Wednesday, January 6, 2021

View Transactions from MySQL Binlogs to Resolve Replication Issues

Summary

Sometimes Replication abends need investigation and the first step is to determine the transaction that is failing.  So we can determine the transaction based on logfile and position from the replication abend is the most common reason I go after transactions from the bin logs so I know what we need to do to fix.

Replication Abend Message

2017-03-20T14:16:10.383779Z 287072 [ERROR] Slave SQL for channel 'group_replication_recovery': Worker 0 failed executing

transaction '57671649-2432-4b1a-af66-74f16f43c510:1588295' at master log mynode-bin.000387, end_log_pos 33127552;

Could not execute Delete_rows event on table myschema.tablename; Can't find record in 'tablename',

Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 33127552, Error_code: 1032

 

In the example message we can determine the log file name and position

mynode-bin.000387

33127552

Mysql Binlog Utility Options

--base64-output=DECODE-ROWS

Required for the ROW based replication format

 

--verbose

Required to verbose the output in a readable text format

 

MySQL Bin log to a text file to be able to search for records

Example:

mysqlbinlog --base64-output=DECODE-ROWS --verbose  mynode-bin.000387 > mynode.binlog.00387.txt

 

 

 

Search the binlog text file for Records

Edit the text file created in the previous and then search for the log positon

 

vi mynode.binlog.00387.txt

/33127552

 

#170318  2:42:06 server id 101  end_log_pos 33127552    Delete_rows: table id 595 flags: STMT_END_F

### DELETE FROM `myschema`.`tablename`

### WHERE

###   @1='05d7490c-0971-408f-82f5-d5024922dcd7'

###   @2='received.x12.005010x222a1'

###   @3=526

###   @4=9223372036854775807

###   @5='claims/claims'

###   @6=1489819270

### DELETE FROM `myschema`.`tablename`

### WHERE

###   @1='05d7490c-0971-408f-82f5-d5024922dcd7'

###   @2='received.x12.005010x223a2'

###   @3=1

###   @4=9223372036854775807

###   @5='claims/claims'

###   @6=1489819270

### DELETE FROM `myschema`.`tablename`

### WHERE

###   @1='087f46ce-c8a2-490b-89c7-a4dd77584070'

###   @2='processed'

###   @3=24

###   @4=9223372036854775807

###   @5='claims/claims'

###   @6=1489819270

 

** note in this case 3 delete statements, if you just skipped the position in replication you would skip all 3 delete statements probably no exactly what you would want to do.

 

Resolve the Replication Issue

Based on key value of table find which of the 3 deletes is an issue.

 

After that we can insert a record into the table that would satisfy the delete so that replication can successfully remove the dummy record and continue.


No comments: