Redo Log
Redo Log
Format
Example
By the way, for a sparse page, there are logs(MLOG_COMP_LIST_START ,MLOG_COMP_LIST_END and etc) that store a list of operation.
Mini-Transaction
Atomic operations
- Group Redo Log: make multi redo log atomic.
- Single Redo Log
Group Redo Log
Single Redo Log
Mini-Transaction
Mini-Transaction is a atomic operation, including a group redo log or a Single atomic redo log.
Relationship
Redo Log Buffer
Redo Log Block
Write Redo Log to Redo Log Buffer
Concurrency of Redo Log Buffer
When write to disk
- Not enough log buffer space.
- transaction is committing.
- write redo log before writing dirty.
- background thread write periodically.
- power off
- checkpoint update.
Redo Log File
File Group
File Format
Log File header
Log File checkpoint
Flags
Long Sequence NumberFlushed_to_disk_lsnoldest_modificationcheckpoint_lsn
| Variables | Meaning | How to update |
|---|---|---|
Long Sequence Number |
Next redo start number. | do operation |
Flushed_to_disk_lsn |
lsn that has been written to disk |
write redo log page in redo log bugger to disk |
oldest_modification |
oldest lsn of dirty page. |
periodically flush |
checkpoint_lsn |
checkpoint of transaction | execute checkpoint: update checkpoint_lst and write checkpoint to log file |
rank
checkpoint_lsn <= oldest_modification <= flushed_to_disk_lsn <= Long Sequence Number
How to update checkpoint_lsn : checkpoint
- read current
oldest_modificationand set it tocheckpoint_lsn - write current checkpoint’s information to redo log file(actually the first log filie’s checkpoint area).
Crash Recovery
Recovery Range
Look at rank picture.
Recovery range is from checkpoint_lsn to flushed_to_disk_lsn, namely part 4.
but part 1 has been write to disk before crash, so it will be passed.
Start Point
get checkpoint_lsn from first log’s checkpoint area.
End Point
scan from checkpoint until log file’s LOG_BLOCK_HDR_DATA_LEN is not 512.
How to pass part 1 while recovering
There are a FIL_PAGE_LSN in redo log file header.
if this value is bigger than last checkpoint_lsn, meaning that it’s more updated, it will passed.
Optimization
While reading redo log file, arrange redo log by hash table. Key is redo log’s space ID and page number.
arrange them by writing order to keep original order.
If this page has been passed(namely has been written be crash), it will not wirte again.
- minimize disk I/O by hash table
- avoid duplicate write
Safety vs Concurency
innodb_flush_log_at_trx_commit when flush redo while committing.
- 0: not right now, depends on background thread.
- default 1: now. write to disk directly.
- 2: wriiten by buffer IO. live in System’s page buffer.