Log based recovery:Let us first describe the term transaction log in the context of DBMS. A transaction log is a record in DBMS that remains track of all the transactions of a database system that update any data values in the database. A log haves the following information about a transaction:
- A transaction start marker
- The transaction identification: The transaction id, user id or terminal id etc.
- The operations being performed by the transaction such as delete, update, insert.
- The objects or data items that are affected by the transaction as well as name of the table, row number and column number.
- The previous or before values (also known as UNDO values) and changed or after values (also called REDO values) of the data items that have been updated.
- A pointer to the next transaction log record, if required.
- The COMMIT marker of the transaction.
In a database system various transactions run simultaneously. When a transaction commits, the data buffers used by it require not be written back to the physical database stored on the secondary storage as these buffers may be used by various other transactions that have not yet committed. On the other hand, some of the data buffers that may have updates by various uncommitted transactions may be forced back to the physical database, as they is no longer being used by the database. So the transaction log helps in remembering which transaction did which alters. Therefore the system knows exactly how to divide the changes made by transactions that have already committed from those alters that is made by the transactions that did not yet commit. Any operation such as start transaction, insert /delete/update and end transaction (commit), adds information to the log having the transaction identifier and enough information to redo or undo the changes.
But how do we recover using log? Let us show this with the help of an example having three concurrent transactions that are active on ACCOUNTS table as:
Transaction T1
|
Transaction T2
|
Transaction T3
|
Read X
|
Read A
|
Read Z
|
Subtract 100
|
Add 200
|
Subtract 500
|
Write X
|
Write A
|
Write Z
|
Read Y
|
|
Add 100
|
Write Y
|
Figure: The sample transactions
Suppose that these transactions have the following log file (hypothetical) at a point:
Transaction Begin Marker
|
Transaction
Id
|
Operation on ACCOUNTS table
|
UNDO values (assumed)
|
REDO
values
|
Transaction Commit Marker
|
Y
|
T1
|
Sub on X Add on Y
|
500
800
|
400
Not done yet
|
N
|
Y
|
T2
|
Add on A
|
1000
|
1200
|
N
|
Y
|
T3
|
Sub on Z
|
900
|
400
|
Y
|
Figure: A sample (hypothetical) Transaction log
Now suppose at this point of time a failure occurs, then how the recovery of the database will be completed on restart.
Values
|
Initial
|
Just before the failure
|
Operation
Required for recovery
|
Recovered Database Values
|
X
|
500
|
400 (assuming update has been done in physical database also)
|
UNDO
|
500
|
Y
|
800
|
800
|
UNDO
|
800
|
A
|
1000
|
1000 (assuming update has not been done in physical database)
|
UNDO
|
1000
|
Z
|
900
|
900 (assuming update has not been done in physical database)
|
REDO
|
400
|