Module 4
Transaction Processing
Concurrent execution has following problems:
- Lost Update.
- Dirty Read.
- Unrepeatable Read.
Lost Update: -
The update of one transaction is overwritten by other transaction.Eg. Credit $50 in A by transaction T1 & T2 debits $ 50 from account AA=500
*Schedule S2*
Initial value in account A=$500
Output must be 500 only because T1 is crediting $50 in A & transaction T2 is debiting $50 from A.
But in above schedule, finds value of amount in account A is $450.
Update done by transaction T1 is lost.
- Dirty Read:-
A=500
*Schedule S3*
- Here, value of A has been modified by T1. T2 reads this modified value.
- If because of reason, T1 fails , then as per the atomicity property, T1 must be restarted. So, the modification will be rolled back.
- A will become 500.
- But value of A read by T2 is 600.
- So, T2 reads the value of A which does not exist.
- Reading value of A by T2 is called as a Dirty Read.
Unrepeatable Read:-
If T2 reads (A), then A is altered by T1. T1 commits. Now, if T2 reads (A), then the values read by T2 in both reads will be different.
A=500.
*Schedule S4*
- If T2 reads, A for first time, it will be 500. Now T1 commits.
- Now if T2 again reads A, it will be 600.
- So, in same schedule, T2 reads two different values of A.
Note:- Serial execution, does not suffer from these problems, since in serial execution there is no sharing of data.
Serializable Schedule:-
A schedule ‘S’ of ‘n’ transactions is serializable if it is equivalent to some serial schedule of the same ‘n’ transactions.
Eg.
*Schedule S5*
S5 is a serial schedule because S1 contains T1 & T2. Both these transactions are working serially
*Schedule S6*
S6 is a serializable schedule because it is equivalent to serial schedule S5.
There are two types of Serializability:-
- Conflict &
- View Serializability.
Conflict Serializable Schedule:-
- Conflict Equivalent :-
- Two schedules are said to be conflict equivalent if the order of any two conflicting operations is same in both the schedules.
- If two conflict operating area applied in different order in two schedules, the effect can be different on the database & hence the schedules are not conflict equivalent.
Eg. *Schedule S7* *Schedule S8*
Iii. These schedules are not conflict equivalent.
Iv. Because (i) & (ii) are conflicting operations. But this sequence of 1 & 2 operations is different in S7 & S8.
v. Value of X in S7 is 15 i.e. final value of x in database & final value x in S8 is 10.
Vi. This happens because of different sequence of conflicting instructions in S7 & S8.
Vii.
Note:- Schedules S5 & S6 are conflict equivalent because conflicting instructions (1 & 2) in both schedules are in same order.
Conflict Serializable:-
- A schedule is conflict serializable if it is conflict equivalent to some serial schedule.
- This can be achieved by two ways.
- Reordering of non-conflicting operations.
- Constructing precedence graph.
Testing for Seriazability:
- Reordering of non-conflicting operations:
Eg. Consider S5 & S6.
If we reorder instructions (1 & 2) in S6 as:-
We are reordering non-conflicting instructions (2,3) & (4,5). The resultant schedule is equivalent to schedule S5. So, S6 is called as conflict serializable because it is conflict equivalent to serial schedule S5.
B. Precedence Graph:- It is used for Testing of conflict serializability of a schedule.
Following Algorithm can be used to test a schedule for conflict serializability:-
- For each transaction Ti participating in schedule S, create a node labeled Ti in the precedence graph.
- For each case in S, where Tj executes a read(X) after Ti executes a write (x), create an edge To -> Tj in precedence graph.
- For each case in S, where Tj executes a write(X) after Ti executes a read(x) create an edge Ti -> Tj in precedence graph.
- For each case in S, where Tj executes a write(X) after Ti executes a write(X), create an edge Ti -> Tj in precedence graph.
- The schedule S is conflict serializable if & only if precedence graph has no cycle.
HINT:- (To draw edge)
- Two instructions must be from different transactions.
- One of the instructions must be write instructions.
- The transaction whoever performing 1st instructions, draw edge from that transaction to other.
1) T1 T2 Precedence Graph
No edge. One of the instructions must be written.
2) T1 T2
T1 performs 1st instruction
3) T1 T2
T2 performs 1st instruction
4) T1 T2
T2 performs 1st instruction
5) T1 T2
T1 performs 1st instruction
6) T1 T2
T2 performs 1st instruction.
Eg. Precedence graph for S6 :-
Consider schedule S9 as:-
This graph has no cycles so it is conflict serializable. So it must have some conflict equivalent schedule. The equivalent serial schedule can be found by using Topological Sorting.
Steps in Topological Sorting:-
- Remove the node which is not having any incoming edge. T4 is the target node to remove because it is not having incoming edge.
2. T5 is not having incoming edges. So remove T5 from graph.
3. Now, remove T6 from graph.
The sequence of node removal is :-
This is nothing but equivalent serial schedule executed in the T4 -> T5 -> T6
View Equivalence & View serializability: -
- Two schedules S & S’ are said to be view equivalent if the following three conditions hold:
- The same set of transactions participates in S & S’ and S & S’ include the same operations of those transactions.
- For any operation Ri(X) of Ti in S, if the value of X has been written by an operation Wj(X) of Tj (or if it is the original value of X before the schedule started), the same condition must hold in S’
- If the operation Wκ(Y) of Tκ is the last operation to write item Y in S, then Wκ(Y) of Tκ must also be the last operation to write item Y in S’
2. The idea behind view equivalence is that, as long as each read operation of a transaction reads the result of same write operation in both schedules, the write operations of each transaction must produce the same results.
3. The read operations are hence said to see the same view in both schedules.
4. A schedules S is said to be view serializable if it is view equivalent to a serial schedule.
Eg.
- Once a transaction T is committed, it should never be necessary to Roll back T.
- The schedules that theoretically meet this criterion are called recoverable schedules & those do not are called non-recoverable schedules.
- Recoverable schedule is one where, for each pair of transactions Ti & Tj, such that Tj reads a data item previously written by Ti, the commit operation of Ti appears before the commit operation of Tj.
Eg.
T2 reads the value of X, which has been written by T1. T1 commits & then T2. As per definition, commit of T1 must appear before commit of T2. So, the schedule is called as a recoverable schedule.
References:
1. “Principles of Database and Knowledge – Base Systems”, Vol 1 by J. D. Ullman, Computer Science Press.
2. “Fundamentals of Database Systems”, 5th Edition by R. Elmasri and S. Navathe, Pearson Education
3. “Foundations of Databases”, Reprint by Serge Abiteboul, Richard Hull, Victor Vianu, Addison-Wesley