Transaction Processing Concept
Transaction System
Definition:
A Transaction System in a database is a way of ensuring that a series of actions (called a transaction) is executed as a single, complete unit. Either all actions in the transaction happen successfully, or none of them happens at all.
Key Properties (ACID):
A good transaction system follows the ACID properties:
1. Atomicity: A transaction is all-or-nothing. If any part fails, the entire transaction is rolled back.
- Example: If money is deducted from one account but not added to another, the entire process is canceled.
2. Consistency: The database stays in a valid state before and after the transaction.
- Example: After transferring money, the total amount in all accounts should remain the same.
3. Isolation: Transactions happening at the same time don’t interfere with each other.
- Example: If two people are booking the last train seat, only one transaction will succeed.
4. Durability: Once a transaction is completed, the changes are saved permanently, even if there is a system failure.
- Example: After transferring money, the new balances are stored securely.
Example:
1. A bank transaction to transfer money:
- Step 1: Deduct ₹500 from Account A.
- Step 2: Add ₹500 to Account B.
If Step 2 fails, Step 1 is also canceled to maintain consistency.
2. A shopping website order:
- Step 1: Check if the item is in stock.
- Step 2: Deduct payment from the customer.
- Step 3: Reduce the stock count in the database.
If any step fails, the entire transaction is rolled back.
Testing of Serializability
Definition:
Serializability is a way to check if multiple transactions running at the same time (concurrent transactions) produce the same result as if they were executed one after another (serially).
Why Is It Important?
When multiple users access the database simultaneously, their transactions should not interfere and lead to incorrect data. Testing for serializability ensures the database remains consistent.
Types of Serializability:
1. Conflict Serializability: Checks if transactions can be rearranged by swapping non-conflicting operations to achieve a serial order.
- Example: Two transactions:
- T1: Read data, then update data.
- T2: Read the same data.
If T1 and T2 don’t conflict, they can be swapped.
2. View Serializability: Ensures that the order of transactions maintains the same final result, even if the intermediate steps are different.
How to Test for Serializability:
The testing process involves creating a precedence graph:
1. Create a graph where:
- Each transaction is a node.
- Draw a directed edge between transactions if one depends on the other.
2. Check the graph:
- If there is a cycle (a loop), the schedule is not serializable.
- If there is no cycle, the schedule is serializable.
Example:
1. Two Transactions:
- T1: Write(X).
- T2: Read(X).
Schedule:
- T1 writes X → T2 reads X.
Create a graph:
- Node T1 → Node T2 (T2 depends on T1’s result).
No cycle means the schedule is serializable.
Conclusion:
- A Transaction System ensures reliability, consistency, and proper execution of operations using ACID properties.
- Testing Serializability verifies that concurrent transactions maintain the database's consistency as if they were executed one at a time.
Serializability of Schedules
Definition:
A schedule is the sequence in which multiple database transactions are executed. Serializability ensures that the outcome of executing transactions concurrently (at the same time) is the same as if they were executed one after the other (serially).
Why Is It Important?
- When many transactions are running at the same time, they may affect each other’s results. Serializability ensures the database remains consistent and produces correct results.
Example:
Transactions T1 and T2:
- T1: Reads X, Updates X
- T2: Reads X
Schedule 1 (Concurrent):
- T1: Read X
- T2: Read X
- T1: Update X
If the final result of this concurrent schedule is the same as running T1 and T2 one after another, it is serializable.
Conflict Serializable Schedule
Definition:
A schedule is Conflict Serializable if it can be rearranged (by swapping non-conflicting operations) to achieve the same result as a serial schedule.
What Is a Conflict?
A conflict occurs when two transactions access the same data and at least one of them is a write operation.
- Types of Conflicts:
1. Read-Write Conflict: One transaction reads data while another writes to it.
2. Write-Read Conflict: One transaction writes data while another reads it.
3. Write-Write Conflict: Two transactions write to the same data.
How to Test for Conflict Serializability:
- Create a precedence graph:
1. Each transaction is a node.
2. Draw a directed edge from one transaction to another if there is a conflict (e.g., T1 writes, and T2 reads the same data).
3. If the graph has a cycle, the schedule is not conflict serializable. If there’s no cycle, it is conflict serializable.
Example:
Transactions T1 and T2:
- T1: Read(X), Write(X)
- T2: Write(X)
Schedule:
- T1: Read(X)
- T2: Write(X)
- T1: Write(X)
Precedence Graph:
- T1 → T2 (T2 writes X after T1 reads X).
- T2 → T1 (T1 writes X after T2 writes X).
This graph has a cycle, so the schedule is not conflict serializable.
View Serializable Schedule
Definition:
A schedule is View Serializable if it produces the same final result as a serial schedule, even if the intermediate steps are different.
Difference Between Conflict and View Serializability:
- Conflict Serializability: Focuses on swapping non-conflicting operations to match a serial order.
- View Serializability: Focuses on the final output of the transactions, ignoring intermediate steps.
Steps to Check View Serializability:
A schedule is view serializable if:
1. Initial Reads Are the Same: Transactions read the same initial values as in a serial schedule.
2. Writes Are Consistent: The data written by transactions in the schedule matches what would be written in a serial schedule.
3. Final Writes Are the Same: The final result of the database is identical to that of a serial schedule.
Example:
Transactions T1 and T2:
- T1: Write(A)
- T2: Read(A), Write(A)
Schedule 1 (Concurrent):
- T1: Write(A)
- T2: Read(A)
- T2: Write(A)
This schedule produces the same final result as running T1 before T2. Hence, it is view serializable.
Conclusion:
- Serializability of Schedules ensures that concurrent transactions produce correct results like serial transactions.
- Conflict Serializable Schedules focus on rearranging operations to avoid conflicts.
- View Serializable Schedules ensure the final outcome matches a serial schedule, even if intermediate steps differ.
Recoverability
Definition:
In a database, recoverability ensures that if a transaction fails or is rolled back, the database can still maintain consistency by undoing or recovering affected changes.
Key Concepts in Recoverability:
1. Commit: A transaction is said to be committed when all its changes are permanently saved to the database.
2. Rollback: If a transaction encounters an error or fails, all its changes are undone to maintain consistency.
3. Dependency Between Transactions:
- Sometimes, one transaction depends on another. For example, if Transaction T2 reads data modified by Transaction T1, T1 must commit first for the database to remain consistent.
Types of Schedules Based on Recoverability:
1. Recoverable Schedule:
- A schedule where a transaction only commits if all the transactions it depends on have also committed.
- Example:
- T1: Write(A)
- T2: Read(A)
- T1 commits → T2 commits (Recoverable).
2. Non-Recoverable Schedule:
- A schedule where a transaction commits even if the transactions it depends on fail or roll back.
- Example:
- T1: Write(A)
- T2: Read(A) → T2 commits
- T1 fails (Not Recoverable).
3. Cascading Rollbacks:
- When one transaction’s failure causes a chain reaction, rolling back multiple dependent transactions.
- Example:
- T1: Write(A)
- T2: Read(A)
- T3: Read(A)
- If T1 fails, both T2 and T3 must roll back.
4. Cascadeless Schedule:
- A schedule where transactions do not read uncommitted changes from other transactions. This prevents cascading rollbacks.
Importance of Recoverability:
- Ensures the database remains consistent and reliable, even during transaction failures.
Recovery from Transaction Failures
Definition:
When a transaction fails (due to system crashes, power failures, or logical errors), the database must recover to a consistent state. This is achieved by undoing incomplete transactions or redoing committed ones.
Types of Failures:
1. System Failure: Hardware or software crashes that stop transactions abruptly.
2. Transaction Failure: Errors like invalid input or deadlocks that stop a single transaction.
3. Disk Failure: Physical damage to the storage where the database is stored.
Techniques for Recovery:
1. Log-Based Recovery:
- A log is a record of all changes made by transactions. It helps in undoing or redoing changes during recovery.
- Steps:
- Undo: Revert changes of incomplete transactions using the log.
- Redo: Reapply changes of committed transactions to ensure they are not lost.
2. Deferred Update:
- Changes made by a transaction are stored in a temporary area and applied to the database only after the transaction commits.
- If the transaction fails, no changes are made to the database.
3. Immediate Update:
- Changes are applied to the database immediately but are logged so they can be undone if the transaction fails.
4. Checkpoints:
- A checkpoint is a saved state of the database at a particular point in time. During recovery, the system starts from the last checkpoint, reducing the recovery time.
Example of Recovery:
1. Transaction T1:
- Step 1: Write(A).
- Step 2: Write(B).
- System crashes before T1 commits.
Recovery Process:
- Undo: Remove the changes made to A and B because T1 didn’t commit.
2. Transaction T2:
- Step 1: Write(C).
- T2 commits.
- System crashes.
Recovery Process:
- Redo: Apply the changes to C because T2 had committed.
Importance of Recovery:
- Prevents data loss during failures.
- Maintains the reliability and consistency of the database.
Conclusion:
- Recoverability ensures the database can handle failures without losing consistency.
- Recovery from Transaction Failures involves techniques like logging, checkpoints, and undo/redo to restore the database to a consistent state.
Log-Based Recovery
Definition:
In a database, a log is a record that keeps track of all the actions performed by transactions. Log-Based Recovery is a process that uses these logs to recover the database to a consistent state after a failure.
Key Concepts:
1. Logs:
- Logs store information like:
- Transaction ID
- Data item being modified
- Old value (before change)
- New value (after change)
2. Undo and Redo:
- Undo: If a transaction failed or was incomplete, reverse its changes.
- Redo: If a transaction was committed before the failure, apply its changes again to ensure they are not lost.
Types of Log-Based Recovery:
1. Deferred Update:
- Changes are recorded in the log but applied to the database only after the transaction commits.
- Advantage: No need to undo changes since uncommitted changes are not written to the database.
2. Immediate Update:
- Changes are written to the database immediately and recorded in the log.
- Requires both undo and redo operations during recovery.
Example:
- Transaction T1:
1. Log: "T1 starts."
2. Log: "T1 changes A from 10 to 20."
3. Log: "T1 commits.
- If the system crashes after step 2:
- Undo changes (restore A to 10).
- If the system crashes after step 3:
- Redo changes (ensure A is 20).
Why is Log-Based Recovery Important?
- Ensures no data is lost during a crash.
- Maintains the reliability and consistency of the database.
Checkpoints
Definition:
A checkpoint is a saved state of the database at a specific point in time. It helps reduce the time and effort required for recovery by providing a point to start from instead of scanning all previous logs.
How Checkpoints Work:
1. Periodically, the system saves a snapshot of the database and records it in the log.
2. During recovery, the system starts from the last checkpoint instead of the beginning of the log.
Example:
- Without Checkpoints:
If a system crash occurs, the recovery process must check all transactions from the start of the log.
- With Checkpoints:
If a checkpoint is created every 10 minutes, recovery only checks logs after the last checkpoint, saving time.
Why Are Checkpoints Important?
- Speeds up recovery after a failure.
- Reduces the number of transactions to analyze.
Deadlock Handling
Definition:
A deadlock occurs when two or more transactions wait for each other to release resources, and none of them can proceed.
Example of a Deadlock:
- Transaction T1 locks data item A and needs data item B.
- Transaction T2 locks data item B and needs data item A.
- T1 waits for T2, and T2 waits for T1. Both are stuck, causing a deadlock.
How to Handle Deadlocks:
1. Prevention:
- Avoid deadlocks by controlling how resources are allocated.
- Example: Transactions must lock all required resources at the start.
2. Detection and Resolution:
- The system periodically checks for deadlocks.
- If a deadlock is found, one transaction is aborted (rolled back) to break the cycle.
3. Timeout:
- If a transaction waits too long, it is rolled back and restarted.
Why Is Deadlock Handling Important?
- Prevents transactions from getting stuck forever.
- Ensures smooth execution of concurrent transactions.
Conclusion:
- Log-Based Recovery uses logs to undo or redo transactions after a failure.
- Checkpoints save the database state periodically to speed up recovery.
- Deadlock Handling ensures that transactions don’t block each other permanently, maintaining smooth operation of the database.
No comments:
Post a Comment