Unit 5 | DBMS Notes | AKTU Notes


Concurrency Control Techniques

Unit 5 | DBMS Notes | AKTU Notes

    Concurrency Control

    Definition:  

    In a database, Concurrency Control is a process that manages how multiple users or processes access and modify the database at the same time without causing errors or inconsistencies.

    Why is Concurrency Control Important?

    When multiple users or programs try to perform operations (like reading or updating data) at the same time, problems can occur:

    1. Lost Updates: Two users update the same data simultaneously, and one update overwrites the other.  

       - Example: User A and User B both edit the price of a product at the same time. User A's update gets overwritten by User B's.

    2. Dirty Reads: One user reads uncommitted changes made by another, which might be rolled back later.  

       - Example: User A sees temporary changes made by User B, but User B cancels their transaction.

    3. Inconsistent Data: Multiple users access and modify related data, leading to inconsistent results.  

       - Example: A bank updates account balances for a transfer, but one balance is updated while the other is not.

    Objectives of Concurrency Control:

    1. Ensure data consistency even when multiple users access the database.  

    2. Maintain isolation between concurrent transactions so they don't interfere with each other.  

    3. Prevent conflicts like lost updates, dirty reads, and data inconsistencies.


    Locking Techniques for Concurrency Control

    Definition:  

    Locking is a technique used in concurrency control to ensure that only one transaction can access or modify data at a time. Locks act like "reserved signs" that prevent others from using the same data until the transaction is complete.

    Types of Locks:

    1. Shared Lock (Read Lock):

       - Multiple users can read the data simultaneously, but no one can modify it.  

       - Example: If User A has a shared lock on a product’s price, User B can also read it, but neither can change it.

    2. Exclusive Lock (Write Lock):

       - Only one user can read or modify the data. Others must wait.  

       - Example: If User A has an exclusive lock on a product’s price, User B cannot read or update it until User A finishes.

    Locking Levels:

    1. Row-Level Lock:

       - Locks a specific row in a table.  

       - Advantage: More transactions can happen since only one row is locked.  

       - Example: Locking only one student’s record in a marks table.

    2. Table-Level Lock:

       - Locks the entire table.  

       - Advantage: Simple to implement.  

       - Disadvantage: Slower because it blocks all other users.  

       - Example: Locking the whole marks table while updating multiple records.

    3. Database-Level Lock:

       - Locks the entire database.  

       - Advantage: Ensures consistency for large operations.  

       - Disadvantage: Slows down other users.  

       - Example: Locking the database during backup.

    Locking Protocols:

    To avoid conflicts, databases use protocols to manage locks:

    1. Two-Phase Locking (2PL):

       - Divided into two phases:

         - Growing Phase: A transaction acquires all the locks it needs.  

         - Shrinking Phase: A transaction releases all locks.  

       - Advantage: Prevents inconsistency by ensuring all operations are isolated.

    2. Deadlock Handling in Locking:

       - Deadlock: A situation where two or more transactions are waiting for each other to release locks, and no one can proceed.

       - Solution:  

         - Timeout: Abort transactions that wait too long.  

         - Deadlock Detection: Identify and resolve deadlocks by aborting one of the transactions.

    Examples of Locking in Action:

    - Bank Transactions: When transferring money, the system locks both accounts to ensure one balance is deducted while the other is credited without interference.

    - Online Shopping: When one user is checking out a product, the system locks the inventory count to prevent others from buying the same item until the transaction is complete.

    Conclusion:

    - Concurrency Control prevents problems like lost updates, dirty reads, and inconsistent data when multiple users access the database simultaneously.  

    - Locking Techniques ensure data consistency by allowing only one transaction to access or modify data at a time.


    Time Stamping Protocols for Concurrency Control

    Definition:  

    Time stamping protocols ensure that transactions are executed in a specific order based on their timestamps. A timestamp is a unique number assigned to each transaction when it begins, representing the order of its arrival.

    How It Works:

    1. Assigning Timestamps:  

       - Every transaction gets a timestamp when it starts.  

       - The system uses the timestamp to decide the order of operations.

    2. Rules for Execution:  

       - A transaction with an earlier timestamp gets priority over transactions with later timestamps.

       - Operations (like reading or writing data) are checked to ensure they follow this order.

    Two Types of Time Stamping:

    1. Read Timestamp (RTS):  

       - Keeps track of the last time a data item was read.

    2. Write Timestamp (WTS):  

       - Keeps track of the last time a data item was written.

    Example:

    - Transaction T1 starts at 10:00 AM and gets timestamp 10.  

    - Transaction T2 starts at 10:05 AM and gets timestamp 15.  

    If both transactions want to modify the same data:  

    - T1 will execute first (because its timestamp is earlier).  

    - T2 must wait until T1 finishes.

    Advantages:

    - No need for locks, so transactions don’t block each other.  

    - Ensures transactions execute in a consistent order.

    Disadvantages:

    - May lead to abort if a transaction tries to access data in the wrong order.  

    - Can result in more rollbacks compared to locking methods.


    Validation-Based Protocol

    Definition:  

    A Validation-Based Protocol allows transactions to execute in three phases. Before committing a transaction, the system checks (validates) if it will cause any conflict with other transactions.

    How It Works:

    1. Three Phases of Execution:

       - Phase 1: Read Phase 

         The transaction reads data from the database and stores it temporarily.  

         Changes are made only in the transaction's local memory, not the database.

       - Phase 2: Validation Phase  

         Before committing, the system checks if applying the transaction's changes will conflict with other transactions.  

         If there’s no conflict, the transaction is allowed to proceed. Otherwise, it is aborted.

       - Phase 3: Write Phase  

         If validation is successful, the transaction writes the changes to the database.

    Example:

    - Transaction T1 reads a product price and calculates a discount.  

    - Transaction T2 modifies the same product price.  

    - In the validation phase:

      - If T1’s calculations depend on outdated data, T1 will be aborted to prevent errors.

    Advantages:

    - Reduces conflicts during execution since transactions don't interfere with each other in the read phase.  

    - No need for locks, so there’s less waiting.

    Disadvantages:

    - If conflicts are found during the validation phase, transactions may have to restart, leading to wasted work.  

    - Validation can be complex for large databases.

    Conclusion:

    - Time Stamping Protocols use timestamps to ensure transactions follow a strict order, avoiding conflicts but potentially causing rollbacks.  

    Validation-Based Protocols check for conflicts before committing transactions, ensuring consistency but requiring extra work if validation fails.


    Time Stamping Protocol Vs Validation Based Protocol

    Feature Time Stamping Protocol Validation-Based Protocol
    Key Idea Orders transactions based on timestamps. Validates transactions before committing.
    Conflict Detection During the execution of the transaction. At the end of the transaction (before committing).
    Locking Does not use locks. Does not use locks.
    Advantages No blocking; ensures order of execution. Avoids conflicts during execution.
    Disadvantages High chance of rollbacks if conflicts occur. Validation can be complex; aborted transactions waste work.
          

    Multiple Granularity

    Definition:  

    In databases, Multiple Granularity is a technique that allows locks to be applied at different levels of the database hierarchy, such as at the level of a database, table, page, row, or even a single field. This helps improve performance by letting transactions lock only the part of the database they actually need.

    How It Works:

    - The database is divided into levels of granularity, starting from the largest (database) to the smallest (field).  

      - Example of levels:  

        1. Database  

        2. Table  

        3. Page (a small portion of a table)  

        4. Row (a single record)  

        5. Field (a specific value in a record)


    - Transactions can lock data at different levels based on their requirements:

      - A transaction updating an entire table locks the table level.  

      - A transaction accessing just one row locks the row level.

    Lock Types:

    1. Intention Locks: These are used to indicate the intention of acquiring finer-grained locks.

       - Intention Shared (IS): Signals that the transaction wants a shared lock at a finer level.

       - Intention Exclusive (IX): Signals that the transaction wants an exclusive lock at a finer level.

    2. Actual Locks:

       - Shared (S): Allows multiple transactions to read the data but not modify it.

       - Exclusive (X): Allows only one transaction to read or modify the data.

    Advantages:

    1. Flexibility: Locks can be applied at the appropriate level, reducing unnecessary blocking.  

    2. Improved Performance: Small transactions can lock only the data they need without locking the entire table.  

    Disadvantages:

    1. Complexity: Managing locks at multiple levels is more difficult.  

    2. Overhead: Using multiple locks may consume more system resources.

    Example:

    - A banking database has a customer table.  

      - If one transaction is summarizing all customer balances, it locks the table level.  

      - If another transaction is updating the balance of a single customer, it locks the row level.

    These locks ensure the two transactions don’t interfere with each other.


    Multi-Version Schemes

    Definition:  

    In Multi-Version Schemes, the database keeps multiple versions of a data item to allow reading and writing to happen simultaneously without conflicts. This improves concurrency by letting readers access old versions while writers create new versions.

    How It Works:

    1. Versions of Data:

       - Each data item in the database has multiple versions, each with a timestamp to show when it was created.

    2. Reading:

       - Transactions reading data always see a consistent version based on the timestamp.  

       - They never wait for ongoing write transactions to finish.

    3. Writing:

       - When a transaction modifies data, it creates a new version with an updated timestamp.  

       - Other transactions can still access the old versions.

    Advantages:

    1. No Blocking: Readers don’t have to wait for writers, and writers don’t block readers.  

    2. Improved Concurrency: Multiple transactions can work on the same data without interference.  

    Disadvantages:

    1. Storage Overhead: Keeping multiple versions requires more storage space.  

    2. Complex Management: Managing and cleaning up old versions is complicated.

    Example:

    - In an online shopping app:

      - A customer views product details (read operation), seeing the price as $100 (old version).  

      - At the same time, the seller updates the price to $120 (write operation), creating a new version.  

      - The customer’s read transaction is unaffected by the update and continues to see $100.

    Conclusion:

    - Multiple Granularity uses locks at different levels to improve flexibility and efficiency while controlling access to the database.  

    - Multi-Version Schemes store multiple versions of data to allow simultaneous reading and writing without conflicts, improving concurrency at the cost of additional storage and complexity.


    Multiple Granularity Vs Multi-Version Schemes

    Feature Multiple Granularity Multi-Version Schemes
    Key Idea Locks are applied at different levels (database, table, row). Multiple versions of data are maintained for concurrency.
    Conflict Handling Locks prevent conflicts between transactions. Readers and writers don’t block each other.
    Performance Improves performance by locking only the needed level. High concurrency due to non-blocking operations.
    Storage Requirement No additional storage needed for locks. Requires more storage to keep multiple versions.
    Complexity Complex to manage locks at various levels. Complex to manage and clean up old versions.

    Recovery with Concurrent Transactions

    What It Means:  

    In a database, concurrent transactions are multiple transactions running at the same time. Sometimes, these transactions can fail due to reasons like system crashes, power failures, or deadlocks. Recovery with Concurrent Transactions ensures that the database remains accurate and consistent even when these failures occur.

    How It Works:

    1. Logs

       - The database maintains a log file to track every transaction.  

       - Each operation is recorded, such as when a transaction starts, writes data, or commits changes.

    2. Rollback and Commit:

       - If a transaction fails, it is rolled back using the logs, undoing all its changes.  

       - If a transaction successfully completes, it is committed, and the changes are made permanent.

    3. Concurrency Control:

       - While recovering, the system ensures that no incomplete transaction affects other running transactions.

       - Locks and timestamps help manage conflicts during recovery.

    4. Checkpointing:

       - At regular intervals, the database takes a checkpoint.  

       - A checkpoint saves the current state of the database, so recovery only needs to process transactions after the last checkpoint.

    Why It’s Important:

    - Ensures data consistency even with multiple transactions running simultaneously.  

    - Prevents partial or incorrect data from being stored.  

    - Maintains the integrity of the database.

    Example:

    Imagine a bank database where two transactions occur:  

    - Transaction A: Depositing money into an account.  

    - Transaction B: Transferring money to another account.  

    If the system crashes during these transactions:  

    - The database uses its log files to undo or complete the affected transactions.  

    - Other running transactions remain unaffected.


    Case Study of Oracle

    What is Oracle?

    Oracle is a popular relational database management system (RDBMS) used by organizations to store and manage data. It has advanced features for handling transactions, recovery, and concurrency.

    How Oracle Handles Recovery:

    1. Redo Logs:  

       - Oracle uses redo logs to record changes made by transactions.  

       - These logs are used to replay changes during recovery after a failure.

    2. Undo Tablespaces:

       - Oracle uses undo tablespaces to store old versions of data.  

       - If a transaction is rolled back, the undo data restores the database to its previous state.

    3. Automatic Recovery:

       - Oracle automatically detects crashes or failures.  

       - It uses checkpointing, redo logs, and undo tablespaces to recover the database without manual intervention.

    4. Concurrency Control:

       - Oracle allows multiple users to access the database simultaneously.  

       - It uses techniques like locks, multi-versioning, and timestamps to prevent conflicts between concurrent transactions.

    Key Features in Oracle for Recovery:

    1. ACID Properties:  

       - Oracle ensures Atomicity, Consistency, Isolation, and Durability for transactions.

    2. Flashback Technology:  

       - Oracle’s Flashback feature allows users to recover data from accidental deletion or modification without full recovery.

    3. Distributed Databases:  

       - Oracle can recover and manage transactions even across multiple distributed databases.

    Why Oracle is Used:

    - Reliable and secure for critical applications like banking, e-commerce, and healthcare.  

    - Handles large amounts of data efficiently.  

    - Advanced tools for recovery, ensuring minimal downtime after failures.

    Example:

    - In an e-commerce database managed by Oracle:

      - A customer places an order, and the transaction modifies inventory and payment records.

      - If the system crashes, Oracle’s recovery system uses redo logs and undo tablespaces to restore the database and ensure the order is processed correctly.

    Conclusion:

    - Recovery with Concurrent Transactions ensures that databases remain consistent and recover quickly even during failures, handling multiple transactions simultaneously.  

    - Oracle provides advanced recovery mechanisms like redo logs, undo tablespaces, and flashback tools, making it a powerful solution for managing critical data systems.

    No comments:

    Post a Comment