Unit 5 | unit 5 concurrency control techniques

DBMS

Unit - 5

Concurrency Control Techniques

Q1) What is concurrency control?

A1) Concurrency Control in a Database Management System is a method of controlling multiple actions at the same time without their interfering. It ensures that Database transactions are completed in a timely and accurate manner to deliver accurate results without jeopardizing the database's data integrity.

When all users are doing is reading data, concurrent access is simple. There's no way they'll get in each other's way. Concurrency is an issue in any practical database because it will include a mix of READ and WRITE activities.

Concurrency control in DBMSs is used to resolve such conflicts, which are most common in multi-user systems. As a result, Concurrency Control is the most crucial component for a Database Management System's correct operation when two or more database transactions demand access to the same data at the same time.

Example

Assume that two customers visit electronic kiosks at the same time to purchase a movie ticket for the same film and show time.

However, there is just one seat left in that particular theatre for the movie screening. It's feasible that both moviegoers will buy a ticket if the DBMS doesn't provide concurrency control. However, the concurrency control approach prevents this from happening. Both moviegoers have access to the data stored in the cinema seating database. Concurrency control, on the other hand, only gives a ticket to the first buyer who completes the transaction process.

Q2) Explain locking technique for concurrency control?

A2) Lock-based protocols in database systems use a method that prevents any transaction from reading or writing data until it obtains an appropriate lock on it. There are two types of locks.

Binary Locks - A data item's lock can be in one of two states: locked or unlocked.

Shared/exclusive - This sort of locking mechanism distinguishes the locks according to their intended usage. It is an exclusive lock when a lock is obtained on a data item in order to perform a write operation. Allowing many transactions to write to the same data item would cause the database to become incompatible. Because no data value is changed, read locks are shared.

There are four different types of lock procedures.

Simplistic Lock Protocol

Transactions can obtain a lock on every object before performing a 'write' operation using simple lock-based protocols. After performing the ‘write' procedure, transactions may unlock the data item.

Pre-claiming Lock Protocol

Pre-claiming protocols assess their activities and generate a list of data items for which locks are required. Before beginning an execution, the transaction requests all of the locks it will require from the system. If all of the locks are granted, the transaction runs and then releases all of the locks when it's finished. If none of the locks are granted, the transaction is aborted and the user must wait until all of the locks are granted.

Fig 1: Pre claiming lock protocol

Two-Phase Locking 2PL

This locking protocol divides a transaction's execution phase into three segments. When the transaction first starts, it asks for permission to obtain the locks it requires. The transaction then obtains all of the locks in the second section. The third phase begins as soon as the transaction's first lock is released. The transaction can't demand any additional locks at this point; it can only release the ones it already possesses.

Fig 2: Two phase locking

Two-phase locking has two phases: one is growing, in which the transaction acquires all of the locks, and the other is shrinking, in which the transaction releases the locks it has gained.

A transaction must first obtain a shared (read) lock and then upgrade it to an exclusive lock in order to claim an exclusive (write) lock.

Strict Two-Phase Locking

Strict-2PL's initial phase is identical to 2PL's. The transaction continues to run normally after gaining all of the locks in the first phase. However, unlike 2PL, Strict-2PL does not release a lock after it has been used. Strict-2PL keeps all locks until the commit point, then releases them all at once.

Fig 3: Strict two phase locking

Q3) What are time stamping protocols?

A3) The timestamp-based protocol is the most widely used concurrency protocol. As a timestamp, this protocol employs either system time or a logical counter.

At the moment of execution, lock-based protocols regulate the order between conflicting pairings among transactions, whereas timestamp-based protocols begin working as soon as a transaction is initiated.

Every transaction has a timestamp, and the order of the transactions is controlled by the transaction's age. All subsequent transactions would be older than a transaction started at 0002 clock time. For example, any transaction 'y' entering the system at 0004 is two seconds younger than the older one, and the older one will be prioritized.

In addition, the most recent read and write timestamps are assigned to each data item. This tells the system when the last ‘read and write' operation on the data item was done.

Timestamp ordering protocol

The Timestamp Ordering Protocol is used to place transactions in chronological order based on their timestamps. The transaction order is just the ascending sequence of transaction creation.

Because the older transaction has a greater priority, it runs first. This protocol employs system time or a logical counter to determine the transaction's timestamp.

At execution time, the lock-based protocol is utilized to govern the order between conflicting pairs of transactions. However, protocols based on timestamps begin working as soon as a transaction is created.

Assume T1 and T2 are two separate transactions. Assume that transaction T1 has been processed 007 times and transaction T2 has been processed 009 times. T1 has a higher priority, thus it runs first because it was the first to enter the system.

The timestamp ordering protocol also saves the timestamp of the most recent data 'read' and 'write' operations.

Timestamp ordering protocol works as follows −

● If a transaction Ti issues a read(X) operation −

○ If TS(Ti) < W-timestamp(X)

■ Operation rejected.

○ If TS(Ti) >= W-timestamp(X)

■ Operation executed.

○ All data-item timestamps updated.

● If a transaction Ti issues a write(X) operation −

○ If TS(Ti) < R-timestamp(X)

■ Operation rejected.

○ If TS(Ti) < W-timestamp(X)

■ Operation rejected and Ti rolled back.

○ Otherwise, operation executed.

Q4) Describe a validation based protocol?

A4) Optimistic concurrency control technique is another name for the validation phase. The transaction is carried out in three phases in the validation-based protocol:

Read phase -

The transaction T is read and executed in this step. It is used to read the value of various data elements and store them in local variables that are temporary. It is capable of performing all write operations on temporary variables without affecting the database.

Validation phase -

The temporary variable value will be checked against the actual data to see if it violates serializability in this phase.

Write phase -

If the transaction's validation is successful, the temporary results are saved to the database or system; otherwise, the transaction is rolled back.

The following are the different timestamps for each phase:

Start(Ti) - It contains the start time of Ti's execution.

Validation (Ti) - It specifies when Ti completes its read phase and begins its validation phase.

Finish (Ti) - It records the time Ti completes its write phase.

● This protocol is used to determine the transaction's time stamp for serialization by using the time stamp of the validation phase, as this is the phase that determines whether the transaction will commit or rollback.

● Hence TS(T) = validation(T).

● During the validation procedure, the serializability is determined. It's impossible to know ahead of time.

● It ensures a higher level of concurrency and fewer conflicts while processing the transaction.

● As a result, it contains transactions with fewer rollbacks.

Q5) What is multiple granularity?

A5) It is defined as dividing the database into chunks that can be locked in a hierarchical manner.

Multiple Granularity improves concurrency while lowering lock overhead.

It keeps track of what needs to be locked and how it should be locked.

It makes it simple to choose whether to lock or unlock a data item. A tree can be used to graphically display this type of organization.

Example

Take a look at a tree with four tiers of nodes.

● The full database is displayed at the first level or higher level.

● A node of type area is represented at the second level. These are the areas that make up the higher-level database.

● The region is made up of files, which are child nodes. No file can exist in more than one location.

● Finally, each file contains records, which are child nodes. The file contains only the records that belong to its child nodes. There are no records in more than one file.

As a result, starting at the top, the tree's tiers are as follows:

● Database

● Area

● File

● Record

Fig 4: Multiple granularity hierarchy

The top level in this example displays the full database. File, record, and fields are the levels below.

There are three more lock modes with different levels of granularity:

Intention Mode Lock

Intention-shared (IS): At a lower level of the tree, it has explicit locking, but only with shared locks.

Intention-Exclusive (IX): It has explicit locking with exclusive or shared locks at a lower level.

Shared & Intention-Exclusive (SIX): The node is locked in shared mode in this lock, while another node is locked in exclusive mode by the same transaction.

Compatibility Matrix with Intention Lock Modes: The compatibility matrix for these lock modes is shown in the table below:

To assure serializability, it employs intention lock modes. It stipulates that if a transaction tries to lock a node, the node must adhere to the following protocols:

● The lock-compatibility matrix should be followed by transaction T1.

● T1 is the first transaction to lock the tree's root. It has the ability to lock it in any mode.

● If T1 currently has the parent of the node locked in either IX or IS mode, T1 will only lock the node in S or IS mode.

● T1 will lock a node in X, SIX, or IX mode only if the parent of the node is currently locked in IX or SIX mode.

● T1 can lock a node if T1 has not previously unlocked any nodes.

● Transaction T1 will unlock a node if none of the descendants of the node are currently locked.

The locks are obtained in top-down order in multiple-granularity, and they must be released in bottom-up order.

● If transaction T1 reads record Ra9 from file Fa, the database, region A1, and file Fa must be locked in IX mode. Finally, Ra2 must be locked in S mode.

● After locking the database, region A1, and file Fa in IX mode, transaction T2 can modify record Ra9 in file Fa. Finally, the Ra9 must be locked in X mode.

● If transaction T3 reads all of the records from file Fa, it must lock the database and put region A into IS mode. Finally, Fa must be locked in S mode.

● When transaction T4 reads the full database, the database must be locked in S mode.

Q6) Write about a multiversion scheme?

A6) When a data item is modified, other protocols for concurrency control preserve the old values. Because many versions (values) of an object are kept, this is referred to as multiversion concurrency control. When a transaction requires access to an item, an appropriate version is selected to preserve the serializability of the currently running schedule, if at all possible.

The concept is that by reading an older version of the item, some read operations that would be refused by other techniques can still be allowed, preserving serializability. When a transaction publishes an item, it creates a new version while keeping the old version(s). Rather than conflict serializability, some multiversion concurrency control techniques use the idea of view serializability.

Multiversion approaches have the obvious disadvantage of requiring extra storage to keep several versions of database items. Older versions, on the other hand, may need to be kept for various reasons, such as recovery. Furthermore, some database systems necessitate the preservation of prior versions in order to keep a history of data item value evolution. A temporal database, on the other hand, keeps track of all changes and the times at which they occurred. Multiversion approaches have no additional storage penalty in such instances.

There have been several multiversion concurrency control techniques proposed. Here, we'll look at two schemes: one based on timestamp ordering and the other on 2PL. In addition, the concurrency control mechanism for validation keeps track of various versions.

Multiversion Technique Based on Timestamp Ordering

Each data item X is preserved in several versions X1, X2,..., Xk using this manner. The value of version Xi, as well as the following two timestamps, are preserved for each version:

Read_TS(Xi) - The read timestamp of X is the most recent of all the timestamps of transactions that have read version Xi successfully.

Write_TS(Xi) - The timestamp of the transaction that wrote the value of version Xi is the write timestamp of Xi.

When a transaction T is granted permission to do a write item(X) operation, a new version Xk+1 of item X is produced, with both the write TS(Xk+1) and read TS(Xk+1) set to TS (T). When a transaction T is permitted to read the value of version Xi, read TS(Xi) is set to the bigger of the current read TS(Xi) and TS values (T).

Multiversion 2 phase locking:

Each successful write creates a new version of the data item that was written. Versions are identified by timestamps. Select an appropriate version of X based on the transaction's timestamp when a read(X) operation is issued.

Q7) What do you mean by case study of oracle?

A7) Oracle's adoption of internet technologies gained hold in the early years of the new millennium, even as the so-called "internet economy" slowed. Despite a drop in enterprise IT spending during this time, Oracle continued to innovate and achieve results. Major technical advances for Oracle highlighted the era, some of which were the product of millions of dollars and years of research and development and would set the company's future trajectory. Oracle Real Application Clusters, Oracle E-Business Suite, Oracle Grid Computing, enterprise Linux support, and Oracle Fusion are all part of Oracle's 30-year commitment to innovation and leadership.

Current performance

● The corporation has lowered costs and saved $ 1 billion in comparison to the previous year's budget.

● Due to decreased consulting services growth rates, revenues climbed by 15%, which was lower than previous years.

● Sales and marketing costs account for 26% of total operating costs.

● Expenses for research and development increased by 20%.

● As earnings per share climbed, the P/E ratio decreased.

● Liquidity is adequate.

● It is funded by equity rather than debt, which is more expensive.

● The return on investment has multiplied by 2.82.

Analysis of Strategic Factor

Strength

● Oracle is the world's number one integrated suite of enterprise applications, compatible with any operating system.

● Has become the gold standard for database technology and applications in businesses all around the world, from multinational organizations to the corner coffee shop.

● It had attracted legions of developers with its aggressive marketing initiatives.

● Its sales from subsidiaries and agents around the world account for over half of its total revenue.

● The amount of money made from online sales will skyrocket.

● In all respects, there is a high level of professionalism.

Weaknesses

● The cost of an Oracle varies per country.

● Liabilities must cover investments more than equity because it is less expensive.

● The CEO's eagerness in speaking to competitors could result in legal issues.

● Some staff may be demotivated by the CEO's harsh attitude toward them.

Opportunity

● Expand your market share by developing new products and services.

● The IT industry is booming all over the world.

● The enormous success of the E-business suite, as well as the company's strong financial position, will attract a large number of investors and venture capital funds.

Threats

“Microsoft will always be the biggest threat because of the great efforts it employs on R&D and dominating power in the market.”

Strategic Alternatives:

Concentric Diversification: Acquire companies in comparable industries to ensure market share and reduce competitive threats.

Recommended Strategy:

● Because of the fierce competition, growth strategy is the most effective short- and long-term approach.

● By incorporating new applications into the company's information system, the retrenchment strategy will be maintained and reinforced.

● Oracle's quality will be maintained by continuing to invest in research and development.

Implementation

● When the e-commerce suite is more widely used, employee costs will fall.

● When the e-commerce suite is more widely used, IT expenditures will fall.

● Because of its extensive experience and dedicated efforts, R&D will develop innovative goods and services.

Evaluation and Control

“Oracle is and owns the #1 integrated suite of enterprise applications for evaluation and control. It has applied it internally with great success and the result was high efficiency and effectiveness”

Q8) Write the characteristics of good concurrency protocol?

A8) An ideal concurrency control DBMS mechanism has the following objectives:

● Must be resilient to site and communication failures.

● It allows the parallel execution of transactions to achieve maximum concurrency.

● Its storage mechanisms and computational methods should be modest to minimize overhead.

● It must enforce some constraints on the structure of atomic actions of transactions.

Q9) Why do we use the concurrency method?

A9) Reasons for using Concurrency control method is DBMS:

● To apply Isolation through mutual exclusion between conflicting transactions

● To resolve read-write and write-write conflict issues

● To preserve database consistency through constantly preserving execution obstructions

● The system needs to control the interaction among the concurrent transactions. This control is achieved using concurrent-control schemes.

● Concurrency control helps to ensure serializability

Q10) Write the problem of concurrency?

A10) Here, are some issues which you will likely to face while using the DBMS Concurrency Control method:

● Lost Updates occur when multiple transactions select the same row and update the row based on the value selected

● Uncommitted dependency issues occur when the second transaction selects a row which is updated by another transaction (dirty read)

● Non-Repeatable Read occurs when a second transaction is trying to access the same row several times and reads different data each time.

● Incorrect Summary issue occurs when one transaction takes summary over the value of all the instances of a repeated data-item, and second transaction update few instances of that specific data-item. In that situation, the resulting summary does not reflect a correct result.

Q11) What are the advantages and disadvantages of timestamp protocol?

A11) Advantages of timestamp ordering protocol

1) Conflicting operations are processed in timestamp order and therefore it ensures conflict serializability.

2) Since timestamps do not use locks, deadlocks cannot occur.

Disadvantages of timestamp ordering protocol

1) Starvation may occur if a transaction is continually aborted and restarted.

2) It does not ensure recoverable schedules.

Q12) Write the pros and cons of multiversion timestamp ordering?

A12) Advantages

1) The read request never fails and is never made to wait.

Disadvantages

1) It requires more storage to maintain multiple versions of the data item.

2) Reading of a data item also requires the update of the R - timestamp field, resulting in two potential disk accesses.

3) Conflicts between transactions are resolved through rollbacks, rather than through waits. This may be expensive.

Q13) Write the pros and cons of multiversion two phase locking?

A13) Advantages

1) Reads can proceed concurrently with a write operation but it is not permitted in standard two-phase locking.

2) It avoids cascading aborts, since transactions are only allowed to read the version that was written by a committed transaction.

Disadvantages

1) It requires more storage to maintain multiple versions of data items.

2) Dead locks may occur.

Unit - 5

Concurrency Control Techniques

Q1) What is concurrency control?

Example

Assume that two customers visit electronic kiosks at the same time to purchase a movie ticket for the same film and show time.

Q2) Explain locking technique for concurrency control?

A2) Lock-based protocols in database systems use a method that prevents any transaction from reading or writing data until it obtains an appropriate lock on it. There are two types of locks.

Binary Locks - A data item's lock can be in one of two states: locked or unlocked.

There are four different types of lock procedures.

Simplistic Lock Protocol

Pre-claiming Lock Protocol

Fig 1: Pre claiming lock protocol

Two-Phase Locking 2PL

Fig 2: Two phase locking

Two-phase locking has two phases: one is growing, in which the transaction acquires all of the locks, and the other is shrinking, in which the transaction releases the locks it has gained.

A transaction must first obtain a shared (read) lock and then upgrade it to an exclusive lock in order to claim an exclusive (write) lock.

Strict Two-Phase Locking

Fig 3: Strict two phase locking

Q3) What are time stamping protocols?

A3) The timestamp-based protocol is the most widely used concurrency protocol. As a timestamp, this protocol employs either system time or a logical counter.

In addition, the most recent read and write timestamps are assigned to each data item. This tells the system when the last ‘read and write' operation on the data item was done.

Timestamp ordering protocol

The Timestamp Ordering Protocol is used to place transactions in chronological order based on their timestamps. The transaction order is just the ascending sequence of transaction creation.

Because the older transaction has a greater priority, it runs first. This protocol employs system time or a logical counter to determine the transaction's timestamp.

The timestamp ordering protocol also saves the timestamp of the most recent data 'read' and 'write' operations.

Timestamp ordering protocol works as follows −

● If a transaction Ti issues a read(X) operation −

○ If TS(Ti) < W-timestamp(X)

■ Operation rejected.

○ If TS(Ti) >= W-timestamp(X)

■ Operation executed.

○ All data-item timestamps updated.

● If a transaction Ti issues a write(X) operation −

○ If TS(Ti) < R-timestamp(X)

■ Operation rejected.

○ If TS(Ti) < W-timestamp(X)

■ Operation rejected and Ti rolled back.

○ Otherwise, operation executed.

Q4) Describe a validation based protocol?

A4) Optimistic concurrency control technique is another name for the validation phase. The transaction is carried out in three phases in the validation-based protocol:

Read phase -

Validation phase -

The temporary variable value will be checked against the actual data to see if it violates serializability in this phase.

Write phase -

If the transaction's validation is successful, the temporary results are saved to the database or system; otherwise, the transaction is rolled back.

The following are the different timestamps for each phase:

Start(Ti) - It contains the start time of Ti's execution.

Validation (Ti) - It specifies when Ti completes its read phase and begins its validation phase.

Finish (Ti) - It records the time Ti completes its write phase.

● Hence TS(T) = validation(T).

● During the validation procedure, the serializability is determined. It's impossible to know ahead of time.

● It ensures a higher level of concurrency and fewer conflicts while processing the transaction.

● As a result, it contains transactions with fewer rollbacks.

Q5) What is multiple granularity?

A5) It is defined as dividing the database into chunks that can be locked in a hierarchical manner.

Multiple Granularity improves concurrency while lowering lock overhead.

It keeps track of what needs to be locked and how it should be locked.

It makes it simple to choose whether to lock or unlock a data item. A tree can be used to graphically display this type of organization.

Example

Take a look at a tree with four tiers of nodes.

● The full database is displayed at the first level or higher level.

● A node of type area is represented at the second level. These are the areas that make up the higher-level database.

● The region is made up of files, which are child nodes. No file can exist in more than one location.

● Finally, each file contains records, which are child nodes. The file contains only the records that belong to its child nodes. There are no records in more than one file.

As a result, starting at the top, the tree's tiers are as follows:

● Database

● Area

● File

● Record

Fig 4: Multiple granularity hierarchy

The top level in this example displays the full database. File, record, and fields are the levels below.

There are three more lock modes with different levels of granularity:

Intention Mode Lock

Intention-shared (IS): At a lower level of the tree, it has explicit locking, but only with shared locks.

Intention-Exclusive (IX): It has explicit locking with exclusive or shared locks at a lower level.

Shared & Intention-Exclusive (SIX): The node is locked in shared mode in this lock, while another node is locked in exclusive mode by the same transaction.

Compatibility Matrix with Intention Lock Modes: The compatibility matrix for these lock modes is shown in the table below: