Distributed and Local Transactions

Local Transactions:

A local transaction is one where all the data accessed to process the transaction resides on a single database. It's called "local" because all the changes happen within a single boundary, that is, within a single system.

A local transaction has properties defined by the acronym ACID:

  • Atomicity: A transaction is a single, indivisible unit of work. All its updates are performed, or none of them are.

  • Consistency: A transaction brings the database from one consistent state to another.

  • Isolation: Each transaction is executed in isolation from other transactions. Intermediate states produced by a transaction are not visible to others. Simultaneously executed transactions aren't aware of each other.

  • Durability: Once a transaction is committed, its updates persist in the database, even in the event of a system failure.

Distributed Transactions:

On the other hand, a distributed transaction is one where the data accessed to process the transaction spans multiple databases, possibly on different machines or locations. This could be the case, for example, in a microservices architecture where different services have their own databases, but a business operation might span multiple services.

One of the challenges with distributed transactions is maintaining the ACID properties across multiple databases. For instance, if a transaction updates data on two different databases, but an error occurs after the first update is committed, rolling back that update is not trivial.

To handle this problem, distributed transactions often use a two-phase commit protocol:

  1. Prepare Phase: All the databases involved in the transaction are asked if they can commit the transaction. They check if the transaction doesn't violate any constraints and if they can make the changes permanent. If so, they respond affirmatively and wait for the final decision.

  2. Commit Phase: If all databases responded affirmatively in the prepare phase, then the transaction manager issues a 'global commit', which tells every participant to make their changes permanent. If any of the databases cannot commit the transaction, then a 'global abort' is issued, and all databases roll back any changes made during the transaction.

Distributed transactions are more complex than local ones, and have more potential points of failure. They also require more resources, as the databases involved need to maintain information about the transaction state and communicate with each other. For these reasons, they should be used judiciously, only when necessary. In some cases, it might be preferable to design a system in such a way that operations can be performed as local transactions within a single database, or to use eventual consistency models where applicable.