Never forget ACID again!!
We often forget how ACID properties function individually since these are the core of Relational Databases it is important we know their importance to ensure clear communication between team members and to use those properties efficiently
Atomicity: All or nothing. This property makes sure that in a transaction either all changes are committed or if something fails all changes are reverted.
A great example of seeing why it is critical to have atomicity is Money Transfers.
Imagine transferring money from bank account A to B. The transaction involves subtracting the balance from A and adding the balance to B. If any of these changes are partially applied to the database, it will lead to money either not being debited or credited, depending on when it failed.
Consistency: This property makes sure that every data in the database is correct. Any attempt to add incorrect data will cause failure.
What does inconsistent data look like? The balance of an account can be updated as negative which is not possible in the real-world Leftover of orphaned data such as comments on a blog post. If a blog post is deleted, its comments must also be deleted from the database
Correctness is ensured by defining rules such as Foreign Key constraints, Check constraints, On Delete Cascades, On Update Cascades, etc.
Isolation: This property ensures that no more than one transaction can have write(exclusive lock) or read(shared lock) access to a particular row to prevent incorrectly updating data and phantom reads.
Take a case when two or more transfers happen on the same account simultaneously, without isolation there will be a high probability that the wrong amount is updated in the database and there will be a mismatch between the amount debited and the amount credited
Durability: This property ensures that once the transactions commit, the changes survive any outages, crashes, and failures, which means any writes that have gone through as part of the successful transaction should never abruptly vanish.
A typical example of this is your purchase order placed on Amazon, which should continue to exist and remain unaffected even after their database faced an outage. So, to ensure something outlives a crash, it has to be stored in non-volatile storage like a Disk; and this forms the core idea of durability.
Now that you know what ACID properties are and how to use them effectively, Do you ever wonder how they are implemented under the hood?
Most databases implement Atomicity using logging. When a transaction starts, the engine starts logging all changes in a file. When the engine receives
COMMIT it then applies those changes permanently. If it receives
ROLLBACK or some error happens, that log file is discarded and thus no changes get applied.
Alternatively, Atomicity can also be implemented by keeping a copy of the data before starting the transaction and using it during rollbacks.
Integrity constraints like primary key, foreign key and datatype checks are checked at runtime when the changes are being applied to the data.
Cascade operations i.e operations to be performed when data containing foreign keys are modified are performed synchronously along with the running transaction. Most database engines also provide a way to make them asynchronous, allowing us to keep our transactions leaner.
A transaction before altering any row takes a lock (shared or exclusive) on that row, disallowing any other transaction to act on it. The other transactions might have to wait until the first one either commits or rollbacks.
The granularity and the scope of locking depend on the isolation level configured. Every database engine supports multiple Isolation levels, which determines how stringent the locking is. The 4 isolation levels are
The Serializable isolation level provides the strictest transaction isolation. This level emulates serial transaction execution for all committed transactions; as if transactions had been executed one after another, serially, rather than concurrently.
The Repeatable Read isolation level only sees data committed before the transaction began; it never sees either uncommitted data or changes committed during transaction execution by concurrent transactions.
Read Committed is the default isolation level in PostgreSQL. When a transaction uses this isolation level, a SELECT query (without a FOR UPDATE/SHARE clause) sees only data committed before the query began; it never sees either uncommitted data or changes committed during query execution by concurrent transactions.
The uncommitted read isolation level allows an application to access the uncommitted changes of other transactions. Moreover, UR does not prevent another application from accessing a row that is being read, unless that application is attempting to alter or drop the table.
The most fundamental way to achieve durability is by using a fast transactional log. The changes to be made to the actual data are first flushed on a separate transactional log and then the actual update is made.
This flushed transactional log enables us to reprocess and replay the transaction during database reboot and reconstruct the system's state to the one that it was in right before the failure occurred - typically the last consistent state of the database. The write to a transaction log is made fast by keeping the file append-only and thus minimizing the disk seeks.