Blockchain solutions are definitely becoming more popular. The technology is best known for creating the world of cryptocurrency, but it can solve technical issues in other places as well. One of those happens to be Azure SQL Database. A colleague asked me the other day to help him understand the value proposition behind Azure SQL Database ledger and whether he needed to be using that more frequently.
I should first take a step back and clarify what a “ledger” is. A distributed ledger is a consensus-based system (meaning multiple systems agree on the data’s validity) which is replicated and shared. Each node in the system keeps an identical, independent copy of the ledger which is independently updated. Once an update occurs, it is shared with the other nodes. Those nodes then use a consensus algorithm to agree on the order of transactions and establish that the new update is correct. The nodes then update their records to reflect the consensus results. If consensus is not reached, the transaction is invalidated. Different systems can use different approaches to reach consensus. Under the covers, it’s cryptographic algorithms. Blockchains are a specific type of distributed ledger.
This model is completely decentralized. That means that there is no single system of record. This means that tampering with any node in the system has the current set of records. It also means that the set of systems collectively prevents tampering with any single system, since tampering would alter the records in a way that prevents consensus on the changes.
How does this relate to Azure SQL? Historically, we have no way to know whether the current state of the records in the database is legitimate. That is, we cannot conclusively prove that there wasn’t tampering that was removed from the logs. There’s no way natively to ensure that an administrator or someone with physical access did not alter the data. Using a distributed ledger to record the changes helps to solve this problem.
Under the covers, Azure SQL will create SHA-256 hash of both the data in the current transaction and the hash of the previous transaction. This hashed record will be added to the ledger. The history of each of these changes is also stored in the database. This is not very different from traditional change tracking tables, other than the fact that the details are committed to a ledger. Because any tampering with the data would cause a disagreement with the ledger, it’s possible to compare the two sets of data to ensure that no tampering has occurred. When a verification is requested, the process can recompute all of the hashes for the stored changes in the database, comparing each to the record in the ledger. This allows the system to quickly identify inconsistencies in the data. There is more to the actual implementation, but hopefully this gives you an understanding of the basic principals.
In short, the system allows us to cryptographically prove that an administrator has not tampered with the data being stored in the database. It also allows us to review the history of changes that led to the current state.
The implementation is built on a Microsoft Research project called the Confidential Consortium Framework. This framework powers the Azure Confidential Ledger service. This systems stores immutable, tamper-protected, append-only ledgers, ensuring that sensitive data records can be securely stored and protected. At an even lower level, this system relies on Azure Confidential Computing to provide hardware-based trusted data processing. The short version – it ensures that the system itself cannot compromise the data. Microsoft has a longer explanation of CCF and CC.
Note that the data is append-only. This means it continues to grow over time, since it cannot ever be deleted (since deleting would eliminate historical records). It also means that the system perpetually ensures the validity of ALL of the associated data.
That brings us back to the question – when would you use this technology? It comes down to security: are you storing data which, if modified, could violate legal compliance requirements? For example:
- The history of medicines distributed to patients from a prescription
- The access or modification of medical records
- The maintenance history for military aircraft
- Financial transaction data stored by a bank
- Records of access to secure systems
- Transfers of legal title or ownership
In each of these cases, a modified or deleted record could indicate a criminal activity which could subject the owner of the data to fines and penalties. By being able to prove that the data is accurate and complete, these operations can be protected. From an auditing perspective, it also makes it easy for a third party to quickly verify the integrity and completeness of the data.
It’s worth knowing that once this feature is enabled, the ledger database cannot be disabled and data in the ledger tables cannot be modified or deleted. In addition, non-ledger tables cannot be converted into ledger tables. All of those options would enable a way to tamper with the data or hide some portion of the history.
To summarize, Azure SQL Database Ledger covers a specific use case in which it is critical to ensure the stored data has not been altered and that no tampering has occurred. It’s not something which all databases or systems will require, and it requires careful consideration before enabling since it is a permanent decision related to securing the data perpetually. I let my colleague know that it isn’t something I would expect he would need to be using more frequently. Instead, it’s an additional tool that can be used when the situation warrants that level of protection.