ACID Transactionality

  1. What ACID means
  2. InfinityDB 1.0 'ACD' Transactionality
  3. InfinityDB 2.0 ACID Optimistic Locking
  4. Two-Phase
  5. Prefix Locks
  6. Promotion Locks
  7. Space Usage
  8. Performance
  9. Fairness
  10. Transaction Isolation Level
  11. No Log
  12. Java Transaction Architecture JTA
  13. Example
  14. Further Reading
Previous Next

What ACID means

The transaction processing community describes the capability of a transaction processing system such as a DBMS using the acronym 'ACID', which stands for 'Atomic', 'Consistent', 'Isolated', and 'Durable'. These characteristics are vital for any serious database system.

Here are the meanings of these features:

Atomic A transaction either commits entirely at a single 'moment' or not at all. Often, each transaction commit will be considered to have moved the system forwards by one 'SCN' or 'System Change Number' or similar. InfinityDB does not use version numbering.
Consistent The data that a transaction commits is self-consistent, not partial or corrupted. This is largely the application's own issue.
Isolated Transactions can execute concurrently but not interfere with each other (except for delays or retries caused by lock conflicts or deadlocks). Transactions can begin and optionally commit or discard changes they make in any order and in any possibly overlapping pattern. There can be no violation of read/write or write/write ordering dependencies between the transactions. Other transactions never see the 'dirty' data or work of another transaction before the other transaction commits.
Durable Once committed, a transaction will never disappear, even if there is some kind of application or system crash. It is not acceptable for there to be a report to the application of a completed commit with a later loss of the transaction.

Essentially all transaction processing systems that provide isolation use use some form of locking. This is a serious burden on the system and its clients, but it has never been possible to eliminate practically. Sometimes locking is carried out in a strict predetermined order or some knowledge can be introduced to help reduce the need or simplify the locking. However, in general-purpose systems, there is typically no advance knowledge about the locking pattern. The InfinityDB 2.0 ACID transaction facility has no inherent restrictions on locking patterns and requires no client consideration other than for the handling of rare OptimisticLockConflictExceptions as discussed below. It is possible for clients to improve performance by ordering accesses, however.

InfinityDB 1.0 'ACD' Transactionality

InfinityDB 1.0 transactionality provides no isolation, but only the 'ACD' features, with a single global transaction in effect. This transactionality makes all modifications to the InfinityDB data store instantly visible to all Threads: in many cases this is exactly what is needed and performance is maximized by the simplicity of this model. The atomic commit() captures all modifications up to that point, with guaranteed durability due to a special copy-on-write protocol. There is no inter-thread isolation. There is also a no-wait-for-durability type of commit for speed.

InfinityDB 2.0 ACID Optimistic Locking

InfinityDB 2.0, adds transparent multi-Threaded transactionality. Each Thread can either view and modify the database using the previous global ACD protocol, even with unlimited concurrent Threads, but now it can also enter the 'ACID' domain by creating and using TransactionalItemSpace. The ACD level is still available for backwards compatibility, as in InfinityDB 1.0, and it can be used by Threads coexisting with ACID user Threads. The ACD mode is now used normally only for bulk loading or large queries or other situations where ACID level transactionality and consistency are not important. The previous ACD level is fully Thread safe, but only globally transactional, while the ACID level provides non-thread-safe TransactionalItemSpaces, normally one per thread. The client threads do not need to know about the locking going on, and can simply proceed with their normal accesses and modifications as desired, with locks set as needed by the system.

The InfinityDB 2.0 ACID implementation uses 'optimistic' locking, in that data to be worked on by a transaction in a particular Thread is assumed to be generally available for use and lock conflicts are usually rare. In practice, the system is also useful in situations with high conflict rates due to considerations we will describe below (e.g. fine lock granularity and scheduler characteristics). The optimistic system differs mainly from pessimistic systems in that there are no cases where data to be locked is waited for at the same time as locks are held. When locks are both held and waited on by a transactions, there is always the possibility of a cyclic wait-for graph - in other words a deadlock between two or more threads that hold data or other resources desired by the others. InfinityDB does not use pessimistic locking, so there is no possibility of deadlocks. When InfinityDB's commit() is invoked, it never fails due to locking issues.

Two-Phase

The InfinityDB optimistic protocol is two-phase. While locks are being acquired for data, it is always possible that a transaction executing in a particular Thread will encounter a lock already held by another transaction in another Thread, and if so, the existing locks of the requester are all immediately released, all changes are rolled back, the transaction is reset to the not-in-progress state, and an OptimisticLockConflictException is thrown. The Thread desiring the data must start over by invoking begin() again, and by repeating the transaction's work, with its view of the database brought up-to-date.

Releasing all locks on conflict may seem draconian, but the alternative is equally draconian in a pessimistic locking system: in a pessimistic system there must be some kind of deadlock detector, usually running slowly (on the order of seconds) that finds deadlocks and throws Deadlock Exceptions into waiting Threads. It is possible to detect deadlocks as the locks are set, but this is in theoretical complexity and in practice very slow. The result, in any case, is that a pessimistic transaction or an optimistic transaction both must respond to failure cases by handling failure returns or Exceptions. The optimistic system fails immediately, however, rather than waiting for the deadlock detector. The ostensible advantage of a pessimistic system is an increase in scheduling fairness in highly contentious situations, but in practice, such situations simply produce 'deadlock storms'. Highly contentious situations are always difficult and normally just require reducing Thread count. but see Fairness below.

In practice, the handling of the OptimisticLockConflictException can:

  1. Report an 'in-flight' conflict problem to the user, such as someone looking at a web page. This is the response, for example in the Bugzilla bug management system.
  2. Loop, waiting for all of the desired locks to be available at once. Note that after the locks for all of the 'desired prefixes' are acquired, no further operations can cause the OptimisticLockConflictException to be thrown, so application code can be simplified.
  3. Ignore the OptimisticLockConflictException and drop the transaction entirely. For example, a data collection system could just omit a data point.

Prefix Locks

The simplicity of the data model of InfinityDB - the ItemSpace - allows for an equally simple locking system. Each lock corresponds to a single Item that locks all Items in the database of which it is a prefix. It also locks itself. When a lock conflict occurs, a subclass of OptimisticLockConflictException is thrown.

When a Transaction desires to set a lock, it does not need to know that it is doing so: it just goes about its way, doing normal data accesses and modifications, and the locks are automatically set as needed. For example, a next(Cu cu, int protectedPrefixLength) operation is one of the basic retrieval operations, and it just sets a lock for the prefix that corresponds to the characters of the Item currently in the cursor cu, shortened to the length given by protectedPrefixLength. The next(Cu cu, int protectedPrefixLength) simply changes the characters of the Item currently in cu to be the next stored Item in sorted sequence greater than that in cu while preventing a change to the given Item in the characters before the protectedPrefixLength. Therefore, next(Cu cu, int protectedPrefixLength) cannot possibly need to see Items other than those having the given prefix, and the lock can be requested on the protected portion of the characters in cu. This locking is transparent to the client, and very fast. For insert(Cu cu) and delete(Cu cu), the entire sequence of chars in the cu Item is used as the lock prefix.

Other optimistic systems usually have a 'version number' column in each table that is used to detect 'in-flight collisions', whereas InfinityDB automatically locks all data without such a column. Instead, every piece of data that is read or written is automatically locked, and conflicts are reported immediately, not on commit. Hence commit() or rollback() can never cause a lock conflict.

Promotion Locks

It is possible for a transaction to obtain a group of locks and then request and obtain a lock prefix that 'contains' those locks. In other words, the new lock prefix obtained is a prefix of some set of existing locks. The request will throw an OptimisticPromotionLockException if any of the existing locks are owned by other transactions, but in case they are not, all of the contained locks will be replaced by a 'promoted' lock having the given prefix. This is expensive to the extent that a large number of existing locks may need to be dealt with. OptimisticPromotionLockException is a subclass of OptimisticLockConflictException.

Space Usage

There is no limit to the number of locks, either per-Thread, or per-transaction, or per-database or in total for the JVM, other than total heap memory, within which the prefixes are kept efficiently packed together. The lock prefixes are not Java Objects, but only data packed efficiently into internal B-trees, saving a considerable amount of space, especially on 64-bit platforms. Individual lock prefix space usage will typically be a few bytes to tens of bytes, or more depending on the application.

The 'deltas', which are the pending per-transaction changes to the database, are also kept packed efficiently in in-memory BTrees until commit. As for locks, only JVM memory limits the total delta usage for all transactions. Note that while the sum of lock and delta storage is limited by memory, the typical alternative is to lock entire blocks in memory, which is hundreds of times less efficient and can easily lead to memory pressure failures. Commit() and rollback() release most of the resources associated with a TransactionalItemSpace, but eventually a TransactionalItemSpace.dispose() should be invoked to avoid leaks instead of allowing a TransactionalItemSpace to be garbage collected.

A TransactionalItemSpace requires space starting in the 10's of KB. There is at least one 10KB BTree block for the locks, one for the insertions, and one for the deletions, and, if necessary, more blocks as transactions grow. On commit, the blocks beyond the initial three, plus some overhead, may be reclaimed by the system. This space usage description is only an approximation and only applies to v2.0.x.

Performance

The lock performance, either for testing or setting a lock, goes as O(log(nLocks)). Ignoring the commit time, during a transaction the deltas provide insert performance as O(log(insertedItemCount)). The deletes are similar, performing as O(log(deletedItemCount)). These operations are very fast compared to the modifications of the InfinityDB itself, so the bottleneck often occurs at commit instead, which incurs the final actual updates to the database in the block cache as well as the final I/O's and two sync()'s. The I/O's and sync()'s are shared between concurrently committing threads: this is a 'group commit' and it can have a dramatic effect on performance, so it is good to have many threads. One to a thousand threads is common. Another dramatic performance improvement can be had by using the commit(boolean isWaitingForDurable) method, with the parameter false. This causes all but one thread to return immediately on commit(): the one 'unlucky' thread makes sure the I/O's and sync do happen soon.

Retrievals during a transaction are slightly slower than modifications because they must examine both the deltas and the base InfinityDB in order to provide to the thread an up-to-date view of the modifications it has done to the ItemSpace. However, for speed, the thread can opt to bypass the consultation of the deltas during retrievals via TransactionalItemSpace.setPassThroughRetrievals(boolean isPassThroughRetrievals).

One way to simplify and speed up a transaction is to access all of the necessary Item prefixes at the beginning of the transaction: after that, no more lock conflicts can occur. Accessing the necessary prefixes early usually only requires arranging code properly or doing one or a few extra accesses. For example, a loop that modifies a large number of Items can be moved after an access that sets a lock that covers all of the inserts, avoiding a lock promotion. Note again that the ACID rules are automatically enforced no matter what lock ordering strategy is used, and any lock ordering can be used without consideration of locking if desired. To see a printout of the lock prefixes that are in effect, use ItemSpaceTransactionManager.dumpLocks(PrintStream) or TransactionalItemSpace.dumpLocks(PrintStream).

Fairness

The optimistic model has a perceived flaw in that the retries after lock conflict - the throwing of the OptimisticLockConflictException - can loop for ever. This does not happen in practice.

Fairness is provided by the Thread scheduler, which tries to keep each Thread busy, and tends to cause retries to work fairly. Below is an output of a tester that shows locks being set and released fairly. The 'S' is a successful commit, with the following digit identifying the Thread, an 'i' is an insertion attempt, the '_' is a prefix lock conflict, and a ' ' is a promotion conflict. The prefix lengths are randomly generated, so some cover more ground than others, but they all get their locks in a short time. Some of the shorter generated prefixes correspond to table locks, some longer prefixes correspond to row locks, some to value locks within a row, some to index entry locks, and so on, but these correspondences, if they are imposed by the client at all, are not enforced by or even visible to the locking system. Each transaction in this test locks 10 prefixes by doing 10 random inserts. There are 10 threads. There are no sleeps. OptimisticLockConflictExceptions are handled by looping. This is typical in testing, and has been tested for a range of 1 to 1K Threads. The test database is only 500MB, so conflicts should not be rare. As can be seen, the optimistic lock protocol is working well as in fact long runs of ' ' or '_' are not common in the data - only short runs, and all Threads, identified by their thread number, get access.

commits/s total=68.82312456985547 count=400 t=5.812 dt=1.656 this commits/s=60.38647342995169
iiiiiiiii_iiiiiiiiiiS4iiiiiiiiiiS9iiiiiiiiii
S6iiiiiiiiiiS3iiiiiiiiiiS4iiiiiiiiiiS5i_ii_iiiiiiiiiiS8iiiiiiiiiiS7iiiiiiiiiiS9i
iiiiiiiiiS2iiiiiiiiiiS1iiiiiiiiiiS0iiiiiiiiiiS6iiiiiiiiiiS4iiiiiiiiiiS2i_iiiiiii
iiiS9iiiiiiiiiiS0iiiiiiiiiiS7iiiiiiiiiiS5iiiiiiiiiiS3iiiiiiiiiiS8iiiiiiii i i i 
i i iiiiiiiiiiS1iiiiiiiiiiS4iiiiiiiiiiS7iiiiiiiiiiS1iiiiiiiiiiS9iiiffiiiiiiiS6iiii
iiiiiiS3iiiiiiiiii_iiiiiiiiiiS8iiiiiiiiiiS5iiiiiiiiiiS0iiii_iiiiiii_iiiiiiiiiiS2
iiiiii_iiiiiiiiiiS2iiiiiiiiiiS9iiiiiiiiiiS1iiiiiiiiiiS0iiiiiiiiiiS4iiiiiiiiiiS3i
iiiiiiiiiS5iiiiiiiii_iiiiiiiiiiS6iiiiiiiiiiS7iiiiiiiiiiS8iiiiiiiiiiS6iiiiiiiiiiS2
iiiiiiiiiiS9iiiiiiiiiiS3iiiiiiiiiiS7iiiiiiiiiiS1iiiiiiiiiiS4iiiii_iiiiiiiiiiS8ii
iiiiiiiiS0iiiiiiiiiiS5iiiiiiiiiiS8iiiiiiiiiiS4iiiiiiiiiiS5iiiiii_iiiiii i iiiiii
iiiiS3iii_iiiiiiiiiiS7iiiiiiiiiiS2iiiiiiiiiiS6iiiiiiiiiiS0iiiiiiiiiiS1iiiiiiiiii

Transaction Isolation Level

For performance reasons, other ACID systems always must make concessions with respect to the consistency of the data, even though the 'C' stands for consistency. InfinityDB does not make concessions. There are four 'standard' basic transaction isolation levels: READ_UNCOMMITTED, READ_COMMITTED, REPEATABLE_READ, and SERIALIZED. InfinityDB tries to stay only with the best: SERIALIZABLE. Whenever a transaction-aware programmer hears this, the assumption the programmer makes is that there must be locks that limit access to the entire database in a strict sequence, with no transaction overlap, and terrible performance. InfinityDB avoids the SERIALIZED performance problem by using extremely fine grained locks. Using the SERIALIZABLE level means there are no possible reads of data that change on subsequent reads, and no possible 'phantom' reads. Most other databases provide READ_COMMITTED level by default for performance, but this and the other non-SERIALIZABLE transaction isolation levels allow inconsistencies to appear in certain cases, and these inconsistencies are normally simply ignored. InfinityDB does provide READ_COMMITTED as well, but it is not generally necessary (it differs from SERIALIZABLE in that it sets exclusive write locks but no read locks.) In the examples directory you can find a 'ConcurrencyTest.java' that demonstrates this and allows performance testing and experimentation.

The locking granularity is as fine as possible for every transaction. Although InfinityDB does not innately use the table/column/value terminology or relational concepts but instead the ItemSpace model, we can still compare the effective lock granularity assuming a typical application-imposed organization of the data. There are then effectively locks possible on tables, rows, column values, and index entries or even index key prefixes, with the finest granularity usable for each lock that is set. In fact every possible InfinityDB client-imposed data structure that can be devised on top of the ItemSpace data model is automatically locked at the finest possible granularity.

No Log

InfinityDB 2.0, as its predecessor 1.0, requires no recovery log. The transactionality is hidden inside the single InfinityDB data file, without any other new files required. The existing InfinityDB files will continue to work, with no changes to the format and no automatic upgrades to the files. Also, the new API is backwards compatible.

Almost all (possibly all) other transactional DBMS require a transaction log, which is used for transaction rollback and transaction durable commit. The committed data for a transaction is kept in this log and migrated into the actual database as quickly as possible. On crash recovery, the log is scanned again and the transactions are redone or undone as necessary. This log can grow very large if there are long-running update transactions, and it must be kept in a safe place, associated with its related database in case recovery is needed.

InfinityDB instead uses a 'Copy on Write' style system inside the database file to ensure global atomicity, rollback capability, and durability. This original InfinityDB feature is built on by the 2.0 ACID transactionality. Hence there is no time-consuming recovery phase on system bring-up: InfinityDB databases recover instantly.

The JTA Java Transaction Architecture

The API for InfinityDB transactions is the standard Java Transaction Architecture or JTA v1.0.1 (there is also a JTA 1.1 but it is not implemented currently in InfinityDB). However most of the methods there are not implemented or necessary. In order to create a transaction, one starts by asking the InfinityDB to create a new TransactionalItemSpace and thereafter the begin(), commit(), commit(boolean isWaitingForDurable), and rollback() methods are available on that TransactionalItemSpace. A TransactionalItemSpace is non-thread-safe, so normally one (or, rarely, more than one) will be created for the use of each client thread.

In some circumstances, such as in legacy code, the TransactionalItemSpace will be visible only as as its superclass ItemSpace, and the commit() and so on are not visible. In these situations, the new ItemSpace.getTransactionManager() can return a ItemSpaceTransactionManager which can be used instead for begin() commit() and so on.

There is no need for Objects of the Transaction class in the current implementation. Without a Transaction class, there is no suspend() or resume(Transaction). These methods are needed rarely in normal JTA in order to switch the thread-local transaction context between different transactions: in InfinityDB, each TransactionalItemSpace contains a transaction context, so there is no need to switch back and forth.

Example

Here is an example of transactional code, using a single insert() to get a lock. We then read the stored Item back from outside the transaction. There are no threads, so this is rather contrived.

package com.infinitydb.manual;

import java.io.IOException;

import com.infinitydb.Cu;
import com.infinitydb.InfinityDB;
import com.infinitydb.OptimisticLockConflictException;
import com.infinitydb.TransactionalItemSpace;

public class ACIDTest {

	/**
	 * Do the simplest possible thing using InfinityDB 2.0 ACID
	 * Transactionality.
	 */
	public static void main(String[] args) {
		try { // catch IOException from InfinityDB.open() or db.insert()
			
			InfinityDB db = InfinityDB.create("c:/temp/junk", true);
			
			/*
			 *  Create a TransactionalItemSpace. This is only for
			 *  demonstration - it is necessary only with multiple
			 *  competing threads.
			 */
			TransactionalItemSpace tdb = db.getNewTransactionalItemSpace();
			/*
			 * We use the looping technique to force success. Instead we could
			 * just report the error or skip the transaction.
			 */
			for (boolean isSuccess = false; !isSuccess;) {
				tdb.begin();
				try {
					/*
					 * Do as many operations as desired. They will automatically
					 * be locked invisibly. If there is a collision, the
					 * Exception is thrown and we choose to retry. We could have
					 * a retry counter too.
					 */
					tdb.insert("hello");
					isSuccess = true;
				} catch (OptimisticLockConflictException e) {
					/*
					 * The lock has been removed, but will be attempted again on
					 * the next db.insert() above
					 */
					isSuccess = false;
				}
			}
			tdb.commit(); // Always do either commit or rollback
			/*
			 *  Find the Item we inserted. The Item has one component: a string
			 */
			Cu cu = Cu.alloc("hello");
			if (tdb.exists(cu)) {
				System.out.println("success!");
			}
			/*
			 * Release resources. 
			 */
			tdb.dispose(); 
		} catch (IOException e) {
			e.printStackTrace();
		}
	}		
}

Further Reading

Look in the JavaDoc for ItemSpaceTransactionManager and TransactionalItemSpace for full details. The examples directory contains ConcurrencyTest.java, which will fully demonstrate the system. The ConcurrencyTest can show the global consistency maintenance property of the SERIALIZABLE isolation level, and can show performance characteristics, lock conflict frequency, and much more.

Previous Next