Prefetch Serialization

Previous Next

Avoiding Locks with Prefetch Serialization

InfinityDB does not like locks. It goes to great lengths internally to avoid them, and to allow applications to be written without the need to handle deadlock exceptions that can occur whenever locking is used. Locks also reduce performance, sometimes dramatically. Locks take considerable time to set and release even if no blocking occurs, plus there is overhead for deadlock detection, queueing, and managing the storage space and GC of the lock Objects themselves. Often there are many HashMaps and ArrayLists full of various locking- related Objects per transaction or per database. Locks can remain unreleased in some circumstances, leaving the db 'wedged' or making some data inaccessible.

Nevertheless, it is important to be able to isolate some transactions from each other, even if some only do queries. Here is a simple solution for many situations that works because InfinityDB is used in embedded environments rather than in SQL/RDBMS/JDBC environments where queries and transactions are ad-hoc.

'Prefetch Serialization' simply serializes access to the database by critical transactions, yet avoids a performance decrease by doing 'prefetching' of blocks that are not in the cache. The entire operation is handled by application code, but it is simple, and only used where and when needed. It completely avoids deadlocks and database inconsistencies.

The scheme works like this. A database-scoped lock is acquired and released for part of any transaction that needs isolation. Before acquisition, the application executes an optional series of retrieval operations that are targeted at bringing necessary blocks into the InfinityDB block cache memory. With the blocks in memory, the application acquires the global lock, then executes the 'meat' of the transaction, including all retrievals and updates very quickly in-cache without I/O's, and finally does a commit and releases the lock. (It is possible also to use multiple locks for increased concurrency when transactions update disjoint subsets of a given database but this can be treated as distinct instances of the one-lock case which we will consider below.)

The prefetching is executed unlocked and in parallel with all other operations, so it is very efficient. A large number of blocks can be prefetched by different transactions at once depending on the cache size. A small 10MB cache can contain 1K blocks, and caches are often 100MB or more. (There will be occasions when the prefetched blocks will turn out not to be the ones actually used by the transaction due to inter-transaction interactions, but these cases will be rare and the performance will not be significantly affected. The semantics are not affected.)

The prefetching can be done by using a small set of prefixes of Items that are known to be related to the transaction. For example, a transaction that updates a table record and some indexes would do several ItemSpace.exists(Cu) operations using the primary and secondary keys, to use the relational terminology, or the Entity and some Inversions, to use the EAV terminology.

The prefetching code can be limited to parts of the system that are actual bottlenecks; this is simpler than the handling of deadlocks in every piece of application code in locking systems. Also, with complete serialization, application correctness is assured, and performance is made second priority. This allows quicker application delivery followed by safe, reliable performance tuning only if needed. With locking systems, it is necessary to choose a 'transaction isolation level' that trades off speed against the quality of data consistency. For example, the 'read-committed' transaction isolation level is almost universally used in typical RDBMS transactions (with occasional use of serialized level), but it suffers from lost writes, phantom reads, and lack of preservation of global consistency constraints. There is no tradeoff with prefetch serialization, which always provides perfect data consistency and maximum speed.

Note that prefetch serialization works with any possible ItemSpace data structure: it is not limited to the Entity-Attribute-Value or relational data models. See "ItemSpace Data Structures.pdf" in the docs directory for some examples of other structures.

Here is one simple way to use prefetch serialization where needed. We write the transaction as a single method with an ItemSpace parameter against which the transaction does updates:

 	void myTransaction() { // this is all cut-and-paste code
 		ItemSpace roDb = new ReadOnlyItemSpace(db); // wrap db
		myTransaction(roDb); // touch important blocks
		synchronized (db) { // get global lock (we could use > 1)
			myTransaction(db); // now update actual db
			db.commit(); // fast because all data is in memory
		}
	}
	// The actual transaction code operates on a given db
	void myTransaction(ItemSpace db) {
		// any kind of accesses.
		while (db.next(..)) {
			...
		}
		db.insert(cu); 
	}
The above technique guarantees that all blocks are touched once and brought into memory before the actual work, no matter how complex myTransaction(ItemSpace) is, but it does the in-cache part of the transaction twice. The in-cache part will be orders-of-magnitude faster than the disk I/O caused by the prefetch, so doing it twice will cost very little. It may be easier or faster to do the prefetching using only a few db.exists(cu) operations on the relevant prefixes to force in one or a few blocks if a performance bottleneck is identified in some transaction code.

It is possible that the update transaction to be prefetched does updates that later affect its own operation by being read back. In this case, a DeltaItemSpace would be used instead of a ReadOnlyItemSpace.

DeltaItemSpaces Generalize Rollback

When operating in overlappingCommits() mode (via InfinityDB.setOverlappingCommits(true)), InfinityDB.rollBack() should be avoided. Instead, a DeltaItemSpace can be used. A DeltaItemSpace is a wrapper for a 'baseSpace' ItemSpace, and it isolates the baseSpace from changes until desired, while providing a view of the baseSpace that appears to be dynamic. The changes can be written all at once before InfinityDB.commit() is invoked. Also, this allows the transaction to rollback the changes before commit simply by discarding the DeltaItemSpace. The next section will explain.

DeltaItemSpaces Speed Prefetch Serialization

In order to allow a prefetch-serialized transaction to avoid the extra work of breaking its execution into two phases, a DeltaItemSpace can be used. (DeltaItemSpace is a beta-test feature as of 1.0.48 5/23/06). For example:
	// Start transaction. We set isMinimizeDeltas true so the baseSpace gets touched. 
	ItemSpace deltaDb = new DeltaItemSpace(db, true); // db is 'baseSpace'. DeltaItemSpace is fast
	myTransaction(deltaDb); // This touches the actual db too but does not modify it
	// at this point we can rollback simply by discarding deltaDb
	synchronized (db) { // globally lock the actual db. Or, have multiple isolated locks
		deltaDb.writeToBaseSpace(); // Change db in one fast step
		db.commit(); // fast because cache blocks are in memory.
	}
Of course, if there is a need to verify the compatibility of the changes to be applied with the db state once the lock is aquired, we are back to a two-phase process. This would happen if there are reads that affect the writes. It is necessary then to know in advance that database changes will be compatible with the existing database - for example, new sales transactions can come in without interfering with existing transactions. The new data will still appear atomically to readers. Of course, if the data does not need to appear atomically, the writes can just go directly to db and the whole issue of transactionality is avoided. If changes are not inherently isolated, then the regular two-phase prefetch serialization technique described above must be used or some kind of locking must be added.

DeltaItemSpace by default uses two new VolatileItemSpaces to store changes. These are extremely fast, but they do each have a minimum footprint of one 10KB block, so it may be necessary to keep and re-use them. DeltaItemSpace can control whether it touches the baseSpace through its isMinimizeDeltas constructor parameter: when true, the baseSpace is touched.

Breaking Up Transactions with Prefix Serialization

The basic prefetch serialization approach works only for transactions that are 'self-terminating' - in other words, for transactions that do not depend on external events to allow them to continue to completion. So, a transaction that may need to wait for a user to press Enter - a 'user- terminated' transaction - cannot use this system as-is, because it would hold the database-scope lock for a long time.

User-terminated or other long-duration transactions often can be broken into subparts, each of which can use prefetch-serialization. For example, a bulk load can be broken into small batches which can be prefetched and then committed quickly atomically. A user looking at a display of available airline seats would see an available seat in the most recent database state and decide to book it, but when the seat is actually to be booked, a prefetch-serialized transaction would attempt to complete the transaction, possibly reporting that the seat had already been taken. The actual booking is an atomic operation that is not user-terminated.

Hand-Coded Locks with Prefetch Serialization

Sometimes transactions simply must lock some part of the database for a long time. In these cases, prefetch-serialization can be combined with explicit, hand-coded locks. A hand-coded lock can use Items stored with the data being protected, such as an extra Attribute for the appropriate EntityClass. The lock is set by inserting Values for the 'lock' Attribute, for example when a user first views a record: then when the user is ready, the database-scope lock is obtained and the operation completes, with the hand-coded lock being deleted before commit.

The hand-coded lock can be much more intelligent than an automatic DBMS-internal lock, which can only force the user to wait. A hand-coded lock can include the user id of the locker, for example, so that other users can determine who is holding the lock and can ask them to release it or can steal the lock if they think it is safe to do so. When a lock is stored as data, it can contain, for example:

User-terminated transactions often only need a finite number of locks, such as a single lock on an account, rather than the potentially unlimited number of locks that can result if all transactions use traditional implicit locking. The granularity can be as fine as desired: there is no need to use crude locks at the block level or table level. There is no limit to the number of locks available and no failures can result from running out of locks or leaving dangling locks. Standard RDMBS', even mature ones, often leave locks dangling in some situations, and alert DBA's must watch for them.

Cleanup of hand-coded locks can be automatic even though they are indefinitely persistent: when a lock is discovered, the time of setting of the lock can be examined, and if it is earlier than the latest system restart time, the lock can be released. Or, locks can be systematically released at a specific time of day or a specific interval. It is not necessary to scan the database for all old locks: obsolete locks are merely removed when discovered.

An alternative to hand-coded locks is 'optimistic locking' of various kinds. An optimistic lock uses a 'version number' Attribute that tracks changes to the record. (Here a 'record' in the EAV system means the set of Items related to a given Entity). When a user-terminated transaction is begun, the record's version number is temporarily saved by the client code, and when the transaction is to be committed, the record to be modified is examined again to discover whether the version number has been changed by another transaction. If the version number has changed, the user is notified of the 'collision' so the transaction can be restarted for the user. This code is very similar to the hand-coded lock described above. Prefetch serialization can be used when the version number is examined initially and again during the attempt at commit.

Previous Next


Copyright © 1997-2006 Boiler Bay.