Nevertheless, it is important to be able to isolate some transactions from each other, even if some only do queries. Here is a simple solution for many situations that works because InfinityDB is used in embedded environments rather than in SQL/RDBMS/JDBC environments where queries and transactions are ad-hoc.
'Prefetch Serialization' simply serializes access to the database by critical transactions, yet avoids a performance decrease by doing 'prefetching' of blocks that are not in the cache. The entire operation is handled by application code, but it is simple, and only used where and when needed. It completely avoids deadlocks and database inconsistencies.
The scheme works like this. A database-scoped lock is acquired and released for part of any transaction that needs isolation. Before acquisition, the application executes an optional series of retrieval operations that are targeted at bringing necessary blocks into the InfinityDB block cache memory. With the blocks in memory, the application acquires the global lock, then executes the 'meat' of the transaction, including all retrievals and updates very quickly in-cache without I/O's, and finally does a commit and releases the lock. (It is possible also to use multiple locks for increased concurrency when transactions update disjoint subsets of a given database but this can be treated as distinct instances of the one-lock case which we will consider below.)
The prefetching is executed unlocked and in parallel with all other operations, so it is very efficient. A large number of blocks can be prefetched by different transactions at once depending on the cache size. A small 10MB cache can contain 1K blocks, and caches are often 100MB or more. (There will be occasions when the prefetched blocks will turn out not to be the ones actually used by the transaction due to inter-transaction interactions, but these cases will be rare and the performance will not be significantly affected. The semantics are not affected.)
The prefetching can be done by using a small set of prefixes of
Items that are known to be related to the transaction.
For example, a transaction that updates a table record and
some indexes would do several ItemSpace.exists(Cu)
operations using the primary and secondary keys, to use
the relational terminology, or the Entity and some
Inversions, to use the EAV terminology.
The prefetching code can be limited to parts of the system that are actual bottlenecks; this is simpler than the handling of deadlocks in every piece of application code in locking systems. Also, with complete serialization, application correctness is assured, and performance is made second priority. This allows quicker application delivery followed by safe, reliable performance tuning only if needed. With locking systems, it is necessary to choose a 'transaction isolation level' that trades off speed against the quality of data consistency. For example, the 'read-committed' transaction isolation level is almost universally used in typical RDBMS transactions (with occasional use of serialized level), but it suffers from lost writes, phantom reads, and lack of preservation of global consistency constraints. There is no tradeoff with prefetch serialization, which always provides perfect data consistency and maximum speed.
Note that prefetch serialization works with any possible
ItemSpace data structure: it is not
limited to the Entity-Attribute-Value or relational data
models. See "ItemSpace Data Structures.pdf"
in the docs directory for some examples of
other structures.
Here is one simple way to use prefetch serialization where needed. We write the transaction as a single method with an ItemSpace parameter against which the transaction does updates:
void myTransaction() { // this is all cut-and-paste code
ItemSpace roDb = new ReadOnlyItemSpace(db); // wrap db
myTransaction(roDb); // touch important blocks
synchronized (db) { // get global lock (we could use > 1)
myTransaction(db); // now update actual db
db.commit(); // fast because all data is in memory
}
}
// The actual transaction code operates on a given db
void myTransaction(ItemSpace db) {
// any kind of accesses.
while (db.next(..)) {
...
}
db.insert(cu);
}
The above technique guarantees that all blocks are touched
once and brought into memory before the actual work, no matter
how complex myTransaction(ItemSpace) is, but it
does the in-cache part of the transaction twice. The
in-cache part will be orders-of-magnitude faster
than the disk I/O caused by the prefetch, so doing it twice will
cost very little. It may be easier or faster to
do the prefetching using only a few db.exists(cu)
operations on the relevant prefixes to force in one or
a few blocks if a performance bottleneck is identified
in some transaction code.
It is possible that the update transaction to be prefetched
does updates that later affect its own operation by being
read back. In this case, a DeltaItemSpace would be used
instead of a ReadOnlyItemSpace.
InfinityDB.setOverlappingCommits(true)),
InfinityDB.rollBack() should be avoided.
Instead, a DeltaItemSpace can be used.
A DeltaItemSpace is a wrapper for a
'baseSpace' ItemSpace, and it isolates the
baseSpace from changes until desired, while providing
a view of the baseSpace that appears to be dynamic.
The changes can be written all at once before
InfinityDB.commit()
is invoked. Also, this allows the transaction to rollback
the changes before commit simply by discarding the
DeltaItemSpace. The next section will explain.
DeltaItemSpace can be used.
(DeltaItemSpace is a beta-test feature as of 1.0.48
5/23/06). For example:
// Start transaction. We set isMinimizeDeltas true so the baseSpace gets touched.
ItemSpace deltaDb = new DeltaItemSpace(db, true); // db is 'baseSpace'. DeltaItemSpace is fast
myTransaction(deltaDb); // This touches the actual db too but does not modify it
// at this point we can rollback simply by discarding deltaDb
synchronized (db) { // globally lock the actual db. Or, have multiple isolated locks
deltaDb.writeToBaseSpace(); // Change db in one fast step
db.commit(); // fast because cache blocks are in memory.
}
Of course, if there is a need to verify the
compatibility of the
changes to be applied with the db state once the lock is
aquired, we are back to a two-phase process. This would
happen if there are reads that affect the writes. It is
necessary then to know in advance that
database changes will be compatible with the
existing database - for example, new sales transactions can
come in without interfering with existing transactions. The
new data will still appear atomically to readers. Of course,
if the data does not need to appear
atomically, the writes can just
go directly to db and the whole issue of transactionality is
avoided. If changes are not inherently isolated, then
the regular two-phase prefetch serialization technique
described above must be used or some kind of locking must be
added.
DeltaItemSpace by default uses two new VolatileItemSpaces
to store changes. These are extremely fast, but they do each have
a minimum footprint of one 10KB block, so it may be necessary to
keep and re-use them. DeltaItemSpace can control whether it
touches the baseSpace through its isMinimizeDeltas
constructor parameter: when true, the baseSpace is touched.
User-terminated or other long-duration transactions often can be broken into subparts, each of which can use prefetch-serialization. For example, a bulk load can be broken into small batches which can be prefetched and then committed quickly atomically. A user looking at a display of available airline seats would see an available seat in the most recent database state and decide to book it, but when the seat is actually to be booked, a prefetch-serialized transaction would attempt to complete the transaction, possibly reporting that the seat had already been taken. The actual booking is an atomic operation that is not user-terminated.
The hand-coded lock can be much more intelligent than an automatic DBMS-internal lock, which can only force the user to wait. A hand-coded lock can include the user id of the locker, for example, so that other users can determine who is holding the lock and can ask them to release it or can steal the lock if they think it is safe to do so. When a lock is stored as data, it can contain, for example:
User-terminated transactions often only need a finite number of locks, such as a single lock on an account, rather than the potentially unlimited number of locks that can result if all transactions use traditional implicit locking. The granularity can be as fine as desired: there is no need to use crude locks at the block level or table level. There is no limit to the number of locks available and no failures can result from running out of locks or leaving dangling locks. Standard RDMBS', even mature ones, often leave locks dangling in some situations, and alert DBA's must watch for them.
Cleanup of hand-coded locks can be automatic even though they are indefinitely persistent: when a lock is discovered, the time of setting of the lock can be examined, and if it is earlier than the latest system restart time, the lock can be released. Or, locks can be systematically released at a specific time of day or a specific interval. It is not necessary to scan the database for all old locks: obsolete locks are merely removed when discovered.
An alternative to hand-coded locks is 'optimistic locking' of various kinds. An optimistic lock uses a 'version number' Attribute that tracks changes to the record. (Here a 'record' in the EAV system means the set of Items related to a given Entity). When a user-terminated transaction is begun, the record's version number is temporarily saved by the client code, and when the transaction is to be committed, the record to be modified is examined again to discover whether the version number has been changed by another transaction. If the version number has changed, the user is notified of the 'collision' so the transaction can be restarted for the user. This code is very similar to the hand-coded lock described above. Prefetch serialization can be used when the version number is examined initially and again during the attempt at commit.