Transactions

Previous Next

The Global Transaction Model

InfinityDB does not provide the classical multiple-overlapping- transaction model seen in most modern DBMS. Instead, there is one 'global' transaction in progress at all times. Multiple overlapping Threads are supported, but the javax.transaction package is not presently used: instead a few methods on the InfinityDB are exposed. The global transactionality does provide the standard DBMS 'ACID' capabilities however. There are various ways to use the global transactionality to circumvent the need for overlapping transactions, such as prefetch-serialization. Below is a description of these global 'ACID' properties and an overview of the implementation provided by InfinityDB:

Atomicity
Atomicity means that the commit operation either completes with all changes within the transaction being applied to the db or none being applied. InfinityDB versions prior to 1.0.48 are atomic except that modifications made by other Threads during the execution of commit() itself are not guaranteed to make it into the db on that commit cycle. All updates before commit() is invoked are saved atomically, and all updates after commit() returns are not saved. A new setOverlappingCommits(boolean) makes commits completely atomic and also speeds commits by allowing multiple Threads to be inside the commit() method at the same time rather than serializing the commits. The performance improvement of overlapping commits is roughly proportional to the number of Threads waiting inside the commit() invocation. A new beta feature also allows 'best-effort' or 'no-sync' commits for even greater speed, still preserving atomicity. There is a global InfinityDB.rollback() can be invoked at any time, however, the cache is emptied as a side effect, so it is not fast. It throws away all updates that happened after the most recent commit() returned, but it waits for any overlapping commits to finish. Any amount of changes can be rolled back.
Consistency
The database changes from one consistent state directly to another. This is actually a property provided mostly by the application, since only the application can determine whether the db is in a state corresponding to its invariants (except for referential integrity constraints, which are sometimes provided by a DBMS but which are often disabled due to performance limitations.)
Isolation
Multiple overlapping transactions must not interfere with each other by reading and/or writing data in conflicting ways. This is a complex topic, involving the concepts of 'transaction isolation levels', locking, and the possibility of deadlocks due to the locking. Applications tend to sweep these issues under the rug, simply allowing the DBMS to handle it all, while ignoring the serious implications of isolation level on data consistency and the effect of locks on reliability and performance. InfinityDB does not use locks at all, and has the valuable property of not using internal locks and therefore of avoiding problems with internally-generated deadlocks. The lack of the possibility of deadlocks allows reliable, fast applications using techniques like "prefetch serialization".
Durability
When a transaction finishes committing, the changes it has made to the db must never be lost. In most DBMS, this is handled by providing 'redo logs', which are files that constantly accumulate transaction data. InfinityDB uses a single-file model, and does not use logs at all, simplifying application installation, configuration, operation, space management, and more. A special on-disk file update protocol guarantees data retention, eliminating the usual 'recovery' process normally needed after hard shut-down. An InfinityDB database cannot lose internal consistency or lose committed data, barring media failure.

Commit Speed

The speed of transaction processing applications in a DBMS is primarily limited by the speed of the I/O system writing one or more blocks on disk on the end of the redo log and then doing a file sync() on the log to flush data permanently to disk. When multiple Threads are available, all trying to commit at once, it is possible to write these final blocks containing transaction data for multiple transactions all at the same time, hence the performance bottleneck becomes #ThreadsInCommit/(BlockWriteTime + SyncTime).

InfinityDB performance is not far from the maximum measured commit speed. The InfinityDB commit protocol involves flushing and syncing all dirty blocks from the memory block cache to disk, followed by writing and syncing a special block in the file header to lock in all changes. The maximum measured speed for a file sync operation in one test on a modern system in 2006 was about 55/sec, which implies that two random I/O's are being performed: this is in agreement with the requirement that a directory entry must also be updated to reflect the update time.

The InfinityDB double-sync reduces the maximum performance achievable; however, the limitation will only be a bottleneck in certain situations. For example, if transactions do more than a few operations, such as updating multiple inversions and therefore doing more than a few random disk block I/O's (which run about 110/sec) then the commit speed limit is not so noticeable. A large number of Threads, a large number of cache misses between commits, or a large number of update operations between commits will mitigate the expense of the double-sync. This table shows the worst-case limits for a 2.5GHz X86 when in overlapping-commit mode (ItemSpace.setOverlappingCommits(true)):

ThreadsOps/CommitCommits/secOps/sec
111313
1K1834834
11K2.92900
1K1K5050K
1>>1K approaches 130K
1K>>1K approaches 130K

As can be seen, the worst case occurs with one Thread and one operation per commit, where we see 13 commits/sec, substantially below the 55/sec tested outer limit. However, the 55/sec tested limit was obtained with only one block being written and only on the end of a file, not randomly. Any change to this test setup, such as writing multiple blocks, reduced performance considerably. Note the very positive effect of multi-Threading, and note the high throughput with more operations per commit.

No-Sync Commit

A beta feature of InfinityDB on customer request as of 5/18/06 will be the ability to request a commit without ensuring durability on return from commit(). With this feature, Threads can continue quickly while being given only a 'best-effort' assurance that data will become durable. In this mode, commit performance is almost removed as a performance consideration. The feature is used simply by invoking commit(boolean isWaitingForSync). When commit does occur, it is still atomic and durable. The actual delay between invocation of commit() and durability - the latency - is normally a maximum of a few seconds, but usually much less than a second. Even with only one Thread, no-sync performance is very high.

A mixture of no-sync and normal commits can be used, but no-sync commits can overwhelm normal commits if there is an extreme inbalance favoring the no-sync commits, in which case the overall rate of sync's can reduce to the order of several seconds. This only occurs when there are many Threads and only a few are doing normal commits. There is still 'sharing' of the sync's between committers, so the normal commit throughput is not actually as low as the above might imply. The effect will show up as an increased latency for normal committers in heavily loaded, highly Threaded environments. One way to fix this is to throttle the no-sync committers by using Thread.sleep(), for example.

When sync's are far apart, experiments have shown that the disk block writes actually become more efficient, and throughput is increased. This may be due to the disk and driver using an 'elevator' algorithm to reduce head motion, or to writing multiple blocks within one spindle rotation.

Throughput

In this section, we discuss the performance limitations of the basic insert() and delete() update operations other than as related to commit speed. If commits are very rare or no-sync commits are used, then the throughput becomes the primary performance consideration. The basic in-cache operation performance of InfinityDB is approximately:

OperationOps/sec at 2.5GHzOps/sec/GHz
retrievals250K100K
updates130K52K

VolatileItemSpace is several times faster, but data is not durable (no file is used.) Run com.infinitydb.examples.InfinityDBPerformanceTest for more precise numbers on your system.

InfinityDB was designed to maximize throughput by optimizing in-cache operations as well as disk I/O. Blocks that are updated or read repeatedly while in the memory cache do not incur I/O on each update; instead the updates are batched by an I/O Thread, and blocks are written to disk in background as space is needed in the cache. Each block contains only Items in sort order, and Items with common prefixes are generally kept together in one block as well. Threads that require block I/O do not interfere with Threads that need data already in memory.

Blocks in InfinityDB are about 10KB in memory with 75% utilization (25% free space inside the block), and since Items are often about 30 bytes (very roughly, as Items can be structured in many ways) there should be nominally 250 Items per block. However, prefix compression may double the Items per block, while using CharacterLongObjects and BinaryLongObjects may reduce it to only a few. (Note that blocks on disk are much smaller than 10KB due to compression during block write.) The in-cache insert speed on a 2.5 GHz CPU is about 130K operations/sec. Thus blocks with 250 Items can be created by repeated invocations of ItemSpace.insert() at a rate of about 520 per second. This far exceeds the random-access write rate of blocks to disk (about 110/sec), so it is not possible for continuous random insert() or delete() operations to be limited by CPU performance.

The above calculations have implications for the choice of data model. The Entity-Attribute-Value data model can be considered 'update()-intensive'. In EAV, a single 'record' is stored as a set of Items having a common prefix like <ENTITY_CLASS, entity> rather than, for example, as a single Item containing all of the record's fields concatenated (see Record Retrieval). Given the update speed of InfinityDB, it is not possible for the modification, creation or deletion of one or more EAV 'records' or any other structure inside a single block to be a bottleneck when the sets of operations are associated with random-access block I/O.

Previous Next


Copyright © 1997-2006 Boiler Bay.