An IncrementalMergingItemSpace
can do these things, even when the database is much larger than available memory.
(To understand the limitations and considerations surrounding memory usage versus
database size, see Bulk Loading and Sorting.)
For databases that fit in memory, any ItemSpace will be
fast - either an InfinityDB or a
VolatileItemSpace, the basic on-disk and memory-only ItemSpaces.
These will provide about 60K Insertions/Sec/GHz.
IncrementalMergingItemSpace is a compromise.
It is not as fast as possible either for offline bulk loading or for
online retrievals and updates using a single InfinityDB ItemSpace,
but it directly addresses the difficult problem of combining
the two requirements.
An IncrementalMergingItemSpace wraps any kind of lower-level or
baseSpace, the baseSpace normally being an InfinityDB, where it keeps
the persistent set of Items. The Items stored there are organized in a special
way. The Items stored in the baseSpace are given special prefixes to
separate them into groups of various sizes called batches.
Thus it is not possible to take existing data in an
ItemSpace and wrap an IncrementalMergingItemSpace around it
transparently. However an IncrementalMergingItemSpace is a
first-class ItemSpace and it can be use anywhere any other ItemSpace can be
used with no changes to the client code.
As inserts come in, the number of batches increases, and the more batches
there are the lower is the performance.
No special background threads are used internally:
each insert not only adds to one of the existing batches, but it also
does some incremental merging work to combine batches to keep the
baseSpace in an efficient state. It is possible, of course,
for client threads to share the IncrementalMergingItemSpace.
Inserts and deletes become visible immediately
on method return. The normal InfinityDB commit() method can be used on the
base space between insert() or delete() operations.
The advance() method can be used to make incremental progress
merging the batches together, ultimately into one batch. If advance()
is invoked by a background Thread when there is available
time, the batches can be merged gradually and the
speed of the system will be improved.
In fact, insert() simply invokes advance() once on each invocation. If it
becomes known at some point that no further inserts() will be needed right away,
such as during non-business hours, then advance()
can be invoked in a tight loop until it returns false in order
to finish the merging of the batches into a single batch, thereby
optimizing for retrievals and deletes. In such an optimized state, the
index will have performance similar to that of the underlying baseDb, i.e.
without the IncrementalMergingItemSpace overhead.
Usage is simple:
// baseDb is any ItemSpace, normally an InfinityDB
IncrementalMergingItemSpace db =
new IncrementalMergingItemSpace(baseDb, true/*isCreate*/, CACHE_SIZE);
// Now use db in any way desired.
// In beta 4/8/2007, only one update should be active concurrently.
CACHE_SIZE is the amount of temporary memory that is assumed to be
available to the cache in the underlying baseDb. If baseDb is an InfinityDB, then the cache
size provided in the open of the InfinityDB should be used, or somewhat less. If
this parameter is too large, performance may degrade considerably. Using
somewhat too small a value will also degrade performance but not significantly, so
it is best to underestimate. Also, if there are other threads using significant amounts
of the cache memory by virtue of their access patterns, the CACHE_SIZE
parameter may need to be reduced well below the cacheSize
parameter to InfinityDB.create() or InfinityDB.open().
Bulk loads using IncrementalMergingItemSpace
can be roughly 10 to 100 times as fast as simple random insertion
into a large InfinityDB database (insertions that are random in that
they always cause cache misses), while retrievals and deletes or updates will be
affected in a way that depends on the data patterns,
slowing down by 1 to 50 times or more. The speed of the basic operations
decreases logarithmically with database size.
Databases up to ten times the size of memory will be only
slowed by approximately the ratio of database to memory size. Each tenfold
increase in database size thereafter will decrease speed by about the same factor.
In other words, a load time for a database 10 times memory size will take S
seconds, while a database 100 times memory in size will take 20S seconds,
1000 times memory size will take 300S seconds and so on. These are very rough
estimates and actual performance does not decrease smoothly, being somewhat
'jagged'. Retrievals and deletes slow down similarly. There is no limit on
database size.
When an IncrementalMergingItemSpacewraps an InfinityDB, Items
can be inserted randomly into large databases at roughly 5K Items/sec to
40K Items/sec at 3GHz, whereas a simple InfinityDB will be limited by seek
speed to about 70 to 150 Items/sec (for one disk) when Items have no locality.
If there is locality (Items 'group together in sequence') then
a simple InfinityDB may actually be very good, since only the first
Item in a localized group will cause a disk I/O, and the rest of the Items
go very quickly into a single block that loaded into the cache. For example, if a
stock record is comprised of 30 Items having common prefixes that identify a
transaction, then only one disk block needs to be read and
written to insert that record, taking about 4 to 10 msec for a seek
for each operation. In that situation, the
30-Item records can be inserted at up to 3750 Items/sec (considering only the
I/O time), which is similar to the minimum
rate achieved by the IncrementalMergingItemSpace. However,
most applications will mix in some inversions on various attributes,
and there will be some non-local insertion. For the stock records, we also want to
insert inverse Items beginning with stock symbol, brokerage id,
customer id, and on to allow access given such data values - these
will be 'random' and will considerably slow down an InfinityDB used directly, but
will not slow down an IncrementalMergingItemSpace.
Retrieval speed depends on the access pattern. Each retrieval causes a set of
accesses on the baseDb, one for each existing 'batch'. As the database grows,
the number of batches grows logarithmically, starting at 1 and going up to 10, 20,
possibly even 50 for huge databases. In the worst case, a single random
read or delete access will require one disk I/O per batch. On the other hand, sequential
accesses will initially incur such repeated I/O's for an initial random access
followed by rather efficient scanning through the blocks that are
brought into memory as a result. Further sequential access
will only bring in blocks as needed into the cache, and there will be no
unnecessary block reads. While there is no extra I/O for such further sequential
retrievals or deletes, the in-cache access overhead of the operations is still
increased by the need for each retrieval to examine each batch to
find the next Item in sequence. So, while retrievals on a baseDb that
is an InfinityDB might run at 120K Items/sec/Ghz, an
IncrementalMergingItemSpace with 25 batches in the baseDb
would retrieve Items from blocks already in memory at 120K/25 Items/Sec/Ghz.
In this release, which is a Beta test release on 4/8/2007, concurrent access should be limited to one updater and multiple reader threads. In other words, any thread may retrieve, but only one should do inserts, deletes, or updates at a given time.