January 28, 2007
New Sort Engines can be used as Bulk Loaders
InfinityDB Now comes with two high-speed Item sorters for bulk sorting of large sets of Items. The API is extremely simple, consisting of only a few basic methods and classes for use within an embedded system that also contains an InfinityDB database instance. Bulk loading data into an InfinityDB database can be accelerated dramatically by providing the Items in sorted order, so that they all go at the end of the BTree as it builds up, rather than at random positions in the database. Random access incurs slow disk seeks, but sorted input appends data directly to the end of the database file. You can see the performance improvement for yourself by running the new example programs provided with InfinityDB.The sorters use an efficient sort/merge algorithm that requires minimal, controllable temporary space on disk. One sorter provides maximum disk space efficiency, while the other provides maximum speed. There is an 'ItemSpaceBased' sorter that keeps all temporary data in InfinityDB instances. This sorter gains from the high compression efficiency of InfinityDB itself, which is always good and can be as high as 10-fold. The 'DiskBased' sorter keeps temporary data in a very simple uncompressed form for faster, sequential transfers.
The API requires only instantiating a sorter, inserting Items into it, getting an Item reader object from the sorter, and reading back the sorted Items. (It is easy to use the sorters for virtually any kind of efficient sorting job, because an Item can contain virtually any kind of data - see the docs/manual/basic-operations.html).
For a complete description of these sorters, see the manual in docs/manual bulk-loading-and-sorting.html, as well as the Javadoc, which has extensive explanations of the sorter usage. These docs are available in the trial and deployent downloads. The classes are called ItemSpaceBasedMergeSorterItemOutput and DiskBasedMergeSorterItemOutput.
Posted 3 years, 11 months ago on January 28, 2007
The trackback url for this post is http://boilerbay.com/infinitydb/forum/bblog/trackback.php/38/
The trackback url for this post is http://boilerbay.com/infinitydb/forum/bblog/trackback.php/38/
Comments have now been turned off for this post