InfinityDB now has an alpha-test client/server version as well.
We created the fastest, most extensible, no-SQL, Embedded Java Database Engine, now used in thousands of successful deployments. Here are the features:
- Nestable Multi-value (can represent trees, graphs, K/V, documents, huge sparse arrays, tables)
- Strong Typing (with no text/binary trap. Plus CLOB’s, BLOB’s, short byte and char arrays)
- Runtime schema evolution (for forwards/backwards compatibility)
- 1M ops/sec in memory multi-threaded
- Multi-core Concurrency – a true, scalable performance boost
- Extreme Compression – ZLib, UTF-8, variable data, variable blocks, shared prefixes
- Transactions – optimistic ACID for threads, or global ACD for bulk operations
- Self administration (one file, no DBA, no upgrade scripts, no logs, no configuration)
- Instant recovery – a unique disk write pattern prevents corruption or data loss
- Simple ‘ItemSpace’ API – instant developer productivity.
- Dynamic views of data for queries – set logic views, delta views, ranges
Reliability and Safety
InfinityDB uses a rugged internal storage update protocol for persistence on demand or cache spilling to disk for large amounts of data that maintains system-wide data integrity, and survives abrupt application termination, file system bugs, and kernel panics. The single data file remains up-to-date, safe, correct, and usable through any event. There is no log-based recovery, hence restart and recovery is immediate in all cases. (There is one exception for power failure that is a fundamental weakness of any software explained in Operating Guidelines.) No unexpected Exceptions are ever thrown: not even due to any kind of deadlock or internal resource limits (optional optimistic locking throws expected Exceptions on conflict however). No dangerous off-heap storage or native libraries are used.
InfinityDB is designed to use one single file. The combination of this feature and the instant recovery help make this product administrator free. No logs need to be archived or re-applied. There are no configuration files, temporary files, or text logs. No junk files are left behind after any kind of termination, so there is never any cleanup.
Fast Multi-Core Design
Our product was already incredibly fast, but then we redesigned it to make use of all cores at the same time, each operating safely on a different thread. Now, InfinityDB runs at 1 million ops per second on 8 cores with good scaling. You can take advantage of this speed immediately on a server, or you can use multiple threading in your application. Cores are multiplying at Moore’s-law speed, and applications are adding more and more threads.
Without the multi-core technology supplied by InfinityDB to avoid inter-thread interference, bottlenecks called ‘convoys’ can occur when threads contend for data. Performance can drop dramatically, even far below single-thread speed.
We are here to assist you in multi-threaded programming. Write to us at email@example.com
Mixed Relational and Application-Specific Data Models
InfinityDB provides a rich data representation space for structured, semi-structured, or unstructured data. The simple basic data model is used by the application to define and represent any mixture of trees, graphs, key/value maps, documents, text indexes, huge sparse arrays, tables with an unlimited number of columns of an unlimited number of values per row and column, nested multi-maps, inverted Entity-Attribute-Value triples, or creative custom structures.
Nested Map-based Access
There are two APIs, both of which are simple: one is the versatile, fast low-level proprietary ‘ItemSpace’ and the other one is a simple nested Map view.
The nested Map view is a wrapper around the ItemSpace, and it implements and extends the java.util.concurrent.ConcurrentNavigableMap, thereby providing the capability of a ConcurrentHashMap or ConcurrentSkipListMap. InfinityDBMaps may contain other InfinityDBMaps or InfinityDBSets which are standard ConcurrentSets. The InfinityDBMap is a light-weight Object which can be constructed dynamically without itself being persisted: only the Map mutator methods store data in the ItemSpace.
Direct No-SQL ‘ItemSpace’ API
For the ultimate speed and extreme flexibility, our trivial lower-level ‘ItemSpace’ API allows you access to the same data as the Map-based view. There are only a few storage and retrieval operations that operate on the noSQL InfinityDB ItemSpace. You gain low-level access by momentarily allocating a cursor, and then using it to insert, delete, update, locate, or scan data in the ItemSpace, which is nothing but an ordered set of variable-length ‘Items’. An Item can be thought of as an encoded extended tuple.
The database is defined entirely by the set of Items it contains – there is no other state. The Items are all kept and accessed sorted in the database, and can be accessed in sequence with prefix matching. All other structure is higher-level and is defined dynamically by the application via a few simple access operations. There is also a wide set of helper utilities.
Appropriate for Small to Large Installations
You can use InfinityDB as an In-Memory-Only DBMS keeping all data in the cache, or let it grow smoothly to hundreds of GB with no code changes. Access to data in the memory cache is fully multi-core, while infrequently used data is paged to disk transparently.
Data is operated on efficiently with fine granularity for small or large data structures regardless of memory capacity. Fine granularity accesses transition smoothly to coarse block-oriented granularity when and where needed.
Extensibility and Forwards and Backwards Compatibility
No application-defined data structures have physical or practical size limits but can always expand efficiently from zero size upwards to any size. Data structures that are empty take no space, hence any additional structure requires no reorganization, as each data structure effectively already exists virtually but with no size.
Applications can anticipate extensions and often provide some forwards compatibility and can provide vital full backwards compatibility. No upgrade or downgrade scripts are needed – in fact there are no scripts at all, only runtime-created structures.
For example, when data is structured and viewed as tables, a new table or column can be added at any time, because all column values are nullable and come into existence on the first use, taking no space until then. There is no physical or practical limit on number of tables, columns per table, or the number of values per column. A single-valued column that is to be converted to multi-value is already in the proper format, and more values can simply be inserted along with it. Column values can also become aggregate structures at runtime.
Continuous Space Reclamation
Space allocation for individual and aggregated data is fully dynamic: no space is used until structures are created or after they are deleted. During growing or shrinking, structure storage is always minimal and efficient. The single data file is 100% efficient with compressed data on initial loading, and stays at least 50% efficient in the worst case after very large global transactions, which may include any amount of data. Normally, free space is about 10%. The file never shrinks. Applications can run forever without gradual space loss. There are no temporary peaks in space usage, or temporary external files. There is no need for occasional reorganization or packing, and there is no garbage collector thread. Freed space is recycled immediately.
High Compression on Disk and In Memory
InfinityDB’s continuous, dynamic ZLib and UTF-8 data compression packs data into variable-length blocks, avoiding almost all wasted space that would normally be needed for internal fragmentation. I/O bandwith is reduced accordingly. Variable-length binary-encoded primitives, variable-length concatenations of primitives or ‘Items’, and prefix and branch-cell suffix compression are used on disk and in the memory cache as well. Data compression means that the branching factor is kept high for fast access, and the OS file cache is better used.
There is no pre-allocation or waste in ‘extents’, ‘segments’, ‘clusters’, or fixed-size blocks. Deletions or updates do not leave sparse structures – all freed space is reclaimed completely for immediate reuse without rebuilding indexes or running offline reorganizers. No gradual space leaks can occur because free space management is transactional. Any size database benefits from the compression, from 10KB to 10GB and beyond.
- Global. This persists all current changes to disk, providing Atomic, Consistent, and Durable semantics. It does not use any kind of lock, so it does not provide inter-thread Isolation. However all access is concurrent during the commit by any threads. Effectively, there is a single ‘global transaction’ in effect at all times. Optimistic Locking commits also cause global commits.
- Optimistic. Fine-grained multi-thread transactions use optimistic locking and support complete ACID atomic, consistent, isolated and durable semantics. Locks do not follow the usual rules of other DBMS’ but have the equivalent capability as table locks and row locks, index locks, or even single-column value locks and single set element locks. This diversity of lock types is not actually a complex spectrum of details – it follows trivially from the basic data model and is automatic and almost invisible to the programmer. If desired, the programmer can easily control the lock order for maximum concurrency simply by accessing appropriate data early in the transaction. The locks are actually just set on prefixes of tuples, i.e. prefixes of Items, and are maintained transparently. The set of locked prefixes is kept in memory per database globally and also associated with each thread. Lock conflicts throw an OptimisticLockConflictException and are optionally retried by the application code. Concurrent optimistic transactions can reach hundreds of commits per second on disk, and thousands per second on flash.
Sensible Data Representation
Data is not stored as formatted text or as custom raw binary, but as an intermediate form, with standard pre-defined binary encodings of the individual Java primitives in a consistent way that allows extremely high speed combined with clarity of representation. Applications do not need to invent binary encodings or convert primitives to binary or text.
InfinityDb natively supports all common primitive Java data types and more:
- long (stored as compressed bits so byte, short, and int take no more space)
- String (stored as UTF-8)
- short byte and char arrays (sort by length first)
- short byte strings (sort like strings but with bytes instead of 2-byte chars)
- ‘EntityClasses’ and ‘Attributes’ which are metadata that describe the semantics of an Item containing other types, which are ‘primitives’.
When used to represent a nestable relational structure, InfinityDB ‘entities’ (i.e. relational composite keys) can be:
- ‘tuples’, where a tuple is any concatenation of zero or more primitives of any type,
- heterogenous – different keys can have any primitive types or tuple types,
- variadic – different keys can be tuples of different lengths, or
- nestable sparse arrays of unlimited size of any key type.
Also, Attribute (i.e. relational column) values can be the same as keys plus:
- multi-valued, with no limit on number, and where an absence of any value takes no storage,
- nested tables (by means of multiple ‘EntityClass’ metadata components in the ‘Item’),
- nested attributes (by means of multiple the ‘Attribute’ metadata components in the Item),
- nested structured documents (by means of ‘EntityClasses’ nested within an ‘Attribute’ in the Item), or
- CharacterLongObjects or BinaryLongObjects of unlimited size.
All structures in the entire database are represented as a magnitude-ordered set of ‘Items’ which are each a short limited-length composition of one or more arbitrary strongly typed binary-encoded ‘components’. An Item can be thought of as an extended tuple. These ordered Items represent the entire state of the database. All other conceptual upper-level structures are composed of Items with an application-defined meaning. Prefixes of Items are used to logically nest Items into arbitrary recursive sub-spaces. All basic access to the database uses a cursor containing one Item and no other state. The binary encoding of each component in an Item is transparent to the application, which is a level above, and which uses only Java primitives indirectly to build and examine Items in a cursor. The binary encoding is done by InfinityDB in a hidden, fixed permanent way.
The relational model is actually further extended by allowing column values to be nested tables. The relations and their nesting is determined by means of a pair of special ‘non-primitive’ Item components called ‘EntityClasses’ and ‘Attributes’. These two, when included in the series of components that comprise an Item, can describe the meta semantics, like ‘punctuation’. A table requires a single EntityClass, followed by zero or more primitives (the ‘entity’ tuple or ‘primary key’), then an Attribute component, and finally zero or more primitive components forming the ‘value’ tuple. A relation is a set of such ‘quads’ having the same EntityClass. The EntityClass is like a table name, the entity tuple is like a primary key, the Attribute is like a column name, and the rest after the Attribute is the column value. By adding further EntityClasses and Attributes into each of the Items, nested tables, subtables, sub-attributes and other structures are formed. There is no other structure-determining storage elsewhere, so adding a single Item can instantly create multi-values, nested tables or any other extension. Deleting an Item reverses its insertion, with no other maintenance required.
Avoid the Text/Binary Trap
Other NoSQL databases store either text or custom binary and are ‘traps’.
JSON or XML representations have various limitations. Some require slow formatting and parsing, most are non-hierarchical and targeted only at document granularity per key within practical document size limits, some cannot sort by key due to their hashtable structure, some may use only strings as keys, some cannot compose keys, some cannot natively or efficiently represent binary or character streams or ‘LOBs’, especially when they are long, some cannot have multi-values, and some can be space inefficient. Usually, a key has a fixed, non-hierarchical, meaning, so a given key store serves only one purpose.
Java Object serialization is also targeted at chunk-at-a-time access within practical size limits, and does not provide key-based or other access of the chunks or to their internal structures. Programmers must carefully architect the storage structure, and that structure becomes bound to the class structure instead of the data semantics. Serialization has extensibility, versioning, security, upgrade, Object integrity, documentation, reliability, coding complexity, space efficiency, and other issues. Similar problems afflict POJO persistence.
Object/Relational mapping has the familiar, classic ‘impedance mismatch’ problem. The systems are complex and high maintenance. Database structure is determined by both the class structure and the relation structure, which must be versioned in sync and require both upgrade scripts and class code rewrites. Hence runtime extensibility is impossible. Objects end up with either embedded dynamic SQL or else heavy mapping frameworks to separate out the SQL.
Virtual View ItemSpaces
InfinityDB provides many utilities for dynamically viewing one or more underlying ItemSpaces as a virtual ItemSpace . All underlying ItemSpace changes reflect immediately in the virtual view ItemSpace. A view is a true ItemSpace itself:
- SubSpace virtually hides and restricts by a fixed prefix of an ItemSpace;
- DeltaItemSpace is a mutable view of a fixed ItemSpace with commit/rollback;
- AndSpace views a logically intersected set of underlying ItemSpaces;
- OrSpace views a logically unioned set of underlying ItemSpaces;
- RangeItemSpace views a limited range of Items;
- VolatileItemSpace stores Items in memory non-persistently;
- IncrementalMergingItemSpace views a special kind of index that can be incrementally built and optimized efficiently at any size while being accessed concurrently. Concurrent deletions are allowed. Text indexing is one use.
Views can be nested. A nesting of set operation views can be flattened automatically for best speed. These capabilities provide a type of instant dynamic query capability without indexes, query compilation, execution, or temporary space usage. The virtual ItemSpaces are light-weight Objects. Any number of views can exist at once. The views can use the Map-based wrappers.
See the Manual for detailed information. See Documents on the internal structure or the principles for constructing any higher-order data model from the trivial underlying ‘ItemSpace‘ data model. For a Free Trial Download see the shop.
For licensing, email firstname.lastname@example.org.
The new Client/Server InfinityDB
Please see the new version 5 improvement for client/server, i.e. non-embedded use.
Current multi-year InfinityDB licensees of the embedded version include: