InfinityDB Java NoSQL DBMS

InfinityDB Embedded is a Java NoSQL database, a hierarchical sorted key value store. It is high-performance, multi-core, flexible, and maintenance-free. InfinityDB Client/Server is now available as well.

InfinityDB Embedded is easy to use:

  • The entire database is in a single file, used by a single JVM
  • 10 simple API calls (insert, delete, delete suffixes, update, first, next, last, previous, commit, rollback)
  • 12 data types (String, double, float, long, boolean, date, byte array, byte string, char array, index, ‘EntityClass’, ‘Attribute’)
  • Extremely fast, beyond  1m ops/s, multi-core performance with patent-pending technology
  • Data compression up to 10x or more
  • Transactions
  • Instant recovery after any application failure with no log
  • APIs: fast ‘ItemSpace’, fast ConcurrentNavigableMap adapter, and JSON parser/printer
  • Zero administration
  • Optional ‘flexible’ self-extending schema goes far beyond tabular

InfinityDB Client/Server provides secure, remote, shared access to multiple InfinityDB Embedded files

  • Data browsing and editing via web of flexible graphical tables, JSON, raw tuples, and images
  • Administration via web of users, roles, permissions, and multiple InfinityDB  Embedded databases
  • Security:  SSL/TLS, bcrypt hashed and encrypted passwords, more
  • NoSQL ‘Pattern Queries’ for easy powerful data restructuring, filtering, select/project/join/order-by
  • Remote REST access by Python or shell via ‘curl’ command, or local and remote Java

See the short sample code. A dzone.com article shows definitive detailed JMH performance testing.

There are thousands of deployments in current use by these large companies and more:

atlassian_logo(1) has been shipping InfinityDB Embedded for years in their successful Crucible and Fisheye repository browser as the foundation for a fast web server, where it gathers and presents repository structure. Atlassian is a $4.3B Australia-based company that produces collaboration software for software developers.

is a $21B  international company that uses InfinityDB Embedded in Kuwait in a time-series database for collecting real-time signals from distributed nodes for wellhead health and productivity monitoring. Their system keeps up with a very fast stream of input from hundreds of sensors over radio links, and archives the data for critical later analysis.

is a $9B Canadian text processing software company that ships InfinityDB Embedded in multiple text applications  to thousands of customers.

pacific_knowledge_systems_logois a well-established Australian company that uses InfinityDB Embedded in its  ‘ripple-down-rules’ medical data analysis software products.

 

These companies and more have relied on InfinityDB Embedded for years for critical data storage of their successful commercial products.  These projects require extreme speed,  zero maintenance, and unique data structuring capabilities. InfinityDB Embedded is licensed for distribution in applications as a jar.

Fast Multi-Core Design

InfinityDB Embedded was already incredibly fast, but then we redesigned it to make use of all cores at the same time, each operating safely on a different thread. Now, InfinityDB Embedded runs at over 1 million ops per second on 8 cores as it scales. You can take advantage of this speed immediately on a server, or you can use multiple threading in your  application. Cores are multiplying at Moore’s-law speed, and applications are adding more and more threads.

Without the multi-core technology in InfinityDB Embedded to avoid inter-thread interference, bottlenecks called ‘convoys’ can occur when threads contend for data. Performance can drop dramatically, even far below single-thread speed. The concurrency algorithm is patent pending.

 Zero Administration

InfinityDB Embedded applications can run indefinitely with no DBA attention for installation, management, application upgrade, or schema definitions like create table scripts.

A Database is a Single-File

InfinityDB  Embedded uses a single file for all purposes. The combination of this feature and the instant guaranteed recovery on abrupt application termination help make InfinityDB Embedded administrator free. No logs need to be archived or re-applied. There are no configuration files, temporary files, or text logs. No junk files are left behind after any kind of termination, so there is never any cleanup.

 Reliability and Safety

InfinityDB Embedded uses a rugged internal storage update protocol for persistence on demand or cache spilling to disk for large amounts of data that maintains system-wide data integrity, and survives abrupt application termination or other problems. The single data file remains up-to-date, safe, correct, and usable through any event. There is no log-based recovery, hence restart and recovery is immediate in all cases. No unexpected Exceptions are ever thrown: not even due to any kind of deadlock or internal resource limits (optional optimistic locking throws expected Exceptions on conflict however). No dangerous off-heap storage or native libraries are used.

Efficient Storage

Continuous Space Reclamation

Space allocation for individual and aggregated data is fully dynamic: no space is used until structures are created or after they are deleted. During growing or shrinking, structure storage is always minimal and efficient. The single data file is 100% efficient with compressed data on initial loading, and stays at least 50% efficient in the worst case after very large global transactions, which may include any amount of data. Normally, free space is about 10%. The single file never shrinks. Applications can run forever without gradual space loss. There are no temporary peaks in space usage, or temporary external files. There is no need for occasional reorganization or packing, and there is no garbage collector thread. All freed space is recycled on commit or rollback. Deletions or updates do not leave sparse structures behind – all freed space is reclaimed completely for immediate reuse without rebuilding indexes or running offline reorganizers.

High Compression on Disk and In Memory

InfinityDB Embedded  uses continuous, dynamic ZLib and UTF-8 data compression to pack data into variable-length blocks, avoiding almost all wasted space that would normally be needed for internal fragmentation. I/O bandwith is reduced accordingly. Variable-length binary-encoded primitives, variable-length concatenations of primitives or ‘Items’, and prefix and branch-cell suffix compression are used on disk and in the memory cache as well. Data compression means that the branching factor is kept high for fast access, and the OS file cache is better used. For compressible data, 10x is often achieved. There is no pre-allocation or waste in ‘extents’, ‘segments’, ‘clusters’, or fixed-size blocks.No gradual space leaks can occur because free space management is transactional. Any size database benefits from the compression, from 10KB to 100GB and beyond.

Simple NoSQL APIs

There are three APIs: one is the versatile, fast low-level proprietary ‘ItemSpace’. Also a fast extended nested Map view adapter can wrap the ItemSpace for more versatility. JSON can be parsed and printed directly from the data.

Nested Map-based Access

The nested Map view is a wrapper around the basic ItemSpace API, and it implements and extends the java.util.concurrent.ConcurrentNavigableMap, thereby providing the capability of a ConcurrentHashMap or ConcurrentSkipListMap. InfinityDBMaps may contain other InfinityDBMaps or InfinityDBSets which are extended ConcurrentSets. The InfinityDBMap is a light-weight Object which can be constructed dynamically without itself being persisted: the Map mutator methods actually store data in the database.  Extensions to the NavigableMap API include:

  • composite keys – variable data types and component count
  • composite values or set elements, variable data types and component count
  • multi-map – unlimited values per key
  • tuple access via Object[] at the interface, variable length
  • nestable Maps and Sets

Direct No-SQL ‘ItemSpace’ API

For the ultimate speed and extreme flexibility, the simple lower-level ‘ItemSpace’ API allows you access to the same data as the Map-based view. There are only 10 essential storage and retrieval methods that operate on the ItemSpace: insert, delete, deleteSubspace, update, first, next, last, previous, commit, and rollback. You gain low-level access by momentarily allocating a ‘Cu’ cursor, and then using it for the API method invocations and disposing it. There are helper utilities for things like text indexes, hierarchical sorting, inversions, and more. Applications can define rich creative models on top of the ItemSpace.

JSON Access

Data in the database can be mapped directly to JSON text with a one-to-one correpondence. Utilities for parsing and generating JSON are provided. This goes beyond ‘Document’ databases, because the JSON is not stored as text but instead as compressed ‘paths’ or ‘Items’ each of which represents a JSON value. The 12 data types can be encoded into extended JSON, or ‘underscore-quoted’ standard JSON so any primitive can be a key or value, such as a date, and all keys are sorted. There is no artificial distinction between the ‘container’ of the documents and the documents themselves, so the size of any JSON sub-document depends on only the given path to it – from an entire database down to individual values. Access does not depend on loading and storing entire JSON documents – any scope can also be incrementally or atomically operated on at any scale via the ItemSpace or Map access, and in memory, this can reach millions of ops/sec.

Transactionality

Two kinds of transactions are available:
  • Global. This persists all current changes to disk, providing Atomic, Consistent, and Durable semantics. It does not use any kind of lock, so it does not provide inter-thread Isolation. However all access is concurrent during the commit by any threads. Effectively, there is a single ‘global transaction’ in effect at all times. Optimistic Locking commits also cause global commits.
  • Optimistic. Fine-grained multi-thread transactions use optimistic locking and support complete ACID atomic, consistent, isolated and durable semantics. Locks do not follow the usual rules of other DBMS’ but have the equivalent capability as table locks and row locks, index locks, or even single-column value locks and single set element locks. This diversity of lock types is not actually a complex spectrum of details – it follows trivially from the basic data model and is automatic and almost invisible to the programmer. If desired, the programmer can easily control the lock order for maximum concurrency simply by accessing appropriate data early in the transaction. The locks are actually just set on prefixes of tuples, i.e. prefixes of Items, and are maintained transparently. The set of locked prefixes is kept in memory per database globally and also associated with each thread. Lock conflicts throw an OptimisticLockConflictException and are optionally retried by the application code. Concurrent optimistic transactions can reach hundreds of commits per second on disk, and thousands per second on flash.

Data Structures

Applications do not need to invent binary encodings or convert primitives to binary or text. Data is not stored as formatted text or as custom raw binary, but as an intermediate form, with standard pre-defined binary encodings of the individual Java primitives in a consistent way that allows extremely high speed.

InfinityDB Embedded supports all primitive Java data types and more:

  • long (stored as compressed bits to handle byte, short, char, with no more space)
  • float
  • double
  • boolean
  • String (stored as zlib compressed UTF-8)
  • Date/time
  • index (for ‘huge sparse arrays’, lists in JSON, and BLOBs/CLOBs, texts)
  • short byte and char arrays (sort by length first, used for BLOBs and CLOBs)
  • short byte strings (sort like strings but with bytes instead of 2-byte chars)
  • ‘EntityClasses’ and ‘Attributes’ which are optional metadata for rich self-extending ‘flexible’ structures.

Application-Specific Data Models

InfinityDB provides a rich data representation space for structured, semi-structured, or unstructured data. The  basic data model is simple but flexible enough to be used by the application to define and represent any mixture of trees, graphs, key/value maps, documents, text indexes, huge sparse arrays, tables with an unlimited number of columns of an unlimited number of values per column, nested multi-maps, inverted Entity-Attribute-Value triples, or creative custom structures.

Items and the ItemSpace

All structures in the entire database are represented as a magnitude-ordered set of ‘Items’ which are each a short variable-length composition of one or more arbitrary strongly-typed variable-length binary-encoded ‘components’. An Item can be thought of as a variable-length tuple, but is at base a logical array of 0 to 1665 chars – this internal binary format allows great speed and compression. These ordered Items represent the entire state of the database. All other conceptual upper-level structures are composed of Items with an application-defined meaning. Prefixes of Items are often used to logically nest Items into arbitrary recursive sub-spaces, i.e. sets of suffixes. All basic access to the database uses a temporary ‘Cu’ cursor containing one Item and no other state. The binary encoding of each component in an Item is transparent to the application, which uses only Java primitives indirectly to build and examine Items in a cursor. The internal binary encoding is done by InfinityDB Embedded in a fixed permanent way.

Variable-length Items can represent multiple sets of fixed-length tuples, the equivalent of multiple CSV files, or can represent paths to JSON type data. JSON can be parsed and formatted from the Items.  The JSON is not stored literally: the entire database can be accessed at any level of hierarchical detail, because there is no fixed predefined division between keys and JSON documents.

Flexible Extensible Data Structures with ‘EntityClass’ and ‘Attribute’ Data Types

If the special EntityClass and Attribute data types are mixed in with the other ‘primitive’ data types in the Items, flexible, ‘incrementally self extending’ structures can be represented. See the InfinityDB Client/Server page for a graphical view of some examples of the flexible structures. An initial EntityClass component is normally used to separate data for unlimited independent uses even without the flexible structuring in a single InfinityDB Embedded file. An EntityClass is encoded as binary but contains a string with an initial capital letter followed by zero or more letters, digits, dot, dash, or underscore (as a regex: [A-Z][A-Za-z0-9._-]*). An Attribute is identical but starts with a lower case letter.

When used to represent a ‘flexible’ tabular structure, keys can be:

  • ‘tuples’, where a tuple is any concatenation of zero or more primitives of any type,
  • heterogenous – different keys can have different primitive types or tuple types,
  • variadic – different keys can  be tuples of a different number of primitive types,
  • nestable sparse arrays or lists of unlimited size of any key type, i.e. lists, using the ‘index’ data type

Flexible table column values can be the same as keys plus:

  • multi-valued, with no limit on number, and where an absence of any value takes no storage,
  • CharacterLongObjects or BinaryLongObjects of unlimited size.

Furthermore, any such flexible structures can be nested by concatenating their Items onto the ends of other Items. A particular set of suffixes can contain any kind of nested structure. The ‘EntityClass’ and ‘Attribute’ data type components can represent four patterns depending on their pairings within each Item:

Pairing Meaning
EntityClass then data then Attribute then data a ‘table’
EntityClass then data then EntityClass then data a ‘sub-table’
Attribute then data then Attribute then data a ‘sub-attribute’
Attribute then data then EntityClass then data a ‘nested table’

The GUI display of such flexible structures is very rich – see it in action in InfinityDB Client/Server. The displays look like nestable ‘documents’, tables, lists, trees, and so on. Here is a flexible table with EntityClass “Trees”, a multi-value Attribute, ‘composite’ keys of variable component count, and a nested table “Location”.

Forwards and Backwards Schema Compatibility

The ItemSpace model is inherently extensible, but with the flexible ‘EntityClass’ and ‘Attribute’ metadata data types embedded in the Items, databases become ‘self-describing’ and can be extended in ways that avoid incompatibilities with earlier or later database backups, old or new application versions, or changing or extending data producers and consumers like users, Python scripts, bash ‘curl’ commands,  or IoT’s or distributed databases.

Virtual View ItemSpaces

InfinityDB Embedded provides many utilities for dynamically viewing one or more underlying ItemSpaces as a virtual ItemSpace . All underlying ItemSpace changes reflect immediately in the virtual view ItemSpace. A view is a true ItemSpace itself:

  • ItemSubspace  virtually hides and restricts by a fixed prefix of an ItemSpace;
  • DeltaItemSpace is a mutable view of a fixed underlying ItemSpace with its own commit and rollback;
  • AndSpace views a logically intersected set of underlying ItemSpaces;
  • OrSpace views a logically unioned set of underlying ItemSpaces;
  • RangeItemSpace views a limited range of Items;
  • VolatileItemSpace stores Items in memory non-persistently;
  • IncrementalMergingItemSpace views a special kind of index that can be incrementally built and optimized efficiently at any size while +being accessed concurrently. Concurrent deletions are allowed. Text indexing is one use.

Views can be nested. An arbitrarily deep nesting of AndSpace and OrSpace can be flattened automatically for best speed. These capabilities provide a type of instant dynamic query capability without indexes, query compilation, execution, or temporary space usage. The virtual ItemSpaces are light-weight Objects. Any number of views can exist at once. The views can underlie the Map-based wrappers. They work with the flexible data representation using EntityClass and Attribute data types as well.

More Information

See the Manual for detailed information on InfinityDB Embedded. See Documents on the internal structure or the principles for constructing any higher-order data model from the trivial underlying ‘ItemSpace‘ data model. For graphical representations of the ‘flexible’ structures using EntityClass and Attribute data types see the InfinityDB Client/Server . For a Free InfinityEmbedded Trial Download see the shop. Here is the InfinityDB Embedded_Trial License.

for licensing, email support@boilerbay.com

Learn more about InfinityDB Client/Server

 

The AirConcurrentMap Java ConcurrentNavigableMap

A separate product, the fast com.infinitydb.map.AirConcurrentMap API is identical to the Standard Java Maps – in fact it is a java.util.concurrent.ConcurrentNavigableMap, optimized for more than about 1K Entries. You can use our time-limited trial version to compare the performance with that of the Standard Java Maps. Memory efficiency is higher than any JDK Map as well. Our extensions provide extreme performance for parallel operations even beyond streams.

Learn more about AirConcurrentMap.