InfinityDB Java NoSQL Database

InfinityDB Server is a complete secure user-friendly DBMS. It can connect the database directly to REST clients and other servers with PatternQueries. It is based on InfinityDB Embedded internally.

InfinityDB Embedded and InfinityDB Encrypted are Java modules for use in Java applications, licensable separately.

The Server

With the secure server, effortlessly generate and maintain data, schemas, REST or Microservices APIs, queries, and indexing via the web to enhance productivity without the need for coding. PatternQueries go far beyond SQL; they almost resemble programs but remain declarative rather than algorithmic. They are simple, clear, concise, easy to learn, and editable as text or graphics. Each REST access URL corresponds to a single query, serving as a secure interface that conceals the database structure – nothing else is necessary. Schema and API evolution, tuning, testing, exploration and experimentation can all be done by making live edits to data and queries via the web, without any server downtime or interruption.

All Data Types

Images, binary data (Blobs), documents, tables, tuples, sets, and lists may be flexibly nested, each with 12 data types. Metadata ‘class’ and ‘attribute’ data types are seamlessly embedded within the data, allowing for instant and incremental extensions. All data is more than just JSON documents; it is strongly typed, efficiently sorted, binary encoded for speed, and interchangeable with any format on demand. Databases are stored as highly-compressed ‘B-Trees’. Queries are stored as structured data within their respective databases.

Web-Based

Web-based database administration and data exploration and editing are available in both text and graphical modes. Data displays as a mix of binary data, nestable structures and text, and is rich, flexible, informal or formalized. All data can be viewed and edited as ‘i-code,’ JSON, CSV, other text forms, or graphically. All formats can embed images, pdfs, plain text, or any other stored blob or file.

Optimized

The PatternQuery compiler is highly optimized, and while it offers optional fanout hints, it always completes the compilation phase within milliseconds, unlike traditional RDBMS systems. Optionally, users can customize queries to effectively embed optimal execution plans, such as the choice of inversions. A unique ‘zig-zag’ algorithm significantly accelerates joins, intersections, and differences, all without requiring temporary storage, even in the absence of indexes. The underlying InfinityDB Embedded ‘database engine’ is highly optimized as well, and is multi-core threaded for speed and isolation of short and long operations.

Secure

Unlike hand-crafted REST interfaces and web servers, InfinityDB is secured by SSL/TLS for all communications. Administrative data is encrypted and passwords hashed. There are users, passwords, roles, permissions, ,databases, and grants. REST access is finely controllable via named interfaces.

Remote

REST access from almost any language can easily do request-response data interchanges with the server. Also, remote servers or Java clients can be connected via a binary ‘ItemPacket’ protocol for consolidation, backup, curation, and collaborative purposes.

For more see InfinityDB Server where we show the web user interface and explain the data model and PatternQueries.

Applications

Server Architecture

The InfinityDB Server version utilizes the architecture depicted below. It is designed for seamless deployment on Amazon Elastic Computing Cloud (EC2), allowing for instant launch as a virtual machine. Additionally, it can be executed on any platform where Java is supported with an ‘on-premises’ license. On the right-hand side, you can see the REST clients, which can encompass various devices such as computers, laptops, desktops, mobile devices, or IoT devices. The REST protocol is widely adopted and extremely user-friendly, making it accessible for any client program. A REST request functions like a remote procedure call, with each API method being handled by a straightforward declarative PatternQuery.

The remote servers represent other InfinityDB instances that have been configured as remotes within the referring server. From the perspective of clients, these remotes appear as regular databases. On the left side, you’ll find the web console, which serves as the interface for administration and interactive data access and querying. All communication is robustly secured through SSL (Secure Sockets Layer) encryption.

Within the EC2 instance, there are two Elastic Block Store (EBS) volumes: one serves as the root volume for the code, while the other is designated for data storage. The root volume is periodically updated to accommodate newer versions. When operating outside the EC2 environment, the use of two separate volumes is not required.

InfinityDB Architecture

InfinityDB Embedded: The Underlying Database ‘Engine’

InfinityDB Embedded and InfinityDB Encrypted are optionally separately available from the server, and they comprise the ‘engine’ that underlies InfinityDB Server to provide the highest available performance, according to our customers and the provided performance tests.

Performance

  • More than 1M ops/sec are typical for multi-threaded insert, delete, and next in cache
  • Multi-core overlapping operations scale almost linearly in thread count
  • Almost all cores are used with many threads
  • Threads use fair scheduling, with very low inter-thread interference
  • Random I/O scales logarithmically in file size, with no size limit
  • Huge caches are efficient – 1MB to 100GB or more, and are on-heap
  • Transactions are fast: 50/s on disk, 300/s on flash, or thousands/sec for delayed durability
  • Database open is immediate, even for recovery after abrupt exit

Features

 Here are the basic InfinityDB Embedded and InfinityDB Encrypted ‘Database Engine’ features. InfinityDB Encrypted is identical to InfinityDB Embedded but encrypts 100% of the database 100% of the time.

  • Each standalone database is in a single file, used by a single JVM. The file format never changes.
  • 10 simple API calls (insert, delete, delete suffixes, update, first, next, last, previous, commit, rollback)
  • 12 basic data types:
    • String, double, float, long, boolean, date, byte array, byte string, char array, index, ‘Class’, ‘Attribute’
  • Unlimited nestable data structures and patterns:
    • images, files, mime-typed BLOBs, trees, documents, inversions, lists, maps, sets, variable strongly-typed tuples, elegant Entity-Attribute-Value
  • Schema is dynamically extensible but can be backwards compatible
  • Data compression up to 10x or more, continuous, no memory or storage leaks
  • APIs: fast ‘ItemSpace’, fast extended Java -standard ConcurrentNavigableMap adapter
  • Zero administration

Sample Code

See the short embedded example code , map access example code, encrypted example code or client/server example code. Also see PatternQuery examples. A dzone.com article shows definitive performance testing with the Java Microbenchmarking Harness.

Mission-Critical Deployments

There are thousands of deployments in current use for years by these large companies and more:

has been shipping InfinityDB Embedded for years to tens of thousands of customers in their successful Crucible and Fisheye repository browser as the foundation for a fast web server, where it gathers and presents repository structure. Atlassian is an Australia-based company that produces collaboration software for software developers.

an international company that uses InfinityDB Embedded in Kuwait in a time-series database for collecting real-time signals from distributed nodes for wellhead health and productivity monitoring. Their system keeps up with a very fast stream of input from hundreds of sensors over radio links, and archives the data for critical later analysis.

is a large Canadian text processing software company that ships InfinityDB Embedded in an enterprise-grade text indexing system. The index provides access to distributed documents based on content.

is an innovative Australian company that uses InfinityDB Encrypted in its  ‘ripple-down-rules’ medical data analysis software products. They use encryption for health security compliance.

These companies and more have relied on InfinityDB Embedded for years for critical data storage of their successful commercial products.  These projects require extreme speed,  zero maintenance, and unique data structuring capabilities. InfinityDB Embedded is licensed for distribution in applications as a jar.

Multi-Core Concurrency

InfinityDB Embedded was already incredibly fast, but then we redesigned it to make use of all cores at the same time, each operating safely on a different thread. Now, InfinityDB Embedded runs at over 1 million ops per second on 8 cores as it scales. You can take advantage of this speed immediately on a server, or you can use multiple threading in your  application. Cores are multiplying at Moore’s-law speed, and applications are adding more and more threads.

Without the multi-core technology in InfinityDB Embedded to avoid inter-thread interference, bottlenecks called ‘convoys’ can occur when threads contend for data. Performance can drop dramatically, even far below single-thread speed. The concurrency algorithm is patented now.

 Zero Administration

InfinityDB Embedded applications can run indefinitely with no DBA attention for installation, management, application upgrade, or schema definitions like create table scripts.

A Database is a Single-File

InfinityDB  Embedded uses a single file for each database. The combination of this feature and the instant guaranteed recovery on abrupt application termination help make InfinityDB Embedded administrator free. No logs need to be archived or re-applied. There are no configuration files, temporary files, or text logs. No junk files are left behind after any kind of termination, so there is never any cleanup.

 Reliability and Safety

InfinityDB Embedded uses a rugged internal storage update protocol for persistence on demand or cache spilling to disk for large amounts of data that maintains system-wide data integrity, and survives abrupt application termination or other problems. The single data file remains up-to-date, safe, correct, and usable through any event. There is no log-based recovery, hence restart and recovery is immediate in all cases. No unexpected Exceptions are ever thrown: not even due to any kind of deadlock or internal resource limits (optional optimistic locking throws expected Exceptions on conflict however). No dangerous off-heap storage or native libraries are used.

Storage Efficiency

Continuous Space Reclamation

Space allocation for individual and aggregated data is fully dynamic: no space is used until structures are created or after they are deleted. During growing or shrinking, structure storage is always minimal and efficient. The single data file is 100% efficient with compressed data on initial loading, and stays at least 50% efficient in the worst case after very large global transactions, which may include any amount of data. Normally, free space is about 10%. The single file never shrinks. Applications can run forever without gradual space loss. There are no temporary peaks in space usage, or temporary external files. There is no need for occasional reorganization or packing, and there is no garbage collector thread. All freed space is recycled on commit or rollback. Deletions or updates do not leave sparse structures behind – all freed space is reclaimed completely for immediate reuse without rebuilding indexes or running offline reorganizers.

High Compression on Disk and In Memory

InfinityDB Embedded  uses continuous, dynamic ZLib and UTF-8 data compression to pack data into variable-length blocks, avoiding almost all wasted space that would normally be needed for internal fragmentation. I/O bandwith is reduced accordingly. Variable-length binary-encoded primitives, variable-length concatenations of primitives or ‘Items’, and prefix and branch-cell suffix compression are used on disk and in the memory cache as well. Data compression means that the branching factor is kept high for fast access, and the OS file cache is better used. For compressible data, 10x is often achieved. There is no pre-allocation or waste in ‘extents’, ‘segments’, ‘clusters’, or fixed-size blocks.No gradual space leaks can occur because free space management is transactional. Any size database benefits from the compression, from 10KB to 100GB and beyond.

Simple NoSQL APIs

There are several APIs, all of which can efficiently transfer images, blobs, or any other data:

  • For InfinityDB Embedded and InfinityDB Encrypted:
    • ItemSpace –  a small proprietary set of methods to query and modify the database at very high speed.
    • Map – an extended java.util.concurrent.ConcurrentNavigableMap that wraps the ItemSpace
    • JSON any ItemSpace data can be formatted and parsed
  • For InfinityDB Server:
    • REST – a widely used network request/response protocol based on HTTP, available through JavaScript, Java, and Python clients, or even from the unix/Linux shell via the curl utility.
    • ItemPacket – a proprietary binary protocol for fast Java client to InfinityDB Server remote access, or between Servers.

ItemSpace API

For the ultimate speed and extreme flexibility, the simple lower-level ‘ItemSpace’ API can do anything. There are only 10 essential storage and retrieval methods that operate on the ItemSpace: insert, delete, deleteSubspace, update, first, next, last, previous, commit, and rollback. You gain low-level access by momentarily allocating a ‘Cu’ cursor, and then using it for the API method invocations and disposing it. There are helper utilities for things like text indexes, hierarchical sorting, inversions, and more. Applications can define rich creative models on top of the ItemSpace. An ItemSpace is like a single sorted set of tuples each being any sequence of components of the 12 primitive data types.  For extreme speed, the tuples are actually dealt with as ‘Items’ which use a standard binary encoding for the components up to 1665 chars in length, but the encodings are never directly dealt with.

Map API

The nested Map view is a wrapper around the basic ItemSpace API, and it implements and extends the java.util.concurrent.ConcurrentNavigableMap, thereby providing the capability of a ConcurrentHashMap or ConcurrentSkipListMap. InfinityDBMaps may contain other InfinityDBMaps or InfinityDBSets which are extended ConcurrentSets. The InfinityDBMap is a light-weight Object which can be constructed dynamically without itself being persisted: the Map mutator methods actually store data in the ItemSpace database.  Extensions to the ConcurrentNavigableMap API include:

  • composite keys – variable data types and component count
  • composite values or set elements, variable data types and component count
  • multi-map – unlimited values per key
  • tuple access via variable-length Object arrays
  • nestable Maps and Sets

JSON API

Data in the database can be mapped directly to extended JSON text with a one-to-one correspondence. Utilities for parsing and generating JSON are provided. This goes beyond ‘Document’ databases, because the JSON is not stored as text but instead as compressed ‘paths’ or ‘Items’ each of which represents a JSON value. The 12 data types can be encoded into extended JSON, or ‘underscore-quoted’ standard JSON so any primitive can be a key or value, such as a date, and all keys are sorted. There is no artificial distinction between the ‘container’ of the documents and the documents themselves, so the size of any JSON sub-document depends on only the given path to it – from an entire database down to individual values. Access does not depend on loading and storing entire JSON documents. JSON has been extended to handle blobs such as images or any other file type. A better alternative format is the optional proprietary ‘i-code’ language.

Transactionality

Two kinds of transactions are available:
  • Global. This persists all current changes to disk, providing Atomic, Consistent, and Durable semantics. It does not use any kind of lock, so it does not provide inter-thread Isolation. However all access is concurrent during the commit by any threads. Effectively, there is a single ‘global transaction’ in effect at all times. Optimistic Locking commits also cause global commits.
  • Optimistic. Fine-grained multi-thread transactions use optimistic locking and support complete ACID atomic, consistent, isolated and durable semantics. Locks do not follow the usual rules of other DBMS’ but have the equivalent capability as table locks and row locks, index locks, or even single-column value locks and single set element locks. This diversity of lock types is not actually a complex spectrum of details – it follows trivially from the basic data model and is automatic and almost invisible to the programmer. If desired, the programmer can easily control the lock order for maximum concurrency simply by accessing appropriate data early in the transaction. The locks are actually just set on prefixes of tuples, i.e. prefixes of Items, and are maintained transparently. The set of locked prefixes is kept in memory per database globally and also associated with each thread. Lock conflicts throw an OptimisticLockConflictException and are optionally retried by the application code. Concurrent optimistic transactions can reach hundreds of commits per second on disk, and thousands per second on flash.

Advanced Data Model

Applications do not need to invent binary encodings or convert primitives to binary or text. Data is not stored as formatted text or as custom raw binary, but as an intermediate form, with standard pre-defined binary encodings of the individual Java primitives in a consistent way that allows extremely high speed.

InfinityDB Embedded supports all primitive Java data types and more:

  • long (stored as compressed bits to handle byte, short, char, with no more space)
  • float
  • double
  • boolean
  • String (encoded as UTF-16)
  • Date/time
  • index (for ‘huge sparse arrays’, lists in JSON, and BLOBs/CLOBs, texts)
  • short byte arrays up to 1KB (sort by length first, used for BLOBs)
  • short UTF-16 char arrays up to 1K chars (sort by length first, used for CLOBs)
  • short byte strings up to 1KB (these sort like strings but with bytes instead of 2-byte chars)
  • ‘Classes’ metadata identifiers embedded in Items for delimiting other data, describing the schema
  • ‘Attributes’ metadata identifiers

Application-Specific Data Models

InfinityDB provides a rich data representation space for structured, semi-structured, or unstructured data. The  basic data model is simple but flexible enough to be used by the application to define and represent any mixture of images or other BLOBS, texts or other CLOBS, trees, graphs, key/value multi-maps, sets, documents, text indexes or other inversions, huge sparse arrays, tables with an unlimited number of columns of an unlimited number of values or nested structures per column,   Entity-Attribute-Value structures or creative custom structures.

Items and the ItemSpace

All structures in the entire database are represented at the lowest level as a magnitude-ordered set of ‘Items’, where an Item is  logically a char array from 0 to 1665 chars long. The simplicity of the low-level format allows great speed and compression. These ordered Items represent the entire state of the database. ItemSpaces come in a wide variety of implementations, but they all have the same simple structure.

In order to make use of the ItemSpace, an Item is formatted as a packed series of ‘components’ of the 12 primitive data types, each of which is variable-length, compressed, explicitly typed, and self-delimiting. So, an Item can be thought of as a tuple with a variable number of elements of any type. The data in the components is formatted such that the sorting is appropriate: raw floats or other types would not sort properly. The internal binary encoding is done by InfinityDB Embedded in a fixed permanent way  for forwards and backwards compatibility.

All basic access to the database uses a temporary ‘Cu’ cursor containing one Item and no other state. The binary encoding of each component in an Item is unimportant to the application, which uses only Java primitives indirectly to build and examine Items in a Cu cursor.

Prefixes of Items are often used to logically nest Items into arbitrary recursive sub-spaces, i.e. sets of suffixes. Items can be used to represent sets of fixed-length tuples, the equivalent of CSV files or tables, or they can represent paths to JSON terminal values. JSON can be parsed and formatted from the Items.  The JSON is not stored literally: the entire database can be accessed at any level of hierarchical detail, because there is no fixed predefined division between keys and JSON documents. Combinations of tabular, document, blob, or many other structures can be easily intermixed.

Flexible Extensible Data Structures with ‘Class’ and ‘Attribute’ Data Types

If the special Class and Attribute data types are mixed in with the other ‘primitive’ data types in the Items, flexible, ‘incrementally self extending’ structures can be represented. See the InfinityDB Client/Server page for a graphical view of some examples of the flexible structures. An initial Class component is normally used to separate data for unlimited independent uses even without the flexible structuring in a single InfinityDB Embedded file. A Class is encoded as binary but contains a string with an initial capital letter followed by zero or more letters, digits, dot, dash, or underscore (as a regex: [A-Z][A-Za-z0-9._-]*). An Attribute is identical but starts with a lower case letter.

When used to represent a ‘flexible’ tabular structure, keys can be:

  • ‘tuples’, where a tuple is any concatenation of zero or more primitives of any type,
  • heterogenous – different keys can have different primitive types or tuple types,
  • variadic – different keys can  be tuples of a different number of primitive types,
  • nestable sparse arrays or lists of unlimited size of any key type, i.e. lists, using the ‘index’ data type

Flexible table column values can be the same as keys plus:

  • multi-valued, with no limit on number, and where an absence of any value takes no storage,
  • CharacterLongObjects or BinaryLongObjects of unlimited size, such as images or files.

Furthermore, any such flexible structures can be nested by concatenating their Items onto the ends of other Items. A particular set of suffixes can contain any kind of nested structure. The ‘Class’ and ‘Attribute’ data type components can represent four patterns depending on their pairings within each Item:

Pairing Meaning
Class then data then Attribute then data a ‘table’
Class then data then Class then data a ‘sub-table’
Attribute then data then Attribute then data a ‘sub-attribute’
Attribute then data then Class then data a ‘nested table’

The GUI display of such flexible structures is very rich – see it in action in InfinityDB Client/Server. Data can also be handled in various character-oriented formats. The displays look like nestable ‘documents’, tables, lists, trees, and so on. Here is a flexible table with Class “Trees”, a multi-value Attribute, ‘composite’ keys of variable component count, and a nested table “Location”.

Forwards and Backwards Schema Compatibility

The ItemSpace model is inherently extensible, but with the flexible ‘Class’ and ‘Attribute’ metadata data types embedded in the Items, databases become ‘self-describing’ and can be extended in ways that avoid incompatibilities with earlier or later database backups, old or new application versions, or changing or extending data producers and consumers like users, Python scripts, bash ‘curl’ commands,  or IoT’s or distributed databases.

Virtual View ItemSpaces

InfinityDB Embedded provides many utilities for dynamically viewing one or more underlying ItemSpaces as a virtual ItemSpace . All underlying ItemSpace changes reflect immediately in the virtual view ItemSpace. A view is a true ItemSpace itself:

  • ItemSubspace  virtually hides and restricts by a fixed prefix of an ItemSpace;
  • DeltaItemSpace is a mutable view of a fixed underlying ItemSpace with its own commit and rollback;
  • AndSpace views a logically intersected set of underlying ItemSpaces;
  • OrSpace views a logically unioned set of underlying ItemSpaces;
  • RangeItemSpace views a limited range of Items;
  • VolatileItemSpace stores Items in memory non-persistently;
  • IncrementalMergingItemSpace views a special kind of index that can be incrementally built and optimized efficiently at any size while +being accessed concurrently. Concurrent deletions are allowed. Text indexing is one use.

Views can be nested. An arbitrarily deep nesting of AndSpace and OrSpace can be flattened automatically for best speed. These capabilities provide a type of instant dynamic query capability without indexes, query compilation, execution, accessor allocation or destruction or temporary space usage. The virtual ItemSpaces are light-weight Objects. Any number of views can exist at once. The views can underlie the Map-based wrappers. They work with the flexible data representation using Class and Attribute data types as well.

More Information

See the Manual for detailed information on InfinityDB Embedded (an old doc). See Documents (old docs) on the internal structure or the principles for constructing any higher-order data model from the trivial underlying ‘ItemSpace‘ data model. For a Free InfinityDB Embedded Trial Download see the shop. Here is the InfinityDB Embedded_Trial License.

for licensing, email support@boilerbay.com, or go to the Amazon Web Services Marketplace.

The AirConcurrentMap Java ConcurrentNavigableMap

A separate product, the fast com.infinitydb.map.AirConcurrentMap API is identical to the Standard Java Maps – in fact it is a java.util.concurrent.ConcurrentNavigableMap, optimized for more than about 1K Entries. You can use our free non-commercial edition or license the commercial edition. Compare the performance with that of the Standard Java Maps. Memory efficiency is higher than any JDK Map as well. Our extensions provide extreme performance for parallel operations even beyond streams. The Map is fast and concurrent due to the same patented technology as is used in InfinityDB.

Learn more about AirConcurrentMap.