InfinityDB

The fast instant Map-based no-SQL Java Embedded Database

  • Standard Map interface (extended java.util.concurrent.ConcurrentNavigableMap)
  • Nestable Multi-value (can represent trees, graphs, tables, K/V, documents, huge sparse arrays)
  • Strong Typing (with no text/binary trap. Plus CLOB’s, BLOB’s, short byte and char arrays)
  • Runtime schema evolution (for forwards/backwards compatibility)
  • 1M ops/sec in memory multi-threaded
  • Multi-core Concurrency
  • Extreme Compression – freed space is immediately reused
  • Transactions – optimistic or global
  • Self administration (one file, no DBA, instant developer productivity)

We give free advance customer support by a software engineer for our products. Contact us and see at support@boilerbay.com.

The InfinityDB Java embedded database is as simple to use as the Java Map interface, and is fast, multi-core concurrent, and data-compressing. It requires no schema upgrades or maintenance, so it is very easy to administer. InfinityDB Java embedded DBMS provides a rich data representation space for structured, semi-structured, or unstructured data, while protecting against the ‘text/binary trap’. The power of the extended java.util.concurrent.ConcurrentNavigableMap interface and its underlying ‘ItemSpace’ enable creative developers to superimpose novel fast, efficient data representations and access accelerators like indexes. Data is encoded with efficient fast fine granularity and smooth unlimited growth in size and structure, whether the data fits in memory or not. Data spills to disk transparently from the memory cache as needed. There is only a single file. The noSQL InfinityDB Java embedded DB is safe, even though it avoids a log and has instant restart in all situations.

Simple extended java.util.Map Interface

InfinityDB is accessible via an upper level wrapper that implements and improves on the java.util.concurrent.ConcurrentNavigableMap, thereby providing the capability of a HashMap, TreeMap, ConcurrentHashMap, or ConcurrentSkipListMap. Data is stored internally as ‘tuples’, which are manipulated by nested InfinityDBMaps, each of which corresponds to an immutable tuple prefix, i.e. a composite key. Nested Maps may be created dynamically as needed to access data, but the InfinityDBMap Object is not stored – only the prefixed tuples.

Sophisticated web applications and many other kinds of applications can be supported entirely and simply by InfinityDB – no RDBMS is needed, just the easy extended Map interface and optionally the low-level InfinityDB ‘ItemSpace’  data model. Initial development is easy, yet the no-SQL InfinityDB supports indefinite application growth and complexity along with increasing data volume and velocity. Backwards compatibility with older databases and even some forwards compatibility is easy because the database is schemaless. The instant InfinityDB Java database avoids the text/binary trap, so applications need not read and slowly parse chunks of limited-size text or devise complex dangerous rigid opaque binary encodings of data.

The InfinityDB no SQL Java embedded DB has been in use for years, with thousands of active installations.

Enormous Speed and Concurrency

The performance of InfinityDB is over one million operations per second at the low-level with multiple threads. Its Multi-Core technology takes advantage of all of the cores available in modern mobile devices, IoT devices, personal computers, embedded systems, and servers. Cores are multiplying at Moore’s-law speed, and applications are adding more and more threads. Without the multi-core technology supplied by the noSQL InfinityDB to avoid inter-thread interference, bottlenecks called ‘convoys’ can occur when threads contend for data. Performance can drop dramatically, even far below single-thread speed. InfinityDB is completely thread-safe.

Appropriate for small to large installations

You can use InfinityDB as an In-Memory-Only DBMS keeping all data in the cache, or let it grow smoothly to hundreds of GB with no code changes. Access to data in the memory cache is fully multi-core, while infrequently used data is paged to disk as necessary. Data is not serialized in slow batches, as it would be with the standard Java Object serialization, XML, or JSON, for example, but is instead operated on efficiently with fine granularity for small or large data sizes regardless of memory capacity. Fine granularity accesses can smoothly transition to coarse block-oriented granularity accesses as needed.

Extensibility and Forwards Compatibility

No data structures have physical or practical size limits but can always expand efficiently from zero values upwards to any size. For example, a multi-value column or attribute, ordered set, or huge sparse array can smoothly extend to any size with fine-grained, efficient and smooth access performance. Data structures that are empty take no space, hence any additional structure requires no reorganization, as each data structure effectively already exists virtually but with no size.

For example, when data is viewed as relations, a new table or column can be added at any time, because all column values are nullable and come into existence on the first use, taking no space until then. There is no limit on tables, columns, values or value size. A single-valued column that is to be converted to multi-value is already in the proper format, and more values can simply be inserted along with it. Because of this generality, such as extending the concept of a column to have multiple values, we call the model ‘Entity-Attribute-Value’, but the extensions are not required, and a basic key/value store is simple as well using just the standard Map view.

Applications can anticipate extensions and often provide some forwards compatibility and can provide vital full backwards compatibility. No upgrade scripts are needed.

Data Structures

The data type for individual values is determined dynamically, yet values are not simple strings to be parsed but are binary, strongly-typed, self-describing, and compact. InfinityDB converts encoded values to Java primitives and back transparently and almost instantly without application effort.

Aggregate data structures for example include potentially large CharacterLongObjects, or BinaryLongObjects which can be added at any time in any context. Typical application-determined structures include relations, key/value maps, EAV or ER triples, taxonomies, trees, text indexes, documents, DAG’s, sets, or general graphs, including extensible dense mixtures of these, as explained in the documentation. These structures all rest on the same fast, simple Map API or the lower-level ‘ItemSpace Engine’, and developers can easily design and optimize creatively.

Continuous Space Reclamation

Space allocation for individual values or aggregates of values is fully dynamic – they take no space until created or after deletion. During growth, values and aggregate structures require minimal space, and during shrinking, they dynamically return all freed space and remain efficient. The single data file is 100% efficient with compressed data on initial loading, and stays at least 50% efficient in the worst case after very large transactions, which may include any amount of data. Normally, free space is about 10%. The file never shrinks. Applications can run forever without gradual unlimited space loss. There is no need for occasional reorganizations or packing, and there is no garbage collector thread.

High Compression on Disk and In Memory

InfinityDB’s continuous, dynamic ZLib and UTF-8 data compression packs data into variable-length blocks, avoiding almost all wasted space that would normally be needed for internal fragmentation. I/O bandwith is reduced accordingly. Variable-length data items, prefix and branch-cell suffix compression are used in the memory cache as well.  Data compression means that the branching factor is kept high for fast access, and the OS file cache is better used.

InfinityDB requires no configuration file, no extraneous directories, and no temporary work files. Only one file per database is needed.

Direct No-SQL ‘ItemSpace’ API

For the ultimate speed, our trivial lower-level ‘ItemSpace’ API allows you access to all data. There are only a few storage and retrieval operations that operate on the no SQL InfinityDB ItemSpace. You gain low-level access by allocating a ‘Cu’ cursor, and then using it to insert, delete, update, locate, or scan data in the ItemSpace, which is an ordered set of variable-length Items, each Item being a concatenation of self-delimiting variable-length primitive values. There is no requirement that you refer to any external files such as per-table files or index files whatsoever. The ItemSpace and Map access techniques are completely compatible and interchangeable.

Transactionality

Two kinds of transactions are available:
  • Global. This persists all current changes to disk, providing Atomic, Consistent, and Durable semantics. It does not use any kind of lock, so it does not provide inter-thread Isolation. However all access is concurrent during the commit by any threads. Effectively, there is a single ‘global transaction’ in effect at all times. Optimistic Locking commits also cause global commits.
  • Optimisitic. Fine-grained multi-thread transactions use optimistic locking and support complete ACID atomic, consistent, isolated and durable semantics. Locks do not follow the usual rules of other DBMS’ but have the equivalent capability as table locks and row locks, index locks, or even single-column value locks and single set element locks. This diversity of lock types is not actually a complex spectrum of details – it follows trivially from the basic data model and is automatic and almost invisible to the programmer. If desired, the programmer can easily control the lock order for maximum concurrency simply by accessing appropriate data early in the transaction. The locks are actually just set on prefixes of tuples, i.e. prefixes of Items, and are maintained transparently. The set of locked prefixes is kept in memory per database globally and also associated with each thread.

No Need for Schema Upgrades

InfinityDB is infinitely, incrementally expandable while maintaining backwards schema compatibility. The schema does not need to be embedded in the database itself, but is implied by the way the application uses the data. This means that no SQL scripts need to be run to accomplish an upgrade. Instead it provides some forwards-compatibility and easy backwards compatibility of databases with applications because future applications do not need to go back and change schema definitions in order to continue operating. Often, future applications require virtually no effort to be backwards-compatible with older databases. Nevertheless, stored data has sensible, self-descriptive regular binary structure that makes it easy to represent and understand. The schema ‘rides along’ with the data as the data is stored. For example, a new table does not need to be created, or columns added to an existing table, before data can be stored that logically belongs in the new table. There is no proliferation of upgrade scripts in a growing directed acyclic graph of schema upgrade paths between versions as applications evolve.

Sensible Data Representation

Data is not stored as raw text or as raw binary, but as an intermediate form, with standard pre-defined binary encodings of the individual data items in a consistent way that allows extremely high speed combined with clarity of representation. The composite binary data items can be converted to regular text for humans by simply printing them while they are in a cursor. Applications do not need to invent binary encodings or use custom Java object serialization formats that will later have upgrade, reliability, flexibility, consistency, documentation, safety, security or coding complexity issues. InfinityDB has more extensibility, flexibility, and self-documenting characteristics than the best text formats without the performance limitations, such as reading and writing entire JSON texts or XML DOMs on each access.

InfinityDb natively supports all common primitive Java data types and more:

  • long (stored as compressed bits so byte, short, and int take no more space)
  • float
  • double
  • boolean
  • String (stored as UTF-8)
  • Date
  • short byte and char arrays (sort by length first)
  • short byte strings (sort like strings but with bytes instead of 2-byte chars)

When used to represent a relational structure, InfinityDB column values can be either:

  • sets of an arbitrary mixture of:
    • composite i.e. a heterogenous concatenation of one or more primitive data types, or
    • any value type identified by a heterogenous composite of primitive data types
  • sparse arrays of unlimited size of any other value type, or
  • CharacterLongObjects or BinaryLongObjects of unlimited size.

In such an extended relational structure, keys can be limited-length Composite and Heterogenous. 

Full-text indexing is available as well – see the example code.

All structures are represented as an ordered set of ‘Items’ which are each a short limited-length composition of one or more arbitrary strongly typed binary-encoded primitives, where Items are accessed independently. Any structure is nullable, but no space is required for such null structures or values, hence all structures are ‘virtual’ and can be inserted anywhere with no other structure changes. The possible structures are not limited to such extended relations however.

On the other hand JSON or XML representations require slow formatting and parsing, are targeted only at document granularity within practical size limits, cannot sort by key or value, use only strings as keys, cannot natively or efficiently represent binary or character streams or ‘LOBs’, cannot compose keys, and cannot have multi-values. Object serialization is also targeted at chunk-at-a-time access within practical size limits, and does not provide key-based access of the chunks, and has versioning, security, Object integrity and many other issues.

Reliability and Safety

InfinityDB uses a rugged internal storage update protocol for persistence on demand or cache spilling to disk for large amounts of data that maintains system-wide ACID properties except Isolation, protects data integrity, and survives abrupt application termination, file system bugs, and kernel panics with no locking. The single data file remains up-to-date, safe, correct, and usable through any event. There is no log-based recovery, hence restart is immediate in all cases. For smaller per-thread transactions, ACID properties including Isolation are also provided via optimistic locking. (There is one exception for power failure that is a fundamental weakness of any software explained in Operating Guidelines.)

See the Manual for detailed information. See Essentials for documents on the internal structure or the principles for constructing any higher-order data model from the trivial underlying ‘ItemSpace‘ data model.

atlassian_logo(1) pacific_knowledge_systems_logo

For licensing, email support@boilerbay.com.