InfinityDB Client/Server Java NoSQL Database

InfinityDB Client/Server is a Java NoSQL Database. It organizes a set of InfinityDB Embedded database files into a secure, remotely accessible, shareable database.

Applications

  • Data Science via the server’s database browser and PatternQueries.
  • People collaborating by sharing local and remote distributed data
  • IoT and Microservices networks via the REST APIs with PatternQueries defining the REST URLs
  • Distributed database architectures via the server’s ItemPacket protocol
  • High performance Java applications via the low-level ItemSpace Data Model and Map APIs

The features:

  • Data can natively include images or other blob types, Items, texts, hierarchies, relations
  • A built-in light-weight web server provides:
    • A data browser and editor with graphical tabular, JSON, i code, image, raw data views
    • Administration of users, roles, permissions, passwords, and databases
  • Security includes SSL/TLS, hashed and encrypted passwords, encrypted system data
  • Pattern Queries provide:
    • Simple powerful graphical or text  definitions without SQL or algorithms. No programming
    • High speed execution – often only milliseconds. Queries can overlap in different CPU cores
    • Compiler and optimizer takes only milliseconds
    • Ability to transform, select, compute with rich data including images 
    • Reducers allow fast data aggregation, such as statistics, hashing, custom
  • REST access is a form of Remote Procedure Call for Python or shell via ‘curl’ command and is protected by SSL/TLS
    • API is defined by PatternQueries for simplicity, flexibility, isolation, security and extensibility
  • A fast secure ‘ItemPacket’ protocol provides:
    • Client application access to remote servers  with minimal application changes
    • Interconnected servers that redirect access between themselves with minimal application changes.

The Communications Architecture

InfinityDB servers and client programs can communicate in a variety of ways. The ‘ItemPacket’ protocol links allow servers to cooperate by transparently remoting databases that appear to clients to be local to a particular server. Remote servers are for data backup, migration and aggregation. A server can run in a local computer as a sandbox, or in the cloud, for sharing. There are also Remote Procedure Calls via the standard REST type interface:

InfinityDB Communications Architecture

Access the Server Here

Here is a link for the  JSON of the Items describing some aircraft in the infinitydb.com demo/readonly database: Aircraft username ‘testUser’ password ‘db’. Here is a link for the Items for a stored picture: Apollo-Soyuz.

The Flexible ‘ItemSpace’ Data Model

InfinityDB is a novel DBMS that has an almost trivial, very flexible self-adapting ‘ItemSpace’ data model. It’s different because you don’t define schemas like tables or key/value stores ahead of time, but instead they develop gradually and extend automatically as the need arises, new data sources come on-line and client apps evolve. The things we show here are online, and you can experiment with them yourself. The simplicity allows quickly and easily creating not only informal data structures but also databases as formalized and curated as in any relational DBMS.

The flexible format data model is so adaptable that with just a few concepts, you can create multi-user accessible structures that represent tables, trees, documents, blobs like images, all nestable and hierarchical. It’s perfect for data science or highly dynamic environments like tech startups or anywhere you have evolving data to share between people, scripts, IoTs, applications that are evolving and so on. For example, you can have what look like structured documents that are editable by multiple users – somewhat like googledocs. It is possible to store any Binary Long Object i.e. BLOB data, like files, in the database as well, and image BLOBs are displayed graphically in the data browser.

 Try the Database Browser Web Access

You can see the backend web app now at https://infinitydb.com:37411. The guest user is ‘testUser’ and password ‘db’. We appreciate your experimentation and exploration of it. To get a flavor of the system, select ‘Access Databases’ on the home page. You will see a table of the available InfinityDB Embedded database files for users in the guest group, which are demo/readonly and demo/writeable. You can select ‘edit’ on either one to see the main browsing and editing page with a list of the existing ‘EntityClasses’ (which are like nestable tables). Choose any of them, such as ‘Pictures’, ‘Samples’, ‘Aircraft’ or ‘Documentation’ to see the main tabular browsing page with the rich flexible data format. The ‘Documentation’ EntityClass there describes the system overall. You can try the Tabular, regular JSON, extended JSON, CSV, ‘i code’ ‘List of Strings’ or Item views and more. If you want, you can add and modify your own data in demo/writeable for others to see, and then please comment to us. We will see your changes! We will create you a free personal public or private trial database if you contact us, and provide the secure Python REST driver.

The Flexible Data Model Viewed by the Backend Web UI

There is some flexible-format demo data that is actually in the distribution ‘demo/readonly’ database. We’ll show various structures you can get with a simple set of ‘Items’ plus the two special ‘EntityClass’ and ‘Attribute’ data types.

Here is a set of samples in ‘demo/readonly’ from a simulated IoT, where pressure and temperature are measured very quickly. This could be embedded in another table or document or have additional internal structure. You can add new columns without changing a schema. For example, a new humidity sensor comes online, and it sends its data to the ‘SamplesIndexed’ table but using a new column name, and the old and new sensor data is merged as it arrives, with no changes to the old sensor or database by anyone. The data is not stored as text, but is ‘strongly typed’  with 10 basic data types plus special ‘EntityClass’ and ‘Attribute’ data types that encode the semantics. The ‘sensors’ could be IoT’ devices, Java programs, Python scripts or ‘curl’ commands or other ‘RESTful’ data source or sink or even users. There is no rewriting of a global database schema to incorporate the data – the structure is in the data.

Here is a table in ‘demo/readonly’ that contains BLOBs or ‘binary long objects’ which happen to be images in this case.  This is a ‘table’ called ‘Pictures’, with  four ‘columns’ or attributes.

Below is some documentation about the system in the flexible format. It’s not a pretty word doc with fonts and so on, but it captures the logic of a rich document that can be concurrently edited. All of it down to individual paragraphs is independently and concurrently editable by multiple users, and the embedded tables can be added, removed, and edited too, down to one cell at a time, by multiple users. You can use hierarchical numerical or text-titled sections, deeply nested documents or tables and so on. It is real database data, mixed in with any other kind, accessible to remote programs.

The above is actually a table that just looks like a rich document. The ‘keys’ at the left are the section titles, and the contents of the ‘description’ column is the text. The text has multiple embedded tables – the visible one is called ‘Display Components’, where the key is the name of a screen widget, and the single column it has is the description of the widget. The rest of the document is similar. The structure of this is determined solely by a set of ‘Items’ in the database with a very simple format. The back-end GUI  web app interprets the Item patterns to generate the display resembling a document.

Below is a really nested table . This is the definition of a ‘Pattern Query’ found in the public database ‘demo/readonly’, which is how you can transform structures of data in a database to change almost anything about its organization, or select data, or sort, and so on. First of all, there is an outer table called ‘Query’. The keys are the query names, so you can re-use queries. Inside that are the ‘query’ and ‘description’ columns, where the first is the query’s specification, and the second is an explanation, a little documentation about what it is for. You can see the value of nesting documents inside tables here. The description is not limited in size, and can be rich, although individual lines are always limited to 1K characters to fit in Items. The definition has nested columns called ‘pattern’ and ‘result’, plus a nested table called ‘Where’. You can execute these queries yourself at https://infinitydb.com:37411, (user testUser password db) or experiment, creating anything you want in the public ‘demo/writeable’.

This is just more structure represented by the almost trivial data model called the ‘ItemSpace’, with normal data types plus the two special flexible data types ‘EntityClass’ and ‘Attribute’ mixed in to some of the Items. The GUI formats it with certain simple fixed rules into a graphical display based on patterns in the data. The displays above follow these fixed rules, without any special ‘formatting’ instructions or external data structure. That means every structure above is nestable and can be combined with any other structure at any time. The ‘document’ shown above is not a file, but a data pattern. It can be read from and written to at any level of hierarchical detail. The ‘IndexedSamples’ table is not a file but a pattern of Items, as are the ‘Picture’ and ‘Pattern Query’ displays.

Database Browser User Interface

Here is the full backend browser page looking at a ‘Trees’ table in the flexible format. The functions of the display components are described in the table ‘Documentation’ in database ‘demo/readonly’ using the flexible format itself.

The ‘Current Prefix’ here is like your ‘current working directory’ in the shell. This prefix contains an ‘Item’ which is composed of strings, longs, doubles, floats, Booleans, dates, indexes, short byte arrays, short byte strings, and short char arrays, but in the flexible format it also can have any combination of two optional additional special ‘EntityClass’ and ‘Attribute’ type components that describe the schema of the Item internally. There are 12 data types in total.

There is no schema structure defined anywhere but inside the flexible-format ‘Items’ themselves. The database is nothing more than an ordered set of these Items. When an Item is inserted, say with the insert button or by a secure Python or Java client, or by a curl command in the shell, the database schema is effectively extended at that moment. Deleting the Item reverses it, leaving behind the exact original structure. Any kind of structure can be created instantly and painlessly: more JSON trees, raw or flexible tables, rows, new attributes, values, nested structures, whatever.

This self-extending system allows us, for example, to add a brand new EntityClass – think of an EntityClass like a ‘table name’ for now. All raw or flexible data begins with an ‘EntityClass’. An EntityClass or Attribute contains a string to name it, with an EntityClass beginning with an upper case character, an Attribute beginning with a lower case letter, and thereafter zero or more letters, digits, dot, dash, or underscore.  (The .-_ can be used for Morse code when necessary.) Here is an Item with an EntityClass called ‘Trees’ and then an ‘entity’ data component for the tree type, “red fir” – which is like a key – then a new Attribute we are creating at the same time called “type” and then “conifer” at the end for the value of the Attribute:

You can read this Item like a sentence: “There is a Tree called a red fir whose type is conifer”. Now just insert it with the Insert button, and there is a new table:

If you delete that Item, the table vanishes, leaving nothing at all behind. Now you can edit the Item in the current prefix line it to put in a new tree – “oak” as a “deciduous”.

When it is inserted, there are now two rows. You can get back the Items by clicking on the table display – so the original Items re-appear in the current prefix. You can delete the entire table with the ‘delete suffixes’ button after clicking the ‘Trees’ EntityClass to set the current prefix. Click ‘commit’ from time to time when you like the current state, so you can click rollback to go back there. (This is the ‘global’ transaction feature, not the fine-grained ‘Optimistic’ ACID feature obtained by checking ‘Transactional’ explained elsewhere).

Adding a column happens when an Item with a brand new Attribute is inserted. Let’s add “hardwood” as an attribute with the value ‘false’. (Oak is not a hardwood actually.) Insert caused the table to widen by the new column. The cell for ‘red fir’ under hardwood is gray because it has no Item. Here insert false and it goes white. Each white cell has an Item in the database.

You can edit the data cells – here is the edit box containing “deciduous” now you can update it and hit checkmark, or hit plus after changing the text in the box to get a new entry, or hit minus to delete the entry in the edit box. The edit box appears when you click on the already selected cell.  You can’t add structure, only ‘data’. For structure changes, you use the Current Prefix. Structure change basically means creating a brand-new EntityClass or Attribute. The table GUI finds the Attributes to display by looking forwards a bit in the database from the Current Prefix.

The tables we get this way are more flexible than regular tables. You can put structure inside the cells as well as data. Suppose we add ‘larch’, and then discover that ‘conifer’ and ‘deciduous’ are not mutually exclusive! We can put in multiple values to fix it. This is not possible in standard tabular or relational DBMS. Hence standard relational tables require an initial analysis ‘pass’ by the developers in which they figure out the desired capabilities of an application once and for all, and then create the schema and write apps that assume that schema. InfinityDB does not do that, but is more ‘agile’.

Now suppose we find that we want to store facts about red as well as white oak. We can edit the “oak” cell to add “red” and hit ‘+’.

And now we have:

Now there is a row starting with “oak” “red”, and we can put in the hardwood and type attributes for it. The original “oak” row can be changed to “oak” “white” by clicking and editing and clicking the checkmark. The result is a table with ‘composite keys’. The fact that there are multiple components in some keys is OK. Any white cell can have any combination of any number of components of the 10 ‘primitive’ data types, including string, number, Boolean, date and so on.

We call the white-cell components under ‘Trees’ ‘entities’ and those under ‘hardwood’ or ‘type’ we say are ‘values’. A sequence of zero or more primitive components in a white cell is a ‘tuple’. Each of the multiple values of an attribute can be a different tuple. So the “conifer” and “deciduous” of the “larch” form two values, not a tuple. If you move the pointer over a tuple, it goes yellow, including all of the primitive values in it, so you can distinguish multi-component tuples from multi-value attributes if they wrap. Each value can be a tuple, with no limit on the number of values. Tuples should stay relatively short, though.

Relational systems cannot have keys of varying numbers of columns – this is baked in, and impossible to change later. Also, relational systems limit the data types of the keys and values to a fixed type. This is good and bad, because the limitation keeps the data ‘clean’, but it precludes extension. Using varying-length tuples, we can do things like create hierarchies, where the entity tuples represent the paths to the substructure. Using varying data types, we can still expand when the data type assumptions turn out later to be too limiting. We can combine numbers and text, to form hierarchically numbered sections that can have titles too.

We’ll extend this even further below, but as an aside, here is a bit about the non-tabular modes.

The JSON and Item formats

If you need the JSON data format, here is our Trees table (click on ‘Trees’, select ‘show as Extended JSON and click load). You can edit, cut and paste and save it or email it. The initial ‘Trees’ EntityClass component is implied. Note that the keys are Attributes in some places – we can actually use any of the 12 data types for keys or values or list elements. To get rid of that behavior, see ‘Underscore quoted JSON’, in which the 12 data types are encoded as strings with an initial underscore to identify them for compatibility with standard JSON.  Plain strings that happen to have an initial underscore have one more ‘stuffed in’ at the front to avoid being interpreted as non-string data types. There are data types for raw hex-displayed short byte arrays or byte ‘strings’ or short char arrays with which we can store BLOBs, even in JSON. A JSON list uses the ‘Index’ data type.

{
    "larch" : {
        type : {
            "conifer" : null,
            "deciduous" : null
        }
    },
    "oak" : {
        hardwood : false,
        type : "deciduous",
        "red" : null
    },
    "red fir" : {
        hardwood : false,
        type : "conifer"
    }
}

Here are the underlying Items for the Trees (Click on ‘Trees’, Select Show as Items and hit load). The initial ‘Trees’ EntityClass component is implied. The text is in the standard InfinityDB ‘tokenized’ representation for components.

"larch" type "conifer"
"larch" type "deciduous"
"oak" hardwood false
"oak" type "deciduous"
"oak" "red"
"red fir" hardwood false
"red fir" type "conifer"

The secure Python and other RESTful access uses JSON, while the secure Java ‘ItemPacket’ protocol uses discrete Items or batches in their underlying binary form for extreme speed. The browser can also show, edit and store ‘i code Java Style’ ‘i code Python Style’, Lists of CSV, Sets of CSV, Lists of Strings, or plain text CLOBs. The i code formats make editing PatternQueries as text more familiar to some users.

Nested Tables

Now back to tabular view. We will add more info to the flexible Trees table because we realized we have nurseries that stock them for sale. (In a relational system, we would create a new Nurseries table and have a connection table with a fixed composite key of nursery id and species id including quantities, then set up relational integrity maintenance mechanisms.) We can transform and query our Trees table easily with the pattern query feature into a different form at any time, but we want other people and fixed data sources like IoTs and older databases like backups or distributed databases to remain compatible with the existing structure.  (And, we don’t need three tables and annoying joins every time we access it as in relational systems.) So let’s add a subtable to each tree that lists the nurseries and on-hand stock. We click on the “oak” “red” to get back the Trees “oak” “red” Item, then add to it:

This can be read “There is a tree called ‘oak’ and it is a ‘red’ subtype, which is in a nursery in the location aptos in quantity 2”. We start ‘nursery’ with lower case to make it an Attribute, and ‘Location’ with upper case so it is an ‘EntityClass’. Now we have a nested table:

This kind of extension is almost limitless. Any attribute can have any number of values, distinct nested tables, distinct nested attributes, lists, pictures or BLOBs all at once. The structure grows as data flows in. It is determined entirely by the placement of components in the Items that flow in and out. Any possible set of Items has a unique corresponding representation, either in the table display, JSON, or a text Item list. A particular sensor or script producing data will often keep sending in Items of the same structure, but new data sources come along all the time, with new structure to merge in. If the structure becomes limiting in some way, it can be transformed almost limitlessly using the ‘Pattern Query’ feature.

The semantics of the Items containing EntityClasses and Attributes depends only on how pairs of them occur in the Item. There are four ways to pair them at any position in the Item:

Pairing Meaning
EntityClass then data then Attribute then data a ‘table’
EntityClass then data then EntityClass then data a ‘sub-table’
Attribute then data then Attribute then data a ‘sub-attribute’
Attribute then data then EntityClass then data a ‘nested table’

The data parts are any adjacent sequence of zero or more of the 10 primitive data types. The adjacent primitives are a ‘tuple’. The tuple after an EntityClass is an ‘entity’, and the tuple after an Attribute is a ‘value’.

Examples of the rich GUI displays you will see for different simple and complex Item patterns are shown in the database ‘demo/readonly’ and the EntityClass ‘Documents’ at https://infinitydb.com:37411 user name ‘testUser’ password ‘db’.

 The Index Data Type

There is also an ‘index’ primitive data type that allows lists of Items to be described, as shown in the ‘Documentation’ EntityClass above, where the paragraphs are numbered by ‘[n]’ components. These can be appended to easily, and they form lists in the JSON form, which are easily editable to handle renumbering if you insert new paragraphs in the middle.

Dialing it Back with Access Permissions

Of course this is so flexible that it can get out of hand if everyone is just adding data any time they want. So in an organization, there will be various means of coordinating the changes. This is done currently by having multiple databases with different access permissions for each role, and users can be given multiple roles. Then, a user can own a database and use it for loosely organized things, or a more formal database’s structure can be maintained by agreement at routine meetings in various groups, or a single person or two might be the controllers of the structure of a ‘curated’ database, which can be made read-only to many other users and data sinks via role permissions. There is a single ‘admin’ user who creates databases, users, roles, and permissions.

Pattern Queries

A very powerful feature is the ‘Pattern Query‘, which can transform an ItemSpace in a wide variety of ways based on only an input ‘pattern’ and an output ‘result’ plus a ‘Where’ table. These query definition elements are stored as normal Items, so they can be viewed in tabular form in the web-based data browser and editor or used in any other way as data themselves. They can be named and individually documented for re-use. The definition pattern and result can resemble tables and other flexible structure given by EntityClass and Attribute components that may exist in the Items.

The queries can do these things with simple sets of a few definition Items and no SQL or other language:

  • Match patterns of input Items with ‘symbols’ literals, and expressions.
    • Match multiple correlated input Items – these ‘join’
  • Output results creating new Items from symbols, literals, and expressions:
  • Follow rules based on attributes of the symbols
  • Execute ‘reducers’ such as counters, sums, statistics, secure hashing, timers, randomizers and more

These capabilities are much more powerful than the relational ‘select’, ‘project’, ‘join’, and ‘order by’, but far simpler.

The results can be easily moved, copied, trimmed, annotated, filtered, simplified, canonicalized, sorted, restructured, re-nested, and more, with no complex syntax like SQL. See the PatternQuery Reference for details, PatternQuery Perspective and PatternQuery Implementation.

Transfer Suffixes

Another back-end feature is the ability to move data quickly based on a pair of prefixes. Any Item can be a source or destination prefix, and data can be moved within or between databases, for backup, database copying or structure re-organization. The operands are the two sets of suffixes. The operation can copy, move, union, difference, or intersect the two sets of suffixes. Data ‘aliasing’ is handled – for example, the error that occurs in Unix when a directory is moved ‘inside itself’, or ‘outside itself’ does not occur. The operation does not depend on the pattern of components in the Items.

Custom Structures

The types of structures we have discussed above do not limit the uses. Applications sometimes create specific structures, such as text indexes or time-series databases. These will show up in the browser/editor as ‘raw’ Items of the 10 primitive data types without the EntityClass or Attribute data types.

Differences with the Embedded Version

Many applications are light-weight, using only one or a few database files, and they use only InfinityDB Embedded. No back-end server is necessary for many applications, but the database is then used in a single JVM process. Security and administration of an InfinityDB Embedded file is much simpler. InfinityDB Client/Server requires a directory structure to store web pages, data files, SSL keys and certificates, and encrypted metadata. InfinityDB Client/Server requires an ‘admin’ user – at least at first – to manage users, roles, permissions, and databases. The need for these arises as soon as data is exposed on a port, and a simplistic remote socket connection protocol on the other hand cannot easily add security and other features.

However, once an InfinityDB Client/Server instance is running, little administration is required, because schema changes and extension are so easy in the ItemSpace data model. There are no SQL scripts for upgrade/downgrade or for the countless maintenance issues that arise in other systems.

Licensing

We provide our clients with free online trials and limited personal trial data storage at https://infinitydb.com:37411, and we license the InfinityDB Client/Server software for private on-premises use, such as behind a firewall or on client’s secure public servers. We license it for inclusion in client’s products as well. We are working on providing servers within IoT’s, so that an IoT can collect data on its own, and then provide it to applications or users on demand.

Amazon Cloud Servers

Finally, we are working on providing client-operated cloud servers on Amazon Elastic Compute Cloud EC2. This will be done through the AWS Marketplace, where the client launches an AMI provided by us, and then pays the usage fees for it plus our small fee on top while it is running. Such an arrangement provides highly secure private storage.

Contact us at support@boilerbay.com.