July 18, 2005

EAV vs RDBMS

 This problem is handled in SQL products either by using a secondary index or by
'full table scan', if there is no index, which is the exhaustive search. I always have
used a 'secondary index' in InfinityDB B-Tree but in the EAV model its called an Inversion.

First, lets talk about the original data, the 'table' just to be sure we agree on
the terminology and so on.  Each message has an identifier
which is the MessageID and this we call the Entity - a 'key' to an rdbms. Then, each
field is stored in a separate Item formatted like:

<MessageID, 12345, From, a@b.com>

where MessageID is an EntityClass component, 12345 is a long component,
>From is an Attribute component, and a@b.com is a string component. These
components are concatenated into an Item and stored as a unit using methods
on the Cu cursor class.  (Note that somewhere we have static final
EntityClass MessageID = new EntityClass(33/*unique*/)
somewhere in our code, and also static final Attribute From = new Attribute(3/*unique*/).)
The Items we construct are just seqences of bytes that are stored atomically in sorted
order.

In order to store the To 'field' we use:

<MessageID, 12345, To, z@x.com>

Where the MessageID is the same EntityClass, 12345 is the same long component
so that these Items store together, To is an Attribute, and z@x.com is a string
component again. There is one more Item to fill out this 'row':

<MessageID, 12345, Sent, 01/01/2004>

These Items sort together because they start with . Now with
this, you can enumerate messages by message id between any two values
either direction or do direct access on any message id.

Now to be able to search and sort by Sent date, we need an inversion. An inversion
is formed by pairing two Items that mean the same thing but have the components
reordered and changed a bit. Thus we also store an Item:

<MessageSentDate, 0101/2004, MessageIdSent, 12345>

which is the inverse of the first Item above. Each  Item with the Sent Attribute
in the original data has a corresponding inverse Item that is stored in the
db in a different place. In this one, the Sent Attribute corresponds
to the MessageSentDate EntityClass, and the MessageIdSent Attribute corresponds
to the MessageId EntityClass. Every Item makes a simple statement of fact, and
inverses are Items that make the same statement in a different way.   You can now use the first(), next(), last()
and previous() methods on the database to direct access or scan a subrange with
high efficiency. As you scan the MessageSentDate EntityClass this way in date
order, you can retrieve the MessageIdSent attributes, which are message identifiers,
and then go use them to retrieve the original information sorted by message id.

The same basic technology is used internally in an rdbms, but you don't have the
advantages of composite attributes or even composite attributes with a varying
number of components, multi-value attributes, heterogenous-typed
keys and attributes, zero-space null fields, zero-space empty relations or indexes, dynamic schema
extension, array fields, or totally custom high-performance structures like
directory trees, full-text indexes, and so on..

For example code, see the Patient example in the examples directory. Still intrigued?
Ask more! Hope I have answered your question with this long answer.

Roger Deran


X wrote:

> Hi Guys,
>  
> We're evaluating Infinity for use in as an embedded db in a spam filtering product. I've already imported a few thousand messages to test with and things are going great. I'm a bit stuck on a retrieval question though, and am having trouble thinking through a solution.
>  
> We've got messages stored in an EAV structure per your doco with the following fields stored for mail items.
>  
> MessageID    12345    From    a@b.com    To    z@x.com    Sent    01/01/2004    ...
>  
> Now, in our SQL implementation, we could do an "where Sent > 30/11/2004 order by Sent DESC" to get the last 30 days or so of mail. Which I think is what you guys call a dynamic query?
>  
> What's the best way to implement something like that?  I thought one way was to store a secondary index on Sent date. But I'm assuming I'll still have to do an exhaustive iteration of the secondary index to see which values fit my query date range? Is that right, or are there way to take advantage of the BTree's ordering so I can immediately cut down my iteration to values within a date range? (eg a subtree)
>  
> Thanks heaps,
>  
> X.


Posted 5 years, 7 months ago on July 18, 2005
The trackback url for this post is http://boilerbay.com/infinitydb/forum/bblog/trackback.php/22/

...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation

Comments have now been turned off for this post