July 18, 2005
EAV vs RDBMS
This problem is handled in SQL products either by using a secondary index or by 'full table scan', if there is no index, which is the exhaustive search. I always have used a 'secondary index' in InfinityDB B-Tree but in the EAV model its called an Inversion. First, lets talk about the original data, the 'table' just to be sure we agree on the terminology and so on. Each message has an identifier which is the MessageID and this we call the Entity - a 'key' to an rdbms. Then, each field is stored in a separate Item formatted like: <MessageID, 12345, From, a@b.com> where MessageID is an EntityClass component, 12345 is a long component, >From is an Attribute component, and a@b.com is a string component. These components are concatenated into an Item and stored as a unit using methods on the Cu cursor class. (Note that somewhere we have static final EntityClass MessageID = new EntityClass(33/*unique*/) somewhere in our code, and also static final Attribute From = new Attribute(3/*unique*/).) The Items we construct are just seqences of bytes that are stored atomically in sorted order. In order to store the To 'field' we use: <MessageID, 12345, To, z@x.com> Where the MessageID is the same EntityClass, 12345 is the same long component so that these Items store together, To is an Attribute, and z@x.com is a string component again. There is one more Item to fill out this 'row': <MessageID, 12345, Sent, 01/01/2004> These Items sort together because they start with. Now with this, you can enumerate messages by message id between any two values either direction or do direct access on any message id. Now to be able to search and sort by Sent date, we need an inversion. An inversion is formed by pairing two Items that mean the same thing but have the components reordered and changed a bit. Thus we also store an Item: <MessageSentDate, 0101/2004, MessageIdSent, 12345> which is the inverse of the first Item above. Each Item with the Sent Attribute in the original data has a corresponding inverse Item that is stored in the db in a different place. In this one, the Sent Attribute corresponds to the MessageSentDate EntityClass, and the MessageIdSent Attribute corresponds to the MessageId EntityClass. Every Item makes a simple statement of fact, and inverses are Items that make the same statement in a different way. You can now use the first(), next(), last() and previous() methods on the database to direct access or scan a subrange with high efficiency. As you scan the MessageSentDate EntityClass this way in date order, you can retrieve the MessageIdSent attributes, which are message identifiers, and then go use them to retrieve the original information sorted by message id. The same basic technology is used internally in an rdbms, but you don't have the advantages of composite attributes or even composite attributes with a varying number of components, multi-value attributes, heterogenous-typed keys and attributes, zero-space null fields, zero-space empty relations or indexes, dynamic schema extension, array fields, or totally custom high-performance structures like directory trees, full-text indexes, and so on.. For example code, see the Patient example in the examples directory. Still intrigued? Ask more! Hope I have answered your question with this long answer. Roger Deran X wrote: > Hi Guys, > > We're evaluating Infinity for use in as an embedded db in a spam filtering product. I've already imported a few thousand messages to test with and things are going great. I'm a bit stuck on a retrieval question though, and am having trouble thinking through a solution. > > We've got messages stored in an EAV structure per your doco with the following fields stored for mail items. > > MessageID 12345 From a@b.com To z@x.com Sent 01/01/2004 ... > > Now, in our SQL implementation, we could do an "where Sent > 30/11/2004 order by Sent DESC" to get the last 30 days or so of mail. Which I think is what you guys call a dynamic query? > > What's the best way to implement something like that? I thought one way was to store a secondary index on Sent date. But I'm assuming I'll still have to do an exhaustive iteration of the secondary index to see which values fit my query date range? Is that right, or are there way to take advantage of the BTree's ordering so I can immediately cut down my iteration to values within a date range? (eg a subtree) > > Thanks heaps, > > X.
Posted 5 years, 7 months ago on July 18, 2005
The trackback url for this post is http://boilerbay.com/infinitydb/forum/bblog/trackback.php/22/
The trackback url for this post is http://boilerbay.com/infinitydb/forum/bblog/trackback.php/22/
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
...
Comment pending moderation
Comments have now been turned off for this post