With Doradus OLAP, links can only reference objects in the same shard. A link such as
Person.Manager cannot reference an object in another shard. (In fact, setting
Person.Manager to a given object ID implicitly creates the inverse object in the same shard if it does not already exist.) This means that some objects may need to be duplicated in multiple shards so that each shard is a complete graph, allowing queries to work efficiently. Because OLAP stores data compactly, the duplication is worthwhile since same-shard link path evaluation is extremely fast.
<field name="InReplyTo" type="
XLINK" table="Message" inverse="Responses"
junction="ThreadID"/>
<field name="Responses" type="
XLINK" table="Message" inverse="InReplyTo"
junction="_ID"/>
•
|
The _ID of a root message that begins a new conversation thread is used as the thread ID.
|
•
|
We can then traverse Message.Responses to navigate from the root message to other messages in the same thread. To do this, Doradus takes the root message’s _ID (because it is the junction field for Responses) and searches for messages in other shards whose ThreadID matches (because it is the junction field for the inverse link, InReplyTo).
|
•
|
Similarly, we can traverse Message.InReplyTo to navigate from any message back to the root message. In this case, Doradus takes the message’s ThreadID and searches for another message with a matching _ID.
|
One consideration used in this example is shard merging. In an OLAP database that uses time-oriented shards, we generally want to add data to
new shards, which are then merged. We don’t want to modify data in older shards if possible because this requires extra merging. In the example above, message threads are formed by simply setting the
ThreadID of newer messages. Older messages in the thread, including the root message, are never modified, hence we don’t need to merge older shards.
•
|
Each xlink identifies a junction field, which must be a text field belonging to the same table or the _ID field. The junction field is a foreign key to related objects. In a given relationship, at least one xlink must _ID field as its junction field. If the junction field is not explicitly defined, it defaults to the _ID field.
|
•
|
The xlink InReplyTo defines ThreadID as its junction field. This means an object is related via InReplyTo to the message(s) whose _ID matches its ThreadID.
|
•
|
The xlink Responses uses _ID as its junction field. This means an object is related via Responses to the message(s) whose ThreadID matches its _ID.
|
•
|
If both xlinks in a relationship use _ID as their junction field, each object is related to objects with the same object ID. This is allowed even if the xlinks are defined in different tables.
|
Xlinks form soft relationships, hence no referential integrity is assured. When a junction field is assigned a value, there may or may not exist any foreign objects with a matching value. Likewise, if two objects are related, the relationship may be broken by altering the junction field value, deleting one of the objects, or shard aging. Traversing an xlink whose junction field doesn’t match any foreign objects acts as if the xlink is null.