• Objects have a timestamp field whose value is stable, meaning it is rarely modified. In the example schema, the Message table’s SendDate field works well because it is rarely modified once a message is created. This timestamp field is used as the sharding field.
• To benefit from a sharded table, queries must include an equality clause or range clause that uses the sharding field. For example, both of the following queries select objects in specific time frames:Normally, Doradus Spider creates a single term vector for each field/term combination. For example, the term vector with key “Body/the” holds references to all objects that use the term “the” in the field Body. For common terms, the term vector may point to every object in the table, and very large term vectors slow query performance. When sharding is enabled, separate term vectors are created for objects in each shard. Faster searching occurs when the sharding field is then included in queries.
• sharding-field: This option enables sharding and identifies the sharding field. Its value must be a timestamp field defined in the schema.
• sharding-granularity: This option specifies what time period causes objects to be assigned to a new shard. It can be HOUR, DAY, WEEK, or MONTH. If not specified, it defaults to MONTH. The value should be chosen so that each shard as a reasonable number of objects (< 1 million).
• sharding-start: This option specifies the date on which sharding begins for the table. Objects whose sharding-field value is null or less than the sharding-start value are considered “un-sharded” and assigned to shard #0. Objects whose sharding field is greater than or equal to the sharding-start value are assigned a shard number based on the difference between the two values and the sharding-granularity. If not explicitly assigned, sharding-start defaults to “now”, meaning the timestamp of the schema change that enables sharding.Table sharding can also benefit certain links that have very high fan-outs. See the description later on Sharded Links.