CIP-85: AppendCollection schemas
Author | Paul Le Cam |
---|---|
Discussions-To | https://github.com/ceramicnetwork/CIP/issues/90 |
Status | Withdrawn |
Category | RFC |
Created | 2021-02-16 |
Requires | 82, 88 |
Table of Contents
Simple Summary
Provide schemas and associated tools to help interact with large lists in Ceramic.
Abstract
This CIP presents a way to store a possibly large list of items split into multiple streams based on two complementary schemas: AppendCollection
and CollectionSlice
.
This allows to create a virtually infinite list of items such as a feed by leveraging multiple Ceramic streams rather than a single one.
Motivation
Storing a list of items in a single Ceramic Tile stream can make the stream grow large in size, possibly exceeding Ceramic’s internal limits for this stream stype. Providing a standard way to represent large lists such as feeds would be useful to provide reference schemas to solve this issue and associated tools to simplify interactions.
Specification
Concepts
Stream references
The schemas use stream references defined in CIP-82 to identify relations.
Cursors
A cursor is a unique identifier for an item in the collection, made of the tuple (CollectionSlice StreamID, item index). A collection slice should contain a maximum of 256 items, so the index of an item in the slice can be represented in a single byte.
Slice tag
A slice tag is a string identifying a unique slice part of a collection based on the index of the slice, as: <collection StreamID string>:<slice index>
.
Schemas
Two schemas are needed to represent a collection: the AppendCollection
schema represents an entry point to the collection, and the CollectionSlice
schema represents a single slice of the full list of items.
A Collection “instance” would therefore be made of 1 AppendCollection
stream and any number of CollectionSlice
streams with cross-references, as presented in the graphic below:
AppendCollection schema
The AppendCollection
schema must be an object
with a $comment
field from CIP-88 using the cip88:appendCollection
prefix and pointing to the slice’s schema
, along with the following properties:
sliceMaxItems
: the maximum number of items a single slice should containslicesCount
: the total number of slices the collection contains
{
$schema: 'http://json-schema.org/draft-07/schema#',
$comment: 'cip88:appendCollection:<slice schema ID>',
title: 'MyCollection',
type: 'object',
properties: {
sliceMaxItems: { type: 'integer', minimum: 10, maximum: 256 },
slicesCount: { type: 'integer', minimum: 1 },
},
required: ['sliceMaxItems', 'slicesCount']
}
CollectionSlice schema
The CollectionSlice
schema must be an object
with a $comment
field from CIP-88 having the value cip88:collectionSlice
, along with the following properties:
collection
: the StreamID of the collection the slice is part ofsliceIndex
: index of the slice in the collection, between0
and the collection’sslicesCount
minus1
contents
: array with amaxItems
value matching theAppendCollection
sliceMaxItems
property and defining the items schemas, that must include{ type: 'null' }
in order to support removals
{
$schema: 'http://json-schema.org/draft-07/schema#',
$comment: 'cip88:collectionSlice',
title: 'MyCollectionSlice',
type: 'object',
properties: {
collection: { type: 'string', maxLength: 150 },
sliceIndex: { type: 'integer', minimum: 0 },
contents: {
type: 'array',
maxItems: 256,
minItems: 0,
items: {
oneOf: [{ type: 'object', ... }, { type: 'null' }],
},
},
},
required: ['collection', 'sliceIndex', 'contents'],
}
Algorithms
Deterministic slice access
Accessing slices in the collection relies on Ceramic’s ability to load streams deterministically based on their metadata by using { deterministic: true, tags: ['<slice tag>'] }
, where slice tag
is a string of <collection StreamID string>:<slice index>
.
First insertion
This flow assumes a prerequisite check that the
AppendCollection
stream has not been created yet.
- Create the
AppendCollection
stream with aslicesCount
of1
. - Create a deterministic
CollectionSlice
stream with a tag using theindex
value0
. - Update the created
CollectionSlice
stream with thecollection
string,sliceIndex
number andcontents
array containing the item to insert.
Other insertions
- Load the
AppendCollection
stream. - Load the most recent
CollectionSlice
stream based on its deterministic content, using theslicesCount
from the collection. - Check the length of the
contents
array of theCollectionSlice
:
- If it is lower than the
sliceMaxItems
value of theAppendCollection
, add the item to thecontents
array. - If it is equal to the
sliceMaxItems
value:- Create a new deterministic
CollectionSlice
stream withindex
value of the previous slice plus1
. - Update the created
CollectionSlice
stream with thecollection
string,sliceIndex
number andcontents
array containing the item to insert. - Update the
AppendCollection
stream with the incrementedslicesCount
.
- Create a new deterministic
Single item loading
Loading a single item is based on the item cursor (slice StreamID + item index in slice):
- Load the
CollectionSlice
stream from its StreamID. - Access the item at the given index in the
contents
array.
Multiple item loading
Loading multiple items can be done in order (from the first
slice) or reverse order (from the last
slice), with an optional cursor
to start from, as described below:
-
first: N
- Load the
CollectionSlice
stream based on its determistic content with asliceIndex
of0
. - Iterate through
contents
filtering outnull
values untilN
items are collected. - If
N
items are not collected, load the next slice deterministically and continue from previous step.
- Load the
-
first: N, after: cursor(slice ID, offset)
- Load the
CollectionSlice
stream from theslice ID
. - Skip the first
contents
items according to theoffset
. - Iterate through
contents
filtering outnull
values untilN
items are collected. - If
N
items are not collected, load the next slice deterministically and continue from previous step.
- Load the
-
last: N
- Load the
AppendCollection
stream. - Load the
CollectionSlice
stream based on its determistic content, using theslicesCount
from the collection. - Iterate through
contents
in reverse order filtering outnull
values untilN
items are collected. - If
N
items are not collected and the first slice is not reached, load the previous slice deterministically and continue from previous step
- Load the
-
last: N, before: cursor(slice ID, offset)
- Load the
CollectionSlice
stream from theslice ID
. - Skip the first
contents
items according to theoffset
. - Iterate through
contents
in reverse order filtering outnull
values untilN
items are collected. - If
N
items are not collected and the first slice is not reached, load the previous slice deterministically and continue from previous step
- Load the
Single item removal
Removing an item consists in replacing the item value by null
in the contents
array identified by the given cursor(slice ID, offset)
:
- Load the
CollectionSlice
stream from theslice ID
. - Access the item from the
contents
array at the givenoffset
. - If the item exists, replace it by
null
.
Rationale
This CIP implements a well-known data structure (a doubly linked list) in a similar way to the ActivityStreams collections.
The implementation and algorithms are meant to support GraphQL’s connections so this interface can easily be used on top of AppendCollection
.
Backwards Compatibility
None, as these are new schemas.
Implementation
None yet.
Security Considerations
The spec defines clear expectations about the number of items a slice can contain, and how cursors should be stable, but as with any other Ceramic stream, it is up to the schema validation and correct spec implementations to ensure the correct behavior is applied.
Copyright
Copyright and related rights waived via CC0.
Citation
Please cite this document as:
Paul Le Cam, "CIP-85: AppendCollection schemas," Ceramic Improvement Proposals, no. 85, February 2021. [Online serial]. Available: https://github.com/ceramicnetwork/CIPs/blob/main/CIPs/cip-85.md