querycontext

package
v3.35.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 30, 2023 License: Apache-2.0, Apache-2.0 Imports: 13 Imported by: 0

Documentation

Overview

Package querycontext provides a semi-transactional layer wrapping access to multiple underlying transactional databases.

Background

Within a single transactional database, a transaction provides the ability to make multiple changes atomically (either all or none are visible, you can never observe only some of them), and manages the lifetime of data access; for instance, a database could return objects which point into database-managed storage, but which will be invalidated after the transaction completes. A transaction also provides a stable view of the data; even if you aren't writing, a transaction you open will be able to see the same data across multiple queries, even if changes are being made to the database.

Featurebase uses multiple transactional databases in parallel, and we want to preserve transactional semantics across these databases as well as we can. We want to be able to see a consistent view of each database, to be able to make multiple changes in sequence which are committed atomically, and to ensure that data we're still using isn't invalidated.

We can approximate this by having an object which tracks transactions for each individual database, and then closes them all at once, or commits them all at once. However, if we do this naively, it becomes very likely for us to end up deadlocked -- we can have two queries running each of which is holding a lock, and will hold it until it finishes running, and each of which is waiting for access to a lock the other is holding.

The QueryContext offers a resolution to this by tracking the scope of each query's prospective writes. Queries register their prospective writes when they're created, and cannot begin running (potentially acquiring locks) until there are no existing queries that could contest any locks with them.

Data Organization

The overall data set maintained by Featurebase is divided into Indexes (which correspond roughly to SQL tables), Fields, Views, and Shards. Shards divide the set of records into blocks of adjacent records, while the Index/Field/View hierarchy corresponds more to tables and columns. The intersection of {index, field, view, shard} is called a fragment. The QueryContext/TxStore design abstracts away the question of which fragments are stored in which backend database files; you specify your scope in terms of indexes, fields, and shards, and you request access to fragments.

Usage

The major exported types from this package are KeySplitter, QueryContext, QueryScope, and TxStore. In general, the usage of the package is that you create a TxStore representing your underlying collection of databases, and using a KeySplitter to determine how the overall data stored is split into individual database files. When you wish to operate on data, you request a new QueryContext from the TxStore. If you want to write, you use a QueryScope to identify the scope of which parts of the data you want to write to. (The request for a QueryContext will block until it can be satisfied.) The QueryContext then provides read and write access to individual fragments, and handles any necessary multiplexing between fragment access and database access.

Copyright 2022 Molecula Corp (DBA FeatureBase). All rights reserved.

Copyright 2022 Molecula Corp (DBA FeatureBase). All rights reserved.

Copyright 2022 Molecula Corp (DBA FeatureBase). All rights reserved.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewFlexibleKeySplitter

func NewFlexibleKeySplitter(indexes ...IndexName) *flexibleKeySplitter

NewFlexibleKeySplitter creates a KeySplitter which uses index/shard splits by default, but splits things into fields if they're in the indexes provided. This design is experimental, and should be considered pre-deprecated for production use for now.

func NewRBFTxStore

func NewRBFTxStore(path string, cfg *rbfcfg.Config, splitter KeySplitter) (*rbfTxStore, error)

NewRBFTxStore creates a new RBF-backed TxStore in the given directory. If cfg is nil, it will use a `NewDefaultConfig`. All databases will be opened using the same config. If splitter is nil, it uses an index/shard splitter.

With the index/shard key splitter, database directory paths look like `path/indexes/i/shards/00000000`, with each shard directory containing data/wal files.

Types

type FieldName

type FieldName string

type IndexName

type IndexName string

type KeySplitter

type KeySplitter interface {

	// Scope() yields a new scope which is aware of this KeySplitter
	// and will give correct results for Overlap calls.
	Scope() QueryScope
	// contains filtered or unexported methods
}

KeySplitter knows how to convert fragment identifiers (index/field/view/shard) into database backends. This is handled by the unexported `keys` method. The exported method, Scope, produces a QueryScope which follows corresponding rules.

Imagine that you have two QueryScopes A and B from the same TxStore, and two fragment identifiers such that A.Allowed(i1, f1, v1, s1) and B.Allowed(i2, f2, v2, s2) are both true. If the keys method produces the same database key for these two fragment identifiers, then the two query scopes are said to overlap. It doesn't matter whether the whole fragment identifiers are identical.

type QueryContext

type QueryContext interface {
	// NewRead requests a new QueryRead object for the indicated fragment.
	NewRead(IndexName, FieldName, ViewName, ShardID) (QueryRead, error)
	// NewWrite requests a new QueryWrite object for the indicated fragment.
	NewWrite(IndexName, FieldName, ViewName, ShardID) (QueryWrite, error)
	// Error sets a persistent error state and indicates that this QueryContext
	// must not commit its writes.
	Error(...interface{})
	// Errorf is a convenience function equivalent to Error(fmt.Errorf(...))
	Errorf(string, ...interface{})
	// Release releases resources held by this QueryContext without committing
	// writes. If writes have already been committed, they are not affected.
	// A release after a commit (or another release) is harmless.
	Release()
	// Commit attempts to commit writes, unless an error has already been
	// recorded or the parent context has been canceled. If it does not attempt
	// to commit writes, it reports the error that prevented it. Otherwise it
	// attempts the writes and reports an error if any errors occurred.
	// It is an error to try to commit twice or use the QueryContext after a
	// commit.
	Commit() error
}

QueryContext represents the lifespan of a query or similar thing which is accessing one or more backend databases. The individual databases are transactional; a transaction allows seeing consistent data (even if other things are may be writing to the database), keeps memory returned by the backend from being invalidated, and makes sets of changes take effect atomically.

The QueryContext should not be closed until all access to data returned from queries is complete.

The Error/Errorf methods tell the QueryContext that an error has occurred which should prevent it from committing. If you call either of them for a QueryContext, Commit() must fail. (It may yield the error provided, or another error which seemed important.) NewRead and NewWrite also fail once an error has been reported.

A QueryContext is created with a parent context.Context, and will also fail, and refuse to commit, if that context is canceled before you try to commit.

type QueryRead

type QueryRead interface {
	// ContainerIterator yields a container iterator starting at
	// the given key. The found bool return indicates whether that
	// exact container was present. The iterator's Close() function
	// must be called when done using it.
	ContainerIterator(ckey uint64) (citer roaring.ContainerIterator, found bool, err error)

	// ApplyFilter applies a roaring.BitmapFilter to the fragment, starting
	// at the given container key. The container objects passed to the
	// filter's ConsiderData method are transient objects; both the
	// container header and the data associated with the container can be
	// overwritten by the filter after each call. If you need the Container
	// objects, or the data they reference, after that method is called,
	// you must clone them.
	ApplyFilter(ckey uint64, filter roaring.BitmapFilter) (err error)

	// Container returns the *roaring.Container for the container key,
	// which may be a nil if the container isn't present. The container
	// returned is valid for the life of the query context.
	Container(ckey uint64) (*roaring.Container, error)

	// Contains determines whether the bit is set.
	Contains(v uint64) (exists bool, err error)

	// Count returns the count of bits set in the fragment.
	Count() (uint64, error)

	// Max returns the highest bit set in the fragment.
	Max() (uint64, error)

	// Min returns the lowest bit set in the fragment.
	Min() (uint64, bool, error)

	// CountRange returns the count of set bits in the range [start, end)
	// in this fragment. The lower bound is inclusive, the upper bound is
	// exclusive.
	CountRange(start, end uint64) (uint64, error)

	// OffsetRange returns a bitmap containing the containers covering the
	// range (in bits) from start (inclusive) to end (exclusive). Despite
	// the range being specified in bits, all three parameters must be multiples
	// of 65,536 (the size of a Container).
	//
	// The bits returned will have their offsets adjusted by (offset-start).
	// For instance, if start is 0, and offset is 65536, all bits will be
	// 65536 higher (which is to say, all container keys will be one higher
	// than they were in the fragment).
	//
	// OffsetRange is used to translate from a row of a fragment to a shard
	// of a database-wide Row. For instance:
	//
	// OffsetRange(3 * ShardWidth, 4 * ShardWidth, 7 * ShardWidth)
	//
	// would yield the third "row" of a fragment, with its container keys adjusted
	// to reflect the range covered by shard 7 of the index.
	//
	// The resulting bitmap is valid for the lifespan of the QueryContext.
	OffsetRange(offset, start, end uint64) (*roaring.Bitmap, error)

	// RoaringBitmap produces a roaring.Bitmap representing the entire fragment.
	// The resulting bitmap is valid for the lifespan of the QueryContext.
	RoaringBitmap() (*roaring.Bitmap, error)
}

QueryRead represents read access to a fragment. When functions in this interface return an error, the error indicates a failed operation, such as an I/O error. Empty or nonexistent data is not an error. For example, the Container method can return a nil pointer if no such container exists, but would also return a nil error in that case. An error would be returned only if the attempt to determine whether the container exists failed for some reason.

type QueryScope

type QueryScope interface {
	// Allowed determines whether a specific fragment
	// is covered by this QueryScope.
	Allowed(IndexName, FieldName, ViewName, ShardID) bool

	// Overlap reports whether there are any overlaps between this
	// QueryScope object and another. An overlap exists wherever
	// calls to Allowed with the same parameters would return true for
	// both objects.
	Overlap(QueryScope) bool

	AddAll() QueryScope
	AddIndex(IndexName) QueryScope
	AddField(IndexName, FieldName) QueryScope
	AddIndexShards(IndexName, ...ShardID) QueryScope
	AddFieldShards(IndexName, FieldName, ...ShardID) QueryScope

	String() string
}

QueryScope represents a possible set of things that can be written to. A QueryScope can in principle represent arbitrary patterns with special rules. However! Our system depends on using QueryScope objects to detect and prevent overlapping writes, to ensure that queries running in parallel won't deadlock against each other.

So each TxStore can yield QueryScope objects, the Overlap semantics of which match the TxStore's database definitions. If two QueryScopes are considered to overlap, that means that there exist fragment identifiers such that each QueryScope returns true for Allowed on at least one of these fragment identifiers, and the TxStore's KeySplitter would produce the same database key for those fragment identifiers.

The Add functions return the scope to allow things like

txs.NewWriteQueryContext(ctx, txs.Scope().AddIndex("i"))

and chaining add operations in simple cases.

type QueryWrite

type QueryWrite interface {
	QueryRead

	// PutContainer stores c under the given key in the fragment.
	PutContainer(ckey uint64, c *roaring.Container) error

	// RemoveContainer deletes the roaring.Container under the given key
	// in the fragment.
	RemoveContainer(ckey uint64) error

	// Add sets the given bits in the fragment, and reports how many bits
	// actually changed.
	Add(a ...uint64) (changeCount int, err error)

	// Remove clears the given bits in the fragment, and reports how many
	// bits actually changed.
	Remove(a ...uint64) (changeCount int, err error)

	// ApplyRewriter applies a roaring.BitmapRewriter to a specified shard,
	// starting at the given container key. The filter's ConsiderData
	// method may be called with transient Container objects which *must
	// not* be retained or referenced after that function exits. Similarly,
	// their data must not be retained. If you need the data later, you
	// must copy it into some other memory. However, it is safe to overwrite
	// the returned container; for instance, you can DifferenceInPlace on
	// it.
	ApplyRewriter(ckey uint64, filter roaring.BitmapRewriter) (err error)

	// ImportRoaringBits does efficient bulk import using a roaring.RoaringIterator.
	//
	// See the roaring package for details of the RoaringIterator.
	//
	// If clear is true, the bits from rit are cleared, otherwise they are set in the
	// specifed fragment.
	ImportRoaringBits(rit roaring.RoaringIterator, clear bool, rowSize uint64) (changed int, rowSet map[uint64]int, err error)
}

QueryWrite represents write access to a fragment. As with QueryRead, errors indicate an unexpected error. For instance, if you try to remove a container that doesn't exist, that's not an "error", but if you try to remove a container and get a disk write error or something like that, that's an error.

type ShardID

type ShardID uint64

type TxStore

type TxStore interface {
	KeySplitter

	// NewQueryContext yields a new query context which is read-only.
	NewQueryContext(context.Context) (QueryContext, error)

	// NewWriteQueryContext yields a new query context which can
	// write to the things in the given QueryScope
	NewWriteQueryContext(context.Context, QueryScope) (QueryContext, error)

	// Close attempts to shut down. It can fail if there are still
	// open transactions.
	Close() error
}

TxStore represents a transactional database backend, mapping {index,field,view,shard} tuples to combinations of specific on-disk databases and keys to use with them to find related data.

Each TxStore implements the KeySplitter interface, possibly by having a KeySplitter embedded in it. The KeySplitter used with the TxStore determines which database to use (the database key) and which part of that database to use (the fragment key) to access a given fragment.

type ViewName

type ViewName string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL