base

package
v0.0.0-...-417737b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2022 License: BSD-3-Clause Imports: 12 Imported by: 0

Documentation

Index

Constants

View Source
const (
	DefaultBlockRestartInterval = 16
	DefaultBlockSize            = 4096
	DefaultBlockSizeThreshold   = 90
)

SSTable block defaults.

View Source
const InternalTrailerLen = 8

InternalTrailerLen is the number of bytes used to encode InternalKey.Trailer.

Variables

View Source
var DefaultComparer = &Comparer{
	Compare: bytes.Compare,
	Equal:   bytes.Equal,

	AbbreviatedKey: func(key []byte) uint64 {
		if len(key) >= 8 {
			return binary.BigEndian.Uint64(key)
		}
		var v uint64
		for _, b := range key {
			v <<= 8
			v |= uint64(b)
		}
		return v << uint(8*(8-len(key)))
	},

	FormatKey: DefaultFormatter,

	Separator: func(dst, a, b []byte) []byte {
		i, n := SharedPrefixLen(a, b), len(dst)
		dst = append(dst, a...)

		min := len(a)
		if min > len(b) {
			min = len(b)
		}
		if i >= min {

			return dst
		}

		if a[i] >= b[i] {

			return dst
		}

		if i < len(b)-1 || a[i]+1 < b[i] {
			i += n
			dst[i]++
			return dst[:i+1]
		}

		i += n + 1
		for ; i < len(dst); i++ {
			if dst[i] != 0xff {
				dst[i]++
				return dst[:i+1]
			}
		}
		return dst
	},

	Successor: func(dst, a []byte) (ret []byte) {
		for i := 0; i < len(a); i++ {
			if a[i] != 0xff {
				dst = append(dst, a[:i+1]...)
				dst[len(dst)-1]++
				return dst
			}
		}

		return append(dst, a...)
	},

	Name: "leveldb.BytewiseComparator",
}

DefaultComparer is the default implementation of the Comparer interface. It uses the natural ordering, consistent with bytes.Compare.

View Source
var DefaultFormatter = func(key []byte) fmt.Formatter {
	return FormatBytes(key)
}

DefaultFormatter is the default implementation of user key formatting: non-ASCII data is formatted as escaped hexadecimal values.

View Source
var DefaultMerger = &Merger{
	Merge: func(key, value []byte) (ValueMerger, error) {
		res := &AppendValueMerger{}
		res.buf = append(res.buf, value...)
		return res, nil
	},

	Name: "pebble.concatenate",
}

DefaultMerger is the default implementation of the Merger interface. It concatenates the two values to merge.

View Source
var ErrCorruption = errors.New("pebble: corruption")

ErrCorruption is a marker to indicate that data in a file (WAL, MANIFEST, sstable) isn't in the expected format.

View Source
var ErrNotFound = errors.New("pebble: not found")

ErrNotFound means that a get or delete call did not find the requested key.

View Source
var InvalidInternalKey = MakeInternalKey(nil, 0, InternalKeyKindInvalid)

InvalidInternalKey is an invalid internal key for which Valid() will return false.

Functions

func CorruptionErrorf

func CorruptionErrorf(format string, args ...interface{}) error

CorruptionErrorf formats according to a format specifier and returns the string as an error value that is marked as a corruption error.

func InternalCompare

func InternalCompare(userCmp Compare, a, b InternalKey) int

InternalCompare compares two internal keys using the specified comparison function. For equal user keys, internal keys compare in descending sequence number order. For equal user keys and sequence numbers, internal keys compare in descending kind order (this may happen in practice among range keys).

func MakeFilename

func MakeFilename(fileType FileType, fileNum FileNum) string

MakeFilename builds a filename from components.

func MakeFilepath

func MakeFilepath(fs vfs.FS, dirname string, fileType FileType, fileNum FileNum) string

MakeFilepath builds a filepath from components.

func MakeTrailer

func MakeTrailer(seqNum uint64, kind InternalKeyKind) uint64

MakeTrailer constructs an internal key trailer from the specified sequence number and kind.

func MarkCorruptionError

func MarkCorruptionError(err error) error

MarkCorruptionError marks given error as a corruption error.

func MustExist

func MustExist(fs vfs.FS, filename string, fataler Fataler, err error)

MustExist checks if err is an error indicating a file does not exist. If it is, it lists the containing directory's files to annotate the error with counts of the various types of files and invokes the provided fataler. See cockroachdb/cockroach#56490.

func ParseFilename

func ParseFilename(fs vfs.FS, filename string) (fileType FileType, fileNum FileNum, ok bool)

ParseFilename parses the components from a filename.

func SharedPrefixLen

func SharedPrefixLen(a, b []byte) int

SharedPrefixLen returns the largest i such that a[:i] equals b[:i]. This function can be useful in implementing the Comparer interface.

func Visible

func Visible(seqNum uint64, snapshot uint64) bool

Visible returns true if a key with the provided sequence number is visible at the specified snapshot sequence number.

Types

type AbbreviatedKey

type AbbreviatedKey func(key []byte) uint64

AbbreviatedKey returns a fixed length prefix of a user key such that AbbreviatedKey(a) < AbbreviatedKey(b) iff a < b and AbbreviatedKey(a) > AbbreviatedKey(b) iff a > b. If AbbreviatedKey(a) == AbbreviatedKey(b) an additional comparison is required to determine if the two keys are actually equal.

This helps optimize indexed batch comparisons for cache locality. If a Split function is specified, AbbreviatedKey usually returns the first eight bytes of the user key prefix in the order that gives the correct ordering.

type AppendValueMerger

type AppendValueMerger struct {
	// contains filtered or unexported fields
}

AppendValueMerger concatenates merge operands in order from oldest to newest.

func (*AppendValueMerger) Finish

func (a *AppendValueMerger) Finish(includesBase bool) ([]byte, io.Closer, error)

Finish returns the buffer that was constructed on-demand in `Merge{OlderNewer}()` calls.

func (*AppendValueMerger) MergeNewer

func (a *AppendValueMerger) MergeNewer(value []byte) error

MergeNewer appends value to the result.

func (*AppendValueMerger) MergeOlder

func (a *AppendValueMerger) MergeOlder(value []byte) error

MergeOlder prepends value to the result, which involves allocating a new buffer.

type ArchiveCleaner

type ArchiveCleaner struct{}

ArchiveCleaner archives file instead delete.

func (ArchiveCleaner) Clean

func (ArchiveCleaner) Clean(fs vfs.FS, fileType FileType, path string) error

Clean archives file.

func (ArchiveCleaner) String

func (ArchiveCleaner) String() string

type BlockPropertyFilter

type BlockPropertyFilter interface {
	// Name returns the name of the block property collector.
	Name() string
	// Intersects returns true if the set represented by prop intersects with
	// the set in the filter.
	Intersects(prop []byte) (bool, error)
}

BlockPropertyFilter is used in an Iterator to filter sstables and blocks within the sstable. It should not maintain any per-sstable state, and must be thread-safe.

type Cleaner

type Cleaner interface {
	Clean(fs vfs.FS, fileType FileType, path string) error
}

Cleaner cleans obsolete files.

type Compare

type Compare func(a, b []byte) int

Compare returns -1, 0, or +1 depending on whether a is 'less than', 'equal to' or 'greater than' b. The two arguments can only be 'equal' if their contents are exactly equal. Furthermore, the empty slice must be 'less than' any non-empty slice. Compare is used to compare user keys, such as those passed as arguments to the various DB methods, as well as those returned from Separator, Successor, and Split.

type Comparer

type Comparer struct {
	Compare        Compare
	Equal          Equal
	AbbreviatedKey AbbreviatedKey
	FormatKey      FormatKey
	FormatValue    FormatValue
	Separator      Separator
	Split          Split
	Successor      Successor

	// Name is the name of the comparer.
	//
	// The Level-DB on-disk format stores the comparer name, and opening a
	// database with a different comparer from the one it was created with
	// will result in an error.
	Name string
}

Comparer defines a total ordering over the space of []byte keys: a 'less than' relationship.

type DeletableValueMerger

type DeletableValueMerger interface {
	ValueMerger

	// DeletableFinish enables a value merger to indicate that the result of a merge operation
	// is non-existent. See Finish for a description of includesBase.
	DeletableFinish(includesBase bool) (value []byte, delete bool, closer io.Closer, err error)
}

DeletableValueMerger is an extension to ValueMerger which allows indicating that the result of a merge operation is non-existent. Such non-existent entries will eventually be deleted during compaction. Note that during compaction, non-existence of the result of a merge means that the merge operands will not result in any record being output. This is not the same as transforming the merge operands into a deletion tombstone, as older merge operands will still be visible during iteration. Deletion of the merge operands in this way is akin to the way a SingleDelete+Set combine into non-existence while leaving older records for the same key unaffected.

type DeleteCleaner

type DeleteCleaner struct{}

DeleteCleaner deletes file.

func (DeleteCleaner) Clean

func (DeleteCleaner) Clean(fs vfs.FS, fileType FileType, path string) error

Clean removes file.

func (DeleteCleaner) String

func (DeleteCleaner) String() string

type Equal

type Equal func(a, b []byte) bool

Equal returns true if a and b are equivalent. For a given Compare, Equal(a,b) must return true iff Compare(a,b) returns zero, that is, Equal is a (potentially faster) specialization of Compare.

type Fataler

type Fataler interface {
	Fatalf(format string, args ...interface{})
}

A Fataler fatals a process with a message when called.

type FileNum

type FileNum uint64

FileNum is an internal DB identifier for a file.

func (FileNum) String

func (fn FileNum) String() string

String returns a string representation of the file number.

type FileType

type FileType int

FileType enumerates the types of files found in a DB.

const (
	FileTypeLog FileType = iota
	FileTypeLock
	FileTypeTable
	FileTypeManifest
	FileTypeCurrent
	FileTypeOptions
	FileTypeOldTemp
	FileTypeTemp
)

The FileType enumeration.

type FilterPolicy

type FilterPolicy interface {
	// Name names the filter policy.
	Name() string

	// MayContain returns whether the encoded filter may contain given key.
	// False positives are possible, where it returns true for keys not in the
	// original set.
	MayContain(ftype FilterType, filter, key []byte) bool

	// NewWriter creates a new FilterWriter.
	NewWriter(ftype FilterType) FilterWriter
}

FilterPolicy is an algorithm for probabilistically encoding a set of keys. The canonical implementation is a Bloom filter.

Every FilterPolicy has a name. This names the algorithm itself, not any one particular instance. Aspects specific to a particular instance, such as the set of keys or any other parameters, will be encoded in the []byte filter returned by NewWriter.

The name may be written to files on disk, along with the filter data. To use these filters, the FilterPolicy name at the time of writing must equal the name at the time of reading. If they do not match, the filters will be ignored, which will not affect correctness but may affect performance.

type FilterType

type FilterType int

FilterType is the level at which to apply a filter: block or table.

const (
	TableFilter FilterType = iota
)

The available filter types.

func (FilterType) String

func (t FilterType) String() string

type FilterWriter

type FilterWriter interface {
	// AddKey adds a key to the current filter block.
	AddKey(key []byte)

	// Finish appends to dst an encoded filter tha holds the current set of
	// keys. The writer state is reset after the call to Finish allowing the
	// writer to be reused for the creation of additional filters.
	Finish(dst []byte) []byte
}

FilterWriter provides an interface for creating filter blocks. See FilterPolicy for more details about filters.

type FormatBytes

type FormatBytes []byte

FormatBytes formats a byte slice using hexadecimal escapes for non-ASCII data.

func (FormatBytes) Format

func (p FormatBytes) Format(s fmt.State, c rune)

Format implements the fmt.Formatter interface.

type FormatKey

type FormatKey func(key []byte) fmt.Formatter

FormatKey returns a formatter for the user key.

type FormatValue

type FormatValue func(key, value []byte) fmt.Formatter

FormatValue returns a formatter for the user value. The key is also specified for the value formatter in order to support value formatting that is dependent on the key.

type GaugeSampleMetric

type GaugeSampleMetric struct {
	// contains filtered or unexported fields
}

GaugeSampleMetric is used to measure a gauge value (e.g. queue length) by accumulating samples of that gauge.

func (*GaugeSampleMetric) AddSample

func (gsm *GaugeSampleMetric) AddSample(sample int64)

AddSample adds the given sample.

func (*GaugeSampleMetric) Mean

func (gsm *GaugeSampleMetric) Mean() float64

Mean returns the mean value.

func (*GaugeSampleMetric) Merge

func (gsm *GaugeSampleMetric) Merge(x GaugeSampleMetric)

Merge accumulates the information from another gauge metric.

type InternalIterator

type InternalIterator interface {
	// SeekGE moves the iterator to the first key/value pair whose key is greater
	// than or equal to the given key. Returns the key and value if the iterator
	// is pointing at a valid entry, and (nil, nil) otherwise. Note that SeekGE
	// only checks the upper bound. It is up to the caller to ensure that key
	// is greater than or equal to the lower bound.
	//
	// trySeekUsingNext is a performance optimization that callers can use to
	// indicate that they have not done any action to move this iterator beyond
	// the first key that would be found if this iterator were to honestly do
	// the intended seek. For example, say the caller did a
	// SeekGE(k1...), followed by SeekGE(k2...) where k1 <= k2, without any
	// intermediate positioning calls. The caller can safely specify true for this
	// parameter in the second call. As another example, say the caller did do one
	// call to Next between the two Seek calls, and k1 < k2. Again, the caller can
	// safely specify a true value for this parameter. Note that a false value is
	// always safe. The callee is free to ignore the true value if its
	// implementation does not permit this optimization.
	// - We make the caller do this determination since a string comparison of
	//   k1, k2 is not necessarily cheap, and there may be many iterators in
	//   the iterator stack. Doing it once at the root of the iterator stack
	//   is cheaper.
	SeekGE(key []byte, trySeekUsingNext bool) (*InternalKey, []byte)

	// SeekPrefixGE moves the iterator to the first key/value pair whose key is
	// greater than or equal to the given key. Returns the key and value if the
	// iterator is pointing at a valid entry, and (nil, nil) otherwise. Note that
	// SeekPrefixGE only checks the upper bound. It is up to the caller to ensure
	// that key is greater than or equal to the lower bound.
	//
	// The prefix argument is used by some InternalIterator implementations (e.g.
	// sstable.Reader) to avoid expensive operations. A user-defined Split
	// function must be supplied to the Comparer for the DB. The supplied prefix
	// will be the prefix of the given key returned by that Split function. If
	// the iterator is able to determine that no key with the prefix exists, it
	// can return (nil,nil). Unlike SeekGE, this is not an indication that
	// iteration is exhausted.
	//
	// Note that the iterator may return keys not matching the prefix. It is up
	// to the caller to check if the prefix matches.
	//
	// Calling SeekPrefixGE places the receiver into prefix iteration mode. Once
	// in this mode, reverse iteration may not be supported and will return an
	// error. Note that pebble/Iterator.SeekPrefixGE has this same restriction on
	// not supporting reverse iteration in prefix iteration mode until a
	// different positioning routine (SeekGE, SeekLT, First or Last) switches the
	// iterator out of prefix iteration.
	//
	// trySeekUsingNext is a performance optimization that callers can use to
	// indicate that they have not done any action to move this iterator
	// beyond the first key that would be found if this iterator were to
	// honestly do the indicated seek. For example, say the caller did a
	// SeekPrefixGE(k1...), followed by SeekPrefixGE(k2...) where k1 <= k2,
	// without any intermediate positioning calls. The caller can safely
	// specify true for this parameter in the second call. As another example,
	// say the caller did do one call to Next between the two SeekPrefixGE
	// calls, and k1 < k2. Again, the caller can safely specify a true value
	// for this parameter. Note that a false value is always safe. The callee
	// is free to ignore the true value if its implementation does not permit
	// this optimization.
	// - We make the caller do this determination since a string comparison of
	//   k1, k2 is not necessarily cheap, and there may be many iterators in
	//   the iterator stack. Doing it once at the root of the iterator stack
	//   is cheaper.
	// - This optimization could also be applied to SeekLT (where it would be
	//   trySeekUsingPrev). We currently only do it for SeekPrefixGE and SeekGE
	//   because this is where this optimization helps the performance of
	//   CockroachDB. The SeekLT cases in CockroachDB are typically accompanied
	//   with bounds that change between seek calls, and is optimized inside
	//   certain iterator implementations, like singleLevelIterator, without any
	//   extra parameter passing (though the same amortization of string
	//   comparisons could be done to improve that optimization, by making the
	//   root of the iterator stack do it).
	SeekPrefixGE(prefix, key []byte, trySeekUsingNext bool) (*InternalKey, []byte)

	// SeekLT moves the iterator to the last key/value pair whose key is less
	// than the given key. Returns the key and value if the iterator is pointing
	// at a valid entry, and (nil, nil) otherwise. Note that SeekLT only checks
	// the lower bound. It is up to the caller to ensure that key is less than
	// the upper bound.
	SeekLT(key []byte) (*InternalKey, []byte)

	// First moves the iterator the the first key/value pair. Returns the key and
	// value if the iterator is pointing at a valid entry, and (nil, nil)
	// otherwise. Note that First only checks the upper bound. It is up to the
	// caller to ensure that First() is not called when there is a lower bound,
	// and instead call SeekGE(lower).
	First() (*InternalKey, []byte)

	// Last moves the iterator the the last key/value pair. Returns the key and
	// value if the iterator is pointing at a valid entry, and (nil, nil)
	// otherwise. Note that Last only checks the lower bound. It is up to the
	// caller to ensure that Last() is not called when there is an upper bound,
	// and instead call SeekLT(upper).
	Last() (*InternalKey, []byte)

	// Next moves the iterator to the next key/value pair. Returns the key and
	// value if the iterator is pointing at a valid entry, and (nil, nil)
	// otherwise. Note that Next only checks the upper bound. It is up to the
	// caller to ensure that key is greater than or equal to the lower bound.
	//
	// It is valid to call Next when the iterator is positioned before the first
	// key/value pair due to either a prior call to SeekLT or Prev which returned
	// (nil, nil). It is not allowed to call Next when the previous call to SeekGE,
	// SeekPrefixGE or Next returned (nil, nil).
	Next() (*InternalKey, []byte)

	// Prev moves the iterator to the previous key/value pair. Returns the key
	// and value if the iterator is pointing at a valid entry, and (nil, nil)
	// otherwise. Note that Prev only checks the lower bound. It is up to the
	// caller to ensure that key is less than the upper bound.
	//
	// It is valid to call Prev when the iterator is positioned after the last
	// key/value pair due to either a prior call to SeekGE or Next which returned
	// (nil, nil). It is not allowed to call Prev when the previous call to SeekLT
	// or Prev returned (nil, nil).
	Prev() (*InternalKey, []byte)

	// Error returns any accumulated error.
	Error() error

	// Close closes the iterator and returns any accumulated error. Exhausting
	// all the key/value pairs in a table is not considered to be an error.
	// It is valid to call Close multiple times. Other methods should not be
	// called after the iterator has been closed.
	Close() error

	// SetBounds sets the lower and upper bounds for the iterator. Note that the
	// result of Next and Prev will be undefined until the iterator has been
	// repositioned with SeekGE, SeekPrefixGE, SeekLT, First, or Last.
	//
	// The bounds provided must remain valid until a subsequent call to
	// SetBounds has returned. This requirement exists so that iterator
	// implementations may compare old and new bounds to apply low-level
	// optimizations.
	SetBounds(lower, upper []byte)

	fmt.Stringer
}

InternalIterator iterates over a DB's key/value pairs in key order. Unlike the Iterator interface, the returned keys are InternalKeys composed of the user-key, a sequence number and a key kind. In forward iteration, key/value pairs for identical user-keys are returned in descending sequence order. In reverse iteration, key/value pairs for identical user-keys are returned in ascending sequence order.

InternalIterators provide 5 absolute positioning methods and 2 relative positioning methods. The absolute positioning methods are:

- SeekGE - SeekPrefixGE - SeekLT - First - Last

The relative positioning methods are:

- Next - Prev

The relative positioning methods can be used in conjunction with any of the absolute positioning methods with one exception: SeekPrefixGE does not support reverse iteration via Prev. It is undefined to call relative positioning methods without ever calling an absolute positioning method.

InternalIterators can optionally implement a prefix iteration mode. This mode is entered by calling SeekPrefixGE and exited by any other absolute positioning method (SeekGE, SeekLT, First, Last). When in prefix iteration mode, a call to Next will advance to the next key which has the same "prefix" as the one supplied to SeekPrefixGE. Note that "prefix" in this context is not a strict byte prefix, but defined by byte equality for the result of the Comparer.Split method. An InternalIterator is not required to support prefix iteration mode, and can implement SeekPrefixGE by forwarding to SeekGE.

Bounds, [lower, upper), can be set on iterators, either using the SetBounds() function in the interface, or in implementation specific ways during iterator creation. The forward positioning routines (SeekGE, First, and Next) only check the upper bound. The reverse positioning routines (SeekLT, Last, and Prev) only check the lower bound. It is up to the caller to ensure that the forward positioning routines respect the lower bound and the reverse positioning routines respect the upper bound (i.e. calling SeekGE instead of First if there is a lower bound, and SeekLT instead of Last if there is an upper bound). This imposition is done in order to elevate that enforcement to the caller (generally pebble.Iterator or pebble.mergingIter) rather than having it duplicated in every InternalIterator implementation.

Additionally, the caller needs to ensure that SeekGE/SeekPrefixGE are not called with a key > the upper bound, and SeekLT is not called with a key < the lower bound. InternalIterator implementations are required to respect the iterator bounds, never returning records outside of the bounds with one exception: an iterator may generate synthetic RANGEDEL marker records. See levelIter.syntheticBoundary for the sole existing example of this behavior. Specifically, levelIter can return synthetic keys whose user key is equal to the lower/upper bound.

The bounds provided to an internal iterator must remain valid until a subsequent call to SetBounds has returned. This requirement exists so that iterator implementations may compare old and new bounds to apply low-level optimizations. The pebble.Iterator satisfies this requirement by maintaining two bound buffers and switching between them.

An iterator must be closed after use, but it is not necessary to read an iterator until exhaustion.

An iterator is not goroutine-safe, but it is safe to use multiple iterators concurrently, either in separate goroutines or switching between the iterators in a single goroutine.

It is also safe to use an iterator concurrently with modifying its underlying DB, if that DB permits modification. However, the resultant key/value pairs are not guaranteed to be a consistent snapshot of that DB at a particular point in time.

InternalIterators accumulate errors encountered during operation, exposing them through the Error method. All of the absolute positioning methods reset any accumulated error before positioning. Relative positioning methods return without advancing if the iterator has accumulated an error.

type InternalIteratorStats

type InternalIteratorStats struct {
	// Bytes in the loaded blocks. If the block was compressed, this is the
	// compressed bytes. Currently, only the second-level index and data blocks
	// containing points are included.
	BlockBytes uint64
	// Subset of BlockBytes that were in the block cache.
	BlockBytesInCache uint64

	// Bytes in keys that were iterated over. Currently, only point keys are
	// included.
	KeyBytes uint64
	// Bytes in values that were iterated over. Currently, only point values are
	// included.
	ValueBytes uint64
	// The count of points iterated over.
	PointCount uint64
	// Points that were iterated over that were covered by range tombstones. It
	// can be useful for discovering instances of
	// https://github.com/joshua-goldstein/pebble/issues/1070.
	PointsCoveredByRangeTombstones uint64
}

InternalIteratorStats contains miscellaneous stats produced by InternalIterators that are part of the InternalIterator tree. Not every field is relevant for an InternalIterator implementation. The field values are aggregated as one goes up the InternalIterator tree.

func (*InternalIteratorStats) Merge

Merge merges the stats in from into the given stats.

type InternalIteratorWithStats

type InternalIteratorWithStats interface {
	InternalIterator
	Stats() InternalIteratorStats
	ResetStats()
}

InternalIteratorWithStats extends InternalIterator to expose stats.

func WrapIterWithStats

func WrapIterWithStats(iter InternalIterator) InternalIteratorWithStats

WrapIterWithStats ensures that either iter implements the stats methods or wraps it, such that the return value implements InternalIteratorWithStats.

type InternalKey

type InternalKey struct {
	UserKey []byte
	Trailer uint64
}

InternalKey is a key used for the in-memory and on-disk partial DBs that make up a pebble DB.

It consists of the user key (as given by the code that uses package pebble) followed by 8-bytes of metadata:

  • 1 byte for the type of internal key: delete or set,
  • 7 bytes for a uint56 sequence number, in little-endian format.

func DecodeInternalKey

func DecodeInternalKey(encodedKey []byte) InternalKey

DecodeInternalKey decodes an encoded internal key. See InternalKey.Encode().

func MakeExclusiveSentinelKey

func MakeExclusiveSentinelKey(kind InternalKeyKind, userKey []byte) InternalKey

MakeExclusiveSentinelKey constructs an internal key that is an exclusive sentinel key, used as the upper boundary for an sstable when a ranged key is the largest key in an sstable.

func MakeInternalKey

func MakeInternalKey(userKey []byte, seqNum uint64, kind InternalKeyKind) InternalKey

MakeInternalKey constructs an internal key from a specified user key, sequence number and kind.

func MakeRangeDeleteSentinelKey

func MakeRangeDeleteSentinelKey(userKey []byte) InternalKey

MakeRangeDeleteSentinelKey constructs an internal key that is a range deletion sentinel key, used as the upper boundary for an sstable when a range deletion is the largest key in an sstable.

func MakeSearchKey

func MakeSearchKey(userKey []byte) InternalKey

MakeSearchKey constructs an internal key that is appropriate for searching for a the specified user key. The search key contain the maximal sequence number and kind ensuring that it sorts before any other internal keys for the same user key.

func ParseInternalKey

func ParseInternalKey(s string) InternalKey

ParseInternalKey parses the string representation of an internal key. The format is <user-key>.<kind>.<seq-num>. If the seq-num starts with a "b" it is marked as a batch-seq-num (i.e. the InternalKeySeqNumBatch bit is set).

func ParsePrettyInternalKey

func ParsePrettyInternalKey(s string) InternalKey

ParsePrettyInternalKey parses the pretty string representation of an internal key. The format is <user-key>#<seq-num>,<kind>.

func (InternalKey) Clone

func (k InternalKey) Clone() InternalKey

Clone clones the storage for the UserKey component of the key.

func (InternalKey) Encode

func (k InternalKey) Encode(buf []byte)

Encode encodes the receiver into the buffer. The buffer must be large enough to hold the encoded data. See InternalKey.Size().

func (InternalKey) EncodeTrailer

func (k InternalKey) EncodeTrailer() [8]byte

EncodeTrailer returns the trailer encoded to an 8-byte array.

func (InternalKey) IsExclusiveSentinel

func (k InternalKey) IsExclusiveSentinel() bool

IsExclusiveSentinel returns whether this internal key excludes point keys with the same user key if used as an end boundary. See the comment on InternalKeyRangeDeletionSentinel.

func (InternalKey) Kind

func (k InternalKey) Kind() InternalKeyKind

Kind returns the kind compoment of the key.

func (InternalKey) Pretty

func (k InternalKey) Pretty(f FormatKey) fmt.Formatter

Pretty returns a formatter for the key.

func (InternalKey) Separator

func (k InternalKey) Separator(
	cmp Compare, sep Separator, buf []byte, other InternalKey,
) InternalKey

Separator returns a separator key such that k <= x && x < other, where less than is consistent with the Compare function. The buf parameter may be used to store the returned InternalKey.UserKey, though it is valid to pass a nil. See the Separator type for details on separator keys.

func (InternalKey) SeqNum

func (k InternalKey) SeqNum() uint64

SeqNum returns the sequence number component of the key.

func (*InternalKey) SetKind

func (k *InternalKey) SetKind(kind InternalKeyKind)

SetKind sets the kind component of the key.

func (*InternalKey) SetSeqNum

func (k *InternalKey) SetSeqNum(seqNum uint64)

SetSeqNum sets the sequence number component of the key.

func (InternalKey) Size

func (k InternalKey) Size() int

Size returns the encoded size of the key.

func (InternalKey) String

func (k InternalKey) String() string

String returns a string representation of the key.

func (InternalKey) Successor

func (k InternalKey) Successor(cmp Compare, succ Successor, buf []byte) InternalKey

Successor returns a successor key such that k <= x. A simple implementation may return k unchanged. The buf parameter may be used to store the returned InternalKey.UserKey, though it is valid to pass a nil.

func (InternalKey) Valid

func (k InternalKey) Valid() bool

Valid returns true if the key has a valid kind.

func (InternalKey) Visible

func (k InternalKey) Visible(snapshot uint64) bool

Visible returns true if the key is visible at the specified snapshot sequence number.

type InternalKeyKind

type InternalKeyKind uint8

InternalKeyKind enumerates the kind of key: a deletion tombstone, a set value, a merged value, etc.

const (
	InternalKeyKindDelete  InternalKeyKind = 0
	InternalKeyKindSet     InternalKeyKind = 1
	InternalKeyKindMerge   InternalKeyKind = 2
	InternalKeyKindLogData InternalKeyKind = 3
	//InternalKeyKindColumnFamilyDeletion     InternalKeyKind = 4
	//InternalKeyKindColumnFamilyValue        InternalKeyKind = 5
	//InternalKeyKindColumnFamilyMerge        InternalKeyKind = 6
	InternalKeyKindSingleDelete InternalKeyKind = 7
	//InternalKeyKindColumnFamilySingleDelete InternalKeyKind = 8
	//InternalKeyKindBeginPrepareXID          InternalKeyKind = 9
	//InternalKeyKindEndPrepareXID            InternalKeyKind = 10
	//InternalKeyKindCommitXID                InternalKeyKind = 11
	//InternalKeyKindRollbackXID              InternalKeyKind = 12
	//InternalKeyKindNoop                     InternalKeyKind = 13
	//InternalKeyKindColumnFamilyRangeDelete  InternalKeyKind = 14
	InternalKeyKindRangeDelete InternalKeyKind = 15

	// InternalKeyKindSeparator is a key used for separator / successor keys
	// written to sstable block indexes.
	//
	// NOTE: the RocksDB value has been repurposed. This was done to ensure that
	// keys written to block indexes with value "17" (when 17 happened to be the
	// max value, and InternalKeyKindMax was therefore set to 17), remain stable
	// when new key kinds are supported in Pebble.
	InternalKeyKindSeparator InternalKeyKind = 17

	// InternalKeyKindSetWithDelete keys are SET keys that have met with a
	// DELETE or SINGLEDEL key in a prior compaction. This key kind is
	// specific to Pebble. See
	// https://github.com/joshua-goldstein/pebble/issues/1255.
	InternalKeyKindSetWithDelete InternalKeyKind = 18

	// InternalKeyKindRangeKeyDelete removes all range keys within a key range.
	// See the internal/rangekey package for more details.
	InternalKeyKindRangeKeyDelete InternalKeyKind = 19
	// InternalKeyKindRangeKeySet and InternalKeyKindRangeUnset represent
	// keys that set and unset values associated with ranges of key
	// space. See the internal/rangekey package for more details.
	InternalKeyKindRangeKeyUnset InternalKeyKind = 20
	InternalKeyKindRangeKeySet   InternalKeyKind = 21

	// This maximum value isn't part of the file format. It's unlikely,
	// but future extensions may increase this value.
	//
	// When constructing an internal key to pass to DB.Seek{GE,LE},
	// internalKeyComparer sorts decreasing by kind (after sorting increasing by
	// user key and decreasing by sequence number). Thus, use InternalKeyKindMax,
	// which sorts 'less than or equal to' any other valid internalKeyKind, when
	// searching for any kind of internal key formed by a certain user key and
	// seqNum.
	InternalKeyKindMax InternalKeyKind = 21

	// A marker for an invalid key.
	InternalKeyKindInvalid InternalKeyKind = 255

	// InternalKeySeqNumBatch is a bit that is set on batch sequence numbers
	// which prevents those entries from being excluded from iteration.
	InternalKeySeqNumBatch = uint64(1 << 55)

	// InternalKeySeqNumMax is the largest valid sequence number.
	InternalKeySeqNumMax = uint64(1<<56 - 1)

	// InternalKeyRangeDeleteSentinel is the marker for a range delete sentinel
	// key. This sequence number and kind are used for the upper stable boundary
	// when a range deletion tombstone is the largest key in an sstable. This is
	// necessary because sstable boundaries are inclusive, while the end key of a
	// range deletion tombstone is exclusive.
	InternalKeyRangeDeleteSentinel = (InternalKeySeqNumMax << 8) | uint64(InternalKeyKindRangeDelete)

	// InternalKeyBoundaryRangeKey is the marker for a range key boundary. This
	// sequence number and kind are used during interleaved range key and point
	// iteration to allow an iterator to stop at range key start keys where
	// there exists no point key.
	InternalKeyBoundaryRangeKey = (InternalKeySeqNumMax << 8) | uint64(InternalKeyKindRangeKeySet)
)

These constants are part of the file format, and should not be changed.

func ParseKind

func ParseKind(s string) InternalKeyKind

ParseKind parses the string representation of an internal key kind.

func (InternalKeyKind) String

func (k InternalKeyKind) String() string

type Merge

type Merge func(key, value []byte) (ValueMerger, error)

Merge creates a ValueMerger for the specified key initialized with the value of one merge operand.

type Merger

type Merger struct {
	Merge Merge

	// Name is the name of the merger.
	//
	// Pebble stores the merger name on disk, and opening a database with a
	// different merger from the one it was created with will result in an error.
	Name string
}

Merger defines an associative merge operation. The merge operation merges two or more values for a single key. A merge operation is requested by writing a value using {Batch,DB}.Merge(). The value at that key is merged with any existing value. It is valid to Set a value at a key and then Merge a new value. Similar to non-merged values, a merged value can be deleted by either Delete or DeleteRange.

The merge operation is invoked when a merge value is encountered during a read, either during a compaction or during iteration.

type NeedsFileContents

type NeedsFileContents interface {
	// contains filtered or unexported methods
}

NeedsFileContents is implemented by a cleaner that needs the contents of the files that it is being asked to clean.

type Separator

type Separator func(dst, a, b []byte) []byte

Separator is used to construct SSTable index blocks. A trivial implementation is `return a`, but appending fewer bytes leads to smaller SSTables.

Given keys a, b for which Compare(a, b) < 0, Separator returns a key k such that:

1. Compare(a, k) <= 0, and 2. Compare(k, b) < 0.

As a special case, b may be nil in which case the second condition is dropped.

For example, if dst, a and b are the []byte equivalents of the strings "aqua", "black" and "blue", then the result may be "aquablb". Similarly, if the arguments were "aqua", "green" and "", then the result may be "aquah".

type Split

type Split func(a []byte) int

Split returns the length of the prefix of the user key that corresponds to the key portion of an MVCC encoding scheme to enable the use of prefix bloom filters.

The method will only ever be called with valid MVCC keys, that is, keys that the user could potentially store in the database. Pebble does not know which keys are MVCC keys and which are not, and may call Split on both MVCC keys and non-MVCC keys.

A trivial MVCC scheme is one in which Split() returns len(a). This corresponds to assigning a constant version to each key in the database. For performance reasons, it is preferable to use a `nil` split in this case.

The returned prefix must have the following properties:

1) The prefix must be a byte prefix:

bytes.HasPrefix(a, prefix(a))
  1. A key consisting of just a prefix must sort before all other keys with that prefix:

    Compare(prefix(a), a) < 0 if len(suffix(a)) > 0

3) Prefixes must be used to order keys before suffixes:

If Compare(a, b) <= 0, then Compare(prefix(a), prefix(b)) <= 0
  1. Suffixes themselves must be valid keys and comparable, respecting the same ordering as within a key.

    If Compare(prefix(a), prefix(b)) == 0, then Compare(suffix(a), suffix(b)) == Compare(a, b)

type Successor

type Successor func(dst, a []byte) []byte

Successor returns a shortened key given a key a, such that Compare(k, a) >= 0. A simple implementation may return a unchanged. The dst parameter may be used to store the returned key, though it is valid to pass nil. The returned key must be valid to pass to Compare.

type ThroughputMetric

type ThroughputMetric struct {
	// Bytes is the processes bytes by the component.
	Bytes int64
	// WorkDuration is the duration that the component spent doing work.
	WorkDuration time.Duration
	// IdleDuration is the duration that the component was idling, waiting for
	// work.
	IdleDuration time.Duration
}

ThroughputMetric is used to measure the byte throughput of some component that performs work in a single-threaded manner. The throughput can be approximated by Bytes/(WorkDuration+IdleTime). The idle time is represented separately, so that the user of this metric could approximate the peak throughput as Bytes/WorkTime. The metric is designed to be cumulative (see Merge).

func (*ThroughputMetric) Merge

func (tm *ThroughputMetric) Merge(x ThroughputMetric)

Merge accumulates the information from another throughput metric.

func (*ThroughputMetric) PeakRate

func (tm *ThroughputMetric) PeakRate() int64

PeakRate returns the approximate peak rate if there was no idling.

func (*ThroughputMetric) Rate

func (tm *ThroughputMetric) Rate() int64

Rate returns the observed rate.

type ValueMerger

type ValueMerger interface {
	// MergeNewer adds an operand that is newer than all existing operands.
	// The caller retains ownership of value.
	//
	// If an error is returned the merge is aborted and no other methods must
	// be called.
	MergeNewer(value []byte) error

	// MergeOlder adds an operand that is older than all existing operands.
	// The caller retains ownership of value.
	//
	// If an error is returned the merge is aborted and no other methods must
	// be called.
	MergeOlder(value []byte) error

	// Finish does any final processing of the added operands and returns a
	// result. The caller can assume the returned byte slice will not be mutated.
	//
	// Finish must be the last function called on the ValueMerger. The caller
	// must not call any other ValueMerger functions after calling Finish.
	//
	// If `includesBase` is true, the oldest merge operand was part of the
	// merge. This will always be the true during normal iteration, but may be
	// false during compaction when only a subset of operands may be
	// available. Note that `includesBase` is set to true conservatively: a false
	// value means that we could not definitely determine that the base merge
	// operand was included.
	//
	// If a Closer is returned, the returned slice will remain valid until it is
	// closed. The caller must arrange for the closer to be eventually closed.
	Finish(includesBase bool) ([]byte, io.Closer, error)
}

ValueMerger receives merge operands one by one. The operand received is either newer or older than all operands received so far as indicated by the function names, `MergeNewer()` and `MergeOlder()`. Once all operands have been received, the client will invoke `Finish()` to obtain the final result. The order of a merge is not changed after the first call to `MergeNewer()` or `MergeOlder()`, i.e. the same method is used to submit all operands.

The implementation may choose to merge values into the result immediately upon receiving each operand, or buffer operands until Finish() is called. For example, buffering may be useful to avoid (de)serializing partial merge results.

The merge operation must be associative. That is, for the values A, B, C:

Merge(A).MergeOlder(B).MergeOlder(C) == Merge(C).MergeNewer(B).MergeNewer(A)

Examples of merge operators are integer addition, list append, and string concatenation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL