data

package
v0.0.0-...-ae1e168 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 5, 2021 License: BSD-2-Clause Imports: 10 Imported by: 18

Documentation

Index

Constants

View Source
const (
	DefaultDocMaxRoom = 2 * 1048576 // DefaultDocMaxRoom is the default maximum size a single document may never exceed.
	DocHeader         = 1 + 10      // DocHeader is the size of document header fields.
	EntrySize         = 1 + 10 + 10 // EntrySize is the size of a single hash table entry.
	BucketHeader      = 10          // BucketHeader is the size of hash table bucket's header fields.
)
View Source
const (
	HT_FILE_GROWTH = 32 * 1048576 // Default hash table file initial size & file growth
	HASH_BITS      = 16           // Default number of hash key bits
)
View Source
const (
	COL_FILE_GROWTH = 32 * 1048576 // Default collection file initial size & size growth (32 MBytes)
)

Variables

This section is empty.

Functions

func LooksEmpty

func LooksEmpty(buf gommap.MMap) bool

Return true if the buffer begins with 64 consecutive zero bytes.

Types

type Collection

type Collection struct {
	*DataFile
	*Config
}

Collection file contains document headers and document text data.

func (*Collection) Delete

func (col *Collection) Delete(id int) error

Delete a document by ID.

func (*Collection) ForEachDoc

func (col *Collection) ForEachDoc(fun func(id int, doc []byte) bool)

Run the function on every document; stop when the function returns false.

func (*Collection) Insert

func (col *Collection) Insert(data []byte) (id int, err error)

Insert a new document, return the new document ID.

func (*Collection) Read

func (col *Collection) Read(id int) []byte

Find and retrieve a document by ID (physical document location). Return value is a copy of the document.

func (*Collection) Update

func (col *Collection) Update(id int, data []byte) (newID int, err error)

Overwrite or re-insert a document, return the new document ID if re-inserted.

type Config

type Config struct {
	DocMaxRoom    int  // DocMaxRoom is the maximum size of a single document that will ever be accepted into database.
	ColFileGrowth int  // ColFileGrowth is the size (in bytes) to grow collection data file when new documents have to fit in.
	PerBucket     int  // PerBucket is the number of entries pre-allocated to each hash table bucket.
	HTFileGrowth  int  /// HTFileGrowth is the size (in bytes) to grow hash table file to fit in more entries.
	HashBits      uint // HashBits is the number of bits to consider for hashing indexed key, also determines the initial number of buckets in a hash table file.

	InitialBuckets int    `json:"-"` // InitialBuckets is the number of buckets initially allocated in a hash table file.
	Padding        string `json:"-"` // Padding is pre-allocated filler (space characters) for new documents.
	LenPadding     int    `json:"-"` // LenPadding is the calculated length of Padding string.
	BucketSize     int    `json:"-"` // BucketSize is the calculated size of each hash table bucket.
}

Config consists of tuning parameters initialised once upon creation of a new database, the properties heavily influence performance characteristics of all collections in a database. Adjust with care!

func CreateOrReadConfig

func CreateOrReadConfig(path string) (conf *Config, err error)

CreateOrReadConfig creates default performance configuration underneath the input database directory.

func (*Config) CalculateConfigConstants

func (conf *Config) CalculateConfigConstants()

CalculateConfigConstants assignes internal field values to calculation results derived from other fields.

func (*Config) GetPartitionRange

func (conf *Config) GetPartitionRange(partNum, totalParts int) (start int, end int)

Divide the entire hash table into roughly equally sized partitions, and return the start/end key range of the chosen partition.

func (*Config) HashKey

func (conf *Config) HashKey(key int) int

Smear the integer entry key and return the portion (first HASH_BITS bytes) used for allocating the entry.

func (*Config) OpenCollection

func (conf *Config) OpenCollection(path string) (col *Collection, err error)

Open a collection file.

func (*Config) OpenHashTable

func (conf *Config) OpenHashTable(path string) (ht *HashTable, err error)

Open a hash table file.

func (*Config) OpenPartition

func (conf *Config) OpenPartition(colPath, lookupPath string) (part *Partition, err error)

Open a collection partition.

type DataFile

type DataFile struct {
	Path               string
	Size, Used, Growth int
	Fh                 *os.File
	Buf                gommap.MMap
}

Data file keeps track of the amount of total and used space.

func OpenDataFile

func OpenDataFile(path string, growth int) (file *DataFile, err error)

Open a data file that grows by the specified size.

func (*DataFile) Clear

func (file *DataFile) Clear() (err error)

Clear the entire file and resize it to initial size.

func (*DataFile) Close

func (file *DataFile) Close() (err error)

Un-map the file buffer and close the file handle.

func (*DataFile) EnsureSize

func (file *DataFile) EnsureSize(more int) (err error)

Ensure there is enough room for that many bytes of data.

type HashTable

type HashTable struct {
	*Config
	*DataFile

	Lock *sync.RWMutex
	// contains filtered or unexported fields
}

Hash table file is a binary file containing buckets of hash entries.

func (*HashTable) Clear

func (ht *HashTable) Clear() (err error)

Clear the entire hash table.

func (*HashTable) Get

func (ht *HashTable) Get(key, limit int) (vals []int)

Look up values by key.

func (*HashTable) GetPartition

func (ht *HashTable) GetPartition(partNum, partSize int) (keys, vals []int)

Return all entries in the chosen partition.

func (*HashTable) Put

func (ht *HashTable) Put(key, val int)

Store the entry into a vacant (invalidated or empty) place in the appropriate bucket.

func (*HashTable) Remove

func (ht *HashTable) Remove(key, val int)

Flag an entry as invalid, so that Get will not return it later on.

type Partition

type Partition struct {
	*Config

	DataLock *sync.RWMutex // guard against concurrent document updates
	// contains filtered or unexported fields
}

Partition associates a hash table with collection documents, allowing addressing of a document using an unchanging ID.

func (*Partition) ApproxDocCount

func (part *Partition) ApproxDocCount() int

Return approximate number of documents in the partition.

func (*Partition) Clear

func (part *Partition) Clear() error

Clear data file and lookup hash table.

func (*Partition) Close

func (part *Partition) Close() error

Close all file handles.

func (*Partition) Delete

func (part *Partition) Delete(id int) (err error)

Delete a document.

func (*Partition) ForEachDoc

func (part *Partition) ForEachDoc(partNum, totalPart int, fun func(id int, doc []byte) bool) (moveOn bool)

Partition documents into roughly equally sized portions, and run the function on every document in the portion.

func (*Partition) Insert

func (part *Partition) Insert(id int, data []byte) (physID int, err error)

Insert a document. The ID may be used to retrieve/update/delete the document later on.

func (*Partition) LockUpdate

func (part *Partition) LockUpdate(id int)

Lock a document for exclusive update.

func (*Partition) Read

func (part *Partition) Read(id int) ([]byte, error)

Find and retrieve a document by ID.

func (*Partition) UnlockUpdate

func (part *Partition) UnlockUpdate(id int)

Unlock a document to make it ready for the next update.

func (*Partition) Update

func (part *Partition) Update(id int, data []byte) (err error)

Update a document.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL