owdb

package
v0.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 25, 2024 License: MIT Imports: 13 Imported by: 0

Documentation

Overview

Package owdb provides OrbweaverDB stores. OrbweaverDB is a NoSQL database that holds web analytics data, originally for the araneastats project. It is a very simple combined OLAP and OLTP system that is safe for concurrent access.

This is something like the DAO for this project. It has nothing to do with web crawling; quite the opposite, this spider sits in its web and waits for requests to come to *it*. Fundamentally this is architected as a multi-valued time-series oriented DB system.

Use Open to create a Store that persists to a file on disk. The Store provides a full-featured database. The data within can be saved to disk by calling Store.Persist at appropriate times, and when a Store is no longer in use, Store.Close is called to end all current operations. An in-memory Store is obtained either by creating a &Store{} manually or calling Import to create one from previously-obtained bytes.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Criterion

type Criterion[E any] struct {
	Meets     func(v E) bool
	Format    string
	NotFormat string
	EstLimits Limits[E]
}

Criterion is match criteria for a single property of a Hit. Its Meets function performs the actual check as to whether the given value meets it.

The Format string is used for printing the Criterion to a human-readable string. It will be passed the formatted string that gives the name of the property that will be checked against the criterion at the time of formatting; this will be "VALUE" for cases when there is no specific property being checked (such as when calling String() by itself). If Format is not set, a generic string will be used instead.

NotFormat, if given, defines what to show when this Criterion has a Not applied to it. It is optional and Not will default to a generic string if not given.

Both Format and NotFormat have the potential to be used for equality checking. Two Crtierion with the same Format strings should return the same values from their Meets methods when given identical inputs. The same applies to NotFormat.

EstLimits gives the minimum and maximum values, according to some known ordering, that would be captured by this Criterion. It is used for query planning, and all Criterion that do not define it will not be able to be used for query planning or limiting, even if they are applied to an indexed field. This does not mean the Criterion will not be applied, just that the search for the initial set to apply it to will not use it for that purpose. Limits are always an estimate, even when defined - the real "minimum" is the smallest one that meets the criterion, which may not always be easily determinable. Limits must always be no narrower than the set of input the Criterion matches on, but they may be wider.

func CollatesAfter

func CollatesAfter(s string) Criterion[string]

CollatesAfter returns a Criterion that checks that the string property of interest comes after the given value.

func CollatesAfterOrEquals

func CollatesAfterOrEquals(s string) Criterion[string]

CollatesAfterOrEquals returns a Criterion that checks that the string property of interest comes after the given value or is the given value.

func CollatesBefore

func CollatesBefore(s string) Criterion[string]

CollatesBefore returns a Criterion that checks that the string property of interest comes before the given value.

func CollatesBeforeOrEquals

func CollatesBeforeOrEquals(s string) Criterion[string]

CollatesBefore returns a Criterion that checks that the string property of interest comes before or equals the given value.

func CollatesBetween

func CollatesBetween(start, end string) Criterion[string]

CollatesBetween returns a Criterion that the time-based property of interest be between the given values, inclusive.

func DoesNot

func DoesNot[E any](c Criterion[E]) Criterion[E]

func EqualsIP

func EqualsIP(addr string) Criterion[net.IP]

EqualsIP returns a Criterion that checks that the net.IP property of interest equals the address in the given value. For the purposes of this function, an IPv4 address and that same address in IPv6 form are considered the same address. Panics if addr is not a parsable IPv4 or IPv6 address, if you want to check for a nil IP on a hit, use IsNullIP instead.

func EqualsString

func EqualsString(s string) Criterion[string]

EqualsString returns a Criterion that checks that the string property of interest exactly equals the given value.

func EqualsTime

func EqualsTime(val time.Time) Criterion[time.Time]

EqualsTime returns a Criterion that the time-based property of interest be exactly the given value.

func IsAfter

func IsAfter(t time.Time) Criterion[time.Time]

IsAfter returns a Criterion that the time-based property of interest be after the given time, non-inclusive.

func IsAfterOrEquals

func IsAfterOrEquals(t time.Time) Criterion[time.Time]

IsAfterOrEquals returns a Criterion that the time-based property of interest be on or after the given time.

func IsBefore

func IsBefore(t time.Time) Criterion[time.Time]

IsBefore returns a Criterion that the time-based property of interest be before the given time, non-inclusive.

func IsBeforeOrEquals

func IsBeforeOrEquals(t time.Time) Criterion[time.Time]

IsBeforeOrEquals returns a Criterion that the time-based property of interest be on or before the given time.

func IsBetweenIPs

func IsBetweenIPs(start, end string) Criterion[net.IP]

IsBetweenIPs returns a Criterion that checks that the net.IP property of interest lies between the two addresses, inclusive. Equality checks are performed as per EqualsIP, and less than and greater than checks are performed as per IsLessThanIP and IsGreaterThanIP. Panics if either start or end is not a parsable IPv4 or IPv6 address.

func IsBetweenTimes

func IsBetweenTimes(start, end time.Time) Criterion[time.Time]

IsBetweenTimes returns a Criterion that the time-based property of interest be between the given times, inclusive.

func IsGreaterThanIP

func IsGreaterThanIP(addr string) Criterion[net.IP]

IsGreaterThanIP returns a Criterion that checks that the net.IP property of interest comes after the given value when the two are compared on a byte-by-byte level. Both are converted to full IPv6 representations before the comparison is made. A nil IP address is considered to be less than all other addresses. Panics if addr is not a parsable IPv4 or IPv6 address.

func IsGreaterThanOrEqualsIP

func IsGreaterThanOrEqualsIP(addr string) Criterion[net.IP]

IsGreaterThanOrEqualsIP returns a Criterion that checks that the net.IP property of interest comes after the given value when the two are compared on a byte-by-byte level, or that the two are equal. Both are converted to full IPv6 representations before the comparison is made. A nil IP address is considered to be less than all other addresses. Panics if addr is not a parsable IPv4 or IPv6 address.

func IsLessThanIP

func IsLessThanIP(addr string) Criterion[net.IP]

IsLessThanIP returns a Criterion that checks that the net.IP property of interest comes before the given value when the two are compared on a byte-by-byte level. Both are converted to full IPv6 representations before the comparison is made. A nil IP address is considered to be less than all other addresses. Panics if addr is not a parsable IPv4 or IPv6 address.

func IsLessThanOrEqualsIP

func IsLessThanOrEqualsIP(addr string) Criterion[net.IP]

IsLessThanOrEqualsIP returns a Criterion that checks that the net.IP property of interest comes before the given value when the two are compared on a byte-by-byte level, or that the two are equal. Both are converted to full IPv6 representations before the comparison is made. A nil IP address is considered to be less than all other addresses. Panics if addr is not a parsable IPv4 or IPv6 address.

func IsNullIP

func IsNullIP() Criterion[net.IP]

IsNullIP returns a Criterion that checks that the net.IP property of interest is not set.

func IsNullString

func IsNullString() Criterion[string]

IsNullString returns a Criterion that checks that the string property of interest is not set. This is equivalent to EqualsString(""), but does not have any estimated limits defined.

func IsNullTime

func IsNullTime() Criterion[time.Time]

IsNullTime returns a Criterion that checks that the time property of interest is not set. This is equivalent to EqualsTime(time.Time{}), but does not have any estimated limits defined.

func Meets

func Meets[E any](fn func(v E) bool, baseName ...string) Criterion[E]

Meets returns a Criterion that matches against input by using the provided function as its Meets field. This is a convenience function for defining a new Criterion on-the-fly when the caller does not particularly care about display format related fields, or intends to set them later. It makes it so the generic type parameter of the Criterion can be inferred from the function given, which cannot be done when directly instantiating Criterion as of Go 1.19.

Instead of having to write something long and difficult to read such as Criterion[time.Time]{Meets: func(v time.Time){ return v == myTime}}, this function can be used to write Meets(func(v time.Time){ return v == myTime}), which is a bit easier to grok.

baseName, if given, gives the baseName to use for the display field. Only the first baseName is read, if present; all after the first are ignored. If one is given, it is used as the basis for both Format and NotFormat. If one is provided and it is blank, this function will panic.

In order to avoid breaking the contract that functions with the same format strings return equivalent values from their Meets methods, and given that fn itself cannot be exhaustively checked to ensure that, every Criterion created with Meets that doesn't provide a baseName is given a random name that is used to fill its display-format related fields. The randomness is not suited for cryptographic applications. If this is needed, callers should avoid using the default name generation by providing a base-name themselves.

func (Criterion[E]) FilledString

func (crit Criterion[E]) FilledString(value string) string

FilledString returns the string representation of this Criterion when it is being used to check against a particular value. The value could be the name of a property, or the string representation of an actual value along with any delimiter characters that show it. This value is passed unchanged to crit's Format to create the formatted check-string. If Format is set to the empty string, a generic format string is used instead.

func (Criterion[E]) String

func (crit Criterion[E]) String() string

String returns the string representation of crit, which will be the same as CheckString called with a placeholder string.

type Filter

type Filter interface {
	Node() FilterNode
	And(clause Filter, clauses ...Filter) FilterNode
	Or(clause Filter, clauses ...Filter) FilterNode
	Negate() FilterNode

	TimeIndexLimits() Limits[time.Time]
	Matches(h Hit) bool
}

Filter is an interface that is implemented by all types that can be combined into a Where. Because a Where requires that anything combined with it itself be a Where, this interface indicates that the type can be converted to one and then combined with it. Additionally, Filter supports the creation of a Where via the addition of an operator and any applicable operands, selected via And, Or, or Negate.

type FilterNode

type FilterNode struct {
	Cond  *Where
	Op    Operator
	Group []FilterNode
}

FilterNode is a set of conditions to match against all Hits that an operation is to apply to. It is either a "condition"-mode FilterNode, which includes specific criteria, or a "group"-mode FilterNode, which combines criterion-mode WhereNodes with binary operators. A FilterNode cannot be both. If And and Or are used to combine WhereNodes and Conditions, this will be handled automatically.

Cond determines whether the FilterNode is condition mode or group mode. If it is set to a non-nil condition, the FilterNode is in condition mode, Group and Op are ignored. If Cond is set to nil, the FilterNode is in group mode and will test a hit against all Wheres in Group, combined with Op.

An Op of NOT in group mode will use only a single operand from Group. If a FilterNode is evaluated with Op of NOT and multiple operands in Group, all others are ignored. A FilterNode in group mode with an operand of NOT will return false for all Hits passed to Matches.

The zero-value is a ready to use FilterNode in group mode that will match all Hits.

func Not

func Not(f Filter) FilterNode

func (FilterNode) And

func (n FilterNode) And(com Filter, coms ...Filter) FilterNode

And returns a new Where that that matches only those Hits that match all of the given combined conditions. Multiple Combiners can be given to have them all be a part of the same sequence of Ands, and will be evaluated in order.

Calling this returns a Where that represents the composite condition given by (w && co1 ... && coN).

func (FilterNode) IsOperation

func (n FilterNode) IsOperation() bool

IsOperation returns whether the FilterNode represents a grouped operation, that is, one or more operands that an operator is applied to. A FilterNode that represents this is said to be in "group" mode as opposed to "condition" mode, because its Group (and Op) members are used to check whether it matches some input as opposed to the Where in the FilterNode's Cond member.

A FilterNode with Cond set to nil is considered an operation, regardless of the values of Group and Op. Likewise, a FilterNode with Cond set to true is conidered not an operation (though the conceptual line gets a bit blurry when Cond contains multiple criteria, which strictly speaking are treated as though they are AND'd together).

TODO: above parenthetical not needed, move that to pkg docs

If IsOperation returns false, then the FilterNode is in condition mode.

func (FilterNode) Matches

func (n FilterNode) Matches(h Hit) bool

Matches returns whether the given Hit matches this Where clause.

func (FilterNode) Negate

func (n FilterNode) Negate() FilterNode

Negate returns a Where that matches only those Hits that do *not* match w.

Calling this returns a Where that represents the composite condition given by !w.

func (FilterNode) Node

func (n FilterNode) Node() FilterNode

Node returns the FilterNode itself. It is included for implementation of Filter.

func (FilterNode) Or

func (n FilterNode) Or(com Filter, coms ...Filter) FilterNode

Or returns a new Where that that matches all Hits that match at least one of the given combined conditions. Multiple Combiners can be given to have them all be a part of the same sequence of Ors, and will be evaluated in order.

Calling this returns a Where that represents the composite condition given by (w || co1 ... || coN).

func (FilterNode) Simplify

func (n FilterNode) Simplify() FilterNode

Simplify returns a FilterNode that represents the same logic as this one but with any redundant operations removed (such as a double NOT). If n is already simplest terms, it will return itself.

func (FilterNode) String

func (n FilterNode) String() string

String prints out the string representation of the FilterNode. Two FilterNodes should be considered exactly equivalent if they produce the same string, as they will match and fail to match on the same inputs.

func (FilterNode) TimeIndexLimits

func (n FilterNode) TimeIndexLimits() Limits[time.Time]

TimeIndexLimits returns the limits on the indexed Time field of Hits that this FilterNode would impose. If either end is "open", it will be nil. Both ends being open means this doesn't have any limits. Returned Limits should be checked with IsImpossible() to verify that they are possible before using.

type Hit

type Hit struct {
	// Time is the time that the event was recorded.
	Time time.Time

	// Host is an identifier of the host that the client was accessing. It is
	// usually a DNS name but may also be an IP address.
	Host string

	// Resource is the identifier of the resource within the host that the
	// client was attempting to access. This is usually a path for HTTP servers
	// or something similar.
	Resource string

	// Client is information on the HTTP client who made the request.
	Client Requester
}

Hit is a single hit on a website from a particular IP address, which may or may not be unique.

func (Hit) Equal

func (hit Hit) Equal(other interface{}) bool

Equal returns whether other is a Hit with the same proprties as hit.

func (Hit) MarshalBinary

func (h Hit) MarshalBinary() ([]byte, error)

func (Hit) String

func (hit Hit) String() string

func (*Hit) UnmarshalBinary

func (h *Hit) UnmarshalBinary(data []byte) error

type Limits

type Limits[E any] struct {
	Min *E
	Max *E
}

Limits gives the max and minimum for a value. It is used for query planning based on a filter provided. It can be applied even to normally non-comparable types, as long as Min <= Max according to some known ordering.

If Min or Max are set to nil, they should be considered as "no limit" on that side of things.

func (Limits[E]) Contains

func (lim Limits[E]) Contains(pt E, eq func(e1, e2 E) bool, gt func(e1, e2 E) bool) bool

Contains returns whether the limits include the given item pt. This will be checked both by comparison and greater-than checks using the provided eq and gt functions respectively.

Limits is said to contain a point when it falls between the two bounds. Undefined bounds are treated as infinity in that direction; a nil Min is negative infinity, and a nil Max is positive infinity.

func (Limits[E]) IsImpossible

func (lim Limits[E]) IsImpossible(gt func(e1, e2 E) bool) bool

IsImpossible returns whether the limits are impossible for any value to fall between, due to both being set and Min being greater than Max. The function gt tells whether e1 is greater than e2.

func (Limits[E]) Narrow

func (lim Limits[E]) Narrow(other Limits[E], gt func(e1, e2 E) bool) Limits[E]

Narrow returns a new Limits whose Max is the lesser of lim and other, and whose Min is the greater of lim and other. A nil Min or Max is always overcome by a non-nil Min or Max. The function gt must be provided to give how to tell that the first operand is greater than the second.

func (Limits[E]) Widen

func (lim Limits[E]) Widen(other Limits[E], gt func(e1, e2 E) bool) Limits[E]

Widen returns a new Limits whose Max is the greater of lim and other, and whose Min is the lesser of lim and other. A nil Min or Max will always overcome a non-nil one, as they represent infinite coverage on that end. The function gt must be provided to give how to tell that the first operand is greater than the second.

type Operator

type Operator int

Operator is an operation that is applied to all operands of a Where in group mode. NOT is a unary operator and can only apply to a single operand; attempting to evaluate it with more than one operand will result in a panic.

const (
	AND Operator = iota
	OR
	NOT
)

func (Operator) String

func (op Operator) String() string

type Requester

type Requester struct {

	// Address is the IP address of the client.
	Address net.IP

	// Country is the name of the country that the source IP address is from, as
	// per geolocation lookup of the Address.
	Country string

	// City is the name of the city that the source IP address is from, as per
	// geolocation lookup of the Address.
	City string
}

Requester holds information on an HTTP request client.

func (Requester) Equal

func (r Requester) Equal(other interface{}) bool

Equal returns whether other is a Requester with the same properties as r.

func (Requester) MarshalBinary

func (r Requester) MarshalBinary() ([]byte, error)

func (Requester) String

func (r Requester) String() string

func (*Requester) UnmarshalBinary

func (r *Requester) UnmarshalBinary(data []byte) error

type Store

type Store struct {
	// DataFile is the file on disk that the store will store state data in when
	// [Store.Persist] is called. It will be set automatically when the Store is
	// created with a call to [Open].
	//
	// If set to the empty string, calls to [Store.Persist] will have no effect.
	// This allows for in-memory database behavior.
	DataFile string
	// contains filtered or unexported fields
}

Store holds analytics data and provides access to both storage (OLTP) and analytics of events. The zero-value is in-memory only, but one that syncs to disk on calls to Store.Persist can be made by calling Open or setting [Store.DataDir] manually.

Store is safe to use from multiple goroutines concurrently. It serializes access to internal storage.

The zero-value is a Store with no Hits in it ready for immediate use as an in-memory database whose Persist function does not save it to disk. Store must not be copied once created.

func Import

func Import(data []byte) (*Store, error)

Import loads the given data bytes into a new in-memory Store. The data bytes must have been created by a prior call to Store.Export.

The returned Store will be in-memory only by default, and will not persist to disk when Store.Persist is called. To change this, set DataFile on the returned Store.

func ImportFile

func ImportFile(file string) (*Store, error)

ImportFile reads the bytes in the given file and returns the result of calling Import on the read bytes.

func Open

func Open(file string) (*Store, error)

Open creates a new Store that will persist itself to the given data file. If the file already exists, its entire contents are loaded into a new *Store which is then returned. If the file does not exist, it will be created.

The returned Store will have its DataFile member set to the given file. This does not make it so the returned Store will automatically save its contents to disk, rather Store.Persist or Store.Close must be called manually to flush it.

If file is set to the empty string, the Store will be opened in in-memory mode and calls to Persist and Close will only finalize any pending changes and will not write to disk.

func (*Store) Close

func (s *Store) Close() error

Close ends the Store connection. It automatically persists any unflushed changes (if persistence is configured via the DataFile member) and releases any other outstanding resources.

After Close returns, the Store cannot be used again, regardless of whether the returned error is nil.

If the Store has already been closed, calling this method will have no effect and the returned error will be nil.

func (*Store) DataString

func (s *Store) DataString() string

DataString returns a string containing all current data in the store. It can be useful for debugging. If the Store has already been closed, the data will not be shown.

func (*Store) Delete

func (s *Store) Delete(f Filter) (int, error)

Delete removes a hit from the store. All hits that match the given Filter will be deleted. If f is nil, all hits will be considered to match. Returns the number of data points deleted. If no hit of that time exists, nothing is performed and no error is returned.

This operation runs in O(n) with respect to the number of elements in the DB.

func (*Store) Export

func (s *Store) Export() ([]byte, error)

Export exports all data to bytes that can be later decoded with Open or [Store.Import].

func (*Store) Insert

func (s *Store) Insert(h Hit) error

Insert adds a new hit to the store. The time of the hit is not modified and is used to determine the storage location of the data.

This operation runs in O(n) with respect to the number of elements in the DB.

func (*Store) MarshalBinary

func (s *Store) MarshalBinary() ([]byte, error)

MarshalBinary converts the store to a binary bytes representation of itself. These bytes may be saved to disk or loaded into another Store with UnmarshalBinary.

This function is not concurrent safe and requires a read lock. Users of Store should prefer calling Store.Persist (or Store.Export if the exact bytes are needed) instead, which safely obtain one and handle any other required operations.

func (*Store) Persist

func (s *Store) Persist() error

Persist waits for any pending data updates in the Store to be applied and then saves the data, generally to disk. Persistance to disk will occur if Store.DataFile is set to a non-empty string. If Store.DataFile is the empty string (i.e. if Store is in in-memory mode), calling Persist will only do whatever is necessary to make any pending changes visible to future requests.

Persist is not automatically called; the user must do so themselves at the correct frequency. It is recommended it be called after each logical "batch" of operations.

When Persist is called, all data in s is marshaled to bytes and saved to disk, regardless of whether any changes occurred to the data since it was last persisted or loaded. This has performance implications, especially as the amount of data grows large.

func (*Store) Select

func (s *Store) Select(f Filter) ([]Hit, error)

Select selects all hits that match the given Filter. If there are no matches, a slice with length 0 will be returned along with a nil error.

func (*Store) String

func (s *Store) String() string

func (*Store) UnmarshalBinary

func (s *Store) UnmarshalBinary(data []byte) error

UnmarshalBinary converts a binary byte representation of a Store located at the start of data and uses it to set the values on the Store.

This function is not concurrent safe and requires a write lock. Users of Store should prefer calling Open or Import to create a Store from bytes, which safely handle obtaining synchronization primitives and any other required operations.

func (*Store) Update

func (s *Store) Update(f Filter, update func(Hit) Hit) (matched, updated int, err error)

Update applies a transformation function to hits to get a new one. All hits that match the given filter will be passed to the given function in order to create a new one. Update returns the number of records that match the filter as well as the number of records actually changed by the provided function.

While it is possible to, within the function, apply one's own checks and return the Hit unchanged when it doesn't meet it, this will result in poor performance than if the Filter is used to limit the query to only those Hits which are to be modified.

If the update function modifies an indexed field, a performance hit is incurred as the indexes will then need to be modified.

type Where

type Where struct {
	Time          Criterion[time.Time]
	Host          Criterion[string]
	Resource      Criterion[string]
	ClientAddress Criterion[net.IP]
	ClientCountry Criterion[string]
	ClientCity    Criterion[string]
}

Where is a set of criteria that a Hit can be matched against. It may have up to one check per property of a Hit.

func (Where) And

func (w Where) And(com Filter, coms ...Filter) FilterNode

And combines both this and any other Where into a single WhereNode clause that matches only those Hits that match all of the Wheres. Multiple Wheres can be given to have them all be a part of the same sequence of Ands, and will be evaluated in order.

Calling this returns a Where that represents the composite condition given by (cond && com ... && comN).

func (Where) Matches

func (w Where) Matches(h Hit) bool

Matches returns whether the criteria defined by this Where match the given Hit.

func (Where) Negate

func (w Where) Negate() FilterNode

Negate returns a WhereNode that matches only those Hits that do *not* match the Where.

Calling this returns a WhereNode that represents the composite condition given by !cond.

func (Where) Node

func (cond Where) Node() FilterNode

AsWhere returns a new Condition-mode Where that matches Hits against this condition. It is included to implement WhereCombiner.

func (Where) Or

func (w Where) Or(com Filter, coms ...Filter) FilterNode

And combines both this and any other Wheres given into a single WhereNode clause that matches only those Hits that match at least one of the Wheres. Multiple Conditions can be given to have them all be a part of the same sequence of Ors, and will be evaluated in order.

Calling this returns a WhereNode that represents the composite condition given by (cond || com ... || comN).

func (Where) String

func (w Where) String() string

String prints out the string representation of this Where. Two Where structs that return the same values from String() should be considered exactly equivalent, as they will produce identical output from their respective Matches.

func (Where) TimeIndexLimits

func (w Where) TimeIndexLimits() Limits[time.Time]

TimeIndexLimits returns the limits on the indexed Time field of Hits that this where would impose. If either end is "open", it will be nil. Both ends being open means this doesn't have a lowest one.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL