Documentation ¶
Overview ¶
Package owdb provides OrbweaverDB stores. OrbweaverDB is a NoSQL database that holds web analytics data, originally for the araneastats project. It is a very simple combined OLAP and OLTP system that is safe for concurrent access.
This is something like the DAO for this project. It has nothing to do with web crawling; quite the opposite, this spider sits in its web and waits for requests to come to *it*. Fundamentally this is architected as a multi-valued time-series oriented DB system.
Use Open to create a Store that persists to a file on disk. The Store provides a full-featured database. The data within can be saved to disk by calling Store.Persist at appropriate times, and when a Store is no longer in use, Store.Close is called to end all current operations. An in-memory Store is obtained either by creating a &Store{} manually or calling Import to create one from previously-obtained bytes.
Index ¶
- type Criterion
- func CollatesAfter(s string) Criterion[string]
- func CollatesAfterOrEquals(s string) Criterion[string]
- func CollatesBefore(s string) Criterion[string]
- func CollatesBeforeOrEquals(s string) Criterion[string]
- func CollatesBetween(start, end string) Criterion[string]
- func DoesNot[E any](c Criterion[E]) Criterion[E]
- func EqualsIP(addr string) Criterion[net.IP]
- func EqualsString(s string) Criterion[string]
- func EqualsTime(val time.Time) Criterion[time.Time]
- func IsAfter(t time.Time) Criterion[time.Time]
- func IsAfterOrEquals(t time.Time) Criterion[time.Time]
- func IsBefore(t time.Time) Criterion[time.Time]
- func IsBeforeOrEquals(t time.Time) Criterion[time.Time]
- func IsBetweenIPs(start, end string) Criterion[net.IP]
- func IsBetweenTimes(start, end time.Time) Criterion[time.Time]
- func IsGreaterThanIP(addr string) Criterion[net.IP]
- func IsGreaterThanOrEqualsIP(addr string) Criterion[net.IP]
- func IsLessThanIP(addr string) Criterion[net.IP]
- func IsLessThanOrEqualsIP(addr string) Criterion[net.IP]
- func IsNullIP() Criterion[net.IP]
- func IsNullString() Criterion[string]
- func IsNullTime() Criterion[time.Time]
- func Meets[E any](fn func(v E) bool, baseName ...string) Criterion[E]
- type Filter
- type FilterNode
- func (n FilterNode) And(com Filter, coms ...Filter) FilterNode
- func (n FilterNode) IsOperation() bool
- func (n FilterNode) Matches(h Hit) bool
- func (n FilterNode) Negate() FilterNode
- func (n FilterNode) Node() FilterNode
- func (n FilterNode) Or(com Filter, coms ...Filter) FilterNode
- func (n FilterNode) Simplify() FilterNode
- func (n FilterNode) String() string
- func (n FilterNode) TimeIndexLimits() Limits[time.Time]
- type Hit
- type Limits
- func (lim Limits[E]) Contains(pt E, eq func(e1, e2 E) bool, gt func(e1, e2 E) bool) bool
- func (lim Limits[E]) IsImpossible(gt func(e1, e2 E) bool) bool
- func (lim Limits[E]) Narrow(other Limits[E], gt func(e1, e2 E) bool) Limits[E]
- func (lim Limits[E]) Widen(other Limits[E], gt func(e1, e2 E) bool) Limits[E]
- type Operator
- type Requester
- type Store
- func (s *Store) Close() error
- func (s *Store) DataString() string
- func (s *Store) Delete(f Filter) (int, error)
- func (s *Store) Export() ([]byte, error)
- func (s *Store) Insert(h Hit) error
- func (s *Store) MarshalBinary() ([]byte, error)
- func (s *Store) Persist() error
- func (s *Store) Select(f Filter) ([]Hit, error)
- func (s *Store) String() string
- func (s *Store) UnmarshalBinary(data []byte) error
- func (s *Store) Update(f Filter, update func(Hit) Hit) (matched, updated int, err error)
- type Where
- func (w Where) And(com Filter, coms ...Filter) FilterNode
- func (w Where) Matches(h Hit) bool
- func (w Where) Negate() FilterNode
- func (cond Where) Node() FilterNode
- func (w Where) Or(com Filter, coms ...Filter) FilterNode
- func (w Where) String() string
- func (w Where) TimeIndexLimits() Limits[time.Time]
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Criterion ¶
type Criterion[E any] struct { Meets func(v E) bool Format string NotFormat string EstLimits Limits[E] }
Criterion is match criteria for a single property of a Hit. Its Meets function performs the actual check as to whether the given value meets it.
The Format string is used for printing the Criterion to a human-readable string. It will be passed the formatted string that gives the name of the property that will be checked against the criterion at the time of formatting; this will be "VALUE" for cases when there is no specific property being checked (such as when calling String() by itself). If Format is not set, a generic string will be used instead.
NotFormat, if given, defines what to show when this Criterion has a Not applied to it. It is optional and Not will default to a generic string if not given.
Both Format and NotFormat have the potential to be used for equality checking. Two Crtierion with the same Format strings should return the same values from their Meets methods when given identical inputs. The same applies to NotFormat.
EstLimits gives the minimum and maximum values, according to some known ordering, that would be captured by this Criterion. It is used for query planning, and all Criterion that do not define it will not be able to be used for query planning or limiting, even if they are applied to an indexed field. This does not mean the Criterion will not be applied, just that the search for the initial set to apply it to will not use it for that purpose. Limits are always an estimate, even when defined - the real "minimum" is the smallest one that meets the criterion, which may not always be easily determinable. Limits must always be no narrower than the set of input the Criterion matches on, but they may be wider.
func CollatesAfter ¶
CollatesAfter returns a Criterion that checks that the string property of interest comes after the given value.
func CollatesAfterOrEquals ¶
CollatesAfterOrEquals returns a Criterion that checks that the string property of interest comes after the given value or is the given value.
func CollatesBefore ¶
CollatesBefore returns a Criterion that checks that the string property of interest comes before the given value.
func CollatesBeforeOrEquals ¶
CollatesBefore returns a Criterion that checks that the string property of interest comes before or equals the given value.
func CollatesBetween ¶
CollatesBetween returns a Criterion that the time-based property of interest be between the given values, inclusive.
func EqualsIP ¶
EqualsIP returns a Criterion that checks that the net.IP property of interest equals the address in the given value. For the purposes of this function, an IPv4 address and that same address in IPv6 form are considered the same address. Panics if addr is not a parsable IPv4 or IPv6 address, if you want to check for a nil IP on a hit, use IsNullIP instead.
func EqualsString ¶
EqualsString returns a Criterion that checks that the string property of interest exactly equals the given value.
func EqualsTime ¶
EqualsTime returns a Criterion that the time-based property of interest be exactly the given value.
func IsAfter ¶
IsAfter returns a Criterion that the time-based property of interest be after the given time, non-inclusive.
func IsAfterOrEquals ¶
IsAfterOrEquals returns a Criterion that the time-based property of interest be on or after the given time.
func IsBefore ¶
IsBefore returns a Criterion that the time-based property of interest be before the given time, non-inclusive.
func IsBeforeOrEquals ¶
IsBeforeOrEquals returns a Criterion that the time-based property of interest be on or before the given time.
func IsBetweenIPs ¶
IsBetweenIPs returns a Criterion that checks that the net.IP property of interest lies between the two addresses, inclusive. Equality checks are performed as per EqualsIP, and less than and greater than checks are performed as per IsLessThanIP and IsGreaterThanIP. Panics if either start or end is not a parsable IPv4 or IPv6 address.
func IsBetweenTimes ¶
IsBetweenTimes returns a Criterion that the time-based property of interest be between the given times, inclusive.
func IsGreaterThanIP ¶
IsGreaterThanIP returns a Criterion that checks that the net.IP property of interest comes after the given value when the two are compared on a byte-by-byte level. Both are converted to full IPv6 representations before the comparison is made. A nil IP address is considered to be less than all other addresses. Panics if addr is not a parsable IPv4 or IPv6 address.
func IsGreaterThanOrEqualsIP ¶
IsGreaterThanOrEqualsIP returns a Criterion that checks that the net.IP property of interest comes after the given value when the two are compared on a byte-by-byte level, or that the two are equal. Both are converted to full IPv6 representations before the comparison is made. A nil IP address is considered to be less than all other addresses. Panics if addr is not a parsable IPv4 or IPv6 address.
func IsLessThanIP ¶
IsLessThanIP returns a Criterion that checks that the net.IP property of interest comes before the given value when the two are compared on a byte-by-byte level. Both are converted to full IPv6 representations before the comparison is made. A nil IP address is considered to be less than all other addresses. Panics if addr is not a parsable IPv4 or IPv6 address.
func IsLessThanOrEqualsIP ¶
IsLessThanOrEqualsIP returns a Criterion that checks that the net.IP property of interest comes before the given value when the two are compared on a byte-by-byte level, or that the two are equal. Both are converted to full IPv6 representations before the comparison is made. A nil IP address is considered to be less than all other addresses. Panics if addr is not a parsable IPv4 or IPv6 address.
func IsNullIP ¶
IsNullIP returns a Criterion that checks that the net.IP property of interest is not set.
func IsNullString ¶
IsNullString returns a Criterion that checks that the string property of interest is not set. This is equivalent to EqualsString(""), but does not have any estimated limits defined.
func IsNullTime ¶
IsNullTime returns a Criterion that checks that the time property of interest is not set. This is equivalent to EqualsTime(time.Time{}), but does not have any estimated limits defined.
func Meets ¶
Meets returns a Criterion that matches against input by using the provided function as its Meets field. This is a convenience function for defining a new Criterion on-the-fly when the caller does not particularly care about display format related fields, or intends to set them later. It makes it so the generic type parameter of the Criterion can be inferred from the function given, which cannot be done when directly instantiating Criterion as of Go 1.19.
Instead of having to write something long and difficult to read such as Criterion[time.Time]{Meets: func(v time.Time){ return v == myTime}}, this function can be used to write Meets(func(v time.Time){ return v == myTime}), which is a bit easier to grok.
baseName, if given, gives the baseName to use for the display field. Only the first baseName is read, if present; all after the first are ignored. If one is given, it is used as the basis for both Format and NotFormat. If one is provided and it is blank, this function will panic.
In order to avoid breaking the contract that functions with the same format strings return equivalent values from their Meets methods, and given that fn itself cannot be exhaustively checked to ensure that, every Criterion created with Meets that doesn't provide a baseName is given a random name that is used to fill its display-format related fields. The randomness is not suited for cryptographic applications. If this is needed, callers should avoid using the default name generation by providing a base-name themselves.
func (Criterion[E]) FilledString ¶
FilledString returns the string representation of this Criterion when it is being used to check against a particular value. The value could be the name of a property, or the string representation of an actual value along with any delimiter characters that show it. This value is passed unchanged to crit's Format to create the formatted check-string. If Format is set to the empty string, a generic format string is used instead.
type Filter ¶
type Filter interface { Node() FilterNode And(clause Filter, clauses ...Filter) FilterNode Or(clause Filter, clauses ...Filter) FilterNode Negate() FilterNode TimeIndexLimits() Limits[time.Time] Matches(h Hit) bool }
Filter is an interface that is implemented by all types that can be combined into a Where. Because a Where requires that anything combined with it itself be a Where, this interface indicates that the type can be converted to one and then combined with it. Additionally, Filter supports the creation of a Where via the addition of an operator and any applicable operands, selected via And, Or, or Negate.
type FilterNode ¶
type FilterNode struct { Cond *Where Op Operator Group []FilterNode }
FilterNode is a set of conditions to match against all Hits that an operation is to apply to. It is either a "condition"-mode FilterNode, which includes specific criteria, or a "group"-mode FilterNode, which combines criterion-mode WhereNodes with binary operators. A FilterNode cannot be both. If And and Or are used to combine WhereNodes and Conditions, this will be handled automatically.
Cond determines whether the FilterNode is condition mode or group mode. If it is set to a non-nil condition, the FilterNode is in condition mode, Group and Op are ignored. If Cond is set to nil, the FilterNode is in group mode and will test a hit against all Wheres in Group, combined with Op.
An Op of NOT in group mode will use only a single operand from Group. If a FilterNode is evaluated with Op of NOT and multiple operands in Group, all others are ignored. A FilterNode in group mode with an operand of NOT will return false for all Hits passed to Matches.
The zero-value is a ready to use FilterNode in group mode that will match all Hits.
func Not ¶
func Not(f Filter) FilterNode
func (FilterNode) And ¶
func (n FilterNode) And(com Filter, coms ...Filter) FilterNode
And returns a new Where that that matches only those Hits that match all of the given combined conditions. Multiple Combiners can be given to have them all be a part of the same sequence of Ands, and will be evaluated in order.
Calling this returns a Where that represents the composite condition given by (w && co1 ... && coN).
func (FilterNode) IsOperation ¶
func (n FilterNode) IsOperation() bool
IsOperation returns whether the FilterNode represents a grouped operation, that is, one or more operands that an operator is applied to. A FilterNode that represents this is said to be in "group" mode as opposed to "condition" mode, because its Group (and Op) members are used to check whether it matches some input as opposed to the Where in the FilterNode's Cond member.
A FilterNode with Cond set to nil is considered an operation, regardless of the values of Group and Op. Likewise, a FilterNode with Cond set to true is conidered not an operation (though the conceptual line gets a bit blurry when Cond contains multiple criteria, which strictly speaking are treated as though they are AND'd together).
TODO: above parenthetical not needed, move that to pkg docs
If IsOperation returns false, then the FilterNode is in condition mode.
func (FilterNode) Matches ¶
func (n FilterNode) Matches(h Hit) bool
Matches returns whether the given Hit matches this Where clause.
func (FilterNode) Negate ¶
func (n FilterNode) Negate() FilterNode
Negate returns a Where that matches only those Hits that do *not* match w.
Calling this returns a Where that represents the composite condition given by !w.
func (FilterNode) Node ¶
func (n FilterNode) Node() FilterNode
Node returns the FilterNode itself. It is included for implementation of Filter.
func (FilterNode) Or ¶
func (n FilterNode) Or(com Filter, coms ...Filter) FilterNode
Or returns a new Where that that matches all Hits that match at least one of the given combined conditions. Multiple Combiners can be given to have them all be a part of the same sequence of Ors, and will be evaluated in order.
Calling this returns a Where that represents the composite condition given by (w || co1 ... || coN).
func (FilterNode) Simplify ¶
func (n FilterNode) Simplify() FilterNode
Simplify returns a FilterNode that represents the same logic as this one but with any redundant operations removed (such as a double NOT). If n is already simplest terms, it will return itself.
func (FilterNode) String ¶
func (n FilterNode) String() string
String prints out the string representation of the FilterNode. Two FilterNodes should be considered exactly equivalent if they produce the same string, as they will match and fail to match on the same inputs.
func (FilterNode) TimeIndexLimits ¶
func (n FilterNode) TimeIndexLimits() Limits[time.Time]
TimeIndexLimits returns the limits on the indexed Time field of Hits that this FilterNode would impose. If either end is "open", it will be nil. Both ends being open means this doesn't have any limits. Returned Limits should be checked with IsImpossible() to verify that they are possible before using.
type Hit ¶
type Hit struct { // Time is the time that the event was recorded. Time time.Time // Host is an identifier of the host that the client was accessing. It is // usually a DNS name but may also be an IP address. Host string // Resource is the identifier of the resource within the host that the // client was attempting to access. This is usually a path for HTTP servers // or something similar. Resource string // Client is information on the HTTP client who made the request. Client Requester }
Hit is a single hit on a website from a particular IP address, which may or may not be unique.
func (Hit) MarshalBinary ¶
func (*Hit) UnmarshalBinary ¶
type Limits ¶
type Limits[E any] struct { Min *E Max *E }
Limits gives the max and minimum for a value. It is used for query planning based on a filter provided. It can be applied even to normally non-comparable types, as long as Min <= Max according to some known ordering.
If Min or Max are set to nil, they should be considered as "no limit" on that side of things.
func (Limits[E]) Contains ¶
Contains returns whether the limits include the given item pt. This will be checked both by comparison and greater-than checks using the provided eq and gt functions respectively.
Limits is said to contain a point when it falls between the two bounds. Undefined bounds are treated as infinity in that direction; a nil Min is negative infinity, and a nil Max is positive infinity.
func (Limits[E]) IsImpossible ¶
IsImpossible returns whether the limits are impossible for any value to fall between, due to both being set and Min being greater than Max. The function gt tells whether e1 is greater than e2.
func (Limits[E]) Narrow ¶
Narrow returns a new Limits whose Max is the lesser of lim and other, and whose Min is the greater of lim and other. A nil Min or Max is always overcome by a non-nil Min or Max. The function gt must be provided to give how to tell that the first operand is greater than the second.
func (Limits[E]) Widen ¶
Widen returns a new Limits whose Max is the greater of lim and other, and whose Min is the lesser of lim and other. A nil Min or Max will always overcome a non-nil one, as they represent infinite coverage on that end. The function gt must be provided to give how to tell that the first operand is greater than the second.
type Operator ¶
type Operator int
Operator is an operation that is applied to all operands of a Where in group mode. NOT is a unary operator and can only apply to a single operand; attempting to evaluate it with more than one operand will result in a panic.
type Requester ¶
type Requester struct { // Address is the IP address of the client. Address net.IP // Country is the name of the country that the source IP address is from, as // per geolocation lookup of the Address. Country string // City is the name of the city that the source IP address is from, as per // geolocation lookup of the Address. City string }
Requester holds information on an HTTP request client.
func (Requester) MarshalBinary ¶
func (*Requester) UnmarshalBinary ¶
type Store ¶
type Store struct { // DataFile is the file on disk that the store will store state data in when // [Store.Persist] is called. It will be set automatically when the Store is // created with a call to [Open]. // // If set to the empty string, calls to [Store.Persist] will have no effect. // This allows for in-memory database behavior. DataFile string // contains filtered or unexported fields }
Store holds analytics data and provides access to both storage (OLTP) and analytics of events. The zero-value is in-memory only, but one that syncs to disk on calls to Store.Persist can be made by calling Open or setting [Store.DataDir] manually.
Store is safe to use from multiple goroutines concurrently. It serializes access to internal storage.
The zero-value is a Store with no Hits in it ready for immediate use as an in-memory database whose Persist function does not save it to disk. Store must not be copied once created.
func Import ¶
Import loads the given data bytes into a new in-memory Store. The data bytes must have been created by a prior call to Store.Export.
The returned Store will be in-memory only by default, and will not persist to disk when Store.Persist is called. To change this, set DataFile on the returned Store.
func ImportFile ¶
ImportFile reads the bytes in the given file and returns the result of calling Import on the read bytes.
func Open ¶
Open creates a new Store that will persist itself to the given data file. If the file already exists, its entire contents are loaded into a new *Store which is then returned. If the file does not exist, it will be created.
The returned Store will have its DataFile member set to the given file. This does not make it so the returned Store will automatically save its contents to disk, rather Store.Persist or Store.Close must be called manually to flush it.
If file is set to the empty string, the Store will be opened in in-memory mode and calls to Persist and Close will only finalize any pending changes and will not write to disk.
func (*Store) Close ¶
Close ends the Store connection. It automatically persists any unflushed changes (if persistence is configured via the DataFile member) and releases any other outstanding resources.
After Close returns, the Store cannot be used again, regardless of whether the returned error is nil.
If the Store has already been closed, calling this method will have no effect and the returned error will be nil.
func (*Store) DataString ¶
DataString returns a string containing all current data in the store. It can be useful for debugging. If the Store has already been closed, the data will not be shown.
func (*Store) Delete ¶
Delete removes a hit from the store. All hits that match the given Filter will be deleted. If f is nil, all hits will be considered to match. Returns the number of data points deleted. If no hit of that time exists, nothing is performed and no error is returned.
This operation runs in O(n) with respect to the number of elements in the DB.
func (*Store) Export ¶
Export exports all data to bytes that can be later decoded with Open or [Store.Import].
func (*Store) Insert ¶
Insert adds a new hit to the store. The time of the hit is not modified and is used to determine the storage location of the data.
This operation runs in O(n) with respect to the number of elements in the DB.
func (*Store) MarshalBinary ¶
MarshalBinary converts the store to a binary bytes representation of itself. These bytes may be saved to disk or loaded into another Store with UnmarshalBinary.
This function is not concurrent safe and requires a read lock. Users of Store should prefer calling Store.Persist (or Store.Export if the exact bytes are needed) instead, which safely obtain one and handle any other required operations.
func (*Store) Persist ¶
Persist waits for any pending data updates in the Store to be applied and then saves the data, generally to disk. Persistance to disk will occur if Store.DataFile is set to a non-empty string. If Store.DataFile is the empty string (i.e. if Store is in in-memory mode), calling Persist will only do whatever is necessary to make any pending changes visible to future requests.
Persist is not automatically called; the user must do so themselves at the correct frequency. It is recommended it be called after each logical "batch" of operations.
When Persist is called, all data in s is marshaled to bytes and saved to disk, regardless of whether any changes occurred to the data since it was last persisted or loaded. This has performance implications, especially as the amount of data grows large.
func (*Store) Select ¶
Select selects all hits that match the given Filter. If there are no matches, a slice with length 0 will be returned along with a nil error.
func (*Store) UnmarshalBinary ¶
UnmarshalBinary converts a binary byte representation of a Store located at the start of data and uses it to set the values on the Store.
This function is not concurrent safe and requires a write lock. Users of Store should prefer calling Open or Import to create a Store from bytes, which safely handle obtaining synchronization primitives and any other required operations.
func (*Store) Update ¶
Update applies a transformation function to hits to get a new one. All hits that match the given filter will be passed to the given function in order to create a new one. Update returns the number of records that match the filter as well as the number of records actually changed by the provided function.
While it is possible to, within the function, apply one's own checks and return the Hit unchanged when it doesn't meet it, this will result in poor performance than if the Filter is used to limit the query to only those Hits which are to be modified.
If the update function modifies an indexed field, a performance hit is incurred as the indexes will then need to be modified.
type Where ¶
type Where struct { Time Criterion[time.Time] Host Criterion[string] Resource Criterion[string] ClientAddress Criterion[net.IP] ClientCountry Criterion[string] ClientCity Criterion[string] }
Where is a set of criteria that a Hit can be matched against. It may have up to one check per property of a Hit.
func (Where) And ¶
func (w Where) And(com Filter, coms ...Filter) FilterNode
And combines both this and any other Where into a single WhereNode clause that matches only those Hits that match all of the Wheres. Multiple Wheres can be given to have them all be a part of the same sequence of Ands, and will be evaluated in order.
Calling this returns a Where that represents the composite condition given by (cond && com ... && comN).
func (Where) Matches ¶
Matches returns whether the criteria defined by this Where match the given Hit.
func (Where) Negate ¶
func (w Where) Negate() FilterNode
Negate returns a WhereNode that matches only those Hits that do *not* match the Where.
Calling this returns a WhereNode that represents the composite condition given by !cond.
func (Where) Node ¶
func (cond Where) Node() FilterNode
AsWhere returns a new Condition-mode Where that matches Hits against this condition. It is included to implement WhereCombiner.
func (Where) Or ¶
func (w Where) Or(com Filter, coms ...Filter) FilterNode
And combines both this and any other Wheres given into a single WhereNode clause that matches only those Hits that match at least one of the Wheres. Multiple Conditions can be given to have them all be a part of the same sequence of Ors, and will be evaluated in order.
Calling this returns a WhereNode that represents the composite condition given by (cond || com ... || comN).