Documentation ¶
Overview ¶
Package filters provides a data-record filtering mechanism and basic implementations for typical use cases. It is intended as a complement to the formats sister package, useful for automating unique record extraction from a data file.
A loose naming convention of adding "s" on the end implies that the filter is applied independently for each field of the record. Thus the missing "s" on "require" means that all supplied fields are required simultaneously. The currently supported filters are:
"require" - drops any record that does NOT match ALL of it's field entries. An empty string ("") require field is skipped, so if you want to require records with blank fields, use the special string FilterBlankEntry "excludes" - drops any record matching at least one of it's field entries. An empty string ("") exclude field is skipped, so if you want to exclude records with blank fields, use the special string FilterBlankEntry To exclude multiple keywords from one field, you will either need to use multiple excludes or write a new Filter. "null_fields" - remaps fields from a placeholder string into an empty string. For example, many data sources use a placeholder of "-" or "n/a" to indicate a missing element. This filter may also be used to suppress particular values from records. "split_fields" - splits fields on a delimiter, creating new records for each split. For example, a single record with 3="A,B,C" and a delimiter of "," emits three records with 3="A", 3="B" and 3="C". Note that the delimiter "" is not allowed. "date_formats" - parses the field value using an strptime format string, and reformats it into a standard representation, of "2006-01-02 15:04:05" in UTC. Note that not all strptime formats are available, see the package at github.com/pbnjay/strptime for a listing.
To support new filters, simply implement the Filter interface and call RegisterFilter before using GetFilter or FilterSet.Append.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( // FilterBlankEntry is a placeholder for blank string matching in RequireFilter and // ExcludeFilter. If for some reason your input contains this text and you need a // different representation, this may be overridden in user code. FilterBlankEntry = "<BLANK>" )
Functions ¶
func RegisterFilter ¶
func RegisterFilter(name string, fg FilterGetter)
RegisterFilter adds a new named Filter for discovery by GetFilter or FilterSet.Append.
Types ¶
type Filter ¶
type Filter interface { // Setup defines the part strings used to apply this filter to new records. Setup(parts map[interface{}]string) error // Apply takes an input record and applies the Filter to create 0 or more records. Apply(fields map[interface{}]string) []map[interface{}]string }
Filter defines an interface that manipulates fields from one record into a new slice of records (most often 1-to-1). These manipulations can have optional parameters provided by Setup to control them.
type FilterSet ¶
type FilterSet struct {
// contains filtered or unexported fields
}
FilterSet defines an ordered set of filters that are applied to incoming data records. These filters can be use to restrict, reformat, and subdivide data into unique records. Filters are applied in the order they are added with Append(), so results are cumulative and early restrictions can bypass more expensive field splits.
func (*FilterSet) Apply ¶
Apply calls Filter.Apply for each filter in the FilterSet, and accumulates results. Restrictive filters (such as Require/Exclude) should be applied as early as possible, and expansive filters (such as Split and DateFormat) should be applied as late as possible in order to decrease computational times.