dataparse

package module
v0.0.0-...-6c8f576 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 24, 2024 License: BSD-3-Clause Imports: 16 Imported by: 0

README

dataparse

Too often I have to work with these annyances:

  1. CSV files that require to rewrite parsing for varying types from string
  2. Excel files that report erroneous types
  3. Unreliable APIs reporting e.g. integer as floats
  4. Unreliable, (almost) undocumented APIs that return an object or a list depending on the number of results

etc.pp.

To solve these annoyances dataparse was born.

A onestop shop that makes it easy to retrieve information from varying sources and handles the transformation between types.

General use

APIs

If an API does not offer an OpenAPI spec it is left to the consumer to implement a client. Usually it is enough to have a look at the results with curl, define structs with tags accordingly and then json.Unmarshal into those.

Sometimes these APIs (especially SGML-to-JSON-wrapped and to a lesser extend Java-backed APIs) report values in wild inconsistency, e.g. reporting integers as floats or numbers as strings.

In those cases dataparse.Map can help:

// Execute the request to the API
resp, err := http.Get("https://outdated-but-important.api/path/to/endpoint")
if err != nil {
    return err
}

// Read the returned JSON data into a dataparse.Map
m, err := dataparse.FromJsonSingle(resp.Body)
if err != nil {
    return err
}

i, err := m.Int("integer_value")
if err != nil {
    return err
}

log.Printf("integer value: %d")

In this case the API can return the integer as integer, string or float and dataparse will transform it into the desired integer.

Unmarshalling into structs

Another useful utility is unmarshalling data into structs, e.g. when reading CSVs:

Assuming a CSV file with the headers hostname,ip,logsize:

type myData struct {
    Hostname string  `dataparse:"hostname"`
    IPAddress net.IP `dataparse:"ip"`
    Logsize int      `dataparse:"logsize"`
}

// If the CSV file has no headers they can also be passed like this:
// dataparse.From("...", dataparse.WithHeaders("hostname", "ip", "logsize"))
mapCh, errCh, err := dataparse.From("/path/to/data.csv")
if err != nil {
    return err
}

for mapCh != nil || errCh != nil {
    select {
    case m, ok := <- mapCh:
        if !ok {
            mapCh = nil
            continue
        }
        // Read the CSV data into a struct to utilize the discrete types.
        d := myData{}
        if err := m.To(&d); err != nil {
            log.Errorf("error reading data: %v | %#v", err, m)
            continue
        }
        // handle d further
    case err, ok := <- errCh:
        if !ok {
            errCh = nil
            continue
        }
        log.Errorf("error from dataparse: %v", err)
    }
}

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrValueIsNil        = errors.New("dataparse: value is nil")
	ErrValueIsNotPointer = errors.New("dataparse: value is not pointer")
	ErrValueCannotBeSet  = errors.New("dataparse: value cannot be set")
)
View Source
var BoolStringsFalse = []string{
	"",

	"0",
	"no",
	"n",
	"false",

	"na",
	"n/a",
}

BoolStringsFalse are strings that will be interpreted as false by Value.Bool.

View Source
var BoolStringsTrue = []string{
	"1",
	"yes",
	"y",
	"true",
}

BoolStringsTrue are strings taht will be interpreted as true by Value.Bool.

View Source
var DefaultStringSeparators = []string{
	",",
	"\n",
}

DefaultStringSeparators is used when no separators are passed,

ParseTimeFormats are the various formats ParseTime and its consumers utilize to attempt to parse timestamps.

Functions

func AddCustomToFunc

func AddCustomToFunc(fn CustomToFunc)

func FilterSlice

func FilterSlice[V comparable](in []V, removees ...V) []V

FilterSlice returns a copy of the passed slice with the removees removed.

func From

func From(path string, opts ...FromOption) (chan FromResult, error)

From returns maps parsed from a file.

From utilizes other functions for various data types like JSON and CSV.

From automatically unpacks the following archives based on their file extension:

  • gzip: .gz

func FromCsv

func FromCsv(reader io.Reader, opts ...FromOption) chan FromResult

FromCsv returns maps read from a CSV stream.

func FromJson

func FromJson(reader io.Reader, opts ...FromOption) chan FromResult

FromJson returns maps parsed from a stream which may consist of: 1. A single JSON document 2. A stream of JSON documents 3. An array of JSON documents

func ListToAny

func ListToAny[V any](input []V) []any

ListToAny returns a copy of the passed list as []any.

func ListToMap

func ListToMap[K comparable](input []K) map[K]bool

ListToMap returns a map where each of the list members is a key and each value is true. Utilized to take a list of options and transform it into a map for lookups.

func ParseTime

func ParseTime(s string) (time.Time, error)

ParseTime attempts to parse s as time utilizing all formats in ParseTimeFormats. An empty string will return a default time.Time.

Types

type CustomToFunc

type CustomToFunc func(source Value, other any) (any, bool, error)

type ErrNoValidKey

type ErrNoValidKey struct {
	// contains filtered or unexported fields
}

func NewErrNoValidKey

func NewErrNoValidKey(keys []any) ErrNoValidKey

func (ErrNoValidKey) Error

func (e ErrNoValidKey) Error() string

func (ErrNoValidKey) Keys

func (e ErrNoValidKey) Keys() []any

type ErrUnhandled

type ErrUnhandled struct {
	Value any
}

ErrUnhandled is returned as an error if the underlying type is not handled by dataparse.

func NewErrUnhandled

func NewErrUnhandled(value any) ErrUnhandled

NewErrUnhandled returns an ErrUnhandled with the given value.

func (ErrUnhandled) Error

func (e ErrUnhandled) Error() string

type FromOption

type FromOption func(*fromConfig)

func WithChannelSize

func WithChannelSize(i int) FromOption

WithChannelSize defines the buffer size of channels for functions returning channels. Defaults to 100.

func WithHeaders

func WithHeaders(headers ...string) FromOption

WithHeaders defines which headers are expected when reading delimited formats like csv. If no headers are set the input is expected to have headers. Defaults to []string.

func WithSeparator

func WithSeparator(sep string) FromOption

WithSeparator defines the separator to use when splitting strings or when reading formats with delimiters. Defaults to ",". This does not apply to unmarshalled values like JSON.

func WithTrimSpace

func WithTrimSpace(trim bool) FromOption

WithTrimSpace defines whether values are trimmed when parsing input. Defaults to true. This does not apply to unmarshalled values like JSON.

type FromResult

type FromResult struct {
	Map Map
	Err error
}

type Fromer

type Fromer interface {
	From(Value) error
}

type Map

type Map map[any]any

Map is one of the two central types in dataparse. It is used to store and retrieve data taken from various sources.

func FromJsonSingle

func FromJsonSingle(reader io.Reader, opts ...FromOption) (Map, error)

FromJsonSingle is a wrapper around FromJson and returns the first map and error in the result set. It is only intended for instances where it is already known that the input can only contain a single document.

func FromKVString

func FromKVString(kv string, opts ...FromOption) (Map, error)

FromKVString returns a map based on the passed string.

Example:

input: a=1,b=test,c
output: {
	a: 1,
	b: "test",
	c: nil,
}

func FromSingle

func FromSingle(path string, opts ...FromOption) (Map, error)

FromSingle is a wrapper around From and returns the first map and error in the result set. It is only intended for instances where it is already known that the input can only contain a single document.

func NewMap

func NewMap(in any) (Map, error)

NewMap creates a map from the passed value. Valid values are maps and structs.

func (Map) Get

func (m Map) Get(keys ...any) (Value, error)

Get checks for Value entries for each of the given keys in order and returns the first. If no Value is found a dataparse.Value `nil` and an error is returned.

Nested value can be accessed by providing the keys separated with dots.

Example:

m, err := NewMap(map[string]any{
	"a": map[string]any{
		"b": map[string]any{
			"c": "lorem ipsum",
		},
	},
})
if err != nil {
	return err
}
v, err := m.Get("a.b.c")
if err != nil {
	return err
}
fmt.Printf(v.MustString())

Will print "lorem ipsum".

Note: Errors from attempting to convert Values to Maps are returned as stdlib multierrors and only when no match is found.

Note: The entire key including dots is tested as well and the value returned if it exists. Example:

m, err := NewMap(map[string]any{
	"a.b.c": "dolor sic amet",
})
if err != nil {
	return err
}
m2, err := NewMap(map[string]any{
	"a": map[string]any{
		"b": map[string]any{
			"c": "lorem ipsum",
		},
	},
	"a.b.c": "dolor sic amet",
})
if err != nil {
	return err
}
v, err := m.Get("a.b.c")
if err != nil {
	return err
}
v2, err := m2.Get("a.b.c")
if err != nil {
	return err
}
fmt.Printf(v.MustString())
fmt.Printf(v2.MustString())

Will print "dolor sic amet" and "lorem ipsum".

func (Map) Has

func (m Map) Has(keys ...any) bool

Has returns true if the map has an entry for any of the passed keys. The keys are checked in order.

func (Map) Int

func (m Map) Int(keys ...any) (int, error)

Int is a shortcut to retrieve a value and call a function on the resulting Value.

Calling this method is equivalent to:

val, err := m.Get("a")
if err != nil {
	// error handling
}
parsed, err := val.Int()
if err != nil {
	// error handling
}

func (Map) Int64

func (m Map) Int64(keys ...any) (int64, error)

Int64 is a shortcut to retrieve a value and call a function on the resulting Value.

Calling this method is equivalent to:

val, err := m.Get("a")
if err != nil {
	// error handling
}
parsed, err := val.Int64()
if err != nil {
	// error handling
}

func (Map) Map

func (m Map) Map(keys ...any) (Map, error)

Map works like Get but returns a Map.

func (Map) MustGet

func (m Map) MustGet(keys ...any) Value

MustGet is the error-ignoring version of Get.

func (Map) MustInt

func (m Map) MustInt(keys ...any) int

MustInt is the error-ignoring version of Int.

func (Map) MustInt64

func (m Map) MustInt64(keys ...any) int64

MustInt64 is the error-ignoring version of Int64.

func (Map) MustMap

func (m Map) MustMap(keys ...any) Map

MustMap is the error-ignoring version of Map.

func (Map) MustString

func (m Map) MustString(keys ...any) string

MustString is the error-ignoring version of String.

func (Map) MustTime

func (m Map) MustTime(keys ...any) time.Time

MustTime is the error-ignoring version of Time.

func (Map) MustUint

func (m Map) MustUint(keys ...any) uint

MustUint is the error-ignoring version of Uint.

func (Map) MustUint64

func (m Map) MustUint64(keys ...any) uint64

MustUint64 is the error-ignoring version of Uint64.

func (Map) String

func (m Map) String(keys ...any) (string, error)

String is a shortcut to retrieve a value and call a function on the resulting Value.

Calling this method is equivalent to:

val, err := m.Get("a")
if err != nil {
	// error handling
}
parsed, err := val.String()
if err != nil {
	// error handling
}

func (Map) Time

func (m Map) Time(keys ...any) (time.Time, error)

Time is a shortcut to retrieve a value and call a function on the resulting Value.

Calling this method is equivalent to:

val, err := m.Get("a")
if err != nil {
	// error handling
}
parsed, err := val.Time()
if err != nil {
	// error handling
}

func (Map) To

func (m Map) To(dest any, opts ...ToOption) error

To reads the map into a struct similar to json.Unmarshal, utilizing Value.To. The passed variable must be a pointer to a struct.

Multiple keys can be given, separated by a commata `,`:

type example struct {
	Field string `dataparse:"field1,field2"`
}

By default the field name is looked up if any fields in the dataparse tag are not found.

By default it is an error if a struct field cannot be found in the Map. Fields without a dataparse tag can be skipped implicitly by passing the option WithSkipFieldsWithoutTag or explicitly by settings `dataparse:""`:

type example struct {
	Field string `dataparse:""`
}

Value.To uses the underlying field type to call the correct Value method to transform the source value into the targeted struct field type. E.g. if the field type is string and the map contains a number the field will contain a string with the number formatted in.

func (Map) Uint

func (m Map) Uint(keys ...any) (uint, error)

Uint is a shortcut to retrieve a value and call a function on the resulting Value.

Calling this method is equivalent to:

val, err := m.Get("a")
if err != nil {
	// error handling
}
parsed, err := val.Uint()
if err != nil {
	// error handling
}

func (Map) Uint64

func (m Map) Uint64(keys ...any) (uint64, error)

Uint64 is a shortcut to retrieve a value and call a function on the resulting Value.

Calling this method is equivalent to:

val, err := m.Get("a")
if err != nil {
	// error handling
}
parsed, err := val.Uint64()
if err != nil {
	// error handling
}

type ToOption

type ToOption func(*toConfig)

func WithCollectErrors

func WithCollectErrors() ToOption

WithCollectErrors configures Map.To to not return on the first encountered error when processing properties.

Instead occurring errors are collected with errors.Join and returned after processing all fields.

The default is false.

func WithIgnoreNoValidKeyError

func WithIgnoreNoValidKeyError() ToOption

WithIgnoreNoValidKeyError configures Map.To to ignore errors when no field could by found by the configured tags.

This is primarily useful for inconsistent input or when using the same structure to parse data from different sources with different properties.

The default is false.

func WithLookupFieldName

func WithLookupFieldName(lookupFieldName bool) ToOption

WithLookupFieldName configures Map.To to try to lookup the field name in addition to any names given in the dataparse tag.

If this option is set the field name will be looked up after any names in the dataparse tag.

The default is true.

func WithSkipFieldsWithoutTag

func WithSkipFieldsWithoutTag() ToOption

WithSkipFieldsWithoutTag configures Map.To to skip fields without explicit tags.

Note that this also skip fields without an explicit dataparse tag if WithLookupFieldName is set.

The default is false.

type Value

type Value struct {
	Data any
}

Value is one of the two central types in dataparse. It is used to transform data between various representations.

func NewValue

func NewValue(data any) Value

NewValue returns the passed data as a Value.

func (Value) Bool

func (v Value) Bool() (bool, error)

Bool returns a boolean for the underlying lower cased value.

Strings in BoolStringsFalse and BoolStringsTrue are considered to be false and true respectively.

If neither applies strconv.ParseBool is utilized.

func (Value) Float32

func (v Value) Float32() (float32, error)

Float32 returns the underlying data as a float32.

func (Value) Float64

func (v Value) Float64() (float64, error)

Float64 returns the underlying data as a float64.

func (Value) IP

func (v Value) IP() (net.IP, error)

func (Value) Int

func (v Value) Int() (int, error)

Int returns the underlying data as a int.

func (Value) Int16

func (v Value) Int16() (int16, error)

Int16 returns the underlying data as a int16.

func (Value) Int32

func (v Value) Int32() (int32, error)

Int32 returns the underlying data as a int32.

func (Value) Int64

func (v Value) Int64() (int64, error)

Int64 returns the underlying data as a int64.

func (Value) Int8

func (v Value) Int8() (int8, error)

Int8 returns the underlying data as a int8.

func (Value) IsNil

func (v Value) IsNil() bool

IsNil returns true if the data Value stores is nil.

func (Value) List

func (v Value) List(seps ...string) ([]Value, error)

List returns the underlying data as a slice of Values.

The passed separators are passed to .ListString if the underlying value is a string.

Warning: This method is very simplistic and at the moment only returns a proper slice of values if the underlying data is a slice.

func (Value) ListString

func (v Value) ListString(seps ...string) ([]string, error)

ListString returns the underlying data as a slice of strings.

If the underlying data is a slice each member is transformed into a string using the Value.String method.

If the underlying data is a string the string is split using the passed separator. If not separators are passed DefaultStringSeparators is used.

func (Value) MAC

func (v Value) MAC() (net.HardwareAddr, error)

func (Value) Map

func (v Value) Map() (Map, error)

Map returns the underlying data as a Map.

func (Value) MustBool

func (v Value) MustBool() bool

func (Value) MustFloat32

func (v Value) MustFloat32() float32

MustFloat32 is the error-ignoring version of Float32.

func (Value) MustFloat64

func (v Value) MustFloat64() float64

MustFloat64 is the error-ignoring version of Float64.

func (Value) MustIP

func (v Value) MustIP() net.IP

func (Value) MustInt

func (v Value) MustInt() int

MustInt is the error-ignoring version of Int.

func (Value) MustInt16

func (v Value) MustInt16() int16

MustInt16 is the error-ignoring version of Int16.

func (Value) MustInt32

func (v Value) MustInt32() int32

MustInt32 is the error-ignoring version of Int32.

func (Value) MustInt64

func (v Value) MustInt64() int64

MustInt64 is the error-ignoring version of Int64.

func (Value) MustInt8

func (v Value) MustInt8() int8

MustInt8 is the error-ignoring version of Int8.

func (Value) MustList

func (v Value) MustList() []Value

MustList is the error-ignoring version of List.

func (Value) MustListString

func (v Value) MustListString(sep string) []string

MustListString is the error-ignoring version of ListString.

func (Value) MustMAC

func (v Value) MustMAC() net.HardwareAddr

func (Value) MustMap

func (v Value) MustMap() Map

MustMap is the error-ignoring version of Map.

func (Value) MustString

func (v Value) MustString() string

MustString is the error-ignoring version of String.

func (Value) MustTime

func (v Value) MustTime() time.Time

func (Value) MustUint

func (v Value) MustUint() uint

MustUint is the error-ignoring version of Uint.

func (Value) MustUint16

func (v Value) MustUint16() uint16

MustUint16 is the error-ignoring version of Uint16.

func (Value) MustUint32

func (v Value) MustUint32() uint32

MustUint32 is the error-ignoring version of Uint32.

func (Value) MustUint64

func (v Value) MustUint64() uint64

MustUint64 is the error-ignoring version of Uint64.

func (Value) MustUint8

func (v Value) MustUint8() uint8

MustUint8 is the error-ignoring version of Uint8.

func (Value) String

func (v Value) String() (string, error)

String returns the underlying value as a string.

Note that String never returns an error and is identical to MustString. String and MustString are only kept to follow the same conventions as all other transformation methods follow.

func (Value) Time

func (v Value) Time() (time.Time, error)

func (Value) To

func (v Value) To(other any, opts ...ToOption) error

To transforms the stored data into the target type and returns any occurring errors.

The passed value must be a pointer.

To utilizes the various transformation methods and returns their errors.

If the parameter satisfies the Fromer interface it will be used to set the value.

func (Value) TrimString

func (v Value) TrimString() string

TrimString returns the result of String with spaces trimmed.

func (Value) Uint

func (v Value) Uint() (uint, error)

Uint returns the underlying data as a uint.

func (Value) Uint16

func (v Value) Uint16() (uint16, error)

Uint16 returns the underlying data as a uint16.

func (Value) Uint32

func (v Value) Uint32() (uint32, error)

Uint32 returns the underlying data as a uint32.

func (Value) Uint64

func (v Value) Uint64() (uint64, error)

Uint64 returns the underlying data as a uint64.

func (Value) Uint8

func (v Value) Uint8() (uint8, error)

Uint8 returns the underlying data as a uint8.

Directories

Path Synopsis
cmd
customtoers
pq

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL