datautils

package module
v0.0.0-...-bff1372 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 1, 2021 License: MIT Imports: 10 Imported by: 0

README

datautils logo

datautils

The best toolbox for processing textual data.

Release License Go Report Card


Contents

Introduction

The Data Utilities are a collection of handy text manipulation tools. These tools are supposed to make a data wrangler’s life on the command-line easier.

Much of the functionality can be solved with standard command-line tools (awk, sed, cut, sort, uniq, …), but that would often become tedious. Zealots of the Unix philosophy will probably not use these tools and create a set of sophisticated aliases instead.

On the other hand, some of the tools fix actual problems. The tools use UTF-8 by default. As a consequence, one does not have to deal with the quirks of sort and uniq w.r.t. non-ASCII input.

Installation

go get -v github.com/sfischer13/datautils/...

Tools

These tools are part of the collection:

  • count
  • norm
  • rows
  • text
  • trim

Usage

count

$ echo "a\na\na\nb\nb\nc"
a
a
a
b
b
c
$ echo "a\na\na\nb\nb\nc" | count --keys
3	a
2	b
1	c
$ echo "a\na\na\nb\nb\nc" | count --counts
1	c
2	b
3	a
$ echo "a\na\na\nb\nb\nc" | count --flip
a	3
b	2
c	1
$ echo "a\na\na\nb\nb\nc" | count --threshold 2
3	a
2	b

norm

$ echo "¹²³" | norm --nfc
¹²³
$ echo "¹²³" | norm --nfkc
123

rows

echo "a\nb\nc\nd\ne" | rows --rows 2:4
b
c
d
echo "a\nb\nc\nd\ne" | rows --rows 1,5
a
e

text

$ echo abca | text chars
a
b
c
a
$ echo "This is a test." | text words
This
is
a
test.

trim

$ echo "   abc" | trim --left
abc

Credits

This project is authored and maintained by Stefan Fischer.
The source code is available under the MIT License.
See LICENSE for further details.

Documentation

Overview

Package datautils is a collection of handy text manipulation tools.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CountTrue

func CountTrue(s []bool) int

CountTrue count the number of true values in a slice.

func DefaultApp

func DefaultApp(version, date, goVersion string) *cli.App

DefaultApp TODO

func Intervals2Func

func Intervals2Func(rs []Interval) func(int64) bool

Intervals2Func TODO

func IsNotWhite

func IsNotWhite(s string) bool

IsNotWhite TODO

func IsWhite

func IsWhite(s string) bool

IsWhite TODO

func Max

func Max(x, y int64) int64

Max TODO

func ParseRFCDate

func ParseRFCDate(timeString string) time.Time

ParseRFCDate parses a date string in RFC3339 format. On failure, the current time will be returned.

func StdinSource

func StdinSource() <-chan string

StdinSource TODO

func StdoutSink

func StdoutSink(src <-chan string)

StdoutSink TODO

func TransformPipe

func TransformPipe(src <-chan string, transform func(string) string) <-chan string

TransformPipe TODO

func TransformStdin

func TransformStdin(c *cli.Context, transform func(string) string) error

TransformStdin applies a string transformation function to stdin.

Types

type Interval

type Interval struct {
	Low  int64
	High int64
	Step int64
}

Interval TODO

func NewInterval

func NewInterval(low, high, step string) Interval

NewInterval TODO

func String2Intervals

func String2Intervals(s string) []Interval

String2Intervals TODO

func (*Interval) Contains

func (r *Interval) Contains(i int64) bool

Contains TODO

type List

type List struct {
	Elements []string
}

List TODO

func NewList

func NewList() List

NewList TODO

func (*List) Add

func (l *List) Add(element string)

Add TODO

func (*List) Clear

func (l *List) Clear()

Clear TODO

func (*List) IsEmpty

func (l *List) IsEmpty() bool

IsEmpty TODO

type Pair

type Pair struct {
	Key   string
	Value int64
}

Pair TODO

type Pairs

type Pairs []Pair

Pairs TODO

func (Pairs) ReverseKeys

func (ps Pairs) ReverseKeys()

ReverseKeys TODO

func (Pairs) ReverseValues

func (ps Pairs) ReverseValues()

ReverseValues TODO

func (Pairs) SortKeys

func (ps Pairs) SortKeys()

SortKeys TODO

func (Pairs) SortValues

func (ps Pairs) SortValues()

SortValues TODO

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL