distil-ingest

module
v0.0.0-...-2f503fb Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 17, 2019 License: Apache-2.0

README

distil-ingest

CircleCI Go Report Card GolangCI

Dependencies

Requires the Go programming language binaries with the GOPATH environment variable specified and $GOPATH/bin in your PATH.

Installation

go get github.com/uncharted-distil/distil-ingest

Development

Clone the repository:

mkdir $GOPATH/src/github.com/unchartedsoftware
cd $GOPATH/src/github.com/unchartedsoftware
git clone [email protected]:uncharted-distil/distil-ingest.git

Install dependencies:

cd distil-ingest
make install

Build executable:

make build

Usage

The repository contains CLIs used to parse, and ingest 3M OpenML datasets (those with a name beginning with o_) into elasticsearch.

Merging training and target datasets:
Classifying merged datasets:
  • Update and ensure the arguments in ./classify_all.share correct
  • Run ./classify_all.sh
Ingesting merged and classified datasets:
  • Update and ensure the arguments in ./ingest_all.share correct
  • Run ./ingest_all.sh

Common Issues:

"EOF"
  • The Elasticsearch instance does not have http.compression enabled.
  • The mappings json argument is invalid, most likely missing a closing bracket
"No Elasticsearch node available"
  • You are accessing an Elasticsearch instance that requires a VPN and it is not on.
  • The Elasticsearch instance is temporarily down.
"dep: command not found":
  • Cause: $GOPATH/bin has not been added to your $PATH.
  • Solution: Add export PATH=$PATH:$GOPATH/bin to your .bash_profile or .bashrc.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL