hector
Golang machine learning lib. It's forked form github.com/xlvector/hector, but has been rebuild for clearer CLI commands and algorithms scalability.
Supported Algorithms
- Logistic Regression
- Factorized Machine
- CART, Random Forest, Random Decision Tree, Gradient Boosting Decision Tree
- Neural Network
Hector support libsvm-like data format. Following is an sample dataset
1 1:0.7 3:0.1 9:0.4
0 2:0.3 4:0.9 7:0.5
0 2:0.7 5:0.3
...
How to Run
Install
go get github.com/pantsing/hector
hector --help
Here, supported algorithms include
- lr : logistic regression with SGD and L2 regularization.
- ftrl : FTRL-proximal logistic regreesion with L1 regularization. Please review this paper for more details "Ad Click Prediction: a View from the Trenches".
- ep : bayesian logistic regression with expectation propagation. Please review this paper for more details "Web-Scale Bayesian Click-Through Rate Prediction for Sponsored Search Advertising in Microsoft’s Bing Search Engine"
- fm : factorization machine
- cart : classifiaction tree
- cart-regression : regression tree
- rf : random forest
- rdt : random decision trees
- gbdt : gradient boosting decisio tree
- linear-svm : linear svm with L1 regularization
- svm : svm optimizaed by SMO (current, its linear svm)
- l1vm : vector machine with L1 regularization by RBF kernel
- knn : k-nearest neighbor classification
Benchmark
Binary Classification
Following are datasets used in benchmarks, You can find them from UCI Machine Learning Repository
- heart
- fourclass
I will do 5-fold cross validation on the dataset, and use AUC as evaluation metric. Following are the results:
DataSet |
Method |
AUC |
heart |
FTRL-LR |
0.9109 |
heart |
EP-LR |
0.8982 |
heart |
CART |
0.8231 |
heart |
RDT |
0.9155 |
heart |
RF |
0.9019 |
heart |
GBDT |
0.9061 |
fourclass |
FTRL-LR |
0.8281 |
fourclass |
EP-LR |
0.7986 |
fourclass |
CART |
0.9832 |
fourclass |
RDT |
0.9925 |
fourclass |
RF |
0.9947 |
fourclass |
GBDT |
0.9958 |