Documentation ¶
Index ¶
- func Chebyshev(a, b Vector) (d float64)
- func CoordinatesSetEqual(X, Y Matrix) bool
- func Euclidean(a, b Vector) (d float64)
- func EuclideanSq(a, b Vector) (d float64)
- func InvPerm(x []int)
- func Manhattan(a, b Vector) (d float64)
- func Minkowski(a, b Vector, p float64) (d float64)
- func PermEqual(x, y []int) bool
- func Permute(x, p []int) (y []int)
- type ActiveSet
- type Classes
- type Clusterer
- type Distances
- type HClusters
- type HClustersGeneric
- type HClustersSingle
- type Heap
- type Hierarchizer
- type Hopach
- type Hopacher
- type KMeans
- type KMedians
- type KMedoids
- type KeyValue
- type Linkage
- type Linkages
- type Matrix
- type MetricOp
- type MixModel
- type Partitions
- type Segregator
- type Split
- type Splitter
- type Subclusterer
- type UnionFind
- type Vector
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CoordinatesSetEqual ¶
CoordinatesSetEqual returns whether the a and b contain the same set of coordinates. Each row of a and b is a tuple of coordinates.
func EuclideanSq ¶
EuclideanSq returns the Euclidean squared distance metric between points a and b
Types ¶
type ActiveSet ¶
type ActiveSet struct {
// contains filtered or unexported fields
}
array-based integer singly linked list can remove nodes but cannot add nodes
func NewActiveSet ¶
type Classes ¶
type Classes struct { // classification index Index Partitions K int Cost float64 }
func FindClusters ¶
FIXME There should be multiple instances of Clusterer FindClusters runs the clustering algorithm for the specified number of repeats.
func (*Classes) Partitions ¶
Partitions return an array of partition element arrays
type Distances ¶
type Distances struct {
// contains filtered or unexported fields
}
func NewDistances ¶
type HClusters ¶
type HClusters struct { // Data points [m x n] X Matrix // Distance metric Metric MetricOp // number of clusters K int // linkage method Method int // Distances between data points [m x m] D *Distances // Step-wise dendrogram Dendrogram Linkages // cluster center assignment index Index []int // cost Cost float64 // contains filtered or unexported fields }
func (*HClusters) CutTreeHeight ¶
CutTreeHeight cuts the hierarchical cluster tree to specified height.
type HClustersGeneric ¶
type HClustersGeneric struct { HClusters // contains filtered or unexported fields }
Generic hierarchical clustering using Mullner's algorithm
func NewHClustersGeneric ¶
func NewHClustersGeneric(X Matrix, metric MetricOp, method int, d *Distances) *HClustersGeneric
func (*HClustersGeneric) Cluster ¶
func (c *HClustersGeneric) Cluster(k int) (classes *Classes)
type HClustersSingle ¶
type HClustersSingle struct { HClusters // contains filtered or unexported fields }
Single linkage hierarchical clustering using Minimum Spanning Tree (MST) algorithm
func NewHClustersSingle ¶
func NewHClustersSingle(X Matrix, metric MetricOp, d *Distances) *HClustersSingle
func (*HClustersSingle) Cluster ¶
func (c *HClustersSingle) Cluster(k int) (classes *Classes)
func (*HClustersSingle) Hierarchize ¶
func (c *HClustersSingle) Hierarchize() Linkages
type Heap ¶
type Heap struct {
// contains filtered or unexported fields
}
min binary heap complete binary tree partial order: every node a stores a value that is less than or equal to that of its children
type Hierarchizer ¶
type Hierarchizer interface { // Hierarchize organizes data clusters in a dendrogram Hierarchize() Linkages }
type Hopach ¶
type Hopach struct { Base Hopacher // contains filtered or unexported fields }
func (*Hopach) Hierarchize ¶
type KMeans ¶
type KMeans struct { // Matrix of data points X Matrix // Distance metric Metric MetricOp // number of clusters K int // Distances between data points [m x m] D *Distances // Matrix of centroids Centers Matrix // Total distance of members to each centroid Errors Vector // cluster center assignment index Clusters []int // cost Cost float64 // Maximum number of iterations MaxIter int // ordered index of elements subset Index []int }
func (*KMeans) Cluster ¶
Cluster runs the k-means algorithm once with random initialization Returns the classification information
func (*KMeans) Segregations ¶
type KMedians ¶
type KMedians struct {
KMeans
}
func NewKMedians ¶
type KMedoids ¶
type KMedoids struct {
KMeans
}
type Matrix ¶
func Segregations ¶
Segregations return a matrix of distances between data points and clusters
func SegregationsFromCenters ¶
SegregationsFromCenters return a matrix of distances between data points and cluster centers
type MixModel ¶
type MixModel struct { // Matrix of data points [m x n] X Matrix // number of clusters K int // Matrix of Gaussians [k x n] Means, Variances Matrix // Vector of mixing proportions [k] Mixings Vector // Negative likelihood to be minimized NLogLikelihood float64 // Maximum number of iterations MaxIter int // contains filtered or unexported fields }
type Partitions ¶
type Partitions []int
func (Partitions) Equal ¶
func (p Partitions) Equal(q Partitions) bool
Equal returns whether partitions p and q are equal.
func (Partitions) Len ¶
func (p Partitions) Len() int
func (Partitions) Less ¶
func (p Partitions) Less(i, j int) bool
func (Partitions) Reassign ¶
func (p Partitions) Reassign()
Reassign reassigns partition labels so that partitions are assigned indices (1-index) in the order of appearance.
func (Partitions) Swap ¶
func (p Partitions) Swap(i, j int)
type Segregator ¶
type Split ¶
func SegregateByMeanSil ¶
func SegregateByMeanSil(seg Segregator, K int) (s Split)
TODO Do not count the silhouette of singleton clusters in the average?
func SplitByMeanSplitSil ¶
K is the maximum number of clusters. L is the maximum number of children clusters for any cluster.
type Splitter ¶
type Splitter interface { Segregator Subset(index []int) Splitter }
type Subclusterer ¶
type UnionFind ¶
type UnionFind struct { // parent index array Parent []int // contains filtered or unexported fields }
func NewUnionFind ¶
type Vector ¶
func Silhouettes ¶
Silhouettes returns a vector of silhouettes for data points. If S is a matrix of average distances from each elements to other elements in each cluster, then the returned values are conventionally considered as silhouettes. If S is a matrix of distances from each element to each cluster center, then the returned values are can be considered as shadows. TODO special case: silhouette is not defined for two singleton clusters TODO faithful calculation of "shadow" as defined by Friedrich Leisch (average two nearest centroids for 'b')