gnlp

package module
v0.0.0-...-cea4454 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 26, 2023 License: MIT Imports: 3 Imported by: 0

README

gnlp

Package gnlp provides generic Natural Language Processing tookit written by Go.

Documentation

Overview

Package gnlp provides generic Natural Language Processing tookit.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func BLEU

func BLEU[T comparable](candidate []T, references [][]T) float64

BLEU computes a sentence-level BLEU score.

Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. "BLEU: a method for automatic evaluation of machine translation." In Proceedings of ACL. https://www.aclweb.org/anthology/P02-1040.pdf

The candidate parameter is a sequence of token and the references parameter is a set of sequences of token. This method returns zero if there's no refernece.

func CorpusBLEU

func CorpusBLEU[T comparable](candidateList [][]T, referencesList [][][]T) float64

CorpusBLEU computes a corpus-level BLEU score.

Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. "BLEU: a method for automatic evaluation of machine translation." In Proceedings of ACL. https://www.aclweb.org/anthology/P02-1040.pdf

The candidate list and references list should be the same length. Otherwise it returns zero.

Note that this method doesn't return the average of sentence-level BLEU score. It calculates the micro-average of precision as the original BLEU paper.

func DamerauLevenshteinDistance

func DamerauLevenshteinDistance[T comparable](a, b []T) int64

DamerauLevenshteinDistance computes Damerau-Levenshtein distance between two sequences.

func DiceIndex

func DiceIndex[T comparable](a, b []T) float64

DiceIndex computes Sørensen-Dice index (Sørensen-Dice similarity coefficient) of two sets.

func HammingDistance

func HammingDistance[T comparable](a, b []T) (int64, error)

HammingDistance computes Hamming distance between two sequences of the same length.

func JaccardIndex

func JaccardIndex[T comparable](a, b []T) float64

JaccardIndex computes Jaccard index (Jaccard similarity coefficient) of two sets.

func JaroSimilarity

func JaroSimilarity[T comparable](a, b []T) float64

JaroSimilarity computes Jaro similarity between two sequences.

func JaroWinklerSimilarity

func JaroWinklerSimilarity[T comparable](a, b []T) float64

JaroWinklerSimilarity computes Jaro-Winkler similarity between two sequences. The scaling factor is set to 0.1.

func LevenshteinDistance

func LevenshteinDistance[T comparable](a, b []T) int64

LevenshteinDistance computes Levenshtein distance between two sequences.

func LongestCommonSubsequences

func LongestCommonSubsequences[T comparable](a, b []T) [][]T

LongestCommonSubsequences returns longest subsequences commmon to given two sequences. It returns all valid subsequeces.

This method returns a slice which contains at least one sequence. It returns [][]T{{}} if there's no common subsequence.

func NGrams

func NGrams[T any](seq []T, n int) (ngram [][]T)

NGrams returns a contiguous sequence of n items from the given sequence.

func ROUGEL

func ROUGEL[T comparable](candidate []T, references [][]T) (recall, precision float64)

ROUGEL computes a ROUGE-L score. which is a text summarization metrics based on the longest common subsequence.

Chin-Yew Lin. 2004. "ROUGE: A Package for Automatic Evaluation of Summaries." In Proceedings of ACL. https://aclanthology.org/W04-1013.pdf

func ROUGEN

func ROUGEN[T comparable](candidate []T, references [][]T, n int) float64

ROUGEN computes a ROUGE-N score, which is a recall-oriented text summarization metrics.

Chin-Yew Lin. 2004. "ROUGE: A Package for Automatic Evaluation of Summaries." In Proceedings of ACL. https://aclanthology.org/W04-1013.pdf

func SimpsonIndex

func SimpsonIndex[T comparable](a, b []T) float64

SimpsonIndex computes Szymkiewicz–Simpson index (Szymkiewicz–Simpson similarity coefficient) of two sets.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL