charset

package
v0.0.0-...-7534ea8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 2, 2022 License: MIT Imports: 2 Imported by: 5

Documentation

Overview

Package charset provides data types and functions for managing sets of characters.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Set

type Set struct {
	// Bits is the bit array for indicating which chars are in the set.
	// We have 256 bits because a char can have 256 different values.
	Bits [4]uint64
}

A Set represents a set of chars.

func New

func New(chars []byte) Set

New returns a charset which accepts all chars in 'chars'. Note that all chars must be valid ASCII characters (<128).

func Range

func Range(low, high byte) Set

CharsetRange returns a charset matching all characters between `low` and `high` inclusive.

func (Set) Add

func (c Set) Add(c1 Set) Set

Add combines the characters two charsets match together.

func (Set) Complement

func (c Set) Complement() Set

Complement returns a charset that matches all characters except for those matched by `c`.

func (*Set) Has

func (c *Set) Has(r byte) bool

Has checks if a charset accepts a character. Pointer receiver is for performance.

func (Set) IsSmall

func (c Set) IsSmall() bool

IsSmall returns true if this set can be converted to a small set. In other words, if this set only matches bytes <128.

func (Set) Size

func (c Set) Size() int

Size returns the number of chars matched by this Set.

func (Set) SmallSet

func (c Set) SmallSet() SmallSet

SmallSet converts this Set to a SmallSet.

func (Set) String

func (c Set) String() string

String returns the string representation of the charset.

func (Set) Sub

func (c Set) Sub(c1 Set) Set

Sub removes from 'c' any characters in 'c1'.

type SmallSet

type SmallSet struct {
	Bits [2]uint64
}

A SmallSet is the same as a Set but can only represent 128 possible chars. This is an optimization, since in the common case, only ASCII bytes are used which are <128. The full Set is only necessary when unicode control characters must be matched.

func (*SmallSet) Has

func (c *SmallSet) Has(r byte) bool

Has checks if a charset accepts a character. Pointer receiver is for performance.

func (SmallSet) Size

func (c SmallSet) Size() int

Size returns the number of chars matched by this Set.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL