Documentation ¶
Overview ¶
Example ¶
package main import ( "os" "github.com/arnodel/grammar" ) type Token = grammar.SimpleToken type SExpr struct { grammar.OneOf Number *Token `tok:"number"` String *Token `tok:"string"` Atom *Token `tok:"atom"` *List } type List struct { grammar.Seq OpenBkt Token `tok:"bkt,("` Items []SExpr CloseBkt Token `tok:"bkt,)"` } var tokenise = grammar.SimpleTokeniser([]grammar.TokenDef{ { // If Name is empty, the token is skipped in the token stream Ptn: `\s+`, }, { Name: "bkt", Ptn: `[()]`, }, { Name: "string", Ptn: `"[^"]*"`, }, { Name: "number", Ptn: `-?[0-9]+(?:\.[0-9]+)?`, }, { Name: "atom", Ptn: `[a-zA-Z_][a-zA-Z0-9_-]*`, }, }) func main() { tokenStream, _ := tokenise(`(cons a (list 123 "c")))`) var sexpr SExpr grammar.Parse(&sexpr, tokenStream) grammar.PrettyWrite(os.Stdout, sexpr) }
Output: SExpr { List: List { OpenBkt: {bkt (} Items: [ SExpr { Atom: {atom cons} } SExpr { Atom: {atom a} } SExpr { List: List { OpenBkt: {bkt (} Items: [ SExpr { Atom: {atom list} } SExpr { Number: {number 123} } SExpr { String: {string "c"} } ] CloseBkt: {bkt )} } } ] CloseBkt: {bkt )} } }
Index ¶
- Variables
- func PrettyWrite(out io.Writer, r interface{}) error
- func SimpleTokeniser(tokenDefs []TokenDef) func(string) (*SimpleTokenStream, error)
- type Empty
- type FieldType
- type Mode
- type OneOf
- type ParseError
- type ParseOption
- type ParseOptions
- type Parser
- type ParserState
- type RuleDef
- type RuleField
- type Seq
- type SimpleToken
- type SimpleTokenStream
- type SizeOptions
- type Token
- type TokenDef
- type TokenParseOptions
- type TokenStream
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var EOF = SimpleToken{
TokType: "EOF",
TokValue: "EOF",
}
EOF is the token that is returned when trying to consument an exhausted token stream.
var WithDefaultLogger = WithLogger(log.Default())
Functions ¶
func PrettyWrite ¶
PrettyWrite outputs a pretty representation of the rule r on the writer. For a struct like this:
type RuleType struct { Field1 Rule1 Field2 []Rule2 }
It looks like this:
RuleType { Field1: <pretty representation of the value of Field1> Field2: [ <pretty representation of the first item in Field2> <pretty representation of the second item> <...> ] }
Empty optional fields and empty repeated fields are omitted altogether.
func SimpleTokeniser ¶
func SimpleTokeniser(tokenDefs []TokenDef) func(string) (*SimpleTokenStream, error)
SimpleTokeniser takes a list of TokenDefs and returns a function that can tokenise a string. Designed for simple use-cases.
Types ¶
type Empty ¶
type Empty struct{}
func (Empty) Parse ¶
func (Empty) Parse(r interface{}, s *ParserState, opts ParseOptions) *ParseError
type OneOf ¶
type OneOf struct{}
OneOf should be used as the first field of a Rule struct to signify that it should match exactly one of the fields
func (OneOf) Parse ¶
func (OneOf) Parse(r interface{}, s *ParserState, opts ParseOptions) *ParseError
type ParseError ¶
type ParseError struct { Err error Token TokenParseOptions []TokenParseOptions Pos int }
func Parse ¶
func Parse(dest interface{}, s TokenStream, opts ...ParseOption) *ParseError
Parse tries to interpret dest as a grammar rule and use it to parse the given token stream. Parse can panic if dest is not a valid grammar rule. It returns a non-nil *ParseError if the token stream does not match the rule.
func ParseWithOptions ¶
func ParseWithOptions(dest interface{}, s *ParserState, opts ParseOptions) *ParseError
ParseWithOptions is the same as Parse but the ParseOptions are explicitely given (this is mostly used by the parser generator).
func (*ParseError) Error ¶
func (e *ParseError) Error() string
func (*ParseError) Merge ¶
func (e *ParseError) Merge(e2 *ParseError) *ParseError
type ParseOption ¶
type ParseOption func(s *ParserState)
func WithLogger ¶
func WithLogger(l *log.Logger) ParseOption
type ParseOptions ¶
type ParseOptions struct {
TokenParseOptions []TokenParseOptions
}
func (ParseOptions) MatchToken ¶
func (o ParseOptions) MatchToken(tok Token) (bool, bool)
type Parser ¶
type Parser interface {
Parse(r interface{}, s *ParserState, opts ParseOptions) *ParseError
}
A Parser can parse a token stream to populate itself. It must return an error if it fails to do it.
type ParserState ¶
type ParserState struct { TokenStream // contains filtered or unexported fields }
func (*ParserState) Debug ¶
func (s *ParserState) Debug() bool
func (*ParserState) Logf ¶
func (s *ParserState) Logf(fstr string, args ...interface{})
func (*ParserState) MergeError ¶
func (s *ParserState) MergeError(err *ParseError) *ParseError
type RuleField ¶
type RuleField struct { FieldType ParseOptions SizeOptions Name string Index int }
type Seq ¶
type Seq struct{}
Seq should be used as the first field of a Rule struct to signify that it should match all the fields in sequence.
func (Seq) Parse ¶
func (Seq) Parse(r interface{}, s *ParserState, opts ParseOptions) *ParseError
type SimpleToken ¶
SimpleToken is a simple implementation of the both the Token and the Parser interfaces. So it can be used in rules to match a token field. It is also the concrete type of Tokens returned by SimpleTokenStream.
func (*SimpleToken) Parse ¶
func (t *SimpleToken) Parse(_ interface{}, s *ParserState, opts ParseOptions) *ParseError
Parse tries to match the next token in the given TokenStream with the ParseOptions, using opts.MatchToken. If they match, the receiver is loaded with the token data, if not a non-nil *ParseError is returned. In any event the next token in the token stream has been consumed.
func (SimpleToken) Value ¶
func (t SimpleToken) Value() string
Value returns the value of the token.
type SimpleTokenStream ¶
type SimpleTokenStream struct {
// contains filtered or unexported fields
}
SimpleTokenStream is a very simple implementation of the TokenStream interface which the Parse function requires.
func NewSimpleTokenStream ¶
func NewSimpleTokenStream(toks []Token) *SimpleTokenStream
func (*SimpleTokenStream) Dump ¶
func (s *SimpleTokenStream) Dump(w io.Writer)
func (*SimpleTokenStream) Next ¶
func (s *SimpleTokenStream) Next() Token
Next consumes the next token in the token stream and returns it. If the stream is exhausted, EOF is returned.
func (*SimpleTokenStream) Restore ¶
func (s *SimpleTokenStream) Restore(pos int)
Restore rewinds the token stream to the given position (which should have been obtained by s.Save()). In general this may panic - in this implementation it's always possible to rewind.
func (*SimpleTokenStream) Save ¶
func (s *SimpleTokenStream) Save() int
Save returns the current position in the token stream.
type SizeOptions ¶
type SizeOptions struct {
Min, Max int
}
type Token ¶
type Token interface { Type() string // The type of the token (e.g. operator, number, string, identifier) Value() string // The value of the token(e.g. `+`, `123`, `"abc"`, `if`) }
A Token has a Type() and Value() method. A TokenStream's Next() method must return a Token.
type TokenDef ¶
type TokenDef struct { Ptn string // The regular expression the token should match Name string // The name given to this token type Special func(string) string // If defined, it takes over the tokenising for this pattern Mode string PushMode string PopMode bool }
A TokenDef defines a type of token and the pattern that matches it. Used by SimpleTokeniser to create tokenisers simply.
type TokenParseOptions ¶
func (TokenParseOptions) String ¶
func (o TokenParseOptions) String() string
type TokenStream ¶
type TokenStream interface { Next() Token // Advance the stream by one token and return it. Save() int // Return the current position in the token stream. Restore(int) // Return the stream to the given position. }
A TokenStream represents a sequence of tokens that will be consumed by the Parse() function. The SimpleTokenStream implementation in this package can be used, or users can provide their own implementation (e.g. with a richer data type for tokens, or "on demand" tokenisation).
The Restore() method may panic if is not possible to return to the given position.