fml

package module
v0.0.0-...-2909b0d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 1, 2021 License: Apache-2.0 Imports: 6 Imported by: 1

README

fml (Fast MARC Library)

GoDoc

fml is a Go library for parsing MARC 21 formatted data. The library interface should still be considered unstable and may change in backwards incompatible ways.

There is also an fml command line utility that can be used to pull a single MARC record from a file by control number. The command can be installed with:

$ go get github.com/mitlibraries/fml/cmd/fml

How do I use this?

import "github.com/mitlibraries/fml"

Start by creating a new MarcIterator:

m := fml.NewMarcIterator(<io.Reader>)

A MarcIterator can be iterated over by using the Next() and Value() methods. This is mostly just a thin wrapper around a bufio.Scanner. Next() returns false when there is no more data to process or an unrecoverable error has occured. In this case, iteration will stop and the error will be available from the Err() method:

for m.Next() {
  record, err := m.Value()
  // do something with record
}
err := m.Err()
if err != nil {
  // do something with error
}

A Record contains a leader struct and a slice of all the control and data fields. If you want to iterate over this slice you will need to use a type switch to determine the type of field. There are a few convenience methods that are probably better for accessing specific fields, though.

The ControlField method returns a slice of control fields. A control field contains a tag and a value:

for _, cf := range record.ControlField("001", "003") {
  fmt.Printf("%s: %s\n", cf.Tag, cf.Value)
}

The DataField method returns a slice of data fields. A data field has two indicators, a tag and a slice of subfields. The subfields themselves have a code and a value. There's also a SubField method that returns a slice of specified subfields:

for _, df := range record.DataField("260") {
  fmt.Printf("%s: %s %s\n", df.Tag, df.Indicator1, df.Indicator2)
  for _, sf := range df.SubField("a", "b") {
    fmt.Printf("\t%s: %s\n", sf.Code, sf.Value)
  }
}

There is also Filter method inspired by traject. Filter takes one or more query strings consisting of a three digit MARC tag optionally followed by an indicator filter and/or one or more subfield codes. If no subfields are specified, all subfields for the matching tag are returned. The indicator filter consists of two indicator codes between pipes. A * character can be used to match any indicator code. Here are a few examples of valid query strings:

200
500|02|
650x
245|*1|ac

Filter returns a slice of string slices. Each slice member represents an instance of a matching tag and a slice of all the matching subfields, or data values in the case of control fields. For example, take a MARC record with the following structure:

245 $a Tomb of Annihilation
650 $a Dungeons and Dragons (Game) $v Handbooks, manuals, etc.
650 $a Dungeons and Dragons (Game) $v Rules.

The following code:

for _, t := range record.Filter("245", "650v") {
  for _, v := range t {
    fmt.Println(v)
  }
}

outputs:

Tomb of Annihilation
Handbooks, manuals, etc.
Rules.

Developing

This package uses modules for managing dependencies. Use:

$ go get -u ./...

to upgrade all minor/patch versions of dependencies.

Tests can be run with:

$ go test -v ./...

Benchmarks can be run with:

$ go test -bench=.

Documentation

Overview

fml is library for parsing MARC 21 formatted data.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ControlField

type ControlField struct {
	Tag   string
	Value string
}

ControlField just contains a Tag and a Value.

type DataField

type DataField struct {
	Indicator1 string
	Indicator2 string
	Tag        string
	SubFields  []SubField
}

DataField contains two Indicators, a Tag, and a slice of SubFields. If you want a specific subfield or subfields you should use the SubField method.

func (DataField) SubField

func (d DataField) SubField(subfield ...string) []SubField

SubField takes an arbitrary number of subfield code strings and returns a slice of SubFields.

type Leader

type Leader struct {
	Status        byte // 05 byte position
	Type          byte // 06
	BibLevel      byte // 07
	Control       byte // 08
	EncodingLevel byte // 17
	Form          byte // 18
	Multipart     byte // 19
}

Leader contains a subset of the bytes in the record leader. Omitted are bytes specifying the length of parts of the record and bytes which do not vary from record to record.

type MarcIterator

type MarcIterator struct {
	// contains filtered or unexported fields
}

MarcIterator will iterate over a set of MARC records using the Next() and Value() methods. Use the NewMarcIterator function to create a MarcIterator.

func NewMarcIterator

func NewMarcIterator(r io.Reader) *MarcIterator

NewMarcIterator creates and returns a new instance of a MarcIterator. This function should be used to create a MarcIterator rather than instantiating one yourself.

func (*MarcIterator) Err

func (m *MarcIterator) Err() error

Err will return the first error encountered by the MarcIterator.

func (*MarcIterator) Next

func (m *MarcIterator) Next() bool

Next advances the MarcIterator to the next record, which will be available through the Value method. It returns false when the MarcIterator has reached the end of the file or has encountered an error. Any error will be accessible from the Err method.

func (*MarcIterator) Value

func (m *MarcIterator) Value() (Record, error)

Value returns the current Record or the MarcIterator.

type Record

type Record struct {
	Data   string
	Fields []interface{}
	Leader Leader
}

Record is a struct representing a MARC record. It has a Fields slice which contains both ControlFields and DataFields.

func (Record) ControlField

func (r Record) ControlField(tag ...string) []ControlField

ControlField method takes an arbitrary number of tag strings and returns a slice of matching ControlFields.

func (Record) ControlNum

func (r Record) ControlNum() string

ControlNum returns the record's control number.

func (Record) DataField

func (r Record) DataField(tag ...string) []DataField

DataField method takes an arbitrary number of tag strings and returns a slice of matching DataFields. Note that one tag may return multiple DataFields as they can be repeated.

func (Record) Filter

func (r Record) Filter(query ...string) [][]string

Filter takes one or more tag queries and returns a slice of strings matching the selected subfield values. A tag query consists of the three digit MARC tag optionally followed by one or more subfield codes, for example: "245ac", "650x" or "100". Filtering for indicators can be done by including the two desired indicators between pipes after the tag. An * character can be used for any inidicator, for example: "245|*1|ac" or 650|01|x.

type SubField

type SubField struct {
	Code  string
	Value string
}

SubField contains a Code and a Value.

Directories

Path Synopsis
cmd
fml

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL