docparser

package
v2.0.0-...-6bef632 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 25, 2024 License: GPL-3.0 Imports: 11 Imported by: 0

Documentation

Overview

Package docparser implements a parser for legacy MS Word documents (.doc). It depends on the wvWare tool which must be installed.

The metadata parser is mainly taken from https://github.com/sajari/docconv/blob/master/doc.go

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type DocMetadata

type DocMetadata struct {
	Author   string `json:"author,omitempty"`
	Category string `json:"category,omitempty"`
	Comment  string `json:"comment,omitempty"`
	Company  string `json:"company,omitempty"`
	Keywords string `json:"keywords,omitempty"`
	Manager  string `json:"manager,omitempty"`
	Subject  string `json:"subject,omitempty"`
	Title    string `json:"title,omitempty"`

	Created   *time.Time `json:"created,omitempty"`
	Modified  *time.Time `json:"modified,omitempty"`
	PageCount int32      `json:"page_count,omitempty"`
	CharCount int32      `json:"char_count,omitempty"`
	WordCount int32      `json:"word_count,omitempty"`
}

type WordDoc

type WordDoc struct {
	// contains filtered or unexported fields
}

func NewFromBytes

func NewFromBytes(data []byte) (doc *WordDoc, err error)

func NewFromStream

func NewFromStream(stream io.ReadCloser) (doc *WordDoc, err error)

func (*WordDoc) Close

func (d *WordDoc) Close()

Close is a no-op

func (*WordDoc) Metadata

func (d *WordDoc) Metadata() DocMetadata

func (*WordDoc) MetadataMap

func (d *WordDoc) MetadataMap() map[string]string

func (*WordDoc) StreamText

func (d *WordDoc) StreamText(w io.Writer)

func (*WordDoc) Text

func (d *WordDoc) Text() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL