Documentation ¶
Overview ¶
Package encoding provides the generic APIs implemented by parquet encodings in its sub-packages.
Index ¶
- Variables
- type ByteArrayList
- func (list *ByteArrayList) Cap() int
- func (list *ByteArrayList) Clone() ByteArrayList
- func (list *ByteArrayList) Grow(n int)
- func (list *ByteArrayList) Index(i int) []byte
- func (list *ByteArrayList) Len() int
- func (list *ByteArrayList) Less(i, j int) bool
- func (list *ByteArrayList) Push(v []byte)
- func (list *ByteArrayList) PushSize(n int) []byte
- func (list *ByteArrayList) Range(f func([]byte) bool)
- func (list *ByteArrayList) Reset()
- func (list *ByteArrayList) Size() int64
- func (list *ByteArrayList) Slice(i, j int) ByteArrayList
- func (list *ByteArrayList) Split() [][]byte
- func (list *ByteArrayList) Swap(i, j int)
- type Decoder
- type Encoder
- type Encoding
- type NotSupported
- type NotSupportedDecoder
- func (NotSupportedDecoder) DecodeBoolean([]bool) (int, error)
- func (NotSupportedDecoder) DecodeByteArray(*ByteArrayList) (int, error)
- func (NotSupportedDecoder) DecodeDouble([]float64) (int, error)
- func (NotSupportedDecoder) DecodeFixedLenByteArray(size int, data []byte) (int, error)
- func (NotSupportedDecoder) DecodeFloat([]float32) (int, error)
- func (NotSupportedDecoder) DecodeInt16([]int16) (int, error)
- func (NotSupportedDecoder) DecodeInt32([]int32) (int, error)
- func (NotSupportedDecoder) DecodeInt64([]int64) (int, error)
- func (NotSupportedDecoder) DecodeInt8([]int8) (int, error)
- func (NotSupportedDecoder) DecodeInt96([]deprecated.Int96) (int, error)
- func (NotSupportedDecoder) Encoding() format.Encoding
- func (NotSupportedDecoder) Reset(io.Reader)
- func (NotSupportedDecoder) SetBitWidth(int)
- type NotSupportedEncoder
- func (NotSupportedEncoder) EncodeBoolean([]bool) error
- func (NotSupportedEncoder) EncodeByteArray(ByteArrayList) error
- func (NotSupportedEncoder) EncodeDouble([]float64) error
- func (NotSupportedEncoder) EncodeFixedLenByteArray(int, []byte) error
- func (NotSupportedEncoder) EncodeFloat([]float32) error
- func (NotSupportedEncoder) EncodeInt16([]int16) error
- func (NotSupportedEncoder) EncodeInt32([]int32) error
- func (NotSupportedEncoder) EncodeInt64([]int64) error
- func (NotSupportedEncoder) EncodeInt8([]int8) error
- func (NotSupportedEncoder) EncodeInt96([]deprecated.Int96) error
- func (NotSupportedEncoder) Encoding() format.Encoding
- func (NotSupportedEncoder) Reset(io.Writer)
- func (NotSupportedEncoder) SetBitWidth(int)
Constants ¶
This section is empty.
Variables ¶
var ( // ErrNotSupported is an error returned when the underlying encoding does // not support the type of values being encoded or decoded. // // This error may be wrapped with type information, applications must use // errors.Is rather than equality comparisons to test the error values // returned by encoders and decoders. ErrNotSupported = errors.New("encoding not supported") // ErrInvalidArguments is an error returned when arguments passed to the // encoding functions are incorrect and will lead to an expected failure. // // As with ErrNotSupported, this error may be wrapped with specific // information about the problem and applications are expected to use // errors.Is for comparisons. ErrInvalidArguments = errors.New("invalid encoding arguments") )
Functions ¶
This section is empty.
Types ¶
type ByteArrayList ¶
type ByteArrayList struct {
// contains filtered or unexported fields
}
ByteArrayList is a container similar to [][]byte with a smaller memory overhead. Where using a byte slices introduces ~24 bytes of overhead per element, ByteArrayList requires only 8 bytes per element. Extra efficiency also comes from reducing GC pressure by using contiguous areas of memory instead of allocating individual slices for each element. For lists with many small-size elements, the memory footprint can be reduced by 40-80%.
func MakeByteArrayList ¶
func MakeByteArrayList(capacity int) ByteArrayList
func (*ByteArrayList) Cap ¶
func (list *ByteArrayList) Cap() int
func (*ByteArrayList) Clone ¶
func (list *ByteArrayList) Clone() ByteArrayList
func (*ByteArrayList) Grow ¶
func (list *ByteArrayList) Grow(n int)
func (*ByteArrayList) Index ¶
func (list *ByteArrayList) Index(i int) []byte
func (*ByteArrayList) Len ¶
func (list *ByteArrayList) Len() int
func (*ByteArrayList) Less ¶
func (list *ByteArrayList) Less(i, j int) bool
func (*ByteArrayList) Push ¶
func (list *ByteArrayList) Push(v []byte)
func (*ByteArrayList) PushSize ¶
func (list *ByteArrayList) PushSize(n int) []byte
func (*ByteArrayList) Range ¶
func (list *ByteArrayList) Range(f func([]byte) bool)
func (*ByteArrayList) Reset ¶
func (list *ByteArrayList) Reset()
func (*ByteArrayList) Size ¶
func (list *ByteArrayList) Size() int64
func (*ByteArrayList) Slice ¶
func (list *ByteArrayList) Slice(i, j int) ByteArrayList
func (*ByteArrayList) Split ¶
func (list *ByteArrayList) Split() [][]byte
func (*ByteArrayList) Swap ¶
func (list *ByteArrayList) Swap(i, j int)
type Decoder ¶
type Decoder interface { // Calling Reset clears the decoder state and changes the io.Reader where // decoded values are written to the one given as argument. // // The io.Reader may be nil, in which case the decoder must not be used // until Reset is called again with a non-nil reader. // // Calling Reset does not override the bit-width configured on the decoder. Reset(io.Reader) // Decodes an array of boolean values using this decoder, returning // the number of decoded values, and io.EOF if the end of the underlying // io.Reader was reached. DecodeBoolean(data []bool) (int, error) // Decodes an array of 8 bits integer values using this decoder, returning // the number of decoded values, and io.EOF if the end of the underlying // io.Reader was reached. // // The parquet type system does not have a 8 bits integers, this method // is intended to decode INT32 values but receives them as an array of // int8 values to enable greater memory efficiency when the application // knows that all values can fit in 8 bits. DecodeInt8(data []int8) (int, error) // Decodes an array of 16 bits integer values using this decoder, returning // the number of decoded values, and io.EOF if the end of the underlying // io.Reader was reached. // // The parquet type system does not have a 16 bits integers, this method // is intended to decode INT32 values but receives them as an array of // int8 values to enable greater memory efficiency when the application // knows that all values can fit in 16 bits. DecodeInt16(data []int16) (int, error) // Decodes an array of 32 bits integer values using this decoder, returning // the number of decoded values, and io.EOF if the end of the underlying // io.Reader was reached. DecodeInt32(data []int32) (int, error) // Decodes an array of 64 bits integer values using this decoder, returning // the number of decoded values, and io.EOF if the end of the underlying // io.Reader was reached. DecodeInt64(data []int64) (int, error) // Decodes an array of 96 bits integer values using this decoder, returning // the number of decoded values, and io.EOF if the end of the underlying // io.Reader was reached. DecodeInt96(data []deprecated.Int96) (int, error) // Decodes an array of 32 bits floating point values using this decoder, // returning the number of decoded values, and io.EOF if the end of the // underlying io.Reader was reached. DecodeFloat(data []float32) (int, error) // Decodes an array of 64 bits floating point values using this decoder, // returning the number of decoded values, and io.EOF if the end of the // underlying io.Reader was reached. DecodeDouble(data []float64) (int, error) // Decodes an array of variable length byte array values using this decoder, // returning the number of decoded values, and io.EOF if the end of the // underlying io.Reader was reached. // // The values are written to the `data` buffer by calling the Push method, // the method returns the number of values written. DecodeByteArray will // stop pushing value to the output ByteArrayList if its total capacity is // reached. DecodeByteArray(data *ByteArrayList) (int, error) // Decodes an array of fixed length byte array values using this decoder, // returning the number of decoded values, and io.EOF if the end of the // underlying io.Reader was reached. DecodeFixedLenByteArray(size int, data []byte) (int, error) // Configures the bit-width on the decoder. // // Not all encodings require declaring the bit-width, but applications that // use the Decoder abstraction should not make assumptions about the // underlying type of the decoder, and therefore should call SetBitWidth // prior to decoding repetition and definition levels. SetBitWidth(bitWidth int) }
The Decoder interface is implemented by decoder types.
type Encoder ¶
type Encoder interface { // Calling Reset clears the encoder state and changes the io.Writer where // encoded values are written to the one given as argument. // // The io.Writer may be nil, in which case the encoder must not be used // until Reset is called again with a non-nil writer. // // Calling Reset does not override the bit-width configured on the encoder. Reset(io.Writer) // Encodes an array of boolean values using this encoder. EncodeBoolean(data []bool) error // Encodes an array of 8 bits integer values using this encoder. // // The parquet type system does not have a 8 bits integers, this method // is intended to encode INT32 values but receives them as an array of // int8 values to enable greater memory efficiency when the application // knows that all values can fit in 8 bits. EncodeInt8(data []int8) error // Encodes an array of boolean values using this encoder. // // The parquet type system does not have a 16 bits integers, this method // is intended to encode INT32 values but receives them as an array of // int8 values to enable greater memory efficiency when the application // knows that all values can fit in 16 bits. EncodeInt16(data []int16) error // Encodes an array of 32 bit integer values using this encoder. EncodeInt32(data []int32) error // Encodes an array of 64 bit integer values using this encoder. EncodeInt64(data []int64) error // Encodes an array of 96 bit integer values using this encoder. EncodeInt96(data []deprecated.Int96) error // Encodes an array of 32 bit floating point values using this encoder. EncodeFloat(data []float32) error // Encodes an array of 64 bit floating point values using this encoder. EncodeDouble(data []float64) error // Encodes an array of variable length byte array values using this encoder. EncodeByteArray(data ByteArrayList) error // Encodes an array of fixed length byte array values using this encoder. // // The list is encoded contiguously in the `data` byte slice, in chunks of // `size` elements EncodeFixedLenByteArray(size int, data []byte) error // Configures the bit-width on the encoder. // // Not all encodings require declaring the bit-width, but applications that // use the Encoder abstraction should not make assumptions about the // underlying type of the encoder, and therefore should call SetBitWidth // prior to encoding repetition and definition levels. SetBitWidth(bitWidth int) }
The Encoder interface is implemented by encoders types.
Some encodings only support partial
type Encoding ¶
type Encoding interface { // Returns a human-readable name for the encoding. String() string // Returns the parquet code representing the encoding. Encoding() format.Encoding // Checks whether the encoding is capable of serializing parquet values of // the given type. CanEncode(format.Type) bool // Creates a decoder reading encoded values to the io.Reader passed as // argument. // // The io.Reader may be nil, in which case the decoder's Reset method must // be called with a non-nil io.Reader prior to decoding values. NewDecoder(io.Reader) Decoder // Creates an encoder writing values to the io.Writer passed as argument. // // The io.Writer may be nil, in which case the encoder's Reset method must // be called with a non-nil io.Writer prior to encoding values. NewEncoder(io.Writer) Encoder }
The Encoding interface is implemented by types representing parquet column encodings.
Encoding instances must be safe to use concurrently from multiple goroutines.
type NotSupported ¶
type NotSupported struct { }
NotSupported is a type satisfying the Encoding interface which does not support encoding nor decoding any value types.
func (NotSupported) Encoding ¶
func (NotSupported) Encoding() format.Encoding
func (NotSupported) NewDecoder ¶
func (NotSupported) NewDecoder(io.Reader) Decoder
func (NotSupported) NewEncoder ¶
func (NotSupported) NewEncoder(io.Writer) Encoder
func (NotSupported) String ¶
func (NotSupported) String() string
type NotSupportedDecoder ¶
type NotSupportedDecoder struct { }
NotSupportedDecoder is an implementation of the Decoder interface which does not support decoding any value types.
Many parquet encodings only support decoding a subset of the parquet types, they can embed this type to default to not supporting any decoding, then override specific Decode* methods to provide implementations for the types they do support.
func (NotSupportedDecoder) DecodeBoolean ¶
func (NotSupportedDecoder) DecodeBoolean([]bool) (int, error)
func (NotSupportedDecoder) DecodeByteArray ¶
func (NotSupportedDecoder) DecodeByteArray(*ByteArrayList) (int, error)
func (NotSupportedDecoder) DecodeDouble ¶
func (NotSupportedDecoder) DecodeDouble([]float64) (int, error)
func (NotSupportedDecoder) DecodeFixedLenByteArray ¶
func (NotSupportedDecoder) DecodeFixedLenByteArray(size int, data []byte) (int, error)
func (NotSupportedDecoder) DecodeFloat ¶
func (NotSupportedDecoder) DecodeFloat([]float32) (int, error)
func (NotSupportedDecoder) DecodeInt16 ¶
func (NotSupportedDecoder) DecodeInt16([]int16) (int, error)
func (NotSupportedDecoder) DecodeInt32 ¶
func (NotSupportedDecoder) DecodeInt32([]int32) (int, error)
func (NotSupportedDecoder) DecodeInt64 ¶
func (NotSupportedDecoder) DecodeInt64([]int64) (int, error)
func (NotSupportedDecoder) DecodeInt8 ¶
func (NotSupportedDecoder) DecodeInt8([]int8) (int, error)
func (NotSupportedDecoder) DecodeInt96 ¶
func (NotSupportedDecoder) DecodeInt96([]deprecated.Int96) (int, error)
func (NotSupportedDecoder) Encoding ¶
func (NotSupportedDecoder) Encoding() format.Encoding
func (NotSupportedDecoder) Reset ¶
func (NotSupportedDecoder) Reset(io.Reader)
func (NotSupportedDecoder) SetBitWidth ¶
func (NotSupportedDecoder) SetBitWidth(int)
type NotSupportedEncoder ¶
type NotSupportedEncoder struct { }
NotSupportedEncoder is an implementation of the Encoder interface which does not support encoding any value types.
Many parquet encodings only support encoding a subset of the parquet types, they can embed this type to default to not supporting any encoding, then override specific Encode* methods to provide implementations for the types they do support.
func (NotSupportedEncoder) EncodeBoolean ¶
func (NotSupportedEncoder) EncodeBoolean([]bool) error
func (NotSupportedEncoder) EncodeByteArray ¶
func (NotSupportedEncoder) EncodeByteArray(ByteArrayList) error
func (NotSupportedEncoder) EncodeDouble ¶
func (NotSupportedEncoder) EncodeDouble([]float64) error
func (NotSupportedEncoder) EncodeFixedLenByteArray ¶
func (NotSupportedEncoder) EncodeFixedLenByteArray(int, []byte) error
func (NotSupportedEncoder) EncodeFloat ¶
func (NotSupportedEncoder) EncodeFloat([]float32) error
func (NotSupportedEncoder) EncodeInt16 ¶
func (NotSupportedEncoder) EncodeInt16([]int16) error
func (NotSupportedEncoder) EncodeInt32 ¶
func (NotSupportedEncoder) EncodeInt32([]int32) error
func (NotSupportedEncoder) EncodeInt64 ¶
func (NotSupportedEncoder) EncodeInt64([]int64) error
func (NotSupportedEncoder) EncodeInt8 ¶
func (NotSupportedEncoder) EncodeInt8([]int8) error
func (NotSupportedEncoder) EncodeInt96 ¶
func (NotSupportedEncoder) EncodeInt96([]deprecated.Int96) error
func (NotSupportedEncoder) Encoding ¶
func (NotSupportedEncoder) Encoding() format.Encoding
func (NotSupportedEncoder) Reset ¶
func (NotSupportedEncoder) Reset(io.Writer)
func (NotSupportedEncoder) SetBitWidth ¶
func (NotSupportedEncoder) SetBitWidth(int)
Directories ¶
Path | Synopsis |
---|---|
Package plain implements the PLAIN parquet encoding.
|
Package plain implements the PLAIN parquet encoding. |
Package rle implements the hybrid RLE/Bit-Packed encoding employed in repetition and definition levels, dictionary indexed data pages, and boolean values in the PLAIN encoding.
|
Package rle implements the hybrid RLE/Bit-Packed encoding employed in repetition and definition levels, dictionary indexed data pages, and boolean values in the PLAIN encoding. |