trace

package
v0.0.0-...-22ab5e0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 5, 2024 License: Apache-2.0 Imports: 8 Imported by: 0

README

trace: push arbitrary trace level data to an influxdb instance

This package has code to create a client that can be used to push events to an influxdb instance. It is used to collect trace data from many different nodes in a network. If there is no URL in the config.toml, then the underlying client is nil and no points will be written. The provided chainID and nodeID are used to tag all points. The underlying client is exposed to allow for custom writes, but the WritePoint method should be used for most cases, as it enforces the schema.

Usage and Schema

To use this package, first create a new client using the NewClient function, then pass that client to the relevant components that need to push events. After that, you can use the WritePoint method to push events to influxdb. In the below example, we're pushing a point in the consensus reactor to measure exactly when each step of consensus is reached for each node.

client.WritePoint(RoundStateTable, map[string]interface{}{
		HeightFieldKey: height,
		RoundFieldKey:  round,
		StepFieldKey:   step.String(),
})

Using this method enforces the typical schema, where we are tagging (aka indexing) each point by the chain-id and the node-id, then adding the local time of the creation of the event. If you need to push a custom point, you can use the underlying client directly. See influxdb2.WriteAPI for more details.

Schema

All points in influxdb are divided into a key value pair per field. These kvs are indexed first by a "measurement", which is used as a "table" in other dbs. Additional indexes can also be added, we're using the chain-id and node-id here. This allows for us to quickly query for trace data for a specific chain and/or node.

from(bucket: "e2e")
  |> range(start: -1h)
  |> filter(
    fn: (r) => r["_measurement"] == "consensus_round_state"
      and r.chain_id == "ci-YREG8X"
      and r.node_id == "0b529c309608172a29c49979394734260b42acfb"
    )

We can easily retrieve all fields in a relatively standard table format by using the pivot fluxQL command.

from(bucket: "mocha")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "consensus_round_state")
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
Querying Data Using Python

Python can be used to quickly search for and isolate specific patterns.

from influxdb_client import InfluxDBClient
from influxdb_client.client.write_api import SYNCHRONOUS

client = InfluxDBClient(url="http://your-influx-url:8086/", token="your-influx-token", org="celestia")

query_api = client.query_api()

def create_flux_table_query(start, bucket, measurement, filter_clause):
    flux_table_query = f'''
    from(bucket: "{bucket}")
      |> range(start: {start})
      |> filter(fn: (r) => r._measurement == "{measurement}")
      {filter_clause}
      |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
    '''
    return flux_table_query

query = create_flux_table_query("-1h", "mocha", "consenus_round_state", "")
result = query_api.query(query=query)
Running a node with remote tracing on

Tracing will only occur if an influxdb URL in specified either directly in the config.toml or as flags provided to the start sub command.

Configure in the config.toml
#######################################################
###       Instrumentation Configuration Options     ###
#######################################################
[instrumentation]

...

# The URL of the influxdb instance to use for remote event
# collection. If empty, remote event collection is disabled.
influx_url = "http://your-influx-ip:8086/"

# The influxdb token to use for remote event collection.
influx_token = "your-token"

# The influxdb bucket to use for remote event collection.
influx_bucket = "e2e"

# The influxdb org to use for event remote collection.
influx_org = "celestia"

# The size of the batches that are sent to the database.
influx_batch_size = 20

# The list of tables that are updated when tracing. All available tables and
# their schema can be found in the pkg/trace/schema package.
influx_tables = ["consensus_round_state", "mempool_tx", ]

or

celestia-appd start --influxdb-url=http://your-influx-ip:8086/ --influxdb-token="your-token"
e2e tests

To push events from e2e tests, we only need to specify the URL and the token via the cli.

cd test/e2e
make && ./build/runner -f ./networks/ci.toml --influxdb-url=http://your-influx-ip:8086/ --influxdb-token="your-token"

Documentation

Overview

trace: push arbitrary trace level data to an influxdb instance

This package has code to create a client that can be used to push events to an influxdb instance. It is used to collect trace data from many different nodes in a network. If there is no URL in the config.toml, then the underlying client is nil and no points will be written. The provided chainID and nodeID are used to tag all points. The underlying client is exposed to allow for custom writes, but the WritePoint method should be used for most cases, as it enforces the schema.

## Usage and Schema

To use this package, first create a new client using the `NewClient` function, then pass that client to the relevant components that need to push events. After that, you can use the `WritePoint` method to push events to influxdb. In the below example, we're pushing a point in the consensus reactor to measure exactly when each step of consensus is reached for each node.

```go

if cs.traceClient.IsCollecting() {
		cs.traceClient.WritePoint("consensus", map[string]interface{}{
			"roundData": []interface{}{rs.Height, rs.Round, rs.Step},
		})
}

```

Using this method enforces the typical schema, where we are tagging (aka indexing) each point by the chain-id and the node-id, then adding the local time of the creation of the event. If you need to push a custom point, you can use the underlying client directly. See influxdb2.WriteAPI for more details.

### Schema

All points in influxdb are divided into a key value pair per field. These kvs are indexed first by a "measurement", which is used as a "table" in other dbs. Additional indexes can also be added, we're using the chain-id and node-id here. This allows for us to quickly query for trace data for a specific chain and/or node.

```flux from(bucket: "e2e")

|> range(start: -1h)
|> filter(
  fn: (r) => r["_measurement"] == "consensus"
    and r.chain_id == "ci-YREG8X"
    and r.node_id == "0b529c309608172a29c49979394734260b42acfb"
  )

```

### Running a node with remote tracing on

Tracing will only occur if an influxdb URL in specified either directly in the `config.toml` or as flags provided to the start sub command.

configure in the config.toml

```toml ####################################################### ### Instrumentation Configuration Options ### ####################################################### [instrumentation]

...

# The URL of the influxdb instance to use for remote event # collection. If empty, remote event collection is disabled. influx_url = "http://your-influx-ip:8086/"

# The influxdb token to use for remote event collection. influx_token = "your-token"

# The influxdb bucket to use for remote event collection. influx_bucket = "e2e"

# The influxdb org to use for event remote collection. influx_org = "celestia"

# The size of the batches that are sent to the database. influx_batch_size = 20 ```

or ```sh celestia-appd start --influxdb-url=http://your-influx-ip:8086/ --influxdb-token="your-token" ```

### e2e tests

To push events from e2e tests, we only need to specify the URL and the token via the cli.

```bash cd test/e2e make && ./build/runner -f ./networks/ci.toml --influxdb-url=http://your-influx-ip:8086/ --influxdb-token="your-token" ```

Index

Constants

View Source
const (
	NodeIDTag  = "node_id"
	ChainIDTag = "chain_id"
)
View Source
const (
	FlagInfluxDBURL              = "influxdb-url"
	FlagInfluxDBToken            = "influxdb-token"
	FlagInfluxDBURLDescription   = "URL of the InfluxDB instance to use for arbitrary data collection. If not specified, data will not be collected"
	FlagInfluxDBTokenDescription = "Token to use when writing to the InfluxDB instance. Must be specified if 'influxdb-url' is specified" //nolint:gosec

	FlagPyroscopeURL              = "pyroscope-url"
	FlagPyroscopeURLDescription   = "URL of the Pyroscope instance to use for continuous profiling. If not specified, profiling will not be enabled"
	FlagPyroscopeTrace            = "pyroscope-trace"
	FlagPyroscopeTraceDescription = "enable adding trace data to pyroscope profiling"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Client

type Client struct {

	// Client is the influxdb client. This field is nil if no connection is
	// established.
	Client influxdb2.Client
	// contains filtered or unexported fields
}

Client is an influxdb client that can be used to push events to influxdb. It is used to collect trace data from many different nodes in a network. If there is no URL in the config.toml, then the underlying client is nil and no points will be written. The provided chainID and nodeID are used to tag all points. The underlying client is exposed to allow for custom writes, but the WritePoint method should be used for most cases, as it enforces the schema.

func NewClient

func NewClient(cfg *config.InstrumentationConfig, logger log.Logger, chainID, nodeID string) (*Client, error)

NewClient creates a new influxdb client using the provided config. If there is no URL configured, then the underlying client will be nil, and each attempt to write a point will do nothing. The provided chainID and nodeID are used to tag all points.

func (*Client) IsCollecting

func (c *Client) IsCollecting(table string) bool

IsCollecting returns true if the client is collecting events.

func (*Client) Stop

func (c *Client) Stop()

Stop closes the influxdb client.

func (*Client) WritePoint

func (c *Client) WritePoint(table string, fields map[string]interface{})

WritePoint async writes a point to influxdb. To enforce the schema, it automatically adds the chain_id and node_id tags, along with setting the timestamp to the current time. If the underlying client is nil, it does nothing. The "table" arg is used as the influxdb "measurement" for the point. If other tags are needed, use WriteCustomPoint.

type ClientConfigConfig

type ClientConfigConfig struct {
	// URL is the influxdb url.
	URL string `mapstructure:"influx_url"`
	// Token is the influxdb token.
	Token string `mapstructure:"influx_token"`
	// Org is the influxdb organization.
	Org string `mapstructure:"influx_org"`
	// Bucket is the influxdb bucket.
	Bucket string `mapstructure:"influx_bucket"`
	// BatchSize is the number of points to write in a single batch.
	BatchSize int `mapstructure:"influx_batch_size"`
}

ClientConfigConfig is the influxdb client configuration used for collecting events.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL