moxpopuli

package module
v0.0.0-...-6670d34 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 5, 2022 License: MIT Imports: 5 Imported by: 0

README

Status GoDoc license

Mox Populi

See the presentation we gave at AsyncAPI Conf 2022 about Mox Populi: Building AsyncAPI Specs from Real-World API Data

And once you have an idea what it is, try Mox Populi from your browser.

Mox Populi is a tool to build comprehensive JSON Schemas from real-world events and payloads. That is, instead of the centralized, top-down approach of using a specification to drive development, you can generate a specification from actual running services. Mox Populi represents a democratic (or anarchic) approach to specification generation and API governance.

In pratical terms, Mox Populi is useful for any project concerned with ingesting and processing data from external systems where schema are unreliable or unpublished. It can be used for exploratory purposes, to better understand the data coming into a system, for building powerful regression testing systems, or as the basis for a hand-built specification.

Examples of where Mox Populi has been useful include:

  • Integrating products that do not have easy-to-use staging environments, so you cannot easily generate test data or load to send into your systems.
  • Products that do not have programmatic ways to generate load or test data (the API is read-only or read-mostly).
  • Generating enough load to test a backend in a way that would be impractical otherwise.
  • Fixture data for something similar to property-based testing, where the Mox Populi-generated schema can be used to fixture data used for unit and integration testing.

Mox Populi is currently used by WebhookDB to power its extensive unit and integration regression testing systems, and to make AsyncAPI specifications available for supported services.

Examples

Generate a JSONSchema using each object in a JSON Lines file:

moxpopuli schemagen --payload-loader=file://./inputs.jsonl

Load and save the JSONSchema to a file:

moxpopuli schemagen \
    --payload-loader=file://./testdata/complexpayloads.jsonl \
    --loader=file://./schema-from-inputs.json \
    --saver=file://./schema-from-inputs.json

Generate a JSONSchema for each row in a JSONB column in a Postgres database:

moxpopuli schemagen \
    --payload-loader=postgres://webhookdb:webhookdb@localhost:18005/webhookdb \
    --payload-loader-arg="SELECT request_body FROM logged_webhooks WHERE truncated_at IS NULL LIMIT 100"

Generate a full AsyncAPI spec from some events stored in a file:

moxpopuli specgen \
    --loader='file://./testdata/whdbspecseed.json' \
    --event-loader='file://./testdata/requests.jsonl'

Limitations

Schema generation is based on a supported subset of JSON Schema.

Schema generation does not support:

  • References. Do not include references in your schemas. moxpopuli will never write out references. Because we cannot be sure about the schemas we see, using references could result in too much diffing and confusion.
  • Non-object payloads. It's extremely rare for this to be a problem; if it is, support can be added.
  • YAML. Easy enough to add support but for now we're JSON-only.

Spec generation is limited only to what can be figured out from events, which is usually channels, subscription operations, and their data like bindings, headers, and payloads. Usually you will supply a 'seed' specification that includes an info section and whatever else is needed.

Mox Populi Server

Running moxpopuli server starts a server on $PORT, or 22021 by default. It exposes four endpoints, which roughly correspond to the schemagen, specgen, and datagen CLI commands.

To see the OpenAPI document describing the API, and test it out yourself, go to https://moxpopuli.webhookdb.com/swaggerui/index.html#/default/postV1SchemagenQuickstart

The OpenAPI specification is built mostly from the actual runtime Go structs using the Sashay tool. It's really neat since it uses actual runtime code that your server uses, not separate documentation!

You can run moxpopuli server openapi to generate the OpenAPI spec, or use moxpopuli server swaggerui to open a browser to the SwaggerUI page against localhost.

How it Works

The linked presentation covers this at a high level. But here are some lower level and implementation notes.

Building Schemas from Payloads

The core of moxpoopuli is generating JSON Schemas from real-world production payloads, such as generating a message payload from a request body, or message headers from request headers.

The gist of it is:

  • Schema calculation is always based on an existing schema (which can be empty), and a real-world payload.
  • moxpopuli generates a new schema from the real-world payload.
  • moxpopuli "merges" the two schemas such that the resulting schema is accurate for all seen payloads. For example, if the first payload has a property with the value of 1,000, we'd assume it's an int32. But if the next payload has the same property with a value of 15 billion, the new and resulting schema would be an int64.
  • moxpopuli tries to be as descriptive as possible about the payloads it analyzes. For example, any values that are not accomodated by a schema get added to examples, it uses additional data types for values (uuid, email, etc), and uses x- extension fields to store information it needs for future analysis or generating meaningful sample data.
Loaders and Savers

moxpopuli uses a system of loaders and savers to load information like specifications or events, and then save them after processing. It does this via "loaders" and "savers".

Loaders and savers are usually URLs. Each loader/saver can also take an additional argument, which can be used to configure the saver/loader. One example is using a postgres:// URL to connect to a database, with an argument that is the SQL query to select or update the required data.

Single Objects Load and Save

When moxpopuli is working with things like JSONSchema or an AsyncAPI specification, it loads and saves them as a single JSON object. This is always done with the --saver/-s and --loader/-l arguments, plus --saver-arg/-sa and --loader-arg/-la to provide saver and loader arguments.

Examples of valid loader options are:

  • moxpopuli schemagen -l file://./myschema.json would read myschema.json from the current directory and write it to STDOUT.
  • printf '{"x":1}' | moxpopuli schemagen -l - would read STDIN as a schema.
  • moxpopuli schemagen -l=_ -la='{"x":1}' would parse {"x":1} as JSON as the schema.
  • moxpopuli schemagen -l=postgres://u:p@localhost:5432/myapp -la='SELECT schema FROM asyncapischemas WHERE id=1' would parse the selected row/column as a schema. Note that for single objects, only one column must be returned.
  • If no loader is given, or '.' is used, do not load anything.

Examples of valid saver options are:

  • moxpopuli schemagen -s=- would write to stdout.
  • moxpopuli schemagen -s=file://./myschema.json would save to myschema.json.
  • moxpopuli schemagen -s=postgres://u:p@localhost:5432/myapp -sa='UPDATE asyncapischemas SET schema=$1 WHERE id=1 would run that query with the updated schema as the argument. Note that for single objects, there should be only one positional argument.
Iterator Loaders

When generating a schema or specification, moxpopuli runs over many events/payloads. In these cases, we use the same loader system, but expect the data to be a collection.

Examples of iterator loaders would be:

  • -pl=file://./requests.jsonl would treat each line in the JSONLines file as a separate object.
  • -pl=file://./requests.json would expect the file to be a JSON array.
  • -pl=_ -pla='{"x":1}\n{"x":2}' would expect each line in the loader argument to be a JSON object, like JSONLines.
  • -pl=- would expect each line from STDIN to be a JSON object, like JSONLines.
  • -pl=postgres://u:p@localhost:5432/myapp -pla='SELECT body FROM requests WHERE service=stripe LIMIT 10' would use select rows from Postgres.

There are two types of iterator loaders: Payloads and Events. The only difference is that:

  • Payloads are freeform. The entire JSON object is the payload. If using a postgres loader, the query must return a single column, which is parsed as JSON to get the payload.
  • Event loads expect a certain set of keys, based on the binding. If using a postgres loader, the select/column names should match the keys; if loading JSON directly through other loaders, the loaded JSON should match the keys.
  • Event loader keys are:
    • http binding: path (string), method (string), headers ({string:string} map), body ({string:any} map)

Development

Check out the Makefile, it has everything you will need.

If it doesn't, we should add to the Makefile 😬

Supported by

Documentation

Index

Constants

This section is empty.

Variables

View Source
var BuildSha = "00000000"
View Source
var BuildTime = time.Unix(0, 0).Format(time.RFC3339)

Functions

This section is empty.

Types

type Config

type Config struct {
	CorsOrigins []string
	LogFile     string
	LogFormat   string
	LogLevel    string
}

func LoadConfig

func LoadConfig() Config

Directories

Path Synopsis
Package asyncapispec wraps the minimum of the AsyncAPI specification that Mox Populi has to work with.
Package asyncapispec wraps the minimum of the AsyncAPI specification that Mox Populi has to work with.
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL