A service proxy with failure injection API
This is a reference implementation of a client-side service proxy. It is
meant to be used with Gremlin, a systematic resiliency testing framework.
Every microservice instance making outbound API calls needs to have an
associated gremlin proxy. Typically, it runs in the same VM or container
alongside the calling process, and communicates over the loopback interface
with the caller.
Remote services and their instances have to be statically configured in the
configuration file. The service proxy acts as a HTTP/HTTPS request router
to route requests that arrive at localhost:port to the
remotehost:port. It has built in support for load balancing requests
across remote service instances in a round robin manner. There is no
support for sticky-sessions nor client-side TLS. Note that while the proxy
can connect to HTTPS endpoints, the caller must connect to the proxy at the
localhost via HTTP only. See the example-config.json
for an example of how to support HTTPS upstream endpoints, while connecting
to the proxy via http://localhost:port.
Failure injection
Requests that carry a pre-defined HTTP header, are subjected to
various forms of fault injection. Requests can be aborted (caller
gets back a HTTP 404, HTTP 503, etc.), delayed, or rewritten. The
proxy can be controlled remotely using a REST API. Rules for various
fault injection actions can be installed through this API. The
[Gremlin resliency testing framework]
(https://github.com/ResilienceTesting/gremlinsdk) provides a Python-based
control plane library, to write high-level recipes, that will be
automatically broken down into low-level fault injection commands to
be executed by the gremlin proxy.
Usage
Configuration
The services section of the config file describes a list remote
services that need to be proxied. Each element in the list is a JSON
dictionary object, describing a single service.
The proxy block under each service specifies the local port at which
requests for the remote service will be received, the IP address to bind to
(defaults to localhost), and the proxy protocol. The valid values are
"http" or "tcp". While the proxy can work with HTTP/HTTPS and generic TCP
endpoints, fault injection support for TCP endpoints is limited to
aborting/delaying connections at the beginning of a TCP session.
The loadbalancer section configures the set of hosts that provide
the remote service as well as the load balancing method (currently
roundrobin and random load balancing modes are supported). When the proxy
protocol is set to "http", you can specify hosts with or without a
scheme prefix (i.e., http/https). When the scheme prefix is absent,
"http" will be added to the host entry. For example, if a host entry
is of the form 192.168.0.1:9080, request URLs will be of the form
http://192.168.0.1:9080. If you would like to proxy requests to
HTTPS endpoints, host entries in the loadbalancer section must be
prefixed with "https://" (e.g., https://myacc.cloudant.com).
The router block configures the REST interface of the gremlin
proxy. The port 9876 is the default port at which the service proxy
exposes the REST API. The gremlinheader parameter specifies the HTTP
header that triggers the fault injection actions. Requests that do not
contain this header are left untouched. The name parameter indicates
the name of the microservice for which this service proxy is being
used.
Fields loglevel, logjson, and logstash configure the logging
aspects of the service proxy. All logs from the service proxy can be
directly sent to a logstash server, and then subsequently piped to
Elasticsearch. The Gremlin framework's assertion engine can directly
interface with Elasticsearch to execute assertions over the logs
generated by the gremlinproxy.
An example configuration file is provided in
example-config.json. It configures a proxy for
a Client microservice (as indicated by the name parameter in the
router block). The proxy listens for requests to the Server
microservice at 0.0.0.0:7777 and forwards them to either
54.175.222.246:80 or https://httpbin.org. All requests from the
Client microservice, containing the HTTP header X-Gremlin-ID will
be subjected to fault injection.
Building and running the proxy
- Before you run the proxy, you need to run logstash server and elasticsearch. Run
docker-compose -f compose-logstash-elasticsearch.yml up -d
- Setup your go environment and GOPATH variable
- Clone the repository to
$GOPATH/go/src/github.com/gremlin
folder.
- Build:
go get && go build
- Run
./gremlinproxy -c yourconfig.json
Proxy REST API
GET /gremlin/v1
: simple hello world test
POST /gremlin/v1/rules/add
: add a Rule. Rule must be posted as a JSON. Format is as follows
{
source: <source service name>,
dest: <destination service name>,
messagetype: <request|response|publish|subscribe|stream>
headerpattern: <regex to match against the value of the X-Gremlin-ID trackingheader present in HTTP headers>
bodypattern: <regex to match against HTTP message body>
delayprobability: <float, 0.0 to 1.0>
delaydistribution: <uniform|exponential|normal> probability distribution function
mangleprobability: <float, 0.0 to 1.0>
mangledistribution: <uniform|exponential|normal> probability distribution function
abortprobability: <float, 0.0 to 1.0>
abortdistribution: <uniform|exponential|normal> probability distribution function
delaytime: <string> latency to inject into requests <string, e.g., "10ms", "1s", "5m", "3h", "1s500ms">
errorcode: <Number> HTTP error code or -1 to reset TCP connection
searchstring: <string> string to replace when Mangle is enabled
replacestring: <string> string to replace with for Mangle fault
}
POST /gremlin/v1/rules/remove
: remove the rule specified in the message body (see rule format above)
GET /gremlin/v1/rules/list
: list all installed rules
DELETE /gremlin/v1/rules
: clear all rules
GET /gremlin/v1/proxy/:service/instances
: get list of instances for for :service
PUT /gremlin/v1/proxy/:service/:instances
: set list of instances for :service
. :instances
is a comma separated list.
DELETE /gremlin/v1/proxy/:service/instances
: clear list of instances under :service
PUT /gremlin/v1/test/:id
: set new test :id
, that will be logged along with request/response logs
DELETE /gremlin/v1/test/:id
: remove the currently set test :id