Overview

Tyk streams configuration is specified using YAML. The configuration consists of several main sections: input, pipeline, output and optionally logger.

Input

The input section defines the publisher source of the data stream. Tyk Streams supports various input types such as Kafka, HTTP, MQTT etc. Each input type has specific configuration parameters.
input:
  kafka:
    addresses:
      - localhost:9092
    topics:
      - example_topic
    consumer_group: example_group
    client_id: example_client

Pipeline

The pipeline section defines the processing steps applied to the data. It includes processors for filtering, mapping, enriching and transforming the data. Processors can be chained together.
pipeline:
  processors:
    - mapping: |
        root = this
        root.foo = this.bar.uppercase()
    - json_schema:
        schema_path: "./schemas/example_schema.json"

Output

The output section specifies the destination of the processed data. Similar to inputs, Tyk Streams supports various output types like Kafka, HTTP etc.
output:
  kafka:
    addresses:
      - localhost:9092
    topic: output_topic
    client_id: example_output_client

Logger (Optional)

The logger section is used to configure logging options, such as log level and output format.
logger:
  level: INFO
  format: json

Inputs

Overview

An input is a source of data piped through an array of optional processors:
input:
  label: my_kafka_input

  kafka:
    addresses: [ localhost:9092 ]
    topics: [ foo, bar ]
    consumer_group: foogroup

  # Optional list of processing steps
  processors:
    - avro:
        operator: to_json

Brokering

Only one input is configured at the root of a Tyk Streams config. However, the root input can be a broker which combines multiple inputs and merges the streams:
input:
  broker:
    inputs:
      - kafka:
          addresses: [ localhost:9092 ]
          topics: [ foo, bar ]
          consumer_group: foogroup

      - http_client:
          url: https://localhost:8085
          verb: GET
          stream:
            enabled: true

Labels

Inputs have an optional field label that can uniquely identify them in observability data such as logs.

Broker

Allows you to combine multiple inputs into a single stream of data, where each input will be read in parallel.

Common

# Common config fields, showing default values
input:
  label: ""
  broker:
    inputs: [] # No default (required)
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""

Advanced

# All config fields, showing default values
input:
  label: ""
  broker:
    copies: 1
    inputs: [] # No default (required)
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
A broker type is configured with its own list of input configurations and a field to specify how many copies of the list of inputs should be created. Adding more input types allows you to combine streams from multiple sources into one. For example, reading from both RabbitMQ and Kafka:
input:
  broker:
    copies: 1
    inputs:
      - amqp_0_9:
          urls:
            - amqp://guest:guest@localhost:5672/
          consumer_tag: tyk-consumer
          queue: tyk-queue

        # Optional list of input specific processing steps
        processors:
          - mapping: |
              root.message = this
              root.meta.link_count = this.links.length()
              root.user.age = this.user.age.number()

      - kafka:
          addresses:
            - localhost:9092
          client_id: tyk_kafka_input
          consumer_group: tyk_consumer_group
          topics: [ tyk_stream:0 ]
If the number of copies is greater than zero the list will be copied that number of times. For example, if your inputs were of type foo and bar, with ‘copies’ set to ‘2’, you would end up with two ‘foo’ inputs and two ‘bar’ inputs.
Batching
It’s possible to configure a batch policy with a broker using the batching fields. When doing this the feeds from all child inputs are combined. Some inputs do not support broker based batching and specify this in their documentation.
Processors
It is possible to configure processors at the broker level, where they will be applied to all child inputs, as well as on the individual child inputs. If you have processors at both the broker level and on child inputs then the broker processors will be applied after the child nodes processors.

Fields

copies
Whatever is specified within inputs will be created this many times. Type: int Default: 1
inputs
A list of inputs to create. Type: array
batching
Allows you to configure a batching policy. Type: object
# Examples

batching:
  byte_size: 5000
  count: 0
  period: 1s

batching:
  count: 10
  period: 1s

batching:
  check: this.contains("END BATCH")
  count: 0
  period: 1m
batching.count
A number of messages at which the batch should be flushed. If 0 disables count based batching. Type: int Default: 0
batching.byte_size
An amount of bytes at which the batch should be flushed. If 0 disables size based batching. Type: int Default: 0
batching.period
A period in which an incomplete batch should be flushed regardless of its size. Type: string Default: ""
# Examples

period: 1s

period: 1m

period: 500ms
batching.processors
A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. Type: array
# Examples

processors:
  - archive:
      format: concatenate

processors:
  - archive:
      format: lines

processors:
  - archive:
      format: json_array

Http Client

Connects to a server and continuously performs requests for a single message.

Common

# Common config fields, showing default values
input:
  label: ""
  http_client:
    url: "" # No default (required)
    verb: GET
    headers: {}
    timeout: 5s
    payload: "" # No default (optional)
    stream:
      enabled: false
      reconnect: true
    auto_replay_nacks: true

Advanced

# All config fields, showing default values
input:
  label: ""
  http_client:
    url: "" # No default (required)
    verb: GET
    headers: {}
    metadata:
      include_prefixes: []
      include_patterns: []
    dump_request_log_level: ""
    oauth:
      enabled: false
      consumer_key: ""
      consumer_secret: ""
      access_token: ""
      access_token_secret: ""
    oauth2:
      enabled: false
      client_key: ""
      client_secret: ""
      token_url: ""
      scopes: []
      endpoint_params: {}
    basic_auth:
      enabled: false
      username: ""
      password: ""
    jwt:
      enabled: false
      private_key_file: ""
      signing_method: ""
      claims: {}
      headers: {}
    tls:
      enabled: false
      skip_cert_verify: false
      enable_renegotiation: false
      root_cas: ""
      root_cas_file: ""
      client_certs: []
    extract_headers:
      include_prefixes: []
      include_patterns: []
    timeout: 5s
    retry_period: 1s
    max_retry_backoff: 300s
    retries: 3
    backoff_on:
      - 429
    drop_on: []
    successful_on: []
    proxy_url: "" # No default (optional)
    payload: "" # No default (optional)
    drop_empty_bodies: true
    stream:
      enabled: false
      reconnect: true
    auto_replay_nacks: true
Streaming
If you enable streaming then Tyk Streams will consume the body of the response as a continuous stream of data. This allows you to consume APIs that provide long lived streamed data feeds (such as Twitter).
Pagination
This input supports interpolation functions in the url and headers fields where data from the previous successfully consumed message (if there was one) can be referenced. This can be used in order to support basic levels of pagination.

Examples

Basic Pagination
Interpolation functions within the url and headers fields can be used to reference the previously consumed message, which allows simple pagination.
input:
  http_client:
    url: >-
      http://api.example.com/search?query=allmyfoos&start_time=${! (
        (timestamp_unix()-300).ts_format("2006-01-02T15:04:05Z","UTC").escape_url_query()
      ) }${! ("&next_token="+this.meta.next_token.not_null()) | "" }
    verb: GET

Fields

url
The URL to connect to. Type: string
verb
A verb to connect with Type: string Default: "GET"
# Examples

verb: POST

verb: GET

verb: DELETE
headers
A map of headers to add to the request. Type: object Default: {}
# Examples

headers:
  Content-Type: application/octet-stream
  traceparent: ${! tracing_span().traceparent }
metadata
Specify optional matching rules to determine which metadata keys should be added to the HTTP request as headers. Type: object
metadata.include_prefixes
Provide a list of explicit metadata key prefixes to match against. Type: array Default: []
# Examples

include_prefixes:
  - foo_
  - bar_

include_prefixes:
  - kafka_

include_prefixes:
  - content-
metadata.include_patterns
Provide a list of explicit metadata key regular expression (re2) patterns to match against. Type: array Default: []
# Examples

include_patterns:
  - .*

include_patterns:
  - _timestamp_unix$
dump_request_log_level
Optionally set a level at which the request and response payload of each request made will be logged. Type: string Default: "" Options: TRACE, DEBUG, INFO, WARN, ERROR, FATAL, “.
oauth
Allows you to specify open authentication via OAuth version 1. Type: object
oauth.enabled
Whether to use OAuth version 1 in requests. Type: bool Default: false
oauth.consumer_key
A value used to identify the client to the service provider. Type: string Default: ""
oauth.consumer_secret
A secret used to establish ownership of the consumer key. Type: string Default: ""
oauth.access_token
A value used to gain access to the protected resources on behalf of the user. Type: string Default: ""
oauth.access_token_secret
A secret provided in order to establish ownership of a given access token. Type: string Default: ""
oauth2
Allows you to specify open authentication via OAuth version 2 using the client credentials token flow. Type: object
oauth2.enabled
Whether to use OAuth version 2 in requests. Type: bool Default: false
oauth2.client_key
A value used to identify the client to the token provider. Type: string Default: ""
oauth2.client_secret
A secret used to establish ownership of the client key. Type: string Default: ""
oauth2.token_url
The URL of the token provider. Type: string Default: ""
oauth2.scopes
A list of optional requested permissions. Type: array Default: []
oauth2.endpoint_params
A list of optional endpoint parameters, values should be arrays of strings. Type: object Default: {}
# Examples

endpoint_params:
  bar:
    - woof
  foo:
    - meow
    - quack
basic_auth
Allows you to specify basic authentication. Type: object
basic_auth.enabled
Whether to use basic authentication in requests. Type: bool Default: false
basic_auth.username
A username to authenticate as. Type: string Default: ""
basic_auth.password
A password to authenticate with. Type: string Default: ""
jwt
Allows you to specify JWT authentication. Type: object
jwt.enabled
Whether to use JWT authentication in requests. Type: bool Default: false
jwt.private_key_file
A file with the PEM encoded via PKCS1 or PKCS8 as private key. Type: string Default: ""
jwt.signing_method
A method used to sign the token such as RS256, RS384, RS512 or EdDSA. Type: string Default: ""
jwt.claims
A value used to identify the claims that issued the JWT. Type: object Default: {}
jwt.headers
Add optional key/value headers to the JWT. Type: object Default: {}
tls
Custom TLS settings can be used to override system defaults. Type: object
tls.enabled
Whether custom TLS settings are enabled. Type: bool Default: false
tls.skip_cert_verify
Whether to skip server side certificate verification. Type: bool Default: false
tls.enable_renegotiation
Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you’re seeing the error message local error: tls: no renegotiation. Type: bool Default: false
tls.root_cas
An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. Type: string Default: ""
# Examples

root_cas: |-
  -----BEGIN CERTIFICATE-----
  ...
  -----END CERTIFICATE-----
tls.root_cas_file
An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. Type: string Default: ""
# Examples

root_cas_file: ./root_cas.pem
tls.client_certs
A list of client certificates to use. For each certificate either the fields cert and key, or cert_file and key_file should be specified, but not both. Type: array Default: []
# Examples

client_certs:
  - cert: foo
    key: bar

client_certs:
  - cert_file: ./example.pem
    key_file: ./example.key
tls.client_certs[].cert
A plain text certificate to use. Type: string Default: ""
tls.client_certs[].key
A plain text certificate key to use. Type: string Default: ""
tls.client_certs[].cert_file
The path of a certificate to use. Type: string Default: ""
tls.client_certs[].key_file
The path of a certificate key to use. Type: string Default: ""
tls.client_certs[].password
A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete pbeWithMD5AndDES-CBC algorithm is not supported for the PKCS#8 format. Warning: Since it does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext. Type: string Default: ""
# Examples

password: foo
extract_headers
Specify which response headers should be added to resulting messages as metadata. Header keys are lowercased before matching, so ensure that your patterns target lowercased versions of the header keys that you expect. Type: object
extract_headers.include_prefixes
Provide a list of explicit metadata key prefixes to match against. Type: array Default: []
# Examples

include_prefixes:
  - foo_
  - bar_

include_prefixes:
  - kafka_

include_prefixes:
  - content-
extract_headers.include_patterns
Provide a list of explicit metadata key regular expression (re2) patterns to match against. Type: array Default: []
# Examples

include_patterns:
  - .*

include_patterns:
  - _timestamp_unix$
timeout
A static timeout to apply to requests. Type: string Default: "5s"
retry_period
The base period to wait between failed requests. Type: string Default: "1s"
max_retry_backoff
The maximum period to wait between failed requests. Type: string Default: "300s"
retries
The maximum number of retry attempts to make. Type: int Default: 3
backoff_on
A list of status codes whereby the request should be considered to have failed and retries should be attempted, but the period between them should be increased gradually. Type: array Default: [429]
drop_on
A list of status codes whereby the request should be considered to have failed but retries should not be attempted. This is useful for preventing wasted retries for requests that will never succeed. Note that with these status codes the request is dropped, but message that caused the request will not be dropped. Type: array Default: []
successful_on
A list of status codes whereby the attempt should be considered successful, this is useful for dropping requests that return non-2XX codes indicating that the message has been dealt with, such as a 303 See Other or a 409 Conflict. All 2XX codes are considered successful unless they are present within backoff_on or drop_on, regardless of this field. Type: array Default: []
proxy_url
An optional HTTP proxy URL. Type: string
payload
An optional payload to deliver for each request. Type: string
drop_empty_bodies
Whether empty payloads received from the target server should be dropped. Type: bool Default: true
stream
Allows you to set streaming mode, where requests are kept open and messages are processed line-by-line. Type: object
stream.enabled
Enables streaming mode. Type: bool Default: false
stream.reconnect
Sets whether to re-establish the connection once it is lost. Type: bool Default: true
auto_replay_nacks
Whether messages that are rejected (nacked) at the output level should be automatically replayed indefinitely, eventually resulting in back pressure if the cause of the rejections is persistent. If set to false these messages will instead be deleted. Disabling auto replays can greatly improve memory efficiency of high throughput streams as the original shape of the data can be discarded immediately upon consumption and mutation. Type: bool Default: true

HTTP Server

Receive messages POSTed over HTTP(S). HTTP 2.0 is supported when using TLS, which is enabled when key and cert files are specified.

Common

# Common config fields, showing default values
input:
  label: ""
  http_server:
    address: ""
    path: /post
    ws_path: /post/ws
    allowed_verbs:
      - POST
    timeout: 5s

Advanced

# All config fields, showing default values
input:
  label: ""
  http_server:
    address: ""
    path: /post
    ws_path: /post/ws
    ws_welcome_message: ""
    allowed_verbs:
      - POST
    timeout: 5s
    cert_file: ""
    key_file: ""
    cors:
      enabled: false
      allowed_origins: []
    sync_response:
      status: "200"
      headers:
        Content-Type: application/octet-stream
      metadata_headers:
        include_prefixes: []
        include_patterns: []
Responses
Endpoints
The following fields specify endpoints that are registered for sending messages, and support path parameters of the form /{foo}, which are added to ingested messages as metadata. A path ending in / will match against all extensions of that path:
path (defaults to /post)
This endpoint expects POST requests where the entire request body is consumed as a single message. If the request contains a multipart content-type header as per rfc1341 then the multiple parts are consumed as a batch of messages, where each body part is a message of the batch.
ws_path (defaults to /post/ws)
Creates a websocket connection, where payloads received on the socket are passed through the pipeline as a batch of one message. Please note that components within a Tyk Streams config will register their respective endpoints in a non-deterministic order. This means that establishing precedence of endpoints that are registered via multiple http_server inputs or outputs (either within brokers or from cohabiting streams) is not possible in a predictable way. This ambiguity makes it difficult to ensure that paths which are both a subset of a path registered by a separate component, and end in a slash (/) and will therefore match against all extensions of that path, do not prevent the more specific path from matching against requests. It is therefore recommended that you ensure paths of separate components do not collide unless they are explicitly non-competing. For example, if you were to deploy two separate http_server inputs, one with a path /foo/ and the other with a path /foo/bar, it would not be possible to ensure that the path /foo/ does not swallow requests made to /foo/bar. You may specify an optional ws_welcome_message, which is a static payload to be sent to all clients once a websocket connection is first established.
Metadata
This input adds the following metadata fields to each message:
- http_server_user_agent
- http_server_request_path
- http_server_verb
- http_server_remote_ip
- All headers (only first values are taken)
- All query parameters
- All path parameters
- All cookies
If HTTPS is enabled, the following fields are added as well:
- http_server_tls_version
- http_server_tls_subject
- http_server_tls_cipher_suite

Examples

Path Switching
This example shows an http_server input that captures all requests and processes them by switching on that path:
input:
  http_server:
    path: /
    allowed_verbs: [ GET, POST ]
    sync_response:
      headers:
        Content-Type: application/json

  processors:
    - switch:
      - check: '@http_server_request_path == "/foo"'
        processors:
          - mapping: |
              root.title = "You Got Fooed!"
              root.result = content().string().uppercase()

      - check: '@http_server_request_path == "/bar"'
        processors:
          - mapping: 'root.title = "Bar Is Slow"'
          - sleep: # Simulate a slow endpoint
              duration: 1s
Mock OAuth 2.0 Server
This example shows an http_server input that mocks an OAuth 2.0 Client Credentials flow server at the endpoint /oauth2_test:
input:
  http_server:
    path: /oauth2_test
    allowed_verbs: [ GET, POST ]
    sync_response:
      headers:
        Content-Type: application/json

  processors:
    - log:
        message: "Received request"
        level: INFO
        fields_mapping: |
          root = @
          root.body = content().string()

    - mapping: |
        root.access_token = "MTQ0NjJkZmQ5OTM2NDE1ZTZjNGZmZjI3"
        root.token_type = "Bearer"
        root.expires_in = 3600

    - sync_response: {}
    - mapping: 'root = deleted()'

Fields

address
An alternative address to host from. If left empty the service wide address is used. Type: string Default: ""
path
The endpoint path to listen for POST requests. Type: string Default: "/post"
ws_path
The endpoint path to create websocket connections from. Type: string Default: "/post/ws"
ws_welcome_message
An optional message to deliver to fresh websocket connections. Type: string Default: ""
allowed_verbs
An array of verbs that are allowed for the path endpoint. Type: array Default: ["POST"] Requires version 3.33.0 or newer
timeout
Timeout for requests. If a consumed messages takes longer than this to be delivered the connection is closed, but the message may still be delivered. Type: string Default: "5s" Type: string Default: ""
cert_file
Enable TLS by specifying a certificate and key file. Only valid with a custom address. Type: string Default: ""
key_file
Enable TLS by specifying a certificate and key file. Only valid with a custom address. Type: string Default: ""
cors
Adds Cross-Origin Resource Sharing headers. Only valid with a custom address. Type: object Requires version 3.63.0 or newer
cors.enabled
Whether to allow CORS requests. Type: bool Default: false
cors.allowed_origins
An explicit list of origins that are allowed for CORS requests. Type: array Default: []
sync_response
Customize messages returned via synchronous responses. Type: object
sync_response.status
Specify the status code to return with synchronous responses. This is a string value, which allows you to customize it based on resulting payloads and their metadata. Type: string Default: "200"
# Examples

status: ${! json("status") }

status: ${! meta("status") }
sync_response.headers
Specify headers to return with synchronous responses. Type: object Default: {"Content-Type":"application/octet-stream"}
sync_response.metadata_headers
Specify criteria for which metadata values are added to the response as headers. Type: object
sync_response.metadata_headers.include_prefixes
Provide a list of explicit metadata key prefixes to match against. Type: array Default: []
# Examples

include_prefixes:
  - foo_
  - bar_

include_prefixes:
  - kafka_

include_prefixes:
  - content-
sync_response.metadata_headers.include_patterns
Provide a list of explicit metadata key regular expression (re2) patterns to match against. Type: array Default: []
# Examples

include_patterns:
  - .*

include_patterns:
  - _timestamp_unix$

Kafka

Connects to Kafka brokers and consumes one or more topics.

Common

# Common config fields, showing default values
input:
  label: ""
  kafka:
    addresses: [] # No default (required)
    topics: [] # No default (required)
    target_version: 2.1.0 # No default (optional)
    consumer_group: ""
    checkpoint_limit: 1024
    auto_replay_nacks: true

Advanced

# All config fields, showing default values
input:
  label: ""
  kafka:
    addresses: [] # No default (required)
    topics: [] # No default (required)
    target_version: 2.1.0 # No default (optional)
    tls:
      enabled: false
      skip_cert_verify: false
      enable_renegotiation: false
      root_cas: ""
      root_cas_file: ""
      client_certs: []
    sasl:
      mechanism: none
      user: ""
      password: ""
      access_token: ""
      token_cache: ""
      token_key: ""
    consumer_group: ""
    client_id: tyk
    rack_id: ""
    start_from_oldest: true
    checkpoint_limit: 1024
    auto_replay_nacks: true
    commit_period: 1s
    max_processing_period: 100ms
    extract_tracing_map: root = @ # No default (optional)
    group:
      session_timeout: 10s
      heartbeat_interval: 3s
      rebalance_timeout: 60s
    fetch_buffer_cap: 256
    multi_header: false
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
Offsets are managed within Kafka under the specified consumer group, and partitions for each topic are automatically balanced across members of the consumer group. The Kafka input allows parallel processing of messages from different topic partitions, and messages of the same topic partition are processed with a maximum parallelism determined by the field checkpoint_limit. In order to enforce ordered processing of partition messages set the checkpoint_limit to 1 and this will force partitions to be processed in lock-step, where a message will only be processed once the prior message is delivered. Batching messages before processing can be enabled using the batching field, and this batching is performed per-partition such that messages of a batch will always originate from the same partition. This batching mechanism is capable of creating batches of greater size than the checkpoint_limit, in which case the next batch will only be created upon delivery of the current one.
Metadata
This input adds the following metadata fields to each message:
- kafka_key
- kafka_topic
- kafka_partition
- kafka_offset
- kafka_lag
- kafka_timestamp_unix
- kafka_tombstone_message
- All existing message headers (version 0.11+)
The field kafka_lag is the calculated difference between the high water mark offset of the partition at the time of ingestion and the current message offset.
Ordering
By default messages of a topic partition can be processed in parallel, up to a limit determined by the field checkpoint_limit. However, if strict ordered processing is required then this value must be set to 1 in order to process shard messages in lock-step. When doing so it is recommended that you perform batching at this component for performance as it will not be possible to batch lock-stepped messages at the output level.
Troubleshooting
  • I’m seeing logs that report Failed to connect to kafka: kafka: client has run out of available brokers to talk to (Is your cluster reachable?), but the brokers are definitely reachable.
Unfortunately this error message will appear for a wide range of connection problems even when the broker endpoint can be reached. Double check your authentication configuration and also ensure that you have enabled TLS if applicable.

Fields

addresses
A list of broker addresses to connect to. If an item of the list contains commas it will be expanded into multiple addresses. Type: array
# Examples

addresses:
  - localhost:9092

addresses:
  - localhost:9041,localhost:9042

addresses:
  - localhost:9041
  - localhost:9042
topics
A list of topics to consume from. Multiple comma separated topics can be listed in a single element. Partitions are automatically distributed across consumers of a topic. Alternatively, it’s possible to specify explicit partitions to consume from with a colon after the topic name, e.g. foo:0 would consume the partition 0 of the topic foo. This syntax supports ranges, e.g. foo:0-10 would consume partitions 0 through to 10 inclusive. Type: array Requires version 3.33.0 or newer
# Examples

topics:
  - foo
  - bar

topics:
  - foo,bar

topics:
  - foo:0
  - bar:1
  - bar:3

topics:
  - foo:0,bar:1,bar:3

topics:
  - foo:0-5
target_version
The version of the Kafka protocol to use. This limits the capabilities used by the client and should ideally match the version of your brokers. Defaults to the oldest supported stable version. Type: string
# Examples

target_version: 2.1.0

target_version: 3.1.0
tls
Custom TLS settings can be used to override system defaults. Type: object
tls.enabled
Whether custom TLS settings are enabled. Type: bool Default: false
tls.skip_cert_verify
Whether to skip server side certificate verification. Type: bool Default: false
tls.enable_renegotiation
Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you’re seeing the error message local error: tls: no renegotiation. Type: bool Default: false Requires version 3.45.0 or newer
tls.root_cas
An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. Type: string Default: ""
# Examples

root_cas: |-
  -----BEGIN CERTIFICATE-----
  ...
  -----END CERTIFICATE-----
tls.root_cas_file
An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. Type: string Default: ""
# Examples

root_cas_file: ./root_cas.pem
tls.client_certs
A list of client certificates to use. For each certificate either the fields cert and key, or cert_file and key_file should be specified, but not both. Type: array Default: []
# Examples

client_certs:
  - cert: foo
    key: bar

client_certs:
  - cert_file: ./example.pem
    key_file: ./example.key
tls.client_certs[].cert
A plain text certificate to use. Type: string Default: ""
tls.client_certs[].key
A plain text certificate key to use. Type: string Default: ""
tls.client_certs[].cert_file
The path of a certificate to use. Type: string Default: ""
tls.client_certs[].key_file
The path of a certificate key to use. Type: string Default: ""
tls.client_certs[].password
A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete pbeWithMD5AndDES-CBC algorithm is not supported for the PKCS#8 format. Warning: Since it does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext. Type: string Default: ""
# Example

password: foo
sasl
Enables SASL authentication. Type: object
sasl.mechanism
The SASL authentication mechanism, if left empty SASL authentication is not used. Type: string Default: "none"
OptionSummary
OAUTHBEAREROAuth Bearer based authentication.
PLAINPlain text authentication. NOTE: When using plain text auth it is extremely likely that you’ll also need to enable TLS.
SCRAM-SHA-256Authentication using the SCRAM-SHA-256 mechanism.
SCRAM-SHA-512Authentication using the SCRAM-SHA-512 mechanism.
noneDefault, no SASL authentication.
sasl.user
A PLAIN username. It is recommended that you use environment variables to populate this field. Type: string Default: ""
# Examples

user: ${USER}
sasl.password
A PLAIN password. It is recommended that you use environment variables to populate this field. Type: string Default: ""
# Examples

password: ${PASSWORD}
sasl.access_token
A static OAUTHBEARER access token Type: string Default: "" Type: string Default: ""
sasl.token_key
Required when using a token_cache, the key to query the cache with for tokens. Type: string Default: ""
consumer_group
An identifier for the consumer group of the connection. This field can be explicitly made empty in order to disable stored offsets for the consumed topic partitions. Type: string Default: ""
client_id
An identifier for the client connection. Type: string Default: "tyk"
rack_id
A rack identifier for this client. Type: string Default: ""
start_from_oldest
Determines whether to consume from the oldest available offset, otherwise messages are consumed from the latest offset. The setting is applied when creating a new consumer group or the saved offset no longer exists. Type: bool Default: true
checkpoint_limit
The maximum number of messages of the same topic and partition that can be processed at a given time. Increasing this limit enables parallel processing and batching at the output level to work on individual partitions. Any given offset will not be committed unless all messages under that offset are delivered in order to preserve at least once delivery guarantees. Type: int Default: 1024 Requires version 3.33.0 or newer
auto_replay_nacks
Whether messages that are rejected (nacked) at the output level should be automatically replayed indefinitely, eventually resulting in back pressure if the cause of the rejections is persistent. If set to false these messages will instead be deleted. Disabling auto replays can greatly improve memory efficiency of high throughput streams as the original shape of the data can be discarded immediately upon consumption and mutation. Type: bool Default: true
commit_period
The period of time between each commit of the current partition offsets. Offsets are always committed during shutdown. Type: string Default: "1s"
max_processing_period
A maximum estimate for the time taken to process a message, this is used for tuning consumer group synchronization. Type: string Default: "100ms"
group
Tuning parameters for consumer group synchronization. Type: object
group.session_timeout
A period after which a consumer of the group is kicked after no heartbeats. Type: string Default: "10s"
group.heartbeat_interval
A period in which heartbeats should be sent out. Type: string Default: "3s"
group.rebalance_timeout
A period after which rebalancing is abandoned if unresolved. Type: string Default: "60s"
fetch_buffer_cap
The maximum number of unprocessed messages to fetch at a given time. Type: int Default: 256
multi_header
Decode headers into lists to allow handling of multiple values with the same key Type: bool Default: false
batching
Allows you to configure a batching policy. Type: object
# Examples

batching:
  byte_size: 5000
  count: 0
  period: 1s

batching:
  count: 10
  period: 1s

batching:
  check: this.contains("END BATCH")
  count: 0
  period: 1m
batching.count
A number of messages at which the batch should be flushed. If 0 disables count based batching. Type: int Default: 0
batching.byte_size
An amount of bytes at which the batch should be flushed. If 0 disables size based batching. Type: int Default: 0
batching.period
A period in which an incomplete batch should be flushed regardless of its size. Type: string Default: ""
# Examples

period: 1s

period: 1m

period: 500ms
batching.processors
A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. Type: array
# Examples

processors:
  - archive:
      format: concatenate

processors:
  - archive:
      format: lines

processors:
  - archive:
      format: json_array

Outputs

Overview

An output is a sink where we wish to send our consumed data after applying an optional array of processors. Only one output is configured at the root of a Tyk Streams config. However, the output can be a broker which combines multiple outputs under a chosen brokering pattern. An output config section looks like this:
outout:
  label: my_kafka_output

  kafka:
    addresses: [ localhost:9092 ]
    topic: "foobar"

  # Optional list of processing steps
  processors:
    - avro:
        operator: from_json

Labels

Outputs have an optional field label that can uniquely identify them in observability data such as logs.

Broker

Allows you to route messages to multiple child outputs using a range of brokering patterns.

Common

# Common config fields, showing default values
output:
  label: ""
  broker:
    pattern: fan_out
    outputs: [] # No default (required)
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""

Advanced

# All config fields, showing default values
output:
  label: ""
  broker:
    copies: 1
    pattern: fan_out
    outputs: [] # No default (required)
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
Processors can be listed to apply across individual outputs or all outputs:
output:
  broker:
    pattern: fan_out
    outputs:
      - resource: foo
      - resource: bar
        # Processors only applied to messages sent to bar.
        processors:
          - resource: bar_processor

  # Processors applied to messages sent to all brokered outputs.
  processors:
    - resource: general_processor

Fields

copies
The number of copies of each configured output to spawn. Type: int Default: 1
pattern
The brokering pattern to use. Type: string Default: "fan_out" Options: fan_out, fan_out_fail_fast, fan_out_sequential, fan_out_sequential_fail_fast, round_robin, greedy.
outputs
A list of child outputs to broker. Type: array
batching
Allows you to configure a batching policy. Type: object
# Examples

batching:
  byte_size: 5000
  count: 0
  period: 1s

batching:
  count: 10
  period: 1s

batching:
  check: this.contains("END BATCH")
  count: 0
  period: 1m
batching.count
A number of messages at which the batch should be flushed. If 0 disables count based batching. Type: int Default: 0
batching.byte_size
An amount of bytes at which the batch should be flushed. If 0 disables size based batching. Type: int Default: 0
batching.period
A period in which an incomplete batch should be flushed regardless of its size. Type: string Default: ""
# Examples

period: 1s

period: 1m

period: 500ms
batching.processors
A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. Type: array
# Examples

processors:
  - archive:
      format: concatenate

processors:
  - archive:
      format: lines

processors:
  - archive:
      format: json_array

Patterns

The broker pattern determines the way in which messages are allocated and can be chosen from the following:
fan_out
With the fan out pattern all outputs will be sent every message that passes through Tyk Streams in parallel. If an output applies back pressure it will block all subsequent messages, and if an output fails to send a message it will be retried continuously until completion or service shut down. This mechanism is in place in order to prevent one bad output from causing a larger retry loop that results in a good output from receiving unbounded message duplicates.
fan_out_fail_fast
The same as the fan_out pattern, except that output failures will not be automatically retried. This pattern should be used with caution as busy retry loops could result in unlimited duplicates being introduced into the non-failure outputs.
fan_out_sequential
Similar to the fan out pattern except outputs are written to sequentially, meaning an output is only written to once the preceding output has confirmed receipt of the same message. If an output applies back pressure it will block all subsequent messages, and if an output fails to send a message it will be retried continuously until completion or service shut down. This mechanism is in place in order to prevent one bad output from causing a larger retry loop that results in a good output from receiving unbounded message duplicates.
fan_out_sequential_fail_fast
The same as the fan_out_sequential pattern, except that output failures will not be automatically retried. This pattern should be used with caution as busy retry loops could result in unlimited duplicates being introduced into the non-failure outputs.
round_robin
With the round robin pattern each message will be assigned a single output following their order. If an output applies back pressure it will block all subsequent messages. If an output fails to send a message then the message will be re-attempted with the next input, and so on.
greedy
The greedy pattern results in higher output throughput at the cost of potentially disproportionate message allocations to those outputs. Each message is sent to a single output, which is determined by allowing outputs to claim messages as soon as they are able to process them. This results in certain faster outputs potentially processing more messages at the cost of slower outputs.

HTTP Client

Sends messages to an HTTP server.

Common

# Common config fields, showing default values
output:
  label: ""
  http_client:
    url: "" # No default (required)
    verb: POST
    headers: {}
    timeout: 5s
    max_in_flight: 64
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""

Advanced

# All config fields, showing default values
output:
  label: ""
  http_client:
    url: "" # No default (required)
    verb: POST
    headers: {}
    metadata:
      include_prefixes: []
      include_patterns: []
    dump_request_log_level: ""
    oauth:
      enabled: false
      consumer_key: ""
      consumer_secret: ""
      access_token: ""
      access_token_secret: ""
    oauth2:
      enabled: false
      client_key: ""
      client_secret: ""
      token_url: ""
      scopes: []
      endpoint_params: {}
    basic_auth:
      enabled: false
      username: ""
      password: ""
    jwt:
      enabled: false
      private_key_file: ""
      signing_method: ""
      claims: {}
      headers: {}
    tls:
      enabled: false
      skip_cert_verify: false
      enable_renegotiation: false
      root_cas: ""
      root_cas_file: ""
      client_certs: []
    extract_headers:
      include_prefixes: []
      include_patterns: []
    timeout: 5s
    retry_period: 1s
    max_retry_backoff: 300s
    retries: 3
    backoff_on:
      - 429
    drop_on: []
    successful_on: []
    proxy_url: "" # No default (optional)
    batch_as_multipart: false
    propagate_response: false
    max_in_flight: 64
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
    multipart: []
When the number of retries expires the output will reject the message, the behavior after this will depend on the pipeline but usually this simply means the send is attempted again until successful whilst applying back pressure. The body of the HTTP request is the raw contents of the message payload. If the message has multiple parts (is a batch) the request will be sent according to RFC1341. This behavior can be disabled by setting the field batch_as_multipart to false.
Propagating Responses
It’s possible to propagate the response from each HTTP request back to the input source by setting propagate_response to true. Only inputs that support synchronous responses are able to make use of these propagated responses.

Performance

This output benefits from sending multiple messages in flight in parallel for improved performance. You can tune the max number of in flight messages (or message batches) with the field max_in_flight. This output benefits from sending messages as a batch for improved performance. Batches can be formed at both the input and output level.

Fields

url
The URL to connect to. Type: string
verb
A verb to connect with Type: string Default: "POST"
# Examples

verb: POST

verb: GET

verb: DELETE
headers
A map of headers to add to the request. Type: object Default: {}
# Examples

headers:
  Content-Type: application/octet-stream
  traceparent: ${! tracing_span().traceparent }
metadata
Specify optional matching rules to determine which metadata keys should be added to the HTTP request as headers. Type: object
metadata.include_prefixes
Provide a list of explicit metadata key prefixes to match against. Type: array Default: []
# Examples

include_prefixes:
  - foo_
  - bar_

include_prefixes:
  - kafka_

include_prefixes:
  - content-
metadata.include_patterns
Provide a list of explicit metadata key regular expression (re2) patterns to match against. Type: array Default: []
# Examples

include_patterns:
  - .*

include_patterns:
  - _timestamp_unix$
dump_request_log_level
Optionally set a level at which the request and response payload of each request made will be logged. Type: string Default: "" Options: TRACE, DEBUG, INFO, WARN, ERROR, FATAL, “.
oauth
Allows you to specify open authentication via OAuth version 1. Type: object
oauth.enabled
Whether to use OAuth version 1 in requests. Type: bool Default: false
oauth.consumer_key
A value used to identify the client to the service provider. Type: string Default: ""
oauth.consumer_secret
A secret used to establish ownership of the consumer key. Type: string Default: ""
oauth.access_token
A value used to gain access to the protected resources on behalf of the user. Type: string Default: ""
oauth.access_token_secret
A secret provided in order to establish ownership of a given access token. Type: string Default: ""
oauth2
Allows you to specify open authentication via OAuth version 2 using the client credentials token flow. Type: object
oauth2.enabled
Whether to use OAuth version 2 in requests. Type: bool Default: false
oauth2.client_key
A value used to identify the client to the token provider. Type: string Default: ""
oauth2.client_secret
A secret used to establish ownership of the client key. Type: string Default: ""
oauth2.token_url
The URL of the token provider. Type: string Default: ""
oauth2.scopes
A list of optional requested permissions. Type: array Default: []
oauth2.endpoint_params
A list of optional endpoint parameters, values should be arrays of strings. Type: object Default: {}
# Examples

endpoint_params:
  bar:
    - woof
  foo:
    - meow
    - quack
basic_auth
Allows you to specify basic authentication. Type: object
basic_auth.enabled
Whether to use basic authentication in requests. Type: bool Default: false
basic_auth.username
A username to authenticate as. Type: string Default: ""
basic_auth.password
A password to authenticate with. Type: string Default: ""
jwt
Allows you to specify JWT authentication. Type: object
jwt.enabled
Whether to use JWT authentication in requests. Type: bool Default: false
jwt.private_key_file
A file with the PEM encoded via PKCS1 or PKCS8 as private key. Type: string Default: ""
jwt.signing_method
A method used to sign the token such as RS256, RS384, RS512 or EdDSA. Type: string Default: ""
jwt.claims
A value used to identify the claims that issued the JWT. Type: object Default: {}
jwt.headers
Add optional key/value headers to the JWT. Type: object Default: {}
tls
Custom TLS settings can be used to override system defaults. Type: object
tls.enabled
Whether custom TLS settings are enabled. Type: bool Default: false
tls.skip_cert_verify
Whether to skip server side certificate verification. Type: bool Default: false
tls.enable_renegotiation
Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you’re seeing the error message local error: tls: no renegotiation. Type: bool Default: false
tls.root_cas
An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. Type: string Default: ""
# Examples

root_cas: |-
  -----BEGIN CERTIFICATE-----
  ...
  -----END CERTIFICATE-----
tls.root_cas_file
An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. Type: string Default: ""
# Examples

root_cas_file: ./root_cas.pem
tls.client_certs
A list of client certificates to use. For each certificate either the fields cert and key, or cert_file and key_file should be specified, but not both. Type: array Default: []
# Examples

client_certs:
  - cert: foo
    key: bar

client_certs:
  - cert_file: ./example.pem
    key_file: ./example.key
tls.client_certs[].cert
A plain text certificate to use. Type: string Default: ""
tls.client_certs[].key
A plain text certificate key to use. Type: string Default: ""
tls.client_certs[].cert_file
The path of a certificate to use. Type: string Default: ""
tls.client_certs[].key_file
The path of a certificate key to use. Type: string Default: ""
tls.client_certs[].password
A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete pbeWithMD5AndDES-CBC algorithm is not supported for the PKCS#8 format. Warning: Since it does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext. Type: string Default: ""
# Examples

password: foo
extract_headers
Specify which response headers should be added to resulting synchronous response messages as metadata. Header keys are lowercased before matching, so ensure that your patterns target lowercased versions of the header keys that you expect. This field is not applicable unless propagate_response is set to true. Type: object
extract_headers.include_prefixes
Provide a list of explicit metadata key prefixes to match against. Type: array Default: []
# Examples

include_prefixes:
  - foo_
  - bar_

include_prefixes:
  - kafka_

include_prefixes:
  - content-
extract_headers.include_patterns
Provide a list of explicit metadata key regular expression (re2) patterns to match against. Type: array Default: []
# Examples

include_patterns:
  - .*

include_patterns:
  - _timestamp_unix$
timeout
A static timeout to apply to requests. Type: string Default: "5s"
retry_period
The base period to wait between failed requests. Type: string Default: "1s"
max_retry_backoff
The maximum period to wait between failed requests. Type: string Default: "300s"
retries
The maximum number of retry attempts to make. Type: int Default: 3
backoff_on
A list of status codes whereby the request should be considered to have failed and retries should be attempted, but the period between them should be increased gradually. Type: array Default: [429]
drop_on
A list of status codes whereby the request should be considered to have failed but retries should not be attempted. This is useful for preventing wasted retries for requests that will never succeed. Note that with these status codes the request is dropped, but message that caused the request will not be dropped. Type: array Default: []
successful_on
A list of status codes whereby the attempt should be considered successful, this is useful for dropping requests that return non-2XX codes indicating that the message has been dealt with, such as a 303 See Other or a 409 Conflict. All 2XX codes are considered successful unless they are present within backoff_on or drop_on, regardless of this field. Type: array Default: []
proxy_url
An optional HTTP proxy URL. Type: string
batch_as_multipart
Send message batches as a single request using RFC1341. If disabled messages in batches will be sent as individual requests. Type: bool Default: false
propagate_response
Whether responses from the server should be propagated back to the input. Type: bool Default: false
max_in_flight
The maximum number of parallel message batches to have in flight at any given time. Type: int Default: 64
batching
Allows you to configure a batching policy. Type: object
# Examples

batching:
  byte_size: 5000
  count: 0
  period: 1s

batching:
  count: 10
  period: 1s

batching:
  check: this.contains("END BATCH")
  count: 0
  period: 1m
batching.count
A number of messages at which the batch should be flushed. If 0 disables count based batching. Type: int Default: 0
batching.byte_size
An amount of bytes at which the batch should be flushed. If 0 disables size based batching. Type: int Default: 0
batching.period
A period in which an incomplete batch should be flushed regardless of its size. Type: string Default: ""
# Examples

period: 1s

period: 1m

period: 500ms
batching.processors
A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. Type: array
# Examples

processors:
  - archive:
      format: concatenate

processors:
  - archive:
      format: lines

processors:
  - archive:
      format: json_array
multipart
Create explicit multipart HTTP requests by specifying an array of parts to add to the request, each part specified consists of content headers and a data field that can be populated dynamically. If this field is populated it will override the default request creation behavior. Type: array Default: []
multipart[].content_type
The content type of the individual message part. Type: string Default: ""
# Examples

content_type: application/bin
multipart[].content_disposition
The content disposition of the individual message part. Type: string Default: ""
# Examples

content_disposition: form-data; name="bin"; filename='${! @AttachmentName }
multipart[].body
The body of the individual message part. Type: string Default: ""
# Examples

body: ${! this.data.part1 }

HTTP Server

Sets up an HTTP server that will send messages over HTTP(S) GET requests. HTTP 2.0 is supported when using TLS, which is enabled when key and cert files are specified.

Common

# Common config fields, showing default values
output:
  label: ""
  http_server:
    address: ""
    path: /get
    stream_path: /get/stream
    ws_path: /get/ws
    allowed_verbs:
      - GET

Advanced

# All config fields, showing default values
output:
  label: ""
  http_server:
    address: ""
    path: /get
    stream_path: /get/stream
    ws_path: /get/ws
    allowed_verbs:
      - GET
    timeout: 5s
    cert_file: ""
    key_file: ""
    cors:
      enabled: false
      allowed_origins: []
Sets up an HTTP server that will send messages over HTTP(S) GET requests. Three endpoints will be registered at the paths specified by the fields path, stream_path and ws_path. Which allow you to consume a single message batch, a continuous stream of line delimited messages, or a websocket of messages for each request respectively. When messages are batched the path endpoint encodes the batch according to RFC1341. Please note, messages are considered delivered as soon as the data is written to the client. There is no concept of at least once delivery on this output. Please note that components within a Tyk config will register their respective endpoints in a non-deterministic order. This means that establishing precedence of endpoints that are registered via multiple http_server inputs or outputs (either within brokers or from cohabiting streams) is not possible in a predictable way. This ambiguity makes it difficult to ensure that paths which are both a subset of a path registered by a separate component, and end in a slash (/) and will therefore match against all extensions of that path, do not prevent the more specific path from matching against requests. It is therefore recommended that you ensure paths of separate components do not collide unless they are explicitly non-competing. For example, if you were to deploy two separate http_server inputs, one with a path /foo/ and the other with a path /foo/bar, it would not be possible to ensure that the path /foo/ does not swallow requests made to /foo/bar.

Fields

address
An alternative address to host from. If left empty the service wide address is used. Type: string Default: ""
path
The path from which discrete messages can be consumed. Type: string Default: "/get"
stream_path
The path from which a continuous stream of messages can be consumed. Type: string Default: "/get/stream"
ws_path
The path from which websocket connections can be established. Type: string Default: "/get/ws"
allowed_verbs
An array of verbs that are allowed for the path and stream_path HTTP endpoint. Type: array Default: ["GET"]
timeout
The maximum time to wait before a blocking, inactive connection is dropped (only applies to the path endpoint). Type: string Default: "5s"
cert_file
Enable TLS by specifying a certificate and key file. Only valid with a custom address. Type: string Default: ""
key_file
Enable TLS by specifying a certificate and key file. Only valid with a custom address. Type: string Default: ""
cors
Adds Cross-Origin Resource Sharing headers. Only valid with a custom address. Type: object
cors.enabled
Whether to allow CORS requests. Type: bool Default: false
cors.allowed_origins
An explicit list of origins that are allowed for CORS requests. Type: array Default: []

Kafka

The kafka output type writes a batch of messages to Kafka brokers and waits for acknowledgment before propagating it back to the input.

Common

# Common config fields, showing default values
output:
  label: ""
  kafka:
    addresses: [] # No default (required)
    topic: "" # No default (required)
    target_version: 2.1.0 # No default (optional)
    key: ""
    partitioner: fnv1a_hash
    compression: none
    static_headers: {} # No default (optional)
    metadata:
      exclude_prefixes: []
    max_in_flight: 64
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""

Advanced

# All config fields, showing default values
output:
  label: ""
  kafka:
    addresses: [] # No default (required)
    tls:
      enabled: false
      skip_cert_verify: false
      enable_renegotiation: false
      root_cas: ""
      root_cas_file: ""
      client_certs: []
    sasl:
      mechanism: none
      user: ""
      password: ""
      access_token: ""
      token_cache: ""
      token_key: ""
    topic: "" # No default (required)
    client_id: tyk
    target_version: 2.1.0 # No default (optional)
    rack_id: ""
    key: ""
    partitioner: fnv1a_hash
    partition: ""
    custom_topic_creation:
      enabled: false
      partitions: -1
      replication_factor: -1
    compression: none
    static_headers: {} # No default (optional)
    metadata:
      exclude_prefixes: []
    inject_tracing_map: meta = @.merge(this) # No default (optional)
    max_in_flight: 64
    idempotent_write: false
    ack_replicas: false
    max_msg_bytes: 1000000
    timeout: 5s
    retry_as_batch: false
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
    max_retries: 0
    backoff:
      initial_interval: 3s
      max_interval: 10s
      max_elapsed_time: 30s
The config field ack_replicas determines whether we wait for acknowledgment from all replicas or just a single broker. Metadata will be added to each message sent as headers (version 0.11+), but can be restricted using the field metadata.
Strict Ordering and Retries
When strict ordering is required for messages written to topic partitions it is important to ensure that both the field max_in_flight is set to 1 and that the field retry_as_batch is set to true. You must also ensure that failed batches are never rerouted back to the same output. This can be done by setting the field max_retries to 0 and backoff.max_elapsed_time to empty, which will apply back pressure indefinitely until the batch is sent successfully. However, this also means that manual intervention will eventually be required in cases where the batch cannot be sent due to configuration problems such as an incorrect max_msg_bytes estimate. A less strict but automated alternative would be to route failed batches to a dead letter queue using a fallback broker, but this would allow subsequent batches to be delivered in the meantime whilst those failed batches are dealt with.
Troubleshooting
  • I’m seeing logs that report Failed to connect to kafka: kafka: client has run out of available brokers to talk to (Is your cluster reachable?), but the brokers are definitely reachable.
Unfortunately this error message will appear for a wide range of connection problems even when the broker endpoint can be reached. Double check your authentication configuration and also ensure that you have enabled TLS if applicable.

Performance

This output benefits from sending multiple messages in flight in parallel for improved performance. You can tune the max number of in flight messages (or message batches) with the field max_in_flight. This output benefits from sending messages as a batch for improved performance. Batches can be formed at both the input and output level.

Fields

addresses
A list of broker addresses to connect to. If an item of the list contains commas it will be expanded into multiple addresses. Type: array
# Examples

addresses:
  - localhost:9092

addresses:
  - localhost:9041,localhost:9042

addresses:
  - localhost:9041
  - localhost:9042
tls
Custom TLS settings can be used to override system defaults. Type: object
tls.enabled
Whether custom TLS settings are enabled. Type: bool Default: false
tls.skip_cert_verify
Whether to skip server side certificate verification. Type: bool Default: false
tls.enable_renegotiation
Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you’re seeing the error message local error: tls: no renegotiation. Type: bool Default: false
tls.root_cas
An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. Type: string Default: ""
# Examples

root_cas: |-
  -----BEGIN CERTIFICATE-----
  ...
  -----END CERTIFICATE-----
tls.root_cas_file
An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. Type: string Default: ""
# Examples

root_cas_file: ./root_cas.pem
tls.client_certs
A list of client certificates to use. For each certificate either the fields cert and key, or cert_file and key_file should be specified, but not both. Type: array Default: []
# Examples

client_certs:
  - cert: foo
    key: bar

client_certs:
  - cert_file: ./example.pem
    key_file: ./example.key
tls.client_certs[].cert
A plain text certificate to use. Type: string Default: ""
tls.client_certs[].key
A plain text certificate key to use. Type: string Default: ""
tls.client_certs[].cert_file
The path of a certificate to use. Type: string Default: ""
tls.client_certs[].key_file
The path of a certificate key to use. Type: string Default: ""
tls.client_certs[].password
A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete pbeWithMD5AndDES-CBC algorithm is not supported for the PKCS#8 format. Warning: Since it does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext. Type: string Default: ""
# Example

password: foo
sasl
Enables SASL authentication. Type: object
sasl.mechanism
The SASL authentication mechanism, if left empty SASL authentication is not used. Type: string Default: "none"
OptionSummary
OAUTHBEAREROAuth Bearer based authentication.
PLAINPlain text authentication. NOTE: When using plain text auth it is extremely likely that you’ll also need to enable TLS.
SCRAM-SHA-256Authentication using the SCRAM-SHA-256 mechanism.
SCRAM-SHA-512Authentication using the SCRAM-SHA-512 mechanism.
noneDefault, no SASL authentication.
sasl.user
A PLAIN username. It is recommended that you use environment variables to populate this field. Type: string Default: ""
# Examples

user: ${USER}
sasl.password
A PLAIN password. It is recommended that you use environment variables to populate this field. Type: string Default: ""
# Examples

password: ${PASSWORD}
sasl.access_token
A static OAUTHBEARER access token Type: string Default: ""
sasl.token_cache
Instead of using a static access_token allows you to query a cache resource to fetch OAUTHBEARER tokens from Type: string Default: ""
sasl.token_key
Required when using a token_cache, the key to query the cache with for tokens. Type: string Default: ""
topic
The topic to publish messages to. Type: string
client_id
An identifier for the client connection. Type: string Default: "tyk"
target_version
The version of the Kafka protocol to use. This limits the capabilities used by the client and should ideally match the version of your brokers. Defaults to the oldest supported stable version. Type: string
# Examples

target_version: 2.1.0

target_version: 3.1.0
rack_id
A rack identifier for this client. Type: string Default: ""
key
The key to publish messages with. Type: string Default: ""
partitioner
The partitioning algorithm to use. Type: string Default: "fnv1a_hash" Options: fnv1a_hash, murmur2_hash, random, round_robin, manual.
partition
The manually-specified partition to publish messages to, relevant only when the field partitioner is set to manual. Must be able to parse as a 32-bit integer. Type: string Default: ""
custom_topic_creation
If enabled, topics will be created with the specified number of partitions and replication factor if they do not already exist. Type: object
custom_topic_creation.enabled
Whether to enable custom topic creation. Type: bool Default: false
custom_topic_creation.partitions
The number of partitions to create for new topics. Leave at -1 to use the broker configured default. Must be >= 1. Type: int Default: -1
custom_topic_creation.replication_factor
The replication factor to use for new topics. Leave at -1 to use the broker configured default. Must be an odd number, and less then or equal to the number of brokers. Type: int Default: -1
compression
The compression algorithm to use. Type: string Default: "none" Options: none, snappy, lz4, gzip, zstd.
static_headers
An optional map of static headers that should be added to messages in addition to metadata. Type: object
# Examples

static_headers:
  first-static-header: value-1
  second-static-header: value-2
metadata
Specify criteria for which metadata values are sent with messages as headers. Type: object
metadata.exclude_prefixes
Provide a list of explicit metadata key prefixes to be excluded when adding metadata to sent messages. Type: array Default: []
max_in_flight
The maximum number of messages to have in flight at a given time. Increase this to improve throughput. Type: int Default: 64
idempotent_write
Enable the idempotent write producer option. This requires the IDEMPOTENT_WRITE permission on CLUSTER and can be disabled if this permission is not available. Type: bool Default: false
ack_replicas
Ensure that messages have been copied across all replicas before acknowledging receipt. Type: bool Default: false
max_msg_bytes
The maximum size in bytes of messages sent to the target topic. Type: int Default: 1000000
timeout
The maximum period of time to wait for message sends before abandoning the request and retrying. Type: string Default: "5s"
retry_as_batch
When enabled forces an entire batch of messages to be retried if any individual message fails on a send, otherwise only the individual messages that failed are retried. Disabling this helps to reduce message duplicates during intermittent errors, but also makes it impossible to guarantee strict ordering of messages. Type: bool Default: false
batching
Allows you to configure a batching policy. Type: object
# Examples

batching:
  byte_size: 5000
  count: 0
  period: 1s

batching:
  count: 10
  period: 1s

batching:
  check: this.contains("END BATCH")
  count: 0
  period: 1m
batching.count
A number of messages at which the batch should be flushed. If 0 disables count based batching. Type: int Default: 0
batching.byte_size
An amount of bytes at which the batch should be flushed. If 0 disables size based batching. Type: int Default: 0
batching.period
A period in which an incomplete batch should be flushed regardless of its size. Type: string Default: ""
# Examples

period: 1s

period: 1m

period: 500ms
batching.processors
A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. Type: array
# Examples

processors:
  - archive:
      format: concatenate

processors:
  - archive:
      format: lines

processors:
  - archive:
      format: json_array
max_retries
The maximum number of retries before giving up on the request. If set to zero there is no discrete limit. Type: int Default: 0
backoff
Control time intervals between retry attempts. Type: object
backoff.initial_interval
The initial period to wait between retry attempts. Type: string Default: "3s"
# Examples

initial_interval: 50ms

initial_interval: 1s
backoff.max_interval
The maximum period to wait between retry attempts Type: string Default: "10s"
# Examples

max_interval: 5s

max_interval: 1m
backoff.max_elapsed_time
The maximum overall period of time to spend on retry attempts before the request is aborted. Setting this value to a zeroed duration (such as 0s) will result in unbounded retries. Type: string Default: "30s"
# Examples

max_elapsed_time: 1m

max_elapsed_time: 1h

Processors

Overview

Tyk Streams processors are functions applied to messages passing through a pipeline. Processors are set via config, and depending on where in the config they are placed they will be run either immediately after a specific input (set in the input section), on all messages (set in the pipeline section) or before a specific output (set in the output section). Most processors apply to all messages and can be placed in the pipeline section:
pipeline:
  threads: 1
  processors:
    - label: my_avro
      avro:
        operator: "to_json"
        encoding: textual
The threads field in the pipeline section determines how many parallel processing threads are created. You can read more about parallel processing in the pipeline guide.

Labels

Processors have an optional field label that can uniquely identify them in observability data such as logs.

Avro

# Config fields, with default values
label: ""
avro:
  operator: "" # No default (required)
  encoding: textual
  schema: ""
  schema_path: ""
NoteIf you are consuming or generating messages using a schema registry service then it is likely this processor will fail as those services require messages to be prefixed with the identifier of the schema version being used.

Operators

to_json
Converts Avro documents into a JSON structure. This makes it easier to manipulate the contents of the document within Tyk Streams. The encoding field specifies how the source documents are encoded.
from_json
Attempts to convert JSON documents into Avro documents according to the specified encoding.

Fields

operator
The operator to execute Type: string Options: to_json, from_json.
encoding
An Avro encoding format to use for conversions to and from a schema. Type: string Default: "textual" Options: textual, binary, single.
schema
A full Avro schema to use. Type: string Default: ""
schema_path
The path of a schema document to apply. Use either this or the schema field. Type: string Default: ""
# Examples

schema_path: file://path/to/spec.avsc

schema_path: http://localhost:8081/path/to/spec/versions/1

Tracers

Overview

A tracer type represents a destination for Tyk Streams to send tracing events to such as Jaeger. When a tracer is configured all messages will be allocated a root span during ingestion that represents their journey through a Streams pipeline. Many Streams processors create spans, and so tracing is a great way to analyse the pathways of individual messages as they progress through a Streams instance. Some inputs, such as http_server and http_client, are capable of extracting a root span from the source of the message (HTTP headers). This is a work in progress and should eventually expand so that all inputs have a way of doing so. Other inputs, such as kafka can be configured to extract a root span by using the extract_tracing_map field. A tracer config section looks like this:
tracer:
  jaeger:
    agent_address: localhost:6831
    sampler_type: const
    sampler_param: 1
NoteAlthough the configuration spec of this component is stable the format of spans, tags and logs created by Streams is subject to change as it is tuned for improvement.

Jaeger

# Common config fields, showing default values
tracer:
  jaeger:
    agent_address: ""
    collector_url: ""
    sampler_type: const
    flush_interval: "" # No default (optional)

Advanced

# All config fields, showing default values
tracer:
  jaeger:
    agent_address: ""
    collector_url: ""
    sampler_type: const
    sampler_param: 1
    tags: {}
    flush_interval: "" # No default (optional)
Send tracing events to a Jaeger agent or collector.

Fields

agent_address
The address of a Jaeger agent to send tracing events to. Type: string Default: ""
# Examples

agent_address: jaeger-agent:6831
collector_url
The URL of a Jaeger collector to send tracing events to. If set, this will override agent_address. Type: string Default: ""
# Examples

collector_url: https://jaeger-collector:14268/api/traces
sampler_type
The sampler type to use. Type: string Default: "const"
OptionSummary
constSample a percentage of traces. 1 or more means all traces are sampled, 0 means no traces are sampled and anything in between means a percentage of traces are sampled. Tuning the sampling rate is recommended for high-volume production workloads.
sampler_param
A parameter to use for sampling. This field is unused for some sampling types. Type: float Default: 1
tags
A map of tags to add to tracing spans. Type: object Default: {}
flush_interval
The period of time between each flush of tracing spans. Type: string

OpenTelemetry Collector

# Common config fields, showing default values
tracer:
  open_telemetry_collector:
    http: [] # No default (required)
    grpc: [] # No default (required)
    sampling:
      enabled: false
      ratio: 0.85 # No default (optional)

Advanced

# All config fields, showing default values
tracer:
  open_telemetry_collector:
    http: [] # No default (required)
    grpc: [] # No default (required)
    tags: {}
    sampling:
      enabled: false
      ratio: 0.85 # No default (optional)
NoteThis component is experimental and therefore subject to change or removal outside of major version releases.
Send tracing events to an Open Telemetry collector.

Fields

http
A list of http collectors. Type: array
http[].address
The endpoint of a collector to send tracing events to. Type: string
# Examples

address: localhost:4318
http[].secure
Connect to the collector over HTTPS Type: bool Default: false
grpc
A list of grpc collectors. Type: array
grpc[].address
The endpoint of a collector to send tracing events to. Type: string
# Examples

address: localhost:4317
grpc[].secure
Connect to the collector with client transport security Type: bool Default: false
tags
A map of tags to add to all tracing spans. Type: object Default: {}
sampling
Settings for trace sampling. Sampling is recommended for high-volume production workloads. Type: object
sampling.enabled
Whether to enable sampling. Type: bool Default: false
sampling.ratio
Sets the ratio of traces to sample. Type: float
# Examples

ratio: 0.85

ratio: 0.5

Metrics

Overview

Streams emits lots of metrics in order to expose how components configured within your pipeline are behaving. You can configure exactly where these metrics end up with the config field metrics, which describes a metrics format and destination. For example, if you wished to push them via the Prometheus protocol you could use this configuration:
metrics:
  prometheus:
    push_interval: 1s
    push_job_name: in
    push_url: http://localhost:9091

Metric Names

Metrics are emitted with a prefix that can be configured with the field prefix. The default prefix is bento. The following metrics are emitted with the respective types:

Gauges

  • {prefix}_input_count Number of inputs currently active.
  • {prefix}_output_count Number of outputs currently active.
  • {prefix}_processor_count Number of processors currently active.
  • {prefix}_cache_count Number of caches currently active.
  • {prefix}_condition_count Number of conditions currently active.
  • {prefix}_input_connection_up 1 if a particular input is connected, 0 if it is not.
  • {prefix}_output_connection_up 1 if a particular output is connected, 0 if it is not.
  • {prefix}_input_running 1 if a particular input is running, 0 if it is not.
  • {prefix}_output_running 1 if a particular output is running, 0 if it is not.
  • {prefix}_processor_running 1 if a particular processor is running, 0 if it is not.
  • {prefix}_cache_running 1 if a particular cache is running, 0 if it is not.
  • {prefix}_condition_running 1 if a particular condition is running, 0 if it is not.
  • {prefix}_buffer_running 1 if a particular buffer is running, 0 if it is not.
  • {prefix}_buffer_available The number of messages that can be read from a buffer.
  • {prefix}_input_retry The number of active retry attempts for a particular input.
  • {prefix}_output_retry The number of active retry attempts for a particular output.
  • {prefix}_processor_retry The number of active retry attempts for a particular processor.
  • {prefix}_cache_retry The number of active retry attempts for a particular cache.
  • {prefix}_condition_retry The number of active retry attempts for a particular condition.
  • {prefix}_buffer_retry The number of active retry attempts for a particular buffer.
  • {prefix}_threads_active The number of processing threads currently active.

Counters

  • {prefix}_input_received Count of messages received by a particular input.
  • {prefix}_input_batch_received Count of batches received by a particular input.
  • {prefix}_output_sent Count of messages sent by a particular output.
  • {prefix}_output_batch_sent Count of batches sent by a particular output.
  • {prefix}_processor_processed Count of messages processed by a particular processor.
  • {prefix}_processor_batch_processed Count of batches processed by a particular processor.
  • {prefix}_processor_dropped Count of messages dropped by a particular processor.
  • {prefix}_processor_batch_dropped Count of batches dropped by a particular processor.
  • {prefix}_processor_error Count of errors returned by a particular processor.
  • {prefix}_processor_batch_error Count of batch errors returned by a particular processor.
  • {prefix}_cache_hit Count of cache key lookups that found a value.
  • {prefix}_cache_miss Count of cache key lookups that did not find a value.
  • {prefix}_cache_added Count of new cache entries.
  • {prefix}_cache_err Count of errors that occurred during a cache operation.
  • {prefix}_condition_hit Count of condition checks that passed.
  • {prefix}_condition_miss Count of condition checks that failed.
  • {prefix}_condition_error Count of errors that occurred during a condition check.
  • {prefix}_buffer_added Count of messages added to a particular buffer.
  • {prefix}_buffer_batch_added Count of batches added to a particular buffer.
  • {prefix}_buffer_read Count of messages read from a particular buffer.
  • {prefix}_buffer_batch_read Count of batches read from a particular buffer.
  • {prefix}_buffer_ack Count of messages removed from a particular buffer.
  • {prefix}_buffer_batch_ack Count of batches removed from a particular buffer.
  • {prefix}_buffer_nack Count of messages that failed to be removed from a particular buffer.
  • {prefix}_buffer_batch_nack Count of batches that failed to be removed from a particular buffer.
  • {prefix}_buffer_err Count of errors that occurred during a buffer operation.
  • {prefix}_buffer_batch_err Count of batch errors that occurred during a buffer operation.
  • {prefix}_input_error Count of errors that occurred during an input operation.
  • {prefix}_input_batch_error Count of batch errors that occurred during an input operation.
  • {prefix}_output_error Count of errors that occurred during an output operation.
  • {prefix}_output_batch_error Count of batch errors that occurred during an output operation.
  • {prefix}_resource_cache_error Count of errors that occurred during a resource cache operation.
  • {prefix}_resource_condition_error Count of errors that occurred during a resource condition operation.
  • {prefix}_resource_input_error Count of errors that occurred during a resource input operation.
  • {prefix}_resource_processor_error Count of errors that occurred during a resource processor operation.
  • {prefix}_resource_output_error Count of errors that occurred during a resource output operation.
  • {prefix}_resource_rate_limit_error Count of errors that occurred during a resource rate limit operation.

Timers

  • {prefix}_input_latency Latency of a particular input.
  • {prefix}_input_batch_latency Latency of a particular input at the batch level.
  • {prefix}_output_latency Latency of a particular output.
  • {prefix}_output_batch_latency Latency of a particular output at the batch level.
  • {prefix}_processor_latency Latency of a particular processor.
  • {prefix}_processor_batch_latency Latency of a particular processor at the batch level.
  • {prefix}_condition_latency Latency of a particular condition.
  • {prefix}_condition_batch_latency Latency of a particular condition at the batch level.
  • {prefix}_cache_latency Latency of a particular cache.
  • {prefix}_buffer_latency Latency of a particular buffer.
  • {prefix}_buffer_batch_latency Latency of a particular buffer at the batch level.

Metric Labels

All metrics are emitted with the following labels:
  • path The path of the component within the config.
  • label A custom label for the component, which is optional and falls back to the component type.

Prometheus

# Common config fields, showing default values
metrics:
  prometheus:
    prefix: tyk
    push_interval: ""
    push_job_name: kafka_out
    push_url: ""

Advanced

# All config fields, showing default values
metrics:
  prometheus:
    prefix: tyk
    push_interval: ""
    push_job_name: my_stream
    push_url: ""
    push_basic_auth:
      enabled: false
      username: ""
      password: ""
    file_path: ""
    use_histogram_timing: false
    histogram_buckets: [0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1.0]
Send metrics to a Prometheus push gateway, or expose them via HTTP endpoints.

Fields

prefix
A string prefix for all metrics. Type: string Default: "bento"
push_interval
The interval between pushing metrics to the push gateway. Type: string Default: ""
# Examples

push_interval: 1s

push_interval: 1m
push_job_name
A job name to attach to metrics pushed to the push gateway. Type: string Default: "bento_push"
push_url
The URL to push metrics to. Type: string Default: ""
# Examples

push_url: http://localhost:9091
push_basic_auth
Basic authentication configuration for the push gateway. Type: object
push_basic_auth.enabled
Whether to use basic authentication when pushing metrics. Type: bool Default: false
push_basic_auth.username
The username to authenticate with. Type: string Default: ""
push_basic_auth.password
The password to authenticate with. Type: string Default: ""
file_path
The file path to write metrics to. Type: string Default: ""
# Examples

file_path: /tmp/metrics.txt
use_histogram_timing
Whether to use histogram metrics for timing values. When set to false, summary metrics are used instead. Type: bool Default: false
histogram_buckets
A list of duration buckets to track when use_histogram_timing is set to true. Type: array Default: [0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1.0]

Common Configuration

Batching

Tyk Streams is able to join sources and sinks with sometimes conflicting batching behaviours without sacrificing its strong delivery guarantees. Therefore, batching within Tyk Streams is a mechanism that serves multiple purposes:
  1. Performance (throughput)
  2. Compatibility (mixing multi and single part message protocols)

Performance

For most users the only benefit of batching messages is improving throughput over your output protocol. For some protocols this can happen in the background and requires no configuration from you. However, if an output has a batching configuration block this means it benefits from batching and requires you to specify how you’d like your batches to be formed by configuring a batching policy:
output:
  kafka:
    addresses: [ todo:9092 ]
    topic: tyk_stream

    # Either send batches when they reach 10 messages or when 100ms has passed
    # since the last batch.
    batching:
      count: 10
      period: 100ms
However, a small number of inputs such as kafka must be consumed sequentially (in this case by partition) and therefore benefit from specifying your batch policy at the input level instead:
input:
  kafka:
    addresses: [ todo:9092 ]
    topics: [ tyk_input_stream ]
    batching:
      count: 10
      period: 100ms

output:
  kafka:
    addresses: [ todo:9092 ]
    topic: tyk_stream
Inputs that behave this way are documented as such and have a batching configuration block. Sometimes you may prefer to create your batches before processing, in which case if your input doesn’t already support a batch policy you can instead use a broker, which also allows you to combine inputs with a single batch policy:
input:
  broker:
    inputs:
      - resource: foo
      - resource: bar
    batching:
      count: 50
      period: 500ms
This also works the same with output brokers.

Compatibility

Tyk Streams is able to read and write over protocols that support multiple part messages, and all payloads travelling through Tyk Streams are represented as a multiple part message. Therefore, all components within Tyk Streams are able to work with multiple parts in a message as standard. When messages reach an output that doesn’t support multiple parts the message is broken down into an individual message per part, and then one of two behaviours happen depending on the output. If the output supports batch sending messages then the collection of messages are sent as a single batch. Otherwise, Tyk Streams falls back to sending the messages sequentially in multiple, individual requests. This behaviour means that not only can multiple part message protocols be easily matched with single part protocols, but also the concept of multiple part messages and message batches are interchangeable within Tyk Streams.

Batch Policy

When an input or output component has a config field batching that means it supports a batch policy. This is a mechanism that allows you to configure exactly how your batching should work on messages before they are routed to the input or output it’s associated with. Batches are considered complete and will be flushed downstream when either of the following conditions are met:
  • The byte_size field is non-zero and the total size of the batch in bytes matches or exceeds it (disregarding metadata.)
  • The count field is non-zero and the total number of messages in the batch matches or exceeds it.
  • The period field is non-empty and the time since the last batch exceeds its value.
This allows you to combine conditions:
output:
  kafka:
    addresses: [ todo:9092 ]
    topic: tyk_stream

    # Either send batches when they reach 10 messages or when 100ms has passed
    # since the last batch.
    batching:
      count: 10
      period: 100ms
A batch policy has the capability to create batches, but not to break them down.
If your configured pipeline is processing messages that are batched before they reach the batch policy then they may circumvent the conditions you’ve specified here, resulting in sizes you aren’t expecting.

Field Paths

Many components within Tyk Streams allow you to target certain fields using a JSON dot path. The syntax of a path within Tyk Streams is similar to JSON Pointers, except with dot separators instead of slashes (and no leading dot.) When a path is used to set a value any path segment that does not yet exist in the structure is created as an object. For example, if we had the following JSON structure:
{
  "foo": {
    "bar": 21
  }
}
The query path foo.bar would return 21. The characters ~ (%x7E) and . (%x2E) have special meaning in Tyk Streams paths. Therefore ~ needs to be encoded as ~0 and . needs to be encoded as ~1 when these characters appear within a key. For example, if we had the following JSON structure:
{
  "foo.foo": {
    "bar~bo": {
      "": {
        "baz": 22
      }
    }
  }
}
The query path foo~1foo.bar~0bo..baz would return 22.

Arrays

When Tyk Streams encounters an array whilst traversing a JSON structure it requires the next path segment to be either an integer of an existing index, or, depending on whether the path is used to query or set the target value, the character * or - respectively. For example, if we had the following JSON structure:
{
  "foo": [
    0, 1, { "bar": 23 }
  ]
}
The query path foo.2.bar would return 23.
Querying
When a query reaches an array the character * indicates that the query should return the value of the remaining path from each element of the array (within an array.)
Setting
When an array is reached the character - indicates that a new element should be appended to the end of the existing elements, if this character is not the final segment of the path then an object is created.

Processing Pipelines

Within a Tyk Streams configuration, in between input and output, is a pipeline section. This section describes an array of processors that are to be applied to all messages, and are not bound to any particular input or output. If you have processors that are heavy on CPU and aren’t specific to a certain input or output they are best suited for the pipeline section. It is advantageous to use the pipeline section as it allows you to set an explicit number of parallel threads of execution:
input:
  resource: foo

pipeline:
  threads: 4
  processors:
    - avro:
        operator: "to_json"

output:
  resource: bar
If the field threads is set to -1 (the default) it will automatically match the number of logical CPUs available. By default almost all Tyk Streams sources will utilize as many processing threads as have been configured, which makes horizontal scaling easy.