nats bench
NATS is fast and lightweight, and places a priority on performance. The nats CLI tool can, amongst many other things, be used for running benchmarks and measuring performance of your target NATS service infrastructure. In this tutorial, you learn how to benchmark and tune NATS on your systems and environment.
Note: the numbers below are just examples and were obtained using a MacBook Pro M4 (November 2024) running version 2.12.1 of nats-server:
Model Name: MacBook Pro
Model Identifier: Mac16,1
Model Number: MW2U3LL/A
Chip: Apple M4
Total Number of Cores: 10 (4 performance and 6 efficiency)
Memory: 16 GB
System Firmware Version: 13822.1.2
OS Loader Version: 13822.1.2Prerequisites
Start the NATS server with monitoring enabled
nats-server -m 8222 -jsVerify that the NATS server starts successfully, as well as the HTTP monitor:
[[2932] 2025/10/28 12:29:02.879297 [INF] Starting nats-server
[2932] 2025/10/28 12:29:02.879658 [INF] Version: 2.12.1
[2932] 2025/10/28 12:29:02.879661 [INF] Git: [fab5f99]
[2932] 2025/10/28 12:29:02.879664 [INF] Name: NBIYCV5UNYPP2ZBZJZNGQ7UJNJILSQZCD6MK2CPWU6UY7PHYPKWOYYS4
[2932] 2025/10/28 12:29:02.879667 [INF] Node: YNleYaHo
[2932] 2025/10/28 12:29:02.879668 [INF] ID: NBIYCV5UNYPP2ZBZJZNGQ7UJNJILSQZCD6MK2CPWU6UY7PHYPKWOYYS4
[2932] 2025/10/28 12:29:02.880586 [INF] Starting http monitor on 0.0.0.0:8222
[2932] 2025/10/28 12:29:02.880696 [INF] Starting JetStream
[2932] 2025/10/28 12:29:02.880755 [WRN] Temporary storage directory used, data could be lost on system reboot
[2932] 2025/10/28 12:29:02.881014 [INF] _ ___ _____ ___ _____ ___ ___ _ __ __
[2932] 2025/10/28 12:29:02.881018 [INF] _ | | __|_ _/ __|_ _| _ \ __| /_\ | \/ |
[2932] 2025/10/28 12:29:02.881019 [INF] | || | _| | | \__ \ | | | / _| / _ \| |\/| |
[2932] 2025/10/28 12:29:02.881020 [INF] \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_| |_|
[2932] 2025/10/28 12:29:02.881020 [INF]
[2932] 2025/10/28 12:29:02.881021 [INF] https://docs.nats.io/jetstream
[2932] 2025/10/28 12:29:02.881022 [INF]
[2932] 2025/10/28 12:29:02.881022 [INF] ---------------- JETSTREAM ----------------
[2932] 2025/10/28 12:29:02.881023 [INF] Strict: true
[2932] 2025/10/28 12:29:02.881026 [INF] Max Memory: 12.00 GB
[2932] 2025/10/28 12:29:02.881027 [INF] Max Storage: 233.86 GB
[2932] 2025/10/28 12:29:02.881027 [INF] Store Directory: "/var/folders/cx/x13pjm0n3ds6w4q_4xhr_c0r0000gn/T/nats/jetstream"
[2932] 2025/10/28 12:29:02.881029 [INF] API Level: 2
[2932] 2025/10/28 12:29:02.881030 [INF] -------------------------------------------
[2932] 2025/10/28 12:29:02.881335 [INF] Listening for client connections on 0.0.0.0:4222
[2932] 2025/10/28 12:29:02.881434 [INF] Server is readyRun a publisher throughput test
Let's run a first test to see how fast a single publisher can publish one million 16 byte messages to the NATS server. This should yield very high numbers as there is no subscriber on the subject being used.
The output tells you the number of messages and the number of payload bytes that the client was able to publish per second:
Run a publish/subscribe throughput test
While the measurement above is an interesting data point, it is purely an academic measurement as you will usually have one (or more) subscribers for the messages being published.
Let's look at throughput for a single publisher with a single subscriber. For this, we need to run two instances of nats bench at the same time (e.g. in two shell windows), one to subscribe and one to publish.
First start the subscriber (it doesn't start measuring until it receives the first message from the publisher).
Then start the publisher.
Publisher's output:
Subscriber's output:
We can also increase the size of the messages using --size, for example:
Publisher:
Subscriber:
As expected, while the number of messages per second decreases with the larger messages, the throughput, however, increases massively.
Run a 1:N throughput test
You can also measure performance with a message fan-out where multiple subscribers receive a copy of the message. You can do this using the --client flag, each client being a Go-routine, making it's own connection to the server and subscribing to the subject.
When specifying multiple clients nats bench will also report aggregated statistics.
For example for a fan-out of 4:
and
Publisher's output:
Subscribers' output:
Run a N:M throughput test
When more than 1 publisher client is specified, nats bench evenly distributes the total number of messages (--msgs) across the number of publishers (--clients).
So let's increase the number of publishers and also increase the number of messages so the benchmark run lasts a little bit longer:
Subscriber:
Publisher:
Publisher's output
Subscriber's output:
Run a request-reply latency test
You can also test request/reply performance using nats bench service.
In one shell start a nats bench to act as a server and let it run:
And in another shell send some requests (each request is sent synchronously, one after the other):
In this case, the average latency of request-reply between the two nats bench processes over NATS was 50.87 micro-seconds. However, since those requests are made synchronously, we can not measure throughput this way. We need to generate a lot more load by having more than one client making those synchronous requests at the same time, and we will also run more than one service instance (as you would in production) such that the requests are load-balanced between the service instances using the queue group functionality.
Start the service instances and leave running:
Clients making requests (since we are using a lot of clients to generate load, we will not show the progress bar while running the benchmark):
Run JetStream benchmarks
You can measure JetStream performance using the nats bench js commands.
Measure JetStream publication performance
You can measure the performance of publishing (storing) messages into a stream using nats bench js pub, which offers 3 options:
nats bench js pub syncpublishes the messages synchronously one after the other (so while it's good for measuring latency, it's not good to measure throughput).nats bench js pub asyncpublishes a batch of messages asynchronously, waits for all the publications' acknowledgements and moves on to the next batch (which is a good way to measure throughput).nats bench js pub batchuses the atomic batch publish (while batching is currently implemented only to provide atomicity, it has the side effect of potentially helping throughout, especially for smaller messages).
nats bench js pub will by default use a stream called benchstream, and --create will automatically create the stream if it doesn't exist yet. Also you can use --purge to clear the stream first. You can specify stream attributes like --replicas 3 or --storage memory, or --maxbytes or operate on any existing stream with --stream.
For example, test latency of publishing to a memory stream:
Test throughput using batch publishing:
Remove the stream and test to file storage (which is the default)
You can even measure publish performance to an --replicas 1 stream with asynchronous persistence using --persistasync which yields throughput similar to when using memory storage, as by default JetStream flushes disk writes synchronously, meaning that even if the nats-server process is killed suddenly no messages will be lost as the OS already has them in it's buffer and will flush them to disk (it can be also configured to not just flush but also sync after every write in which case no message will be lost even if the whole host goes down suddenly, at the expense of latency obviously)).
Measure JetStream consumption (replay) performance
Once you have stored some messages on a stream you can measure the replay performance in multiple ways:
nats bench js ordereduses an ordered ephemeral consumer to receive the messages (so each client gets its own copy of the messages).nats bench js consumeuses theConsume()(callback) function on a durable consumer to receive the messages.nats bench js fetchuses theFetch()function on a durable consumer to receive messages in batches.nats bench js getgets the messages directly by sequence number (either synchronously one by one or using 'batched gets') without using a consumer.
Starting with ordered consumer:
Then using consume to distribute consumption of messages between multiple clients throught a durable consumer with explicit acknowledgements:
Using fetch with two clients to retrieve batches of 400 messages through a durable consumer and without explicit acknowledgements:
Measuring the latency of direct synchronous gets:
And finally measuring throughput using batched gets with a fan out of 2:
Measuring publication and consumption together
While measuring publication and consumption to and from a stream separately yields interesting metrics, during normal operations most of the time the consumers are going to be on-line and consuming while the messages are being published to the stream.
First purge the stream and start the consuming instance of nats bench, for example using an ordered consumer and 8 clients (so a fan out of 8):
Then start publishing to the stream, for example using 8 clients doing asynchronous publications:
Consumer's output:
Measure KV performance
nats bench kv can be used to measure Key Value performance using synchronous put and get operations.
First put some data in the KV:
Then simulate a bunch of clients doing gets on random keys:
Play around with the knobs
Don't be afraid to test different JetStream storage and replication options (assuming you have access to a JetStream enabled cluster of servers if you want to go beyond --replicas 1), and of course the number of publishing/subscribing clients, and the batch and message sizes.
You can also use nats bench as a tool to generate traffic at a steady rate by using the --sleep flag to introduce a delay between the publication of each message (or batch of messages). You can also use that same flag to simulate processing time when consuming messages.
Note: If you change the attributes of a stream between runs you will have to delete the stream (e.g. run nats stream rm benchstream)
Leave no trace: clean up the resources when you are finished
Once you have finished benchmarking streams, remember that if you have stored many messages in the stream (which is very easy and fast to do) your stream may end up using a certain amount of resources on the nats-server(s) infrastructure (i.e. memory and files) that you may want to reclaim.
You can instruct use the --purge bench command flag to tell nats to purge the stream of messages before starting its benchmark, or purge the stream manually using nats stream purge benchstream or just delete it altogether using nats stream rm benchstream.
Last updated
Was this helpful?