NATS 2.11
Last updated
Was this helpful?
Last updated
Was this helpful?
This guide is tailored for existing NATS users upgrading from NATS version v2.10.x. This will read as a summary with links to specific documentation pages to learn more about the feature or improvement.
Distributed message tracing: Users can now trace messages as they move through the system by setting a Nats-Trace-Dest
header to an inbox subject. Servers on the message path will return events to the provided subject that report each time a message enters or leaves a server, by which connection type, when subject mappings occur, or when messages traverse an account import/export boundary. Additionally, the Nats-Trace-Only
header (if set to true) will allow tracing events to propagate on a specific subject without delivering them to subscribers of that subject.
JetStream per-message TTLs: It is now possible to age out individual messages using a per-message TTL. The Nats-TTL
header, in either string or integer format (in seconds) allows for individual message expiration independent of stream limits. This can be combined with other limits in place on the stream. More information is available in .
Subject delete markers on MaxAge: The SubjectDeleteMarkerTTL
stream configuration option now allows for the placement of delete marker messages in the stream when the configured MaxAge
limit causes the last message for a given subject to be deleted. The delete markers include a Nats-Marker-Reason
header explaining which limit was responsible for the deletion.
Stream ingest rate limiting: New options max_buffered_size
and max_buffered_msgs
in the jetstream
configuration block enable rate limiting on Core NATS publishing into JetStream streams, protecting the system from overload.
Pull consumer priority groups: Pull consumers now support priority groups with pinning and overflow, enabling flexible failover and priority management when multiple clients are pulling from the same consumer. Configurable policies based on the number of pending messages on the consumer, or the number of pending acks, can control when messages overflow from one client to another, enabling new design patterns or regional awareness.
Consumer pausing: Message delivery to consumers can be temporarily suspended using the new pause API endpoint (or the PauseUntil
configuration option when creating), ideal for maintenance or migrations. Message delivery automatically resumes once the configured deadline has passed. Consumer clients continue to receive heartbeat messages as usual to ensure that they do not surface errors during the pause.
Replication traffic in asset accounts: Raft replication traffic can optionally be moved into the same account in which replicated assets live on a per-account basis, rather than being sent and received in the system account. When combined with multiple route connections, this can help to reduce latencies and avoid head-of-line blocking issues that may occur in heavily-loaded multi-tenant or multi-account deployments.
TLS first on leafnode connections: A new handshake_first
in the leafnode tls
block allows setting up leafnode connections that perform TLS negotiation first, before any other protocol handshakes take place.
Configuration state digest: A new -t
command line flag on the server binary can generate a hash of the configuration file. The config_digest
item in varz
displays the hash of the currently running configuration file, making it possible to check whether a configuration file has changed on disk compared to the currently running configuration.
TPM encryption on Windows: When running on Windows, the filestore can now store encryption keys in the TPM, useful in environments where physical access may be a concern.
SparkplugB: The built-in MQTT support is now compliant with SparkplugB Aware, with support for NBIRTH
and NDEATH
messages.
Replicated delete proposals: Message removals in clustered interest-based or workqueue streams are now propagated via Raft to guarantee consistent removal order across replicas, reducing a number of possible ways that a cluster failure can result in de-synced streams.
Metalayer, stream and consumer consistency: A new leader now only responds to read/write requests after synchronizing with its Raft log, preventing desynchronization between KV key updates and the stream during leader changes.
Replicated consumer reliability: Replicated consumers now consistently redeliver unacknowledged messages after a leader change.
Consumer starting sequence: The consumer starting sequence is now always respected, except for internal hidden consumers for sources/mirrors.
The NATS Server can now return a 429 error with type JSStreamTooManyRequests
when too many messages have been queued up for a stream. It should not generally be possible to hit this limit while using JetStream publishes and waiting for PubAcks, but may trigger if trying to publish into JetStream using Core NATS publishes without waiting for PubAcks, which is not advised.
The new max_buffered_size
and max_buffered_msgs
options control how many messages can be queued for each stream before the rate limit is hit, therefore if needed, you can increase these limits on your deployments. The default values for max_buffered_size
and max_buffered_msgs
are 128MB and 10,000 respectively, whereas in v2.10 these were unlimited.
You can detect in the server logs whether running into a queue limit with the following warning:
If your application starts to log the above warnings then you can first try to increase the limits to higher values while investigating the fast publishers, for example:
Since stream deletes are now replicated through group proposals in a replicated stream, there may be a slight increase in replication traffic on this version.
The js-server-only
healthcheck no longer checks for the health of the metaleader on v2.11.0. Since this healthcheck was designed to detect the server readiness (or in k8s for the readiness probe) checking the metaleader would sometimes cause a NATS server to be considered unhealthy when restarting the servers. In v2.11, this should no longer be an issue. If the previous behavior from v2.10 is preferred, there is a new healthcheck option js-meta-only
which can be used to check whether the meta group is healthy.
Earlier versions of the NATS Server would return an exit code 1 when gracefully shut down, i.e. after SIGTERM. From v2.11, an exit code of 0 (zero) will now be returned instead.
Configurations that have server, cluster, and gateway names with spaces are now considered invalid, as this can cause problems at the protocol level. A server running NATS v2.11 will fail to start with spaces configured in these names. Please ensure that spaces are not used in server, cluster or gateway names.
When downgrading from v2.11 to v2.10, the stream state files on disk will be rebuilt due to a change in the format of these files in v2.11. This requires re-scanning all stream message blocks, which may use higher CPU than usual and will likely take longer for the restarted node to report healthy. This will only happen on the first restart after downgrading and will not result in data loss.