-storeindicates what type of store to use, in this case
file. The other (
-dir) indicates in which directory the state should be stored.
server.dat) another to record clients information (
foo, and assuming that you started the server with
-dir datastore, then you will find a directory called
datastore/foo. In this directory you will find several files: one to record subscriptions information (
subs.dat), and a series of files that logs the messages
-max_channels. When the limit is reached, any new subscription or message published on a new channel will produce an error.
-max_subs. A client that tries to create a subscription on a given channel (subject) for which the limit is reached will receive an error.
-max_bytes. However, for messages, the client does not get an error when the limit is reached. The oldest messages are discarded to make room for the new messages.
--file_slice_max_msgs), the size of the file - including the corresponding index file (
--file_slice_max_bytes), or the period of time that a file slice should cover - starting at the time the first message is stored in that slice (
--file_slice_max_age). The default file store options are defined such that only the slice size is configured to 64MB.
--file_slice_archive_script), then the server will rename the slice files (data and index) with a
.bakextension and invoke the script with the channel name, data and index file names. The files are left in the channel's directory and therefore it is the script responsibility to delete those files when done. At any rate, those files will not be recovered on a server restart, but having lots of unused files in the directory may slow down the server restart.
datastore/foo/msgs.1.idx), and you have configured the script
/home/nats-streaming/archive_script.sh. The server will invoke:
.bakextension so that they are not going to be recovered if the script leave those files in place.
fds_limit(or command line parameter
--file_fds_limit) may be considered to limit the total use of file descriptors.
unexpected EOFerror during the recovery process.
-file_truncate_bad_eofparameter, the server will still print those bad records but truncate each file at the position of the first corrupted record in order to successfully start.
server.dat: This file contains meta data and NATS subjects used to communicate with client applications. If a corruption is reported with this file, we would suggest that you stop all your clients, stop the server, remove this file, restart the server. This will create a new
server.datfile, but will not attempt to recover the rest of the channels because the server assumes that there is no state. So you should stop and restart the server once more. Then, you can restart all your clients.
clients.dat: This contains information about client connections. If the file is truncated to move past an
unexpected EOFerror, this can result in no issue at all, or in client connections not being recovered, which means that the server will not know about possible running clients, and therefore it will not try to deliver any message to those non recovered clients, or reject incoming published messages from those clients. It is also possible that the server recovers a client connection that was actually closed. In this case, the server may attempt to deliver or redeliver messages unnecessarily.
subs.dat: This is a channel's subscriptions file (under the channel's directory). If this file is truncated and some records are lost, it may result in no issue at all, or in client applications not receiving their messages since the server will not know about them. It is also possible that acknowledged messages get redelivered (since their ack may have been lost).
msgs.<n>.dat: This is a channel's message log (several per channel). If one of those files is truncated, then message loss occurs. With the
unexpected EOFerrors, it is likely that only the last "file slice" of a channel will be affected. Nevertheless, if a lower sequence file slice is truncated, then gaps in message sequence will occur. So it would be possible for a channel to have now messages 1..100, 110..300 for instance, with messages 101 to 109 missing. Again, this is unlikely since we expect the unexpected end-of-file errors to occur on the last slice.
unexpected EOFerrors for NATS Streaming file stores, however, you may want to simply delete all NATS Streaming and RAFT stores for the failed node and restart it. By design, the other nodes in the cluster have replicated the data, so this node will become a follower and catchup with the rest of the cluster, getting the data from the current leader and recreating its local stores.