It is possible to deploy a load balancer between the client applications and the cluster servers (or even between servers in a cluster or between clusters in a super-cluster), but you don't need to: NATS already has its own mechanisms to balance the connections between the seeds in the connection URL (including the clients randomizing the returned DNS A records) and to automatically re-establish dropped connections. If you have a cluster with 3 seed nodes you often get more network throughput than going through a load balancer (cloud provider load balancers can be woefully underpowered, not to mention it costs you more money as the load balancer is typically billed by the amount of data going through it). Finally, if you want to use TLS for authentication you do not want the load balancer to be the TLS termination point.
If you do use load balancers you just need to understand the potential issues with having load balancers and adjust the settings accordingly. The main concerns are problems caused by incorrectly configured idle detection, protocol problems due to packet inspection, and ephemeral port problems at high scale.
If routes or gateway connections go through load balancers then you could very well have the same problems as above, which could results in JetStream lost quorum periods and create undue re-synchronization and protocol overhead traffic.
NATS is 'cloud native' and expected to be deployed in virtual environments and/or containers.
However, when it comes to ensuring the highest possible level of performance that NATS can provide it is good to keep a few things in mind.
Think of Core NATS servers as a software equivalent of network switches. Enable JetStream, and they also become a new kind of DB server as well.
What you need to remember is that when selecting the instance types and storage options for your NATS server host instances that in public clouds: you get what you pay for.
For example non network optimized instances may give you 10 Gb/s of network bandwidth... but only for some period of time (like 30 minutes), after which the available bandwidth may drop down dramatically (like to 5 Gb/s) for another period of time. So select network optimized instances types instead if you always need the advertised bandwidth.
It's the same when it comes to storage options: local SSDs instance types can provide the best latency, while using a network attached block storage, e.g. Elastic Block Storage from AWS), can provide the highest overall throughput. When using EBS again you get what you pay for: general purpose storage type may give you a certain number of IOPS, but you can sustain those rates only for some period of time after which the number can drop down dramatically. So select IO optimized storage types if you want to continuously sustain the same max number of IOPS (e.g. AWS.
Be careful when setting resource limits for the nats-server containers. The nats-server processes use resources in proportion to the load traffic generated by all the client applications, if the NATS (and JetStream) usage is high (or bursty, nats-server is very fast and can process sharp bursts in traffic), then you will need to set the container resource limits accordingly, or the container orchestration system will kill the server's container. The nats-server will automatically detect number of available cores, but it will try to use all host memory, not the resource limits set for the container, unless you specify GOMELIMIT. GOMELIMIT is an available option in official NATS helm charts.