BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations Rethinking Connectivity at the Edge: Scaling Fleets of Low-Powered Devices Using NATS.io

Rethinking Connectivity at the Edge: Scaling Fleets of Low-Powered Devices Using NATS.io

Bookmarks
40:14

Summary

Jeremy Saenz discusses NATS, an open-source project for services communication, and how to leverage NATS to streamline communication and fleet management for devices at the edge.

Bio

Jeremy Saenz is senior software engineer at Synadia Communications, maintainer of the open source messaging system, NATS. He has worked on many popular open source projects in the Go community including Martini, Negroni, CLI, Gin, and Inject. Previously Chief Product Officer at Kajabi, Jeremy enjoys wearing a bunch of hats and is passionate about nudging the software engineering industry forward.

About the conference

Software is changing the world. QCon San Francisco empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Saenz: I'm Jeremy. We are going to talk all about NATS.

First, let me just tell you a little bit about what NATS is. At the company that I work for, which we maintain the NATS open source project, we talk a lot about this concept of rethinking connectivity. Why rethink how things connect? Isn't that a solved problem? We've figured out how computers talk to each other and things like that. I think we have some industry trends and really some disruption in our industry related to multi-cloud and edge that's really starting to challenge some of our own preconceived notions of what it's like to build things for the web, or to build microservices or to build streaming platforms.

These things are driving just a massive transformation, and they're really challenging our thinking. None of this stuff is new. A lot of this has come out of research of distributed systems for a long time. We've somehow had amnesia for the past 10, 15 years. We've started moving into the cloud, and we've forgotten some other patterns along the way. What are some of the limitations of the ways that we build things right now? If you're building things for the web, usually you're using things like DNS and host names and IPs to discover things. That's just been the way that we do it all of the time.

If you want to talk to a computer, you need to know its IP address. To get its IP address, you use DNS. We also are so used to these pull based semantics, where you can make an HTTP request and get a response back. We use that hammer for every single problem. There may be some other options there. We've also assumed this perimeter-based security model, where you're like, first and foremost, we're just going to set up our VPC, and we're going to put all of our stuff in there. All of that stuff is going to be secure, because we have some wall around our own set of software. These assumptions are changing as well. We also are so used to these location-dependent backends.

We're going to just take this big database, and we're just going to put it somewhere and that's going to be where all of our stuff is going to be. So many of these patterns emerge from us using this idea of like this one-to-one communication mechanism, especially for microservices, like this HTTP one-to-one style communication, and the many layers built on top of that. We're going to challenge some of these notions today. What I hope that we can walk away with during this talk is just thinking a slightly bit different about how things can communicate, and to know that there are other options out there. There are different ways to think about how these things can communicate.

Introducing NATS

Let's talk about NATS. NATS is an open source, high-performance messaging system, like a connected mesh or a connected fabric. It's really Pub/Sub messaging at its core, but then has all these layers built on top of it, that can allow for some very highly distributed systems problems to be solved with it. It really aims to simplify the number of moving pieces involved in building complex distributed systems.

We really wanted to lean into this idea of adaptability and scale being some of the core tenets. Some of the ways it solves our current technology limitations, or the ways that we solve things today is that we have location-independent addressing. You just connect to NATS, and then you could talk to anything else that's connected to NATS. You don't use IP addresses anymore. You don't use DNS or domain names. You simply just use simple subject-based addressing. This sounds like a really simple like, "Ok, Jeremy. Yes, that makes sense." It's actually really powerful in concept. We're going to show that. We also lean into this idea of M to N communications.

Rather than just one-to-one, we can support many patterns by being able to say, I'm going to ask a question, and I'm going to get maybe one answer back, maybe multiple answers back, just as an example. Leaning into this M to N communication piece is really central to how we can very elegantly solve some of these distributed systems problems. We also have push and pull based. You can ask a question and get an answer, like poll, or you can actually get things pushed to you, "I'm interested in a topic." Very simple Pub/Sub style, but also very powerful.

That is also decentralized in its authentication mechanism with some zero-trust based security constructs. It creates this true multi-tenancy, where you have logical isolation or physical isolation available to you, while still being able to unite everything into a single system. We have this cool, intelligent, persistent layer called JetStream that I'll talk about. Like I said, it was designed for global scale, being able to have a single system that spans the globe, and can continue to deliver performant latency for everything that's connected to it.

NATS Architecture

What is NATS from an architecture standpoint? Quite simply, you have a NATS server, and then you have NATS clients. We have a bunch of different client libraries in various languages, 40 different implementations. We estimated, we support about eight official languages that are our golden path languages that are all the popular ones that you would expect. To break this down into some systems, we have this idea of Core NATS, which is just our basic high-performance messaging. It's the ground floor of Pub/Sub. You connect to a NATS server, and you can throw messages around. It's temporarily coupled. Meaning that if you have a message and you want to send it, if that thing's not there to receive it, it's not going to get delivered to them.

A lot of people look at this as a drawback. Actually, being able to have a completely stateless Pub/Sub model at your fingertips is very cool. It's a layer on top of that. If you need guarantees, you can use a subsystem that we call JetStream. It's still built on top of that NATS Core Pub/Sub model, in fact, a request-reply model that we build on top of that. It's able to persist and easily move data around, replicate data, and support very many different models of how you can store and access that data. Like I said, Core NATS is really fast, fire and forget message publishing. It scales up to a million messages a second. It's payload agnostic, meaning you can put whatever data you want.

The NATS server doesn't really do anything with that data, it just forwards it to whoever's interested in it. It supports a lot of different communication patterns that we have here, not just request-reply, like HTTP, but we have publish and subscribe, we have fan in and fan out, we have a scatter-gather pattern with our M to N communications. We even get load balancing for free amongst all these nodes. You can say, I have these microservices that are connecting into NATS, and I don't want to have to put a load balancer in front of them. NATS just does it for you. It does it in a very intelligent way. It's globally aware, so it can route you to the closest responder or the closest set of responders and load balance between those as well.

NATS JetStream

I'm going to talk quickly about NATS JetStream. What is JetStream? JetStream is this next-gen distributed persistence layer. Again, like I said, it's built on top of NATS Core, so it's Pub/Sub. It has all those same constructs at its core. It's just a layer on top that gives us those guarantees on persistence. Just like NATS is multi-tenant, it's very configurable and globally scalable. You can replicate this data, not just across data centers, but across oceans, and all over the place. It's fast. It's really simple to replicate data, as well as mux and demux data. This is going to be a really important bit as I talk about what this means for the edge, and for fleet management, and for data locality, and everything like that. What kind of patterns does JetStream support?

We have streaming. Typically, streaming and logs would be what you would typically use Kafka for. Then we have work queues, maybe like a RabbitMQ style thing. We have key-value stores like Redis. We have object stores like MinIO. All this is built on top of the same base. We have this globally ordered set of data that you index based on these subjects or topics. Then, you're able to build a lot of really neat things out of that. These are just some of the patterns that JetStream supports in a single model of a stream and a consumer. We're going to be playing with a lot of this stuff.

NATS Demo

Let's jump into the NATS demo. I'll show you a bit about how NATS works. Then we will talk about fleet management, and we'll close out with some more demos. NATS is actually really easy to play with. It's one of my favorite things to demo, because it's not like, we have a workshop to set up NATS, and it's going to take us 4 hours, because it's a big infrastructure. It really isn't. It's very operationally simple to run. I have a NATS server binary here. We have containers. We have Kubernetes support, all of that. I'm just going to run my NATS server like this locally, and then I can start connecting to it.

Actually, let me change my NATS context, because we'll be switching between a couple of these. I'm going to say nats sub hello. That's going to subscribe on a hello subject. I can even say hello.world. Now these tokens are things that I can very much wildcard. I can even just say I want to subscribe on hello.star, if I escape it correctly inside of my terminal, there we go. I can easily subscribe on the hello.star topic. I can easily publish to a nats pub hello.jeremy.

I pass it a payload. In this case, I'm just passing it a string, and I say hello. I can receive that. I can send a bunch of messages. Again, it's really fast. I could even say we have a built-in benchmark command. Let's just publish a bunch of messages here, just see how fast it goes. Yes, on my local laptop, we're pushing about seven-and-a-half million messages a second. NATS Core is really fast. It's too much overhead for what most folks need. It gives us a good base to start building on top of and deciding, what goes in to NATS Core versus what do we want to persist and put into JetStream. We also support some request-reply semantics.

One of the really cool things that we can do with NATS in terms of microservices is we could use this scatter-gather pattern to say, give me all the microservices. Then all you guys, you're actually microservices right now. I switch to our cloud where you guys are connected. I'm going to say nats context select qcon. I'm going to say nats micro list. That's going to give me back a set of microservices. These IDs right here that I'll show up, this is the QCon microservice. These are all instances. These are all you guys, which is really cool. I could say nats micro stats. I can even get stats for qcon. I can get stats for each of your microservices, as well as the endpoints, and everything like that. I can start calling these and get load balancing for free.

One of the endpoints that I have load balanced is this nickname endpoint. I'm going to say nats request qcon.nickname. Somebody is going to be the good lottery winner here. Let's see. I'll just pass nothing here. Merv, you're the lottery winner today. You got selected by the NATS load balancer. Merv won twice, and BP got the second one. Let's see if I could pipe this into nats pico, my Raspberry Pi Pico, the RP2040 over here. It looks like it's still working, which is great. I'm just going to pipe the output of whatever the QCon nickname was.

I'm going to say raw, pipe that into nats pub pico. I'm going to publish a message to Raspberry Pi Pico with the result of that. What do we get? Jeremy. I was the winner in that one. You can do all kinds of really neat things by being able to connect these together. This is a microcontroller, obviously, that's connected. It's actually using MQTT connecting into NATS because we have support for other kinds of transports there. I'm just going to say, QCon, just to reset this. We have request-reply. We have publish, subscribe. We have fan in, fan out. We can connect all of our microservices together, which is actually pretty neat.

The last thing I want to show is just a little overview of JetStream. You guys filled out that survey, and you can keep refreshing that page and it'll pull in that data. That data is actually stored on a JetStream. You guys signed into that website, there's no backend. NATS is the backend directly. You're talking directly with NATS via WebSockets, you're being authenticated, everything like that. I can actually say nats stream lists, and I could see we have a survey stream with about 29 messages in it. The really neat thing that we could do is you could breed these streams, I could say nats sub --stream survey.

This is essentially what you are doing every time you reload the page, you're just pulling in all that data. If people keep filling out the survey, all this data will come in. We have this concept of a really cool consumer model where we have ephemeral consumers, which is what you guys are using today where you just create them and they work automatically. We also have durable consumers to keep that parser in case you're doing any stream processing work.

NATS for Fleet Management

I think we can move into the next portion which is on fleet management. NATS for fleet management. What do I mean by fleet management? I obviously don't mean literal cars. Sometimes I mean cars. We have NATS in cars as well. We do some literal fleet management. Really, when it comes to hardware and devices, I'm talking about just a large number of distributed devices. Typically, they might have a broad variance in hardware profiles. We might have everything connecting into this. Some will be PCs, some will be web browsers, like you guys have.

Some of these will be single board computers or microcontrollers. We need to consider all of those use cases for like, how do we create a single layer or level of communication between all of these things, especially when they're very distributed, they're all living in the same place. They're going to also have unreliable network connectivity. We need to consider that. I didn't even get to talk about how reliable NATS is in terms of how much it does retries, how much it protects itself at its own cost, and how it does failover and fault tolerance automatically, at a global scale.

These are all the things that NATS gives you for this type of fleet management use case. The last part is this perimeterless security. Because if we try to do what we did in the cloud, it's not going to work for the edge. You can't just put a wall around everything because everything's scattered all over the place. Somebody could go take your hardware and do naughty things with it. We need to think about, what are some trusted security models that we could bake into this? How do we manage this at scale? We can't put users in a database if we have millions of them, and they're provisioning all the time. How do we scale something like that? AuthN and AuthZ is something that I think NATS really solves really well, as well.

In working with a lot of organizations that are doing things at the edge, I've noticed that there are four patterns that have emerged in that use case that they've really asked for when engaging with using a technology like NATS. The first one is live querying, basically being able to get at all of the devices to ad hoc filter the devices, and select the devices so that somebody could build applications on it, as well as being able to just operate large scale fleets of these.

The way that we scale this out is with this typical scatter-gather pattern, where you can do this ad hoc querying, and NATS gives you a lot of interesting constructs to filter that and make it more performant. The second one is configuration management. That we have all these devices rolled out whether they're Starbucks stores, or they're vehicles on the road, or any factory floor machines, or whether they're IoT devices, all of these things require some form of configuration or over the air configuration.

Being able to set that and have it just slurp it up automatically, even when these devices are offline and coming back online, all that needs to happen seamlessly. Similar to configuration management, we have these remote commands where you might want to send a command and have that command be consumed, but that device might be offline. We have solved for that use case as well. The last one is store and forward, which is all about data locality.

How can we have these applications that are running on these devices still work and function and save stuff and pretend like everything's ok, even if the device is offline. As soon as the device comes online, all that data is forwarded on. This has typically been a very manual process to roll out these things, and everybody's built their own bespoke platforms around them. With NATS, we get a lot of these things in a much easier way. I talked about live querying, selecting, filtering, grouping, configuration management patterns, remote commands, and finally store and forward. There's a lot of really neat things that you can do when you have this toolbox.

Demo

Let's actually try some of these things out right now. You can see that I'm connected into the nats-0 server. This is on the cloud. This is on Digital Ocean, so we're actually getting pretty good RTT. I'm actually getting like 10 milliseconds, 25 milliseconds. Pretty good. It's still in the cloud. We're going to have to maybe solve for some of that. The first thing I want to showcase is, let's talk about live querying. I showed how we can say, microservices, give me all your information, and you guys can respond back to me. Very similar to that, we have a way for us to get device info. I'm going to say, nats, let's look at device info. I can say, nats request device info. I'm going to say, set replies to zero, meaning I'm going to send one question out, and I'm going to wait until some timeout, this is just on the CLI, to get back all the answers that I can.

If I can send a payload here. I'm going to immediately get info back from you guys from your web browser about what device that you're using, some device info about it, and everything like that. This is giving me everything. If I had a fleet of millions of devices, this wouldn't really cut it. We'd have to do some fun stuff with subject mapping and things like that. Even in this use case, I could do some simple filtering. Looks like there's a couple people on Android, couple people on iOS, I'm going to pick Android and I'm just going to generate a little JavaScript object right now. I will say os.name equals Android.

I'll pipe that into that nats request. Now I just got back all the Android devices. You could see you could filter and you could even operate on some of these filters. This is a really common pattern that I've seen folks use when they have lots of machines, lots of devices, and they have to manage this fleet. It's kind of like core querying pattern where you can say, give me all of the things that qualify under this filter.

Next up, we have configuration management. The way that I like to do configuration management in NATS is to use a key-value store, which is persisted. It's just always there, always available. I can easily just start setting keys to certain subjects, and certain clients can then decide to subscribe to those subjects. That data gets persisted, so if that client goes offline and comes back up, they can get the latest value. We also have the idea of historicity inside of our key-value stores, which is really neat. If you have your phones up, I can say, make config change.

That's just going to publish a message there. You should have gotten your charts to change red, which is cool. I could easily change that back. It's not real time, but it's pretty close to real time, which is really nice. We can get that responsiveness out of all of it. Configuration management is also a really neat pattern to manage fleets off the edge. Remote commands are really similar to config management, except the data structure is just a little bit different. We typically don't want to execute remote commands more than once, but we do want to save them somewhere. When a device comes back offline, they could consume that and acknowledge that they have consumed that.

I don't have a remote commands demo as part of this just yet, mainly because you guys are using ephemeral consumers. If you guys were using a consumer where you had a cursor, this would be a perfect example of where you can just start consuming commands and either pick the last one off and say, yes, I want to do X, or I want to do Y, or I want to do something that you really only want to do once.

Lastly, we have this idea of store and forward, which, devices, they want to be able to store data when they're offline. Then they want to just have that data synced up with the cloud or with some data lake or something automatically without really having to think about it. The applications really shouldn't have knowledge of this pattern. They should just say, I want to save this thing, or I want to publish this message. They don't necessarily need to be like, am I online, am I offline? NATS also handles this as well, mainly because the NATS server is this small, tiny little Go binary that could run in pretty much like small single board computers and above. Can't run on a microcontroller, but it can run on a Raspberry Pi.

I've run it on a Raspberry Pi Zero 2, just fine, with persistence. It's very possible to put these NATS servers anywhere that you want, and essentially embed them alongside of your clients. That way, you get a lot of power in being able to say, how do we want to move data around here? I'm going to show you guys a pretty cool example of how we can do that. You guys are all connected into the cloud, it should say NATS 0, 1, or 2. We're going to change that.

First and foremost, I'm going to run a .nats server that I have running right now on my local computer that I was using. I'm going to close that out. I'm going to run it as a leaf node. A leaf node is essentially an extension to a NATS cluster. It's not a member in the NATS cluster. It's not a member of the Raft group for application. It's essentially its own NATS system, but it has this bridge that's going to basically bridge whatever gap that I want to define in terms of what data moves over the wire. I can simply say, nats-server, and give it this leaf node configuration.

That's going to fire up. It's going to connect into that cloud server. Now we can connect into this, and we could do all of the same things that we wanted, except that we're now connected locally for me. I'm going to have to fudge things a little bit, because I'm not going to attempt to do any NATS traversal, or anything like that, with this network that we have. I'm going to get everybody to connect over to this through a tunnel, just to illustrate what it looks like to be able to say, there's locality here. If we sever our connection with the cloud, everything still works. We're still all publishing messages. We're still all storing data. Everything still works until we get connectivity back, and then all that stuff syncs back up.

Let's pay attention to our dashboard over here. I need to actually fire up that tunnel. I'm going to say, make tunnel. I'm using Ngrok. Ngrok's awesome. Ngrok's a really neat tunnel. I'm going to say, make move. Hopefully, that should move everybody over to Jeremy's laptop. Now, the problem is we got an error, because we don't have any of that data on Jeremy's laptop. That's a problem. I'm going to say, make unmove. I'm just going to move everybody back. Hopefully, everybody's moved back to our cloud server. I'm going to say, make mirrors.

What this is going to do is this is going to create mirrors of that config key-value store. It's going to create mirrors of that survey data. It's going to replicate them in real time. It's going to keep them up to date, but it's going to persist it all right here on my laptop. I'm going to make those mirrors. Let's see how that looks. Nats context select default, make mirrors. Looks like we have nats stream list. Now I'm connected to my laptop and I can see I have the survey stream. I can say nats kv list. I can see I have that config key-value store as well. It looks like those are all replicated. We didn't have anything in config, because we deleted that key, but we did have those 30 messages.

That's actually right here on my laptop. It is in the cloud, but it's replicated now here on my laptop, which is really neat. Now I could do something like make move, and we have all that data. You guys could just refresh that page. Now that data is actually coming in. It's going through a tunnel, but it's coming in through my laptop. I can do something like go over here where I'm managing some of my accounts, and I can go, Accounts, QCon, and I bump this up. I'm going to go over to that leaf node user. Remember, we're using this decentralized based authentication mechanism.

I'm going to say, go ahead and revoke that leaf node. The cool thing about all that is that leaf node is now erroring out when it's trying to connect to the cloud, but all of our stuff is still going to work because I just moved all you guys over. I can say, nats micro list, and I should get responses of those QCon microservices because we are no longer in the cloud. We are all connected here in this room. That's one of the really special things about data locality and how fluid you can be with these NATS servers and moving clients around and being able to move data around. I'm going to go ahead and unrevoke that credential, and make sure that we're all synced up. Let's see. I think we're good now.

The last part I want to show is the store and forward piece. Now that we're all connected to this server right here on my laptop, one of the things that I have those clients emitting, I could say nats sub, I'm just going to subscribe to everything. I call this the metrics mode. It looks like everybody is emitting metrics. It's metrics dot whatever server that you're connected to dot whatever client ID you have. Then, we're just pushing some random data there, some device info, and everything like that. What if we wanted to store all of this stuff in a stream? Because right now, you guys are just publishing it out to the ether. You're just saying, here's some stuff, if anybody's interested, it's all Core NATS based.

What if I wanted to take all that stuff and put it in a stream? I could easily do that. I could say nats stream add. Let's give this the name of metrics. I can just say, what subject do I want to listen to? I want to listen to everything that's metrics dot anything else. I want to store this on a file system. I only have one server here, so my replication factor is going to be 1. There's lots of different options for how you can configure all of these streams. I'm just going to bump through them. Once I'm done, I could say nats stream list. I can see the metrics data is already starting to pour in, 57 messages now, 72 messages now.

We're now collecting data, but it's all local to my laptop. How do I forward this data? The way you do that is very simple. It's very similar to mirroring, except I'm going to go on to the cloud now because we have that cloud connection. I'm going to say, I want to make a stream, but I want it to be sourced from multiple streams. This is what we call muxing or demuxing data. This might be one site out of many sites that are living at the edge, and we're collecting that data, and we want to forward it into one gigantic stream in the cloud. One of the cool parts about that is, maybe because of hardware limitations, we want to keep a very tight retention policy on the data that lives here, but we want to keep a really long data retention policy on the stuff that lives up there.

You can even do a lot of really interesting things to say like, the stream here, it lives in memory, and the stream over there, it's on spinning rust. You could do a lot of really interesting things with that. I'm going to go ahead and switch my context back to the cloud. I'm now on the cloud, I'm just going to say, make source. That's going to add a new stream called global_metrics.

We already have all that data replicated immediately, which is awesome. You could see that we can store stuff locally in our own locale, and we could sync it up with the cloud. We can push out configuration changes. It's all very fluid. This is one of the most awesome parts about NATS is it takes that idea of location transparency, of applications being able to be nomadic, of data being able to be nomadic, it really takes that to the maximum, to the point where you could really simply play in a lot of ways, moving data and applications around, and you have all this flexibility without having to manage a ton of configuration around it.

Questions and Answers

Participant 1: We've been struggling with something basic. If we weren't facing this sort of problem, how would the connectivity actually work here?

Saenz: The tunnel was a little bit of a workaround, because I can't control the network that we're on. If I could, and I could say, port forward something to me, and I could use dynamic DNS or something. What I faked was the tunnel, but in reality, inside of an edge network deployment, you would have control over that network and you'd just be able to connect directly to the node.

Participant 1: We just don't use [inaudible 00:34:58], so we're not replacing TCP/IP.

Saenz: This is all still TCP based.

Participant 1: This is all application level, not queues based.

Saenz: It's all layer 7. For folks who are really familiar with networking, they tend to understand NATS when I could say, it's very much like a software defined networking stack. It tries to bring more of the concepts of the stack of lower layers of networking up to L7, in terms of that flexibility, arbitrary topology, being able to move and shift data around very easily.

Participant 1: Solely on that, so location independence, we're able to all connect, and if I walk out, I'd still be connected, if I moved to another country, I will still be connected, only because there's internet. You're still using DNS, you're still using [inaudible 00:35:47].

Saenz: For the initial connection bit, you still use DNS.

Participant 1: What about after that?

Saenz: After that, the initial connection you're now just using NATS to broker all of that communication. NATS is persistent connection, TCP connection. The initial connection, obviously you use DNS and IP and everything like that. As opposed to a traditional microservices architecture, it would be ad hoc HTTP, or gRPC, not a ton of long-lived connections. You'd be doing DNS lookups, and connecting to IPs ad hoc.

In order to accomplish that, you have to put a lot of layers in between that to make it globally scalable. Instead, if you just put some globally scalable mesh in the middle of it, it can broker a lot of that stuff a lot easier.

Participant 2: You mentioned multi-tenancy, but a lot of that looks like Docker limited, nested hierarchical. How does multi-tenancy work in NATS? How is that data kept separate from other tenants?

Saenz: We do have logical isolation and physical isolation. Actually, you guys all connected to my NATS cluster in the cloud. That's my general-purpose NATS cluster, but you guys were all inside of a QCon tenant. All of the subjects that you have are namespace to you. You don't have to worry about overlap, or crossover. If you want, we have a sharing mechanism for being able to import and export subjects across different accounts in those tenants.

You can set very concrete contracts for how you interface. This is why we see a lot of folks rolling out NATS as like a full platform across their whole organization, they get everybody into what we call accounts. Those accounts can then be different teams, or different orgs. They can all interface with each other in a more explicit way, but still have their whole own universe to themselves. That's our logical isolation.

Our physical isolation with leaf nodes and things like that, people can run their own NATS servers and decide what goes over the wire. We like to say we try to solve the Coke and Pepsi problem. Like if we could put Coke and Pepsi on a global network, what would they require in order to feel comfortable? They'll probably never feel comfortable. How do we isolate things in that way, in terms of traffic, in terms of how data is stored, everything like that? That's just a taste of the multi-tenancy part. There's a lot that goes into the AuthN and AuthZ. There's a lot of really neat constructs there.

Participant 3: How do I maintain consistency across 40 different clients?

Saenz: It's hard. We've tried a lot of different things, a lot of different ideas over the years. There's times where we're like, could we just write a single core library in Rust and make bindings to all that? It's a give and take in terms of like, how do we introduce a new concept and pattern into the client libraries, but also try to stay idiomatic?

Everybody has a different way of handling concurrency and have different concurrency primitives or non-primitives. We're a small team, but we still have a lot of our resources dedicated to client libraries, because the clients are handling a bit of complexity there. They're doing a lot of client-side load balancing. They're doing flow control. They're doing all kinds of stuff. We have to maintain that across multiple languages and different idioms. We want people to be able to pick a NATS client up and be like, I know how to do this because I can write Java, or I can write .NET, or I can write Go or Rust. The long answer to that is, just a lot of hard work and feedback with people who are using it.

 

See more presentations with transcripts

 

Recorded at:

Aug 21, 2024

BT