Welcome back, followers of the fearsome!
Today I’ll be giving a little update on the progress of making the online multiplayer part of our game work.
The networking stack for Viking Squad consists out of two layers. The bottom, most low-level layer is the communication layer. On top of this layer sits the game-layer, which is probably worth an entire blog post at some point in the future. Today I’ll describe my communication layer.
The communication layer
This layer is built on the UDP model, and ads connections, reliable messages, and data-blobs (for lack of a better name). I started developing this layer a long time ago, but it turns out to be extremely similar to Glenn Fiedlers excellent network layer, and it you are at all interested in game-networking, you should most definitely read his write-ups. I seriously can’t recommend it enough, he knows what he’s talking about.
When you want to send a large file over the internet, you probably want to just send parts of file over a reliable connection, until all the parts have been received on the other side, and the file can be reconstructed. A TCP connection is reliable connection, and acts like a pipe (data in on one side, data out on the other side), and is well suited for this purpose. When dealing with packet loss, TCP ensures reliability by re-sending packets, until it has arrived on the other side before continuing. Awesome! We don’t even need to do anything, it takes care of everything! Well, not so fast now (har har), re-sending packets will cause delays in the communication, and while this may be fine for sending a large file, it can be devastating for games.
The network functionality I actually want from the communication layer is as follows:
- 1) A very fast way to send the latest data for roughly 95% of the total traffic (unreliable is ok).
- 2) A reliable way to send data for roughly 5% of the total traffic.
- 3) A way to automatically replicate data.
- 4) Optimized for small data packets, in the order of magnitude of hundreds of bytes, not thousands.
I’ll go into detail a bit more on this points, and how they were implemented in the connection layer.
Sending the latest data fast
For most network traffic in games, only the very latest information matters. A simple example is the position of a player. When you send a message ‘player is at location XYZ‘, and it somehow doesn’t arrive, would you really want to resend that same information? No, of course not. You’d want to update it with the latest location of the player before you send it, wouldn’t you? In fact, you probably want to just send the latest position of the player a couple of times per second, and ignore any missed packets entirely. This is where UDP comes in. UDP is connection-less, and provides no reliability out-of-the-box, other than a checksum for data integrity. UDP is able to send messages between clients very fast, but not reliable. This doesn’t mean that it is not usable though, the communication layer just needs to be built to handle the lack of reliability.
When I started building the communication layer, I started with a simple UDP connection between two machines. One of the first things I added was a bad-connection-simulator. Traffic on a local area network is fast and pretty reliable, and I wanted to make sure to build the game on a really shitty connection, so I added a piece of code that simulates about 5% packet loss, and a lag of anywhere between 200 and 300ms. At this point I was able to send unreliable messages, in about 95% of the cases, they would arrive after 200-300ms.
The next thing to add was a the reliable message structure. I’m not going to describe the exact implementation of this layer, because it is excellently described here. In a nutshell, it uses ack/nak bits and re-sends messages that weren’t acknowledged as received by the other end, until the message is acknowledged. This is reminiscent of the way TCP works, so it should be expected that reliable messages may arrive later than you’d want. I want to point out that while this reliable message layer acts like TCP, the rest of the traffic still flows even if a packet gets lost. Cool, now we’re able to send reliable messages!
In previous games I’ve worked on I had to create new net-messages for each new piece of data I wanted to send, as well as code for handling this message when it arrives. This became quite a chore, and I wanted to fix this for Viking Squad. The new system I am using may not be as sophisticated as the replication system used in Unreal (for example), but it works well for my use, and the network traffic required is minimal.
The data-blob system has 8 local slots for data. Whenever you want a piece of data to be available to other clients, you submit a byte array to one of the 8 slots, and this data will be communicated to the other clients in the network automatically. This system uses a similar ack/nak system as described in the reliable message buffer. On top of this, it uses a binary difference algorithm to make sure only the changes in the data are sent over the network, rather than the entire data blob. Most data in the data-blob slots is somewhat coherent between frames, so the binary diffs are quite small. Whenever a new data-blob comes in from a client, an event is sent to the game layer notifying it of the newly arrived data-blob.
Small data packets
I kind of touched on this in the previous parts, but the messages generated by the game are kept as small as possible. The main message pump sends a steady number of packets per second, and each packet can contain multiple unreliable, reliable, and data-blob messages. The send rate of the message pump depends on the detected amount of packet loss and round trip time, in an effort to control the flow of data and prevent clogs.
To make sure data is saved in the smallest way possible I’ve implemented a BitStreamWriter and BitStreamReader. They allow for dense bit-packing, and are able to considerably reduce message sizes. For example, naively saving a ‘bool’ to the average stream would take 8 bits, since the smallest unit it deals with is a byte. The bit stream would only use up 1 bit. Saving an unsigned int that has a range between 0 and 127 would still take 32 bits in a ‘normal’ stream, but in the BitStreamReader/Writer it would save using only 7 bits. Large savings, if you know the range of your variables!
Here you can see all of the things described above. On the left is one client, on the right is another client, and in the middle is the bad-connection-simulator. At the moment of this screenshot, it was sending 30 reliable and 30 unreliable messages from both sides. More reliable traffic that I’d generally use in a game, but this is a test case, right?
In the middle you can see that I’ve set the lag to be 100-130ms for both send and receive, resulting in a total round trip time (RTT) of around 200-260ms. As you can see in the ‘RTT’ graph, which shows the average RTT, this is exactly what’s happening (the number that is cut off is 256.0). The detected packet loss varies between roughly between 4% and 8%, which is pretty close to the 5% I’ve set in the bad connection simulator.
Another interesting thing to look at is the graphs for ‘Reliable xfer time’ and ‘Unreliable xfer time’. Here you can clearly see that sending unreliable messages is consistently pretty fast (limited by lag only), and a reliable message can take much longer (a few multiples of the lag), based on both packet loss and lag.
The last thing I wanted to mention is the ‘Local Data Blobs’ and ‘Remote Data Blobs’ part. When you type into the Local Data Blobs fields, every change will be treated as a separate ‘revision’, signified by the first number that follows. The second number is the remotely ack-ed version, which is used to generate delta-compressed messages to make sure the other side is updated to the latest version.
Alright, I hope that gives you a bit of insight into how our lower level network layer is set up! Until next week, and don’t forget the art stream later today!