Why this matters for iroh
iroh is built on top of QUIC, providing peer-to-peer connectivity, NAT traversal, and encrypted connections out of the box. While iroh handles the hard parts of networking—holepunching, relay servers, and discovery—you still need to design how your application exchanges data once connected. Many developers reach for iroh expecting it to completely abstract away the underlying transport. However, iroh intentionally exposes QUIC’s powerful stream API because:- QUIC is more expressive than TCP - Multiple concurrent streams, fine-grained flow control, and cancellation give you tools TCP never had
- Protocol design matters - How you structure requests, responses, and notifications affects performance, memory usage, and user experience
- No one-size-fits-all - A file transfer protocol needs different patterns than a chat app or real-time collaboration tool
iroh uses a fork of Quinn, a pure-Rust implementation of QUIC maintained by the n0 team. Quinn is production-ready, actively maintained, and used by projects beyond iroh. If you need lower-level QUIC access or want to understand the implementation details, check out the Quinn documentation.
Overview of the QUIC API
Implementing a new protocol on the QUIC protocol can be a little daunting initially. Although the API is not that extensive, it’s more complicated than e.g. TCP where you can only send and receive bytes, and eventually have an end of stream. There isn’t “one right way” to use the QUIC API. It depends on what interaction pattern your protocol intends to use. This document is an attempt at categorizing the interaction patterns. Perhaps you find exactly what you want to do here. If not, perhaps the examples give you an idea for how you can utilize the QUIC API for your use case. One thing to point out is that we’re looking at interaction patterns after establishing a connection, i.e. everything that happens after we’veconnected or accepted incoming connections, so everything that happens once we have a Connection instance.
Unlike TCP, in QUIC you can open multiple streams. Either side of a connection can decide to “open” a stream at any time:
write on one side of a TCP-based protocol will correspond to a read on the other side, when a protocol opens a stream on one end, the other side of the protocol can accept such a stream.
Streams can be either uni-directional (open_uni/accept_uni), or bi-directional (open_bi/accept_bi).
- With uni-directional streams, only the opening side sends bytes to the accepting side. The receiving side can already start consuming bytes before the opening/sending side finishes writing all data. So it supports streaming, as the name suggests.
- With bi-directional streams, both sides can send bytes to each other at the same time. The API supports full duplex streaming.
One bi-directional stream is essentially the closest equivalent to a TCP stream. If your goal is to adapt a TCP protocol to the QUIC API, the easiest way is going to be opening a single bi-directional stream and then essentially using the send and receive stream pair as if it were a TCP stream.
- The
SendStreamside can.finish()the stream. This will send something like an “end of stream notification” to the other side, after all pending bytes have been sent on the stream. This “notification” can be received on the other end in various ways:RecvStream::readwill returnOk(None), if all pending data was read and the stream was finished. Other methods likeread_chunkwork similarly.RecvStream::read_to_endwill resolve once the finishing notification comes in, returning all pending data. If the sending side never calls.finish(), this will never resolve.RecvStream::stopwill resolve withOk(None)if the stream was finished (orOk(Some(code))if it was reset).
- The
SendStreamside can also.reset()the stream. This will have the same effect as.finish()ing the stream, except for two differences: Resetting will happen immediately and discard any pending bytes that haven’t been sent yet. You can provide an application-specific “error code” (aVarInt) to signal the reason for the reset to the other side. This “notification” is received in these ways on the other end:RecvStream::readand other methods likeread_exact,read_chunkandread_to_endwill return aReadError::Reset(code)with the error code given on the send side.RecvStream::stopwill resolve to the error codeOk(Some(code)).
- The other way around, the
RecvStreamside can also notify the sending side that it’s not interested in reading any more data by callingRecvStream::stopwith an application-specific code. This notification is received on the sending side:SendStream::writeand similar methods likewrite_all,write_chunketc. will error out with aWriteError::Stopped(code).SendStream::stoppedresolves withOk(code).
What is the difference between a bi-directional stream and two uni-directional streams?
- The bi-directional stream establishes the stream pair in a single “open -> accept” interaction. For two uni-directional streams in two directions, you’d need one side to open, then send data, then accept at the same time. The other side would have to accept and then open a stream.
- Two uni-directional streams can not be stopped or reset as a unit: One stream might be stopped or reset with one close code while the other is still open. Bi-directional streams can only be stopped or reset as a unit.
.finish() or .reset() its send half, or .stop() its receiving half.
Finally, there’s one more important “notification” we have to cover:
Closing the connection.
Either end of the connection can decide to close the connection at any point by calling Connection::close with an application-specific error code (a VarInt), (and even a bunch of bytes indicating a “reason”, possibly some human-readable ASCII, but without a guarantee that it will be delivered).
Once this notification is received on the other end, all stream writes return WriteError::ConnectionLost(ConnectionError::ApplicationClosed { .. }) and all reads return ReadError::ConnectionLost(ConnectionError::ApplicationClosed { .. }).
It can also be received by waiting for Connection::closed to resolve.
Importantly, this notification interrupts all flows of data:
- On the side that triggers it, it will drop all data to be sent
- On the side that receives it, it will immediately drop all data to be sent and the side will stop receiving new data.
Let’s look at some interaction pattern examples so we get a feeling for how all of these pieces fit together:
Request and Response
The most common type of protocol interaction. In this case, the connecting endpoint first sends a request. The accepting endpoint will read the full request before starting to send a response. Once the connecting endpoint has read the full response, it will close the connection. The accepting endpoint waits for this close notification before shutting down.Full duplex Request & Response streaming
It’s possible to start sending a response before the request has finished coming in. This makes it possible to handle arbitrarily big requests in O(1) memory. In this toy example we’re readingu64s from the client and send back each of them doubled.
Multiple Requests & Responses
This is one of the main use cases QUIC was designed for: Multiplex multiple requests and responses on the same connection. HTTP3 is an example for a protocol using QUIC’s capabilities for this. A single HTTP3 connection to a server can handle multiple HTTP requests concurrently without the requests blocking each other. This is the main innovation in HTTP3: It makes HTTP/2’s connection pool obsolete. In HTTP3, each HTTP request is run as its own bi-directional stream. The request is sent in one direction while the response is received in the other direction. This way both stream directions are cancellable as a unit, this makes it possible for the user agent to cancel some HTTP requests without cancelling any others in the same HTTP3 connection. Using the QUIC API for this purpose will feel very natural:Multiple ordered Notifications
Sending and receiving multiple notifications that can be handled one-by-one can be done by adding framing to the bytes on a uni-directional stream.LengthDelimitedCodec and tokio-util’s codec feature to easily turn the SendStream and RecvStream that work as streams of bytes into streams of items, where each item in this case is a Bytes/BytesMut. In practice you would probably add byte parsing to this code first, and you might want to configure the LengthDelimitedCodec.
The resulting notifications are all in order since the bytes in the uni-directional streams are received in-order, and we’re processing one frame before continuing to read the next bytes off of the QUIC stream.
There’s another somewhat common way of doing this:
The order that
accept_uni come in will match the order that open_uni are called on the remote endpoint. (The same also goes for bi-directional streams.)
This way you would receive one notification per stream and know the order of notifications from the stream ID/the order of accepted streams.
The downside of doing it that way is you will occupy more than one stream. If you want to multiplex other things on the same connection, you’ll need to add some signaling.Request with multiple Responses
If your protocol expects multiple responses for a request, we can implement that with the same primitive we’ve learned about in the section about multiple ordered notifications: We use framing to segment a single response byte stream into multiple ordered responses:- The roles are reversed: The length-prefix sending happens on the accepting endpoint, and the length-prefix decoding on the connecting endpoint.
- We additionally send a request before we start receiving multiple responses.
At this point you should have a good feel for how to write request/response protocols using the QUIC API. For example, you should be able to piece together a full-duplex request/response protocol where you’re sending the request as multiple frames and the response comes in with multiple frames, too, by combining two length delimited codes in both ways and taking notes from the full duplex section further above.
Requests with multiple unordered Responses
The previous example required all responses to come in ordered. What if that’s undesired? What if we want the connecting endpoint to receive incoming responses as quickly as possible? In that case, we need to break up the single response stream into multiple response streams. We can do this by “conceptually” splitting the “single” bi-directional stream into one uni-directional stream for the request and multiple uni-directional streams in the other direction for all the responses:You might’ve noticed that this destroys the “association” between the two stream directions. This means we can’t use tricks similar to what HTTP3 does that we described above to multiplex multiple request-responses interactions on the same connection.
This is unfortunate, but can be fixed by prefixing your requests and responses with a unique ID chosen per request. This ID then helps associate the responses to the requests that used the same ID.
Another thing that might or might not be important for your use case is knowing when unordered stream of responses is “done”:
You can either introduce another message type that is interpreted as a finishing token, but there’s another elegant way of solving this. Instead of only opening a uni-directional stream for the request, you open a bi-directional one. The response stream will only be used to indicate the final response stream ID. It then acts as a sort of “control stream” to provide auxiliary information about the request for the connecting endpoint.
Proxying UDP traffic using the unreliable datagram extension
Time-sensitive Real-time interaction
We often see users reaching for the QUIC datagram extension when implementing real-time protocols. Doing this is in most cases misguided. QUIC datagram sending still interacts with QUIC’s congestion controller and thus are also acknowledged. Implementing traditional protocols on top of QUIC datagrams might thus not perform the way they were designed to. Instead, it’s often better to use lots of streams that are then stopped, reset or prioritized. A real-world example is the media over QUIC protocol (MoQ in short): MoQ is used to transfer live video frames. It uses one QUIC stream for each frame (QUIC streams are cheap to create)! The receiver then stops streams that are “too old” to be delivered, e.g. because it’s a live video stream and newer frames were already fully received. Similarly, the sending side will also reset older streams for the application level to indicate to the QUIC stack it doesn’t need to keep re-trying the transmission of an outdated live video frame. (MoQ will actually also use stream prioritization to make sure the newest video frames get scheduled to be sent first.) https://discord.com/channels/1161119546170687619/1195362941134962748/1407266901939327007 https://discord.com/channels/976380008299917365/1063547094863978677/1248723504246030336Closing Connections
Gracefully closing connections can be tricky to get right when first working with the QUIC API. If you don’t close connections gracefully, you’ll see the connecting timing out on one endpoint, usually after 30s, even though another endpoint finishes promptly without errors. This happens when the endpoint that finishes doesn’t notify the other endpoint about having finished operations. There’s mainly two reasons this happens:- The protocol doesn’t call
Connection::closeat the right moment. - The endpoint that closes the connection is immediately dropped afterwards without waiting for
Endpoint::close. To make sure that you’re not hitting (2), simply always make sure to wait forEndpoint::closeto resolve, on bothEndpoints, if you can afford it. Getting (1) right is harder. We might accidentally close connections too early, because we accidentally drop theConnection(which implicitly calls close). Instead, we should always keep around the connection and either wait forConnection::closedto resolve or callConnection::closeourselves at the right moment. When that is depends on what kind of protocol you’re implementing:
After a single Interaction
Protocols that implement a single interaction want to keep their connection alive for only the time of this interaction. In this case, the endpoint that received application data last will be the endpoint that callsConnection::close at that point in time.
Conversely, the other endpoint should wait for Connection::closed to resolve before ending its operations.
An example of this can be seen in the Request and Response section above: The connecting side closes the connection once it received the response and the accepting side waits for the connection to be closed after having sent off the response.
During continuous Interaction
Sometimes we want to keep open connections as long as the user is actively working with the application, so we don’t needlessly run handshakes or try to hole-punch repeatedly. In these cases, the protocol flow doesn’t indicate which endpoint of the connection will be the one that closes the connection. Instead, clients should concurrently monitorConnection::closed while they’re running the protocol:
handle_connection we need to make sure to wait for Endpoint::close to resolve.
Aborting Streams
https://discord.com/channels/949724860232392765/1399719019292000329/1399721482522984502QUIC 0-RTT features
Server-side 0.5-RTT
QUIC connections always take 1 full round-trip time (RTT) to establish - the client sends a hello, the server responds with its hello and certificate, and only then can application data flow. However, the server can actually start sending application data before the client finishes the handshake, achieving what’s called “0.5-RTT” latency. This works because after the server sends its hello, it doesn’t need to wait for the client’s final handshake message before it can start processing the client’s request and sending a response. The server knows the encryption keys at this point and can immediately begin sending data back. How to use it: On the server side, this happens automatically - you don’t need to do anything special. As soon as youaccept_bi() or accept_uni() a stream, you can start writing to it immediately, even if the handshake hasn’t fully completed on the client side yet.
- The server should treat 0.5-RTT requests as potentially non-idempotent
- Avoid performing actions with side effects (like making payments, deleting data, etc.) based solely on 0.5-RTT data
- If your protocol requires idempotency guarantees, wait for the handshake to complete before processing sensitive operations