July 19, 2021—What they are and how they work (programming)
A fountain code is a forward error correction code for lossy channels. In other words a fountain code is an erasure code.
That’s just a fancy way to say that a fountain code is one of those things that lets you send a message somewhere, have parts of your communication fail to make it, and the other party will still be able to understand you clearly.
Fountain codes aren’t the only thing that can do this. But they are interesting for some reasons I’ll get into.
Send enough redundant information to the other party that on a good day you wasted some time talking and on a bad day they can still piece together exactly what you said.
What you will do:
What the other party will do:
And that’s it!
Suppose you want to send this message:
Hi!
Here are its ASCII bytes in hexadecimal:
0x48 0x69 0x21
And here are its ASCII bytes in binary:
01001000 01101001 00100001
There are three bytes, so let’s choose \(k = 3\) and split the message up accordingly:
Part | Binary |
---|---|
1 | 01001000 |
2 | 01101001 |
3 | 00100001 |
The next step is to choose a random subset of these three parts and XOR them together. Let’s choose parts one and two. Keep in mind that \(\oplus\) means XOR and \(01001000 \oplus 01101001 = 00100001\):
Parts | \(\oplus\) |
---|---|
1, 2 | 00100001 |
And we do that again. Let’s choose parts two and three:
Parts | \(\oplus\) |
---|---|
1, 2 | 00100001 |
2, 3 | 01001000 |
And let’s do that a third time, picking parts one, two, and three:
Parts | \(\oplus\) |
---|---|
1, 2 | 00100001 |
2, 3 | 01001000 |
1, 2, 3 | 00000000 |
Now let’s pretend like we have received those three rows. We can represent them as a system of equations:
Row | Part 1 | Part 2 | Part 3 | \(\oplus\) |
---|---|---|---|---|
1 | ✓ | ✓ | 00100001 |
|
2 | ✓ | ✓ | 01001000 |
|
3 | ✓ | ✓ | ✓ | 00000000 |
Now just solve the system of equations. The row operation we’ll use is XOR. Here are the steps to solve it:
Row | Part 1 | Part 2 | Part 3 | \(\oplus\) |
---|---|---|---|---|
1 | ✓ | ✓ | 00100001 |
|
2 | ✓ | ✓ | 01001000 |
|
3 | ✓ | 00100001 |
Row | Part 1 | Part 2 | Part 3 | \(\oplus\) |
---|---|---|---|---|
1 | ✓ | ✓ | 00100001 |
|
2 | ✓ | 01101001 |
||
3 | ✓ | 00100001 |
Row | Part 1 | Part 2 | Part 3 | \(\oplus\) |
---|---|---|---|---|
1 | ✓ | 01001000 |
||
2 | ✓ | 01101001 |
||
3 | ✓ | 00100001 |
And it’s solved! Did you notice what is now in that last column? Look a little more closely:
Binary | Hex | ASCII |
---|---|---|
01001000 |
0x48 | H |
01101001 |
0x69 | i |
00100001 |
0x21 | ! |
I’ll leave it as an exercise for you to figure out what would have happened if we had gathered four rows instead of three. Or if we had gathered a different three rows. Or if we had chosen something for \(k\) other than three. Or what the probability is of being unable to decode the message when you have gathered exactly \(n \geq k\) rows.
Fountain codes are rateless. That means that you don’t have to decide up front how many rows you’re going to send. You can just keep sending them until you get tired of it. This is different than Reed Solomon.
Fountain codes can easily be multicasted. Meaning several parties can send a message over individually slow connections to a single party and the receiving party will get it at the combined speed of all those connections. There are \(2^k - 1\) unique and useful rows for every message of \(k\) parts. Because that number gets very large very quickly there is no need to coordinate between all the senders: each sender can just randomly generate rows and they’re more than likely going to be unique and useful to the receiving party.
It’s pretty I/O intensive to generate a row. On average you have to read half of the message each time.
It’s pretty I/O intensive to solve the system of equations. If you use Gaussian Elimination then you’re looking at \(O(k^2)\) row operations. And rows can be pretty large depending on what \(k\) is and how large your message is.
Fountain codes only help in erasure channels. But many communication channels in real life are noisy. In practice that means you might have to transmit each row wrapped in an error detection code (like CRC) which adds overhead.
Throughput is strongly dependent on the size of each row. Rows that are too large will usually have a greater chance of being lost through an erasure channel. But row size is inversely proportional to \(k\), so too small a row size means it will take a very long time for the receiving party to solve the system of equations.
You have to send information about which parts were XOR’d together which adds overhead. This overhead can be very significant for small messages or large values of \(k\).
A systematic fountain code just ensures that the first \(k\) rows that get transmitted are the first \(k\) parts of the message in order.
I’ll leave it as an exercise for you to think about why that might be helpful. Hint: here’s what the first three generated rows would have been from our example above if our fountain code was systematic:
Row | Part 1 | Part 2 | Part 3 | \(\oplus\) |
---|---|---|---|---|
1 | ✓ | 01001000 |
||
2 | ✓ | 01101001 |
||
3 | ✓ | 00100001 |
https://github.com/matthew-a-thomas/Fountain—fountain codes applied to files.
https://github.com/matthew-a-thomas/cs-fountain-codes—simulated performance of a few different fountain codes (or fountain-code-esque things). The graphs on that page can be pretty confusing. Open an issue in that repository if anything isn’t clear.