This article explains Reed-Solomon erasure codes and the problems they solve in gory detail, with the aim of providing enough background to understand how the PAR1 and PAR2 file formats work, the details of which will be covered in future articles.

I’m assuming that the reader is familiar with programming, but has not had much exposure to coding theory or linear algebra. Thus, I’ll review the basics and treat the results we need as a “black box”, stating them and moving on. However, I’ll give self-contained proofs of those results in a companion article.

So let’s start with the problem we’re trying to solve! Let’s say you have n files of roughly the same size, and you want to guard against m of them being lost or corrupted. To do so, you generate m parity files ahead of time, and if in the future you lose up to m of the data files, you can use an equal number of parity files to recover the lost data files.

1–1 (1)