Practical correction and capacity estimation of deletion channel using correction trees
While we understand well information channels like binary symmetric channel (BSC: fixed independent probability of bit flip) or erasure channel (BEC: some values are marked as unknown), we don't even know the capacity of the simplest synchronization channel – deletion channel (BDC): for each bit there is an independent probability (p) of removing it from the stream (e.g. 1010 -> 100).
This implementation uses large state (64 bit) analogue of sequential decoding (forward only), what required replacing convolutional codes with specially designed code. The same encoding was previously used for BSC - implementation, discussion, analytically finding Pareto coefficients for BSC and BEC can be found here. To get some intuitions, here is interactive simulator for BSC.
We create potentially huge tree of possible corrections. Each node corresponds to corrected sequence of bytes up to given point. Its branches correspond to the choice of the next corrected byte and the number of bits used for this purpose: 8 without correction or a smaller number when we test scenarios with some applied deletion. In each node there is tested uniformly distributed checksum: R redundancy bits out of 8 bits (rate is (8-R)/8 for R=1 .. 7). These bits always agree for the proper correction and disagree with 1-2^-R probability for improper ones, allowing to statistically reduce the growth of the tree. After detecting an error, the most probable looking node is chosen for further expansion (heap).
For deletion channel, different deletion patterns can lead to the same corrected sequence - such branches should be removed, what is achieved here by checking if given state hasn't been already spotted for given bit position.
The statistical behavior of such random tree can be described by Pareto coefficient (c), saying that increasing twice the limit of number of nodes, reduces the probability of failure approximately 2^c times. We could improve performance by using more resources (memory and time) as long as c>0. For BSC and BEC this coefficient can be found analytically and c = 0 corresponds to channel capacity there - it is the boundary where the tree of possible corrections starts growing exponentially.
It is argued that standard codebooks are not optimal for deletion channel, especially that they cannot work for p>1/2, while there is known (1-p)/9 universal lower rate bound (survey article). So roughtly extrapolated c=0 positions should be seen as lower bounds for low deletion probabilities here - asymptotically achievable by presented method. For large deletion probability there are used codes with long sequences of the same value (0 or 1). However, in practical applications, deletions are rather low probable and appear alongside other types of damages like bit-flips, which would damage the block structure. In contrast, other types of errors can be easily added to the presented approach as just different types of branches with corresponding probabilities.
The tests for R=1, 4, 6, 7 (rate = 7/8, 1/2, 1/4, 1/8) were made for length 1000 byte encoded sequences (frames), with 5*10^7 node limit. 1000 frames were tested for each case. "damaged" is the number of improperly corrected frames out of 1000. "nodes" is the average number of created tree nodes per encoded byte - linear coefficient for time and memory cost (1 if error-free). Pareto coefficient (c) was estimated by linear fit to the central data ([1/3,2/3]). The last column contains roughtly extrapolated c=0 probability (test result files):
rate 7/8 | p=0.001 | 0.003 | 0.005 | 0.007 | 0.009 | 0.011 | 0.012 | 0.013 | 0.014 | ~0.016 |
---|---|---|---|---|---|---|---|---|---|---|
c | 8.85 | 1.54 | 0.65 | 0.346 | 0.245 | 0.138 | 0.114 | 0.071 | 0.030 | 0 |
damaged | 0 | 0 | 1 | 32 | 128 | 425 | 595 | 753 | 882 | - |
nodes | 1.23 | 10.5 | 235 | 2355 | 9602 | 25k | 34k | 41k | 46k | - |
rate 1/2 | p=0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 | 0.07 | 0.08 | 0.09 | ~0.1 |
c | 71 | 22 | 7.9 | 3.3 | 1.46 | 0.74 | 0.398 | 0.241 | 0.076 | 0 |
damaged | 0 | 0 | 0 | 0 | 0 | 1 | 22 | 170 | 651 | - |
nodes | 1.03 | 1.11 | 1.35 | 2.22 | 9.01 | 156 | 2093 | 13k | 37k | |
rate 1/4 | p=0.1 | 0.12 | 0.13 | 0.14 | 0.15 | 0.16 | 0.17 | 0.18 | 0.19 | ~0.2 |
c | 7 | 2.9 | 2.1 | 1.4 | 0.89 | 0.66 | 0.46 | 0.287 | 0.153 | 0 |
damaged | 0 | 0 | 0 | 0 | 0 | 7 | 35 | 125 | 446 | - |
nodes | 1.54 | 2.80 | 4.88 | 11.7 | 84.7 | 746 | 2818 | 9996 | 28k | - |
rate 1/8 | 0.16 | 0.18 | 0.20 | 0.22 | 0.24 | 0.25 | 0.26 | 0.27 | 0.28 | ~0.29 |
c | 8.70 | 5.51 | 3.26 | 1.77 | 0.877 | 0.611 | 0.377 | 0.245 | 0.098 | 0 |
damaged | 0 | 0 | 0 | 0 | 0 | 7 | 48 | 228 | 623 | - |
nodes | 1.40 | 1.82 | 2.92 | 9.40 | 97 | 911 | 4407 | 16k | 36k | - |
Estimated c=0 positions are close to known theoretical capacity bounds (some recent article) and resemble values for BSC, which for these rates are correspondingly: 0.0171, 0.110, 0.214 and 0.295 bitflip probability. Comparing to other implementations, here is some LDPC-based 2003 article which e.g. breaks for p~0.07-0.08 for rate 0.2333 code (page 8), while presented implementation still works above it for rate 1/2 - allowing to transmit more than twice more information through the same channel.
As this implementation uses the last state for final verification, sending this last state (protected) means that the rates should be reduced by a tiny factor, which decreases proportionally to frame length (Pareto coefficient does not depend on it). Without using this state, the last part of the message may remain damaged. Having this state, we can add bidirectional correction: simultaneously build tree in backward direction and finally merge both of them. This way there are needed two critical error concentrations to essentially cripple the correction process, making that probability of failure is approximately squared (Pareto coefficient is doubled), what is confirmed by results for BSC.
Another further work is trying different codewords (especially for large deletion probabilities) - the current coding was designed for bit-flips. Also, we can try to analytically find Pareto coefficients here – the difficulty in comparison to BSC is cutting branches (deletion patterns) corresponding to the same correction.
Jarek Duda, Kraków, 8/08/2014