# (IC 5.14) Finite-precision arithmetic coding - Decoder

## Метаданные

- **Канал:** mathematicalmonk
- **YouTube:** https://www.youtube.com/watch?v=RFPwltOz1IU
- **Дата:** 11.10.2011
- **Длительность:** 24:22
- **Просмотры:** 10,095
- **Источник:** https://ekstraktznaniy.ru/video/52875

## Описание

Pseudocode for the arithmetic coding decoder, using finite-precision.


A playlist of these videos is available at:
http://www.youtube.com/playlist?list=PLE125425EC837021F

## Транскрипт

### Segment 1 (00:00 - 05:00) []

in the previous video we wrote down the finite Precision version of the encoder for arithmetic coding and now we're ready for the decoder so let's do it all right so this is going to be the decoder finite precision and similarly to in the infinite Precision case we will have the following input so the output from the encoder was this binary sequence the encoded sequence and we're going to get that sequence and there may be some other bits at the end also so if you remember back to when we talked about this in the infinite Precision case the decoder it's not just necessarily going to get um this sequence on its own it's going to get that sequence plus some other bits you know these those other bits could be like additional messages that have been concatenated onto this one or they could just be you know it could just be this is stored in memory somewhere and you have to know where your file ends and if that's not given to you ahead of time you're not given this m you're not given the length of this encoded message you have to figure that out and the responsibility part of the responsibility of the decoder is figuring out what M is so that you know where your file ends okay so we get that and we also get our c just like before we get our c and d and let's suppose we also get this R once again like in the encoder very good so now similarly we so this is going to start out looking like the infinite Precision decoder we initialize a to zero and B to whole you know whole is playing the role of the number one and now let's initialize Z before we initialize this to the binary number course responding to this sequence but now let's initialize it to zero and we're going to approximate that binary number so Z starts out at Z and I is going to be a counter and index into this array then we'll start that out actually let's start it we'll start at one so the first part of the finite Precision decoder is initializing Z to an approximation of that binary number so here's how we're going to do that here's one way to do it so we'll do a little while loop while I is less or equal to Precision so we're going to use Precision bits Precision number of bits to for our approximation while I is less than precision and we don't want to overrun M so M You Know M could be an upper bound on this little M or it could just be if you're not giving an upper bound if this is like an infinite sequence then you would just drop this condition so m is just Big M is just some upper bound on Little M okay so we're going to do this Loop and we're GNA at each iteration if beta I is one then we increment Z by 2 to the Precision minus I so we're constructing this binary number Z and then we increment I so basically all this is doing is initializing so this is basically initialize Z to be the number beta 1 up to Beta Precision in binary and note this is an integer now I'm not using a it's not a binary expansion there's no point here this is an integer okay so that's our initialization and we're going to update Z as we go we'll do that below now we're going to have a for Loop similarly to or not a for Loop excuse me a while loop similarly to before we had the infinite while loop and we're going to decode a symbol at a time so here's what's going to happen this is going to look similar to the infinite precision decoder for JAL 0 up to n so we're going to consider each possible Source symbol that's the next possible symbol to be decoded and we will do the following operations so like in the infinite Precision case we set W to be B minus a and we consider the b and a corresponding to that next possible symbol

### Segment 2 (05:00 - 10:00) [5:00]

symbol but now we're doing integers so remember just like in the encoder everything is in integers here so this is going to be the exact same round rounding operation that we did in the encoder so however you implemented this you're going to do it exactly the same way here round W cxi divided by R oops that's not supposed to be I'm so used to writing the encoder version this is J excuse me we don't know what XI is so this is J and then just like once again in the infinite Precision algorithm we're going to check if that is indeed the next symbol and how do we check we just see whether the current Z is contained in this interval from a knot to B knot and this is slightly different than the infinite Precision algorith it's written exactly the same but remember that Z here is just an approximation to the true Z supposed to be defined so but it's the best possible approximation that we can make with our given level of precision it uses these um this Precision number of bits all right so in this case what we do is we update our so well first we emit J because we know that was the next symbol J to be decoded because it was contained in that interval it's the cor the interval corresponding to J contained C so we emit J just like before and right and then we update our a our new a is a not that was the correct new a and our new B is bot and we have to see if we're at the end of the file so we check is J our end of file symbol zero is if J is the end of file symbol then we quit finish so that's this is when the decoder can turn terminate but now we're not quite done yet this was when we finished the infinite Precision decoder but now we have some more work to do because we need to integrate those rescaling operations into our decoder all right so let me first indicate here this section of code was decoding decode next symbol and update and update A and B now the next step the next part of the algorithm or the last is those rescaling operations and this is the crucial part of the decoder and It's tricky to get it to work but I'm going to show you how to do it so the key thing here is to rescale and round and exactly the same so the key is to rescale at exactly the same times during decoding as you did during encoding for the particular sequence that you have so this is so important that I'm going to write rescale at exactly the same times the same points in the sequence so that you're so that when you do this rounding you get exactly the same values of A and B at any given step in the process so that when you do this comparison you get exactly the same result exactly same times during decoding decoding as during encoding so rescale exactly the same times and in exactly the same way so here's one way to do that so remember that above let's go back to our encoder here in the encoder we we were going through this for Loop and we had you know we computed the new A and B for the next symbol and then we did these rescaling operations so we're going to mimic this process during decoding so during encoding we went through this Loop some number of times

### Segment 3 (10:00 - 15:00) [10:00]

you know one time for each Source symbol encoded and depending on what A and B were we rescaled at certain points and so we're going to do that rescaling just like I said and exactly the same way and exact at exactly the same times during decoding as during encoding so let's do it so here we go so let's have let what color should we use let's do this orange while so in uh no not in that for Loop excuse me so we're we in this while loop here while maybe I should indicate where these Loops are going this is the four um this if contains this stuff okay and this while this out Outer while here is going to contain all this more stuff to go okay so now we have after we determined the next symbol and we updated A and B we're now going to do some rescaling so we're going to do it in exactly the same way while B less than half or a greater than half we're going to do possibly cases zero or one so in case zero and if B is less than half then we update A and B just like we did above a is 2 * a b is 2 * oops B and now we also need to update Z right so we need to keep that in the right relationship to a and b z becomes 2 Z otherwise if a is greater than half then we're in case one and we update in the same way that we did above a becomes 2 A minus half B becomes 2 B minus half and Z becomes 2 Z minus half so we're keeping Z in the same relation to the others now remember that Z was an approximation when we initialized Z up here we started out at zero and we initialized it in this way and Z was just an approximation to our given level of precision so when we do this rescaling think about what's going on here so we are we you know we're blowing things up again so when we multiply Z * 2 we're basically like left shifting that binary representation here so if this was a zero so that's the case where B is less than half that's the case when that first bit was a zero well the fir the first bit is going to be zero in that case that's not exactly what when that happens but the first bit is going to be a zero and this operation basically just left shifts this binary representation and similarly in this case here uh if a is greater than half then the first bit of our approximation to Z is going to be a one so this half is sub subtracting off that one and then the two is left shifting it basically it's way you can think about it so what we need to do after we do those rescalings is we need to update our repres our approximation of Z so let's do that so let's update that approximation and here's how we're going to do that so if we're not yet at the end of our binary sequence and the next bit in the sequence is a one then we should update Z to reflect that fact so then we set Z the new value of Z is z + one right because you know Z was Z is this binary thing and after we left shifted it there's this last bit that we need to read off the next bit in the sequence it's not going to be beta Precision it's I at this point so this is in the algorithm the last bit of Z and after we do that we should increment I because we've read off that next bit from our sequence of betas very good and now we have to do the other type of rescaling this was when we were in the lower half or the upper half and we have our our quote unquote middle type of

### Segment 4 (15:00 - 20:00) [15:00]

rescaling that we have to consider so if a is greater than quarter and B is less than 34s or 3 times quarter then we need to do the middle rescaling type of step and that's just the following so a becomes 2 A minus qu B becomes 2 B minus qu and Z becomes 2 Z minus qu Z to Z minus quarter so just as a reminder I'll just go back up here to remind you so this was uh in the encoder this was our rescaling and so here we and during decoding we're rescaling in exactly the same way so so in this middle rescaling step if a is B than a/ quarter and B is less than 3/4 then well we had to increment s before we don't need to worry about s anymore um but we do need to rescale in exactly the same way and we have to update Z as well all right whoa what happened oops okay there we are back to our decoder and similarly to before in this if we do this rescaling then we've shifted uh Z and we need to update our approximation so for every iteration through this while loop right here we need to update Z so if I is less than M and so this is exactly the same as what we did above and beta is one then add one to Z and increment we're always going to increment I so I is going to keep track of how many times we've left shifted Z okay and let me now indicate just like to be consistent I'll indicate the nesting structure here and that is going to do it for the decoder that does it so that big while loop out here this while one and each iteration of that while one so maybe just recap here so this is our decoder so we initialized we initialized Z to this approximation to this binary number and then we went through this while loop and on each iteration we decoded the next symbol by looking at our current approximation to Z and we emitted that symbol and we updated A and B and when if we hit the end of the file symbol then we quit and then we also did these rescaling operations to ensure that at each step at each iteration when we get to this point that our A's and B's are exactly the same as they were during the encoding process that is so critical okay you're getting tired of me saying that but it is really important and uh so we did this and now the output of this whole thing so we have the output we put that output is the original sequence that was encoded X1 up to XK and or in other words you know it was the original original Source sequence I'll put it this way so to be clear so what I mean by this is the sequence that was emitted during this algorithm which it coincides with the original sequence of course because we decoded it all right so that is our decoder our finite Precision decoder and One Last Detail to mention here to explain is how to get M so remember I said at the beginning that the responsibility of the decoder is not only to decode the message that's uh the initial part of this binary string represents but also to tell where that binary string ends so you need to figure out what m is in general you may and so I'm not going to write down algorithmically I'm going to leave it as an exercise for you to figure out if you are interested in in wrapping up that detail but I'll basically indicate

### Segment 5 (20:00 - 24:00) [20:00]

the rough idea of what you do the rough idea is that you're going to look at Z your current representation of Z at the termination of the algorithm and it's going to be something like you know if beta 1 so this is determining M so if we had beta 1 up to Beta M + 1 beta capital M then your final Z is going to be somewhere it's going to be some binary sequence that's some subset of these some substring of this right here so Z is going to be somewhere here and a and b are going to be you know some integers that you know whatever they happen to be at the end of the algorithm and what you're going to do is similarly to the you know to in the infinite Precision case you're going to find the first you know the first um few know however many bits of this initial part of Z so if Z is Z1 up to Z Precision at the end of the algorithm in binary so these are each of these is a bit z i is a bit then you're going to find whatever the initial sequence Z1 up to Z and then Zer so that the interval corresponding to this number is contained in the interval from A to B well I shouldn't put I here let me put um let me put J CU I so I is going to help you determine what m is because remember that I was keeping track of how many times we left shifted a b and Z so I is going to tell you where the end of your um Z occurs with respect to M so you're going to use so use Z A and B and I to determine m and I'll leave that detail to you okay so that is the decoder we did it so that finishes our description our description of the finite Precision arithmetic coding decoder and I'll iterate at the expense of being monotonous that the key thing here was to rescale at exactly the same times during decoding as during encoding okay so we finished our description of the finite Precision encoder and decoder and this was a quite a long section overall about arithmetic coding but it's truly a remarkable algorithm I hope you'll agree now that you're familiar with it and hopefully you'll have a chance to implement this algorithm and see for yourself how amazing its performance is and maybe you can even play around with a some non IID models and maybe you could even you know break some of the world records for compression performance some of the best performing algorithms or the best performing compression algorithms use arithmetic coding so maybe you can break the current world records you know implement the AL algorithm and Implement a good probabilistic model of language or different sources that are real world sources and maybe you can be the next one to have the best performing compression algorithm okay happy coding and I'll see you next time