算数暗号化の見直し

共通鍵暗号化を見てから再び算数暗号化の説明を読みたくなりました。
The point of arithmetic coding is to reduce a message of many symbols to a single number which consumes fewer bits than the original message. The ultimate goal is to be as close as possible to Shannon's entropy limit, defined as:
n*SIGMA(i)(oi/n) where oi is the number of occurrences of the ith character, and SIGMA(i)(oi/n) means sum over all i of the number of occurrences of ith character divided by n (number of all characters in string). The article I read about the subject (Wikipedia) mentions two methods: frequency based and probability based. The intuition behind these methods seems to be the same: basically the result is a sum within a given range of integers (or fractions). Characters are encoded by proceeding left to right, adding a term to the sum. As each term is added, the possible outcomes of the final number are restricted further.
In the case of the frequency based approach, we begin with the cumulative frequency of each character. This is apparently the number of occurrences of all characters of lower index and of that character. In DABDDB, the cumulative frequency of D is 6, of B is 1, and of A is 0. Besides the cumulative frequency, there is also the "cumulative interval", this seems to span the range [cf-oci,cf) for ith character. For D, this would be [3,6). The indices of each would be A=0, B=1, D=3. The frequencies would be 1 for A, 2 for B, and 3 for D.
I am not able to verify the proof that the encoding/decoding algorithms work, nor am I able to demonstrate that this is efficient. It seems that the decoder must know the cumulative intervals and frequencies of each symbol, costing 6*3+6*3 = 36 bits, as least, for a 6-character message.
Basically, the intuition for the decoding should be to transform the coded integer such that the symbol derived at each step can have only one possible value. The decoding seems to use a division/floor method initially. Thus, given the code divided by n^i, what function correlates a symbol's index to this floored quotient? e.g. say DABDDB encodes to 25100. Then, dividing 25100 by 6^5 and taking the floor yields 3. However, given that we obtain 3 in this step, how do we know that it is as a result of using index 3 (D) as the first letter? What if the later terms added in the encoding step amount to more than 6^5, and the index multiplied by 6^5 in encoding was less than 3?