Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"Some learnings from ZStd"

4 Comments -

1 – 4 of 4
Blogger Jarek Duda said...

Hi Charles,
Thanks for the comments. I don't have experience with LZ, but regarding the last part, the best (still heuristic) method for tANS symbol spread I am aware of is "tuned spread" which uses both quantization and the actual probabilities to correspondingly shift the symbol appearances left or right.
If quantization is p[s] ~ q[s]/L, this symbol has q[s] appearances i \in {q[s],...,2q[s]-1} and
preferred position for i-th appearance is x ~ 1/(p[s] ln(1 + 1/i)).

For a singleton i=1, x ~ 1/(p[s] ln(2)) \in [L,2L-1], so it can well represent probabilities between p[s] ~ 1/(2L ln(2)) ~ 0.72/L for the most-right position (x=2L), to p[s] ~ 1/(L ln(2)) ~ 1.44/L for the most-left position (x=L).
https://encode.ru/threads/2013-Asymmetric-numeral-system-toolkit-and-fast-tuned-symbol-spread
https://github.com/JarekDuda/AsymmetricNumeralSystemsToolkit
Best,
Jarek

September 29, 2017 at 1:19 AM

Blogger cbloom said...

Jarek, but that would require transmitting the true probability (p), not the normalized probability (q). That may or may not be worth it, as the true probability may take more bits, and it would require the decoder to spend the time to normalize (to recreate the q's since it was sent the p's), which is non-trivial.

September 29, 2017 at 8:33 AM

Blogger Jarek Duda said...

Indeed, in practice there is some compromise needed, for example distinguishing singleton in the center of range from the very low probability symbols.
More sophisticated is storing probability distribution quantized in an optimized way (minimizing cost of header + KL), then also decoder perform proper quantization and symbol spread ... https://encode.ru/threads/1883-Reducing-the-header-optimal-quantization-compression-of-probability-distribution
Another option is storing symbol spread in the header ... anyway, there are many possibilities to optimize among.

September 29, 2017 at 11:01 AM

Blogger rep_movsd said...

Hi Charles

Sorry for the off-topic comment, but whatever happened to your old content on cbloom.com

It has always been a resource I pointed data compression enthusiasts at, but the website seems to be defunct since quite a while.

Please put it back somewhere online if you still have that stuff!

March 15, 2018 at 12:20 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.