Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"09-12-10 - Challenges in Data Compression 2 - Sparse Contexts"

2 Comments -

1 – 2 of 2
Blogger Shelwien said...

http://encode.ru/threads/1128-Probability-estimation-for-near-empty-contexts

September 14, 2010 at 2:28 PM

Blogger Matt Mahoney said...

Indirect context models like in PAQ or ZPAQ handle this by looking at past instances of {n0=1, n1=0} in other contexts and counting how many times a 0 or 1 occurred next.

To handle nonstationary sources, the counts are bounded so that a bit history can be represented by 1 byte. You bound n0*n1 < ~30 and represent only the most recent bit. If you go outside this bound then scale n0 and n1 proportionally. This tends to get rid of old statistics while keeping enough to make good predictions.

Keeping only the most recent bit in your history means that the sequences 00111 and 11001 would make the same prediction but that 11100 would make a different prediction, even though you still have {n0=3, n1=2}.

September 14, 2010 at 4:23 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.