General Question

Vincentt's avatar

Why does a Burrows-Wheeler Transformation work?

Asked by Vincentt (8094points) February 12th, 2008

I’ve read a few explanations of the principle of BWT, however, I still don’t get why it works, i.e. why does the output of BWT contain sequences of the same character that was originally only present at different locations in the input? Why aren’t the characters also scattered across the output?

Observing members: 0 Composing members: 0

4 Answers

paulc's avatar

Good question. Reading the Wikipedia article, I think this line gets close to your answer:

“So it can be seen that the success of this transform depends upon one value having a high probability of occurring before a sequence, so that in general it needs fairly long samples (a few kilobytes at least) of appropriate data (such as text).”

Basically I think it means that its most successful if a number of sequences repeat in the original sample. To get any deeper I think we’d be getting into some serious statistical mathematics which I’m not currently well versed in.

Vincentt's avatar

So the sequentials A’s and N’s in “Banana” would result from “NA” and “AN”, respectively? That sounds credible indeed, thanks!

paulc's avatar

Yes, by making all the rotations and then sorting them alphanumerically any often repeating patterns (AN or NA in BANANA) should expose themselves in strings of similar values or similar combinations of values. At least that’s my understanding.

Thanks for asking this question, I had never heard of this before and its really very interesting!

Vincentt's avatar

Thanks for answering :)

It’s for a project for school about data compression and indeed, it’s really interesting. You might also be interested in Huffman coding and Move-to-front transform, which I’ve also studied :)

Answer this question

Login

or

Join

to answer.

This question is in the General Section. Responses must be helpful and on-topic.

Your answer will be saved while you login or join.

Have a question? Ask Fluther!

What do you know more about?
or
Knowledge Networking @ Fluther