DoorKickers's blog

Blogs

后缀排序

2020-02-26 12:26:19 By DoorKickers

Pre-defination

S : a string named S that consists of several characters

|S| : the size of S

S[l:r] : a sub-string of S which consists of the characters S[l], ... , S[r]

Suf(i) : a suffix of S that means S[i:|S|]

Algorithm

Sqy Algorithm I is an algorithm that can sort the suffix of a given string S in O(nlogn) time, and the result of this algorithm will be stored in a array called Suffix Array.

Let me show the code to you first, and then I will explain some intricacy details

(Assumed that you have already got the Radix Sort which can sort an array in O(n) time.)

inline void SA_sort(char *s, int n) {
    for (int i = 1; i <= n; i++) ++c[x[i] = s[i]];
    for (int i = 2; i <= m; i++) c[i] += c[i - 1];
    for (int i = n; i >= 1; i--) sa[c[x[i]]--] = i;
    for (int k = 1; k <= n; k <<= 1) {
        int num = 0;
        for (int i = n - k + 1; i <= n; i++) y[++num] = i;
        for (int i = 1; i <= n; i++) {
            if (sa[i] > k)
                y[++num] = sa[i] - k;
        }
        for (int i = 1; i <= m; i++) c[i] = 0;
        for (int i = 1; i <= n; i++) ++c[x[i]];
        for (int i = 2; i <= m; i++) c[i] += c[i - 1];
        for (int i = n; i >= 1; i--) sa[c[x[y[i]]]--] = y[i], y[i] = 0;
        swap(x, y);
        x[sa[1]] = 1, num = 1;
        for (int i = 2; i <= n; i++) {
            x[sa[i]] = (y[sa[i]] == y[sa[i - 1]] && y[sa[i] + k] == y[sa[i - 1] + k]) ? num : ++num;
        }
        if (num == n) break;
        m = num;

    }
}

Ok, that's the code.

Above all, the most important part that can make you better understanding of that snippet is the meaning of each array.

To simplify, we can use the function concept instead of array concept.

A function f is a machine that can get input and return an output.

Good, looks like you understand it completely.

 

Next step : look through the arrays that mentioned in that snippet.

You will comprehend them via function concept (expect c[])

 

x[] : input a initial position i, and return the "value" of the first-key whose length is k and whose start position is i.

y[] : input a current partial rank k of second-keys, and return the initial position i of the first-key which corresponds to the No.k second-key.

sa[] : input a current global rank k of the combination of first-keys and it's corresponding second-key, and return the initial position i which represents for that rank k string.

Remember it!!!!!!!!!!!!!

Ok, the writter is fucked up.

So

To be continued.

Comments

No comments yet.

Post a comment

You can refer to mike by using "@mike", and "mike" will be highlighted. If you want to type the character "@", please use "@@" instead.

You can enter "/kel" to use the emoticon "kel".