11 Jan 2023 (+0800)

A shuffle algorithm inspired by the merge sort was presented and proved to be wrong.

The input array is divided into two parts as equal length as possible. After two parts are recursively shuffled, they’re merged to one array. The merge procedure is similar to the one in merge sort. But the decision of which element should be merged is based on the output of a random number generator, instead of the comparison between two elements. The C implementation is the following.

```
#define N 65536
int *mshuf(int *in, int n)
{
static int a[N];
int m, *l, *r, *p;
if (n < 2)
return in;
m = n/2;
l = mshuf(in, m);
r = mshuf(in+m, n-m);
p = a;
while (l<in+m && r < in+n)
if (rand()%2)
*p++ = *l++;
else
*p++ = *r++;
while (l<in+m)
*p++ = *l++;
while (r<in+n)
*p++ = *r++;
return memcpy(in, a, n*sizoef(*a));
}
```

For \(in=\{0,1,2,3,\ldots,n-1\}\),
\(n\) is a power of 2, we define event
\(S\) as \(a=\{0,1,2,\ldots,n-1\}\) after
`mshuf()`

returned. Assume `mshuf()`

is correct,
we should have:

\[Pr\{S\} = \frac{1}{n!}\]

We will prove that it’s false. To make \(a=\{0,1,2,\ldots,n-1\}\), after the
recursive `mshuf()`

calls returned, we must have \(l=\{0,1,\ldots,m-1\}\) and \(r=\{m,m+1,\ldots,n-1\}\). Denote this event
as \(D\). By assuming the correctness
of `mshuf()`

, we have:

\[ Pr\{D\} = ({\frac{1}{m!}})^2 \]

After \(D\) occurred, we must ensure \(a[i]=i\) in the merge procedure. We define event \(A_i\) as \(a[i]=i\) under the condition that \(D\) occurred. By the multiplication rule, we have:

\[ \begin{align} &Pr\{A_0A_1\ldots A_{n-1}\} \\ =&Pr\{A_0\}Pr\{A_1|A_0\}Pr\{A_2|A_0A_1\}\ldots Pr\{A_{n-1}|A_0A_1\ldots A_{n-2}\} \end{align} \]

When decide the value of `a[i]`

, if \(i<m\), we randomly chose a value from
`l[]`

or `r[]`

. If \(i\ge m\), we can only take the value from
`r[]`

because all values in `l[]`

are merged. Thus
we have:

\[ \begin{equation} Pr\{A_i|A_0A_1\ldots A_{i-1}\} = \left\{ \begin{aligned} \frac{1}{2}, && i \in [0,m) \\ 1, && i \in [m, n) \end{aligned} \right. \end{equation} \]

Thus we get:

\[ Pr\{A_0A_1\ldots A_{n-1}\} = \frac{1}{2^m} \]

Then we get:

\[ \begin{align} &Pr\{S\} \\ = &Pr\{DA_0A_1\ldots A_{n-1}\} \\ = &(\frac{1}{m!})^2\frac{1}{2^m} \\ \neq &\frac{1}{n!} \end{align} \]

Thus `mshuf()`

is incorrect.

*Q.E.D.*