Algorithm | Reservoir Sampling
by Botao Xiao
Randomly get a number with known length
For a list with known length n, each of the number has a possibility 1/n to be selected.
int rand = random.nextInt(n);
return arr[rand];
Randomly get a number without known length
Since we don’t know the total length of the array, we don’t really known what’s the probability of single number is, but we want 1/n. So we use the reservoir sampling to find the value.
The procedure of picking:
- For first one, we hold the value(doesn’t mean to return the value, could be replaced).
- For second one, it has a chance of 1/2 to replace the holding value.
- For third one, it has a chance of 1/3 to replace the holding value.
- For n, it has a chance of 1/n to replace the holding value.
Explaination:
- For a single number, the possibility for any single term to be chosen is:
possibility to be chosen = possibility chosen to hold * possibility not to be replaced
- Provement, 1 <= a <= n
- p = 1/a(hold) * ((a / a + 1) * (a + 1) / (a + 2) * … * (n - 1)/n) = 1/n
Randomly get k numbers without known length
- Same as the previous part, we take the first k.
- Then we take the [1, k + 1].
- finally we take [n - k + 1, n]
Reference
Subscribe via RSS