A Nifty Non-Replacing Selection Algorithm

Algorithms are awesome fun, so I was super pleased when my little bro asked me to help him with a toy problem he had.

The description is this: It’s a secret santa chooser. A group of people, where each person has to be matched up with one other person, but not themselves.

He’s setup an array that has an id for each person.

His initial shot was something like this (pseudo, obviously):

foreach $array as $key => $subarr {
  do {
      // $count is set to count($array)
      $var = rand(0, $count)
  } while $var != $key and $var isn't already assigned
  $array[$key][$assign] = $var

Initially he was mostly concerned that rand would get called a lot of times (it’s inefficient in the language he’s using).

However, there’s a ton of neat (non-obvious) problems with this algorithm:

  1. By the time we’re trying to match the last person, we’ll be calling rand (on average) N-1 times
  2. As a result, it’s inefficient as hell ( O(3N+1)/2)? )
  3. There is a small chance that on the last call we’ll actually lock – since we won’t have a non-dupe to match with
  4. Not obvious above, but he also considered recreating the array on every iteration of the loop *wince*

Add to this some interesting aspects of the language – immutable arrays (ie, there’s no inbuilt linked lists, so you can’t del from the middle of an array/list) & it becomes an interesting problem.

The key trick was to have two arrays:

One, 2-dimensional array (first dim holding keys, second the matches)
and one 1-dimensional array (which will only hold keys, in order).

Let’s call the first one “$list” and the second “$valid”.

The trick is this – $valid holds a list of all remaining valid keys, in the first N positions of the array, where initially N = $valid length. Both $list & $valid are initially loaded with all keys, in order.

So, to pick a valid key, we just select $valid[rand(N)] and make sure it’s not equal to the key we’re assigning to.
Then, we do two things:

  1. Swap the item at position rand(N) (which we just selected) with the Nth item in the $valid array, &
  2. Decrement N ($key_to_process).

This has the neat effect of ensuring that the item we just selected is always at position N+1. So, next time we rand(N), since N is now one smaller, we can be sure it’s impossible to re-select the just selected item.

Put another way, by the time we finish, $valid will still hold all the keys, just in reverse order that we selected them.

It also means we don’t have to do any array creation. There’s still a 1/N chance that we’ll self-select of course, but there’s no simple way of avoiding that.

Note that below we don’t do the swap (since really, why bother with two extra lines of code?) we simply ensure that position rand(N) (ie, $key_no) now holds the key we didn’t select – ie, the one that is just off the top of the selectable area.

Oh, and in this rand implementation rand(0, N) includes both 0 AND N (most only go 0->N-1 inclusive).

$valid = array_keys($list);
$key_to_process = count($valid) - 1;
do {
  $key_no = rand(0, $key_to_process);
  if ($key_to_process != $valid[$key_no]) {
    $list[$key_to_process][2] = $valid[$key_no];
    $valid[$key_no] = $valid[$key_to_process];
  # deal with the horrid edge case where the last 
  # $list key is equal to the last available 
  # $valid key
  if ($key_to_process == 0 and $valid[0] == 0) {
    $key_no = rand(1, count($list) - 1);
    $list[0][2] = $key_no;
    $list[$key_no][2] = 0;
} while ($key_to_process >= 0);

Without the edge-case code, this results in a super fast, nice slick little 10 or so line algorithm (depending on how/if you count {}’s :)

Elegant, I dig it.


  • No Related Posts

December 16th, 2008 | Algorithms, Code, Software-Engineering |