13
$\begingroup$

What is the least number of comparisons we need to find the median of 6 distinct numbers?

I am able to find the answer to the median of 5 distinct numbers to be 6 comparisons, and it makes sense, however in the case of 6 numbers I can't find an answer.

The best I was able to do it in by hand was 9 comparisons. Can that be minimized further?

Edit: Median in this case, we are assuming to be the lower median.

2 Answers 2

11

It takes at least 8 and can be done in exactly 8.

EDIT

As pointed out by hardmath, the lower bound below is actually an upper bound on the lower bound. JeffE's notes which I link at the bottom give a value of $7$, not $8$.

See this cstheory question here (again thanks to hardmath): https://cstheory.stackexchange.com/questions/12257/exact-number-of-comparisons-to-compute-the-median/12287#12287

which does indeed say the value is $8$.

Note: the below is an incomplete proof.

Lower Bound

A lower bound to find all the $k$ smallest elements is given here: Lower bounds for Selection.

Which says we need at least

$$ n - k + \sum_{n+1-k < j \le n} \lceil \log_{2} j \rceil $$

For $n=6, k=3$, this gives $9$.

$$ 6 - 3 + \lceil \log_{2} 5 \rceil + \lceil \log_{2} 6 \rceil = 9$$

Thus to find the smallest, second smallest and the third smallest, we need at least $9$ compares.

Now any algorithm which finds the $3^{rd}$ smallest element (say $z$) will have to find the set $S = \\{x, y \\}$ such that $x < z$ and $y < z$ (otherwise how does it know that $z$ is the third largest?).

An extra compare between $x$ and $y$ will now give us the smallest, second smallest and third smallest numbers.

Thus any algorithm which finds the third smallest must use at least 8 comparisons.


Upper Bound

As I.J. Kennedy notes, there is an algorithm to find the 3rd smallest element of 6 elements in 8 comparisons and can be found in Knuth's Art of Computer Programming Vol 3, Page 212.

According to Knuth, this is a result obtained by A. Hadian and M. Sobel [Univ of Minnesota, Dept of Statistics Report 121 (169)] and they show that the $k^{th}$ largest element can be found in

$$ n - k + (k-1) \lceil \log_{2} (n+2-k) \rceil $$

comparisons. For $n=6$ and $k=4$ we get the upper bound of $8$.


Algorithm

The way they do it (Knuth's book has all this information) is:

First create a knockout tournament on $n-k+2$ items, which takes $n-k+1$ compares.

The largest item (say $L$) of this cannot be a candidate. So of the remaining $k-2$ pick one and follow the path of $L$ up the tree, which gives the the new largest in at most $\lceil \log_{2} n+2-k \rceil$ compares. Remove this new largest, replacing it with one of the remaining and following up the path of this new largest.

Continue till the rest of the $k-2$ are exhausted. Finally replace the largest with $-\infty$ and determine the largest of the remaining $n+1-k$ which will be the $k^{th}$ largest as we have removed the largest $k-1$ so far.


Related: http://compgeom.cs.uiuc.edu/~jeffe/teaching/497/02-selection.pdf

  • 0
    Oh wait. Miscalculation. This gives 7.2010-10-18
  • 0
    And if it gives 7, I can't seem to reproduce that by hand..I can only do 9. Any chance you can figure out how the comparisons would be?2010-10-18
  • 0
    @Sev: This is not a definitive answer. The fact that this gives 7 does not mean it can be done in 7. It is just a _lower_ bound. It means it is one of 7,8,9. I am looking to see if that above paper has tighter bounds.2010-10-18
  • 0
    Thanks. Besides the theoretical, I'm looking for the smallest number of comparisons it *can* be done in. But of course the theoretical helps.2010-10-18
  • 0
    Also can you explain why you use 3 and not 4? Since the paper states, the k-th largest element. Not the k-th smallest.2010-10-18
  • 0
    @Sev: The reason I posted it was the I miscalculated and got 9. So it seemed you had the optimal algorithm. Unless you prove you cannot do better, you cannot be sure if what you have is the best. You have to get theoretical and start proving lower bounds/optimality. So, you can't go anywhere 'besides the theoretical'. As to k=3 or 4. It does not matter. Just invert your comparison function. So the paper applies to both smallest and largest.2010-10-18
  • 1
    The stated equation seems strange, in that the number of comparisons should be the same for k and n+1-k. Finding the kth largest is finding the (n+1-k)th smallest, so just flip the sense of the comparison. In your case of n=6, finding the 3rd or 4th should take the same number of comparisons.2010-10-18
  • 0
    Makes sense, thank you!2010-10-18
  • 0
    Perfect!! Thanks so much.2010-10-19
  • 0
    @Moron: would you agree with Kennedy about the upper bound being 8?2010-10-19
  • 0
    @Sev: Currently I am away from said Knuth's book. I will confirm that once I get access to it.2010-10-19
  • 0
    @Moron: thank you very much for all the help!2010-10-19
  • 0
    @Sev: Thank you for the interesting problem :-)2010-10-19
  • 1
    I found this Answer worth linking to a new one, about [finding the 3rd largest of seven numbers](http://math.stackexchange.com/q/861602/3111). However despite the title Lower Bounds on that Wikipedia page, the formula you appeal to in arguing a lower bound of 8 here is actually [an upper bound](https://en.wikipedia.org/wiki/Selection_algorithm#Lower_bounds). The class notes [by JeffE(?)](http://cstheory.stackexchange.com/users/111/j%C9%9B%EF%AC%80e) that you link to do give lower bounds on comparisons required (Thms. 1-3), but they give lower bound 7 (for finding the 3rd largest of six).2014-07-10
  • 1
    See also [JeffE's Answer and links](http://cstheory.stackexchange.com/a/12287/3161) on a related Theoretical CS Question.2014-07-10
  • 0
    @hardmath: It seems like you are right. The lower bound I state is actually an upper bound on the lower bound! Thanks for pointing that out. I will edit the answer.2014-07-10
6

It looks like the answer is 8.

My Knuth Volume Three (1970s—not as dusty as you think) reports an upper bound of 8, which, paired with Moron's lower bound of 8, ...

The general form of this question is called the Selection Problem. If you google that phrase you will get lots of useful results.

Edit: Knuth doesn't give an explicit algorithm for finding the median of 6 elements in at most 8 steps (at least in the first edition). However, in exercise 12 of section 5.3.3, he does give the explicit method for finding the median of 7 elements using at most 10 comparisons, which may be of some help.

  • 0
    Can you show me the formula or theorem you saw in the book that gives us the upper bound?2010-10-19
  • 0
    In my copy, it is Table 1 in section 5.3.3, page 215.2010-10-19