5
$\begingroup$

I have an idea for solving the knapsack problem, but it looks too good to be true. I would like someone to explain potential problems with this approach. I'll give an example: I want to find a subset of {2,7,11} which has elements that sum to 13. Here is the algorithm for solving this:

In binary notation 2=0010, 7=0111, 11=1011, 13=1101. Suppose that 2x+7y+11z=13, where x,y,z are {0,1}.

Then y+z=1 (mod 2), since the last bits of 7, 11, and 13 are 1 and the last bit of 2 is 0.

And (y+z-1)+2*(x+y+z)=0 (mod 4), since the second to last bits of 2, 7, and 11 are 1 and the second to last bit of 13 is 0. (The y+z-1 is the carry bit from before.)

And ((y+z-1)+2*(x+y+z)-0)+4y=4 (mod 8), since the third to last bits of 7 and 13 are 1 and the third to last bit of 2 and 11 are 0. (The (y+z-1)+2*(x+y+z)-0 is the carry bit from before.)

And finally ((y+z-1)+2*(x+y+z)-0)+4y-4)+8z=8 (mod 16), since the first bits of 11 and 13 are 1 and the first bit of 2 and 7 are 0. (The ((y+z-1)+2*(x+y+z)-0)+4y-4 is the carry bit from before.)

So we have four linear equations over a finite ring, if we change each one to mod 16.

8y+8z=8 (mod 16)

8x+12y+12z=4 (mod 16)

4x+14y+6z=10 (mod 16)

2x+7y+11z=13 (mod 16)

As we can see, a solution is x=1, y=0, z=1, as one would expect, since the set {2,11} has elements that sum to 13. We could have gotten this through Gaussian elimination (according to answers to my last question on this site).

By request, here is the general algorithm that I just wrote (which can be induced from the above example): We want to solve a_1*x_1 + ... a_n*x_n = b, where each a_j and b is an integer and each x_j is in {0,1}. Putting each a_j and b in binary, we have a_j=(a_{mj} ... a_{1j}) and b=(b_m ... b_1), for suitable m. For instance, a_j=5 in binary is (a_{4j} ... a_{1j})=(0 1 0 1) - highest bit on the left side, lowest on the right side.

Here is the simple algorithm for constructing a new matrix A=(a_{ij}) from this original (a_{ij}) and a new b=(b_i) from this original b:

`Let b_1 = 2^{m-1}*b_1; For j=1 to n Let a_{1j} = 2^{m-1}*a_{1j};

For i=2 to m { Let b_i = 2^{m-1}*b_i + b_{i-1}/2; For j=1 to n Let a_{ij} = 2^{m-1}*a_{ij} + a_{(i-1),j}/2; }`

Next solve the matrix equation Ax=b (mod 2^m), for the new A=(a_{ij}) and b=(b_i). If there is a solution to the knapsack problem, there should be a solution for x in {0,1} in Ax=b (mod 2^m) and vice versa (since the last row is just the original a_j and b). I'm curious to see if it works for large m and n. Anyone want to give it a shot?

What's wrong with this approach? It shouldn't work, but I don't know why.

  • 2
    I suggest you first generalize this method to the general case to see what you get.2010-12-01
  • 0
    I've tried it on other small examples like 2x+5y+6z=7 and 5x+7y+2z=12 and it's worked.2010-12-01
  • 3
    'where x,y,z are binary': Huh? I had to follow the argument for a few more steps before I realised what this meant: 'where x,y,z are all 0 or 1'.2010-12-01
  • 0
    @TonyK, thank you, I changed it.2010-12-01
  • 0
    If I understand this correctly, from {a,b,c} to get d, you have to solve log(d) equations?2010-12-01
  • 3
    @Craig Try to formulate the general algorithm, or at least test it on more than 3 numbers.2010-12-01
  • 1
    Why does it surprise you that "it works"? Knapsack _is_ computable after all! It is, however, thought not to be computable on a Turing machine in polynomial time but you hardly can expect qualified assessments in this direction without giving the general algorithm.2010-12-01
  • 0
    @Raphael I'll give the general algorithm later tonight; however, for now you should be able to generalize from the example. What makes this algorithm is too good to be true is the fact that it works in polynomial-time, since solving these matrix equations takes polynomial-time. So there has to be a catch.2010-12-01
  • 0
    OK, just added the general algorithm.2010-12-02
  • 0
    I think the problem is that your algorithm isn't bounded in the number of terms, but grows based on the size of N.2010-12-02
  • 0
    @Liberalkid The matrix is of size m x n, where n is the number of elements in the original set and a_i>0 and b>0 are less than 2^m. The matrix is of polynomial size as a function of the size of the input. So this algorithm definitely runs in polynomial-time.2010-12-02
  • 0
    And what if the lowest value of m that suffices is 10^5?2010-12-02
  • 0
    There's something strange about your algorithm. The four linear equations you wrote down, in mod 16, are equivalent, so I don't think you can perform Gaussian elimination.2010-12-02

3 Answers 3

9

The problem is that the equations generated are all linearly dependent on the original equation, so finding solutions of the matrix is just as hard as finding solutions to just the original equation. It works in the given example only because the matrix is so small that trial-and-error makes a solution apparent.

To use your example, let $S = \{2,7,11\}$ and $N = 13$. We want to find a subset of $S$ whose elements sum to $N$. Looking at your generated solutions modulo $2^n$, notice that $$2x + 7y + 11z \equiv 13 \pmod{16} \, \Rightarrow \, 4x + 14y + 6z \equiv 10 \pmod{16}$$ $$4x + 14y + 6z \equiv 10 \pmod{16} \, \Rightarrow \, 8x + 12y + 12z \equiv 4 \pmod{16}$$ $$8x + 12y + 12z \equiv 4 \pmod{16} \, \Rightarrow \, 8y + 8z \equiv 8 \pmod{16}$$ Since the reverse implications are not true in general, "simplifying" the equation in this manner does no good, as solving a simpler equation has no effect than to rule out possibilities (and as the problem scales it becomes infeasible to keep track of all remaining possibilities).

  • 0
    Brandon, I think you've solved the puzzle. They are all the same equations!2010-12-02
3

I'm not sure how one sees that finding solutions to the linear system over {0,1} can be accomplished in polynomial time. Reducing the matrix to row-echelon form is not sufficient for finding such solutions. Suppose that the reduced system has a parametrized solution set, as happens in your example. Is there any way, short of plugging in 0 and 1 for the free parameters in all possible ways and then checking the values of the remaining variables, of finding {0,1} solutions? Maybe, but I don't think this is immediately clear. It seems possible that there might be many free parameters, and that there would then be a combinatorial explosion in the number of cases that needed to be checked in order to find {0,1} solutions.

1

When you solve your equations mod 16, you get x, y, and z mod 16. Not mod 2. So all you have is integers x, y, and z between 0 and 15 inclusive such that 2x+7y+11z=13.