このSATのAND圧縮に関する論文と矛盾しているように見える何が問題になっていますか？

もっともらしい推測を前提とする論文と矛盾しているように見える単純な構造を得た。

予想が誤りである可能性は低いので、議論の何が問題になっていますか？

AND圧縮は、SATインスタンスのセットをマップする確定的多項式時間アルゴリズムです。 $x_1,\dots,x_t$ 単一のSATインスタンスに $y$ サイズの $\text{poly}(\max_i |x_i|)$ such that $y$ is satisfiable if and only if all $x_i$ are satisfiable. ... Unless the unlikely complexity-theoretic collapse $\sf coNP \subseteq NP/poly$ occurs, there is no AND-compression for SAT.

Construction:

If $x_i$ are not in CNF, convert them to CNF possibly adding new variables. This is polynomial.

In CNF one can encode AND gate $C := A \land B$ and OR gate $C := A \lor B$ .

The AND and OR gates have the property that for all satisfying assignments of their CNFs we have $C \iff (A \land B)$ and $C \iff (A \lor B)$ .

Let the $j$ -th clause in $x_i$ be $x_{i,j} = [y_1, \ldots, y_k]$ for literals $y_m$ .

Using the OR gate and new variables, compute variable $O_{i,j} := y_1 \lor y_2 \cdots \lor y_k$ .

For all clauses in $x_i$ ( $O_{i,j}$ ) and the AND gate compute variable $V_i := O_{i,1} \land O_{i,2} \land \cdots \land O_{i,m}$ .

By construction $x_i \iff V_i$ .

For all $V_i$ , using the AND gate compute $F := V_1 \land \cdots \land V_t$ .

$F \iff x_1 \land \cdots \land x_t$ .

So the final formula $y$ is the union of the CNFs for $O_{i,j}$ , $V_i$ , $F$ , and a unit clause $[F]$ .

$|y|$ is linear in the number of all literals, $t$ is polynomial in $\max |x_i|$ , which makes $|y|$ polynomial in $\max |x_i|$ .

This appears to contradict the claim in the paper, unless the certain collapse happens.

What is wrong with this argument that appears to contradict the claim in the paper?

Similar construction works for OR-compression, when at least one $x_i$ must be satisfiable.

The newly introduced variable are uniquely determined by the original variables.

OR gate in  CNF 3 := 1 \/ 2 : [[1 2 -3],[-1 3],[-2 3]]
AND gate in CNF 3 := 1 /\ 2 : [[-1 -2 3],[1 -3],[2 -3]]

logic satisfiability

— elluser
ソース

You can't compute

O_{i, j}

$O_{i,j}$ (and hence

V_{i, j}

$V_{i,j}$ ), this would imply that you already know the solution to the SAT instances.

— Luke Mathieson

@LukeMathieson This works in practice with the explicit gates, it is computed symbolically on the literals. To compute

T := x \lor y \lor z

$T := x \lor y \lor z$ , using the or gate compute

t 1 := O R (y, z)

$t1:= OR(y,z)$ ,

T := O R (x, t 1)

$T :=OR(x,t1)$ .

— elluser

@user114872 if they're not using the values, how are they then forced by the truth assignment? I can get behind

x_{i}

$x_{i}$ is satisfiable

\Rightarrow

$\Rightarrow$

V_{i}

$V_{i}$ is satisfiable, but I don't see why you get the reverse, the variables

O_{i, j}

$O_{i,j}$ can all just be set to true.

— Luke Mathieson

@LukeMathieson I am not constructing solution, I am constructing CNF

y

$y$ (work with the literals symbolically). Intuitively, convert CNF

x_{i}

$x_i$ to circuit with gate

o_{i} ⟺ x_{i}

$o_i \iff x_i$ . Add AND gates between all

o_{i}

$o_i$ . Convert the final circuit to CNF forcing the final gate to TRUE. This is polynomial too.

— elluser

@user114872 I understand that you're constructing a CNF formula, what I'm suggesting is that it doesn't obey the condition that

F

$F$ is satisfiable if and only if all

x_{i}

$x_{i}$ s are satisfiable -

F

$F$ is satisfiable just by setting all the

O_{i, j}

$O_{i,j}$ variables to true. This tells us nothing about the instances

x_{i}

$x_{i}$ .

— Luke Mathieson

回答:

The confusion arises from a misunderstanding of what being polynomial in the size of the largest instance means. It does not mean that polynomial growth of the compressor's output is allowed as the number of instances ( $t$ ) increases. Rather it means the compressor's output is allowed to grow only as the maximum instance size grows, independent of $t$ , and then only within a fixed polynomial of that value.

Your compressor concatenates the CNF instances, converts them to circuits, and then converts the circuits back to CNF. Its output grows linearly with the number of instances ( $t$ ) independent of the instance sizes and is therefore bound to exceed any polynomial function of the maximum instance size. Because of this, indeed because it does not do any compression at all, your compressor does not cause any hierarchy collapse.

There also seems to be a misunderstanding as to what the AND-compressor function is supposed to do. It does not have to output an instance that preserves all satisfying assignments for all the input instances, nor does there need to be a parsimonious counting reduction from the input instances to the output instance. A conforming implementation could at the extreme output an empty instance if all the input instances are satisfiable and a single empty clause otherwise.

A more plausible AND-compressor might adapt standard data compression techniques such as the use of a dictionary. A table of unsatisfiable subformulas could be precomputed and used to search against the input SAT instances. A match would imply a forbidden variable assignment, which when instantiated as a learned clause might subsume (eliminate) several other clauses.

Another technique involves taking advantage the different sizes of the input instances. If the input instances are of different sizes, the larger instances could be searched for subformulas equivalent to one of the smaller instances. Such a match implies the existence of a variable assignment that could produce the subformula, and therefore the larger instance is satisfiable if the smaller instance is satisfiable. If the smaller instance is unsatisfiable, the larger instance might still be satisfiable but it doesn't matter because compressor output should still produce an unsatisfiable formula. So the output instance is effectively decoupled from the satisfiability of the larger instance and thus the larger instance no longer has to be represented in the output at all, eliminating a large number of clauses.

There are many other techniques. The theorem in the paper suggests that efficiently achieving high levels of compression is unlikely given the implied partial collapse of the polynomial hierarchy.

— Kyle Jones
ソース

Thank you, but this confuses me even more. Edited the question basically with this: Take

t = 2^{n}, max | x_{i} | = n^{2}

$t = 2^n, \max |x_i|=n^2$ . As

n

$n$ tends to infinity

t

$t$ is not polynomial in

max | x_{i} |

$\max |x_i|$ . Any_ operation depending on all

x_{i}

$x_i$ is exponential, so in this case appears to me polynomial solution independent of

t

$t$ can't exist for the simple reason the size of the input is exponential in

max | x_{i} |

$\max |x_i|$ .

— elluser

You've mixed constraints. The compressor has to run in time polynomial to the size of the input, but the size of its output has to be polynomial to the size of the largest single input instance. The output size requirement has no bearing on how much time you're allowed to read and process the input.

— Kyle Jones

You mean the running time can be arbitrary (not polynomial), just the output

y

$y$ to be polynomial?

— elluser

No, the running time has to be polynomial to the size of the input. As the input grows larger, so does the worst case running time, but at the same polynomial to the size of the input.

— Kyle Jones

The size of the input is all SAT instances and is

\sum_{i = 1}^{t} | x_{i} |

$\sum_{i=1}^t |x_i|$ ?

— elluser

Re Edit 3:

$\langle F\hspace{-0.02 in},\hspace{-0.03 in}T\hspace{.02 in}\rangle\:$ is a satisfying assignment for the instance " $\left(\lnot \hspace{.03 in}x_{\hspace{.02 in}0}\right) \land x_1\hspace{-0.04 in}$ ".
$\langle T\hspace{-0.02 in},\hspace{-0.03 in}F\hspace{.03 in}\rangle\:$ is a satisfying assignment for the instance " $x_{\hspace{.02 in}0} \land \lnot \hspace{.02 in}x_1\hspace{-0.04 in}$ ".
The instances " $\left(\lnot \hspace{.03 in}x_{\hspace{.02 in}0}\right) \land x_1\hspace{-0.04 in}$ " and " $x_{\hspace{.02 in}0} \land \lnot \hspace{.02 in}x_1\hspace{-0.04 in}$ " are both satisfiable.
The instance " $\left(\lnot \hspace{.03 in}x_{\hspace{.02 in}0}\right) \land x_1\hspace{-0.04 in}$ " is satisfiable iff the instance " $x_{\hspace{.02 in}0} \land \lnot \hspace{.02 in}x_1\hspace{-0.04 in}$ " is satisfiable.
$\langle F\hspace{-0.02 in},\hspace{-0.03 in}T\hspace{.02 in}\rangle\:$ is not a satisfying assignment for the instance " $x_{\hspace{.02 in}0} \land \lnot \hspace{.02 in}x_1\hspace{-0.04 in}$ ".
$\langle F\hspace{-0.02 in},\hspace{-0.03 in}T\hspace{.02 in}\rangle\:$ is a satisfying assignment for the instance " $\left(\lnot \hspace{.03 in}x_{\hspace{.02 in}0}\right) \land x_1\hspace{-0.04 in}$ " but not for the instance " $x_{\hspace{.02 in}0} \land \lnot \hspace{.02 in}x_1\hspace{-0.04 in}$ ".
Thus the transformation from the instance " $\left(\lnot \hspace{.03 in}x_{\hspace{.02 in}0}\right) \land x_1\hspace{-0.04 in}$ " to the instance
" $x_{\hspace{.02 in}0} \land \lnot \hspace{.02 in}x_1\hspace{-0.04 in}$ " does not preserve all satisfying assignments of the input instance,
but is such that " $\left(\lnot \hspace{.03 in}x_{\hspace{.02 in}0}\right) \land x_1\hspace{-0.04 in}$ " is satisfiable iff " $x_{\hspace{.02 in}0} \land \lnot \hspace{.02 in}x_1\hspace{-0.04 in}$ " is satisfiable.

Here is a not-necessarily-efficient version of an AND-compression algorithm for SAT:
Determine whether or not all of the input SAT-instances are satisfiable. $\:$ If they all are,
then output the single SAT-instance " $\hspace{-0.02 in}x_{\hspace{.02 in}0}\hspace{-0.04 in}$ ", else output the single SAT-instance " $\hspace{-0.02 in}x_{\hspace{.02 in}0} \land \lnot x_{\hspace{.02 in}0}\hspace{-0.03 in}$ ".

Thank you, this helped. Are these compressable: take

t

$t$ (as large as possible) instances on disjoint variables of the same size. The instances are state of the art random 3-SAT. Since they are on disjoint variables, circuit minimization doesn't work. One can't compress every single SAT instances, since this will solve it by sufficient calls to the minimization.

— elluser

Those are compressible (though not necessarily efficiently); just use the algorithm that I gave.

$\:$ The only relation between this problem circuit minimization that I see is the fact that, depending on whether not circuits are required to actually map each input to a node, SAT may be efficiently reducible to circuit minimization.

$\;\;\;\;$

"since this will solve" what "by sufficient calls to the minimization"?

$\:$ One can compress every single SAT instance (though not necessarily efficiently); just replace "all of the input SAT-instances are satisfiable. If they all are" in my answer with "the input SAT-instance is satisfiable. If it is".

$\;\;\;\;$

I mean efficient minimization/compression. Certainly solving gives O(1) compression.

— elluser

The disjointness of the variables is irrelevant.

$\:$ I suspect that there are no known non-trivial algorithms for AND-compression of such distributions and no known (other) seemingly unlikely consequences of the existence of an efficient AND-compression algorithm for such distributions.

$\;\;\;\;$

The last things that used to confuse me were the bounds on $t$ and the exact definition of polynomial.

The paper "Infeasibility of instance compression and succinct PCPs for NP" helped a lot.

It is about OR-compression, but works.

In the notation of the original paper $t=m$ and $\max |x_i|=n$ .

The paper explains:

" $n$ possibly can be much less than $m$

(the algorithm) runs in time polynomial in $m$ and $n$

— elluser
ソース