\myheading{Using Z3 theorem prover to prove equivalence of some weird alternative to XOR operation}
\label{weird_XOR}

\leveldown{}

(The test was first published in my blog at April 2015: \url{http://blog.yurichev.com/node/86}).

There is a ``A Hacker's Assistant'' program\footnote{\url{http://www.hackersdelight.org/}} (\emph{Aha!}) written by Henry Warren,
who is also the author of the great ``Hacker's Delight'' book.

The \emph{Aha!} program is essentially \emph{superoptimizer}\footnote{\url{http://en.wikipedia.org/wiki/Superoptimization}},
which blindly brute-force a list of some generic RISC CPU instructions to achieve shortest possible (and jumpless or branch-free) 
CPU code sequence for desired operation.
For example, \emph{Aha!} can find jumpless version of abs() function easily.

Compiler developers use superoptimization to find shortest possible (and/or jumpless) code,
but I tried to do otherwise---to find longest code for some primitive operation.
I tried \emph{Aha!} to find equivalent of basic XOR operation without usage of the actual XOR instruction,
and the most bizarre example \emph{Aha!} gave is:

\begin{lstlisting}
Found a 4-operation program:
   add   r1,ry,rx
   and   r2,ry,rx
   mul   r3,r2,-2
   add   r4,r3,r1
   Expr: (((y & x)*-2) + (y + x))
\end{lstlisting}

And it's hard to say, why/where we can use it, maybe for obfuscation, I'm not sure.
I would call this \emph{suboptimization} (as opposed to \emph{superoptimization}).
Or maybe \emph{superdeoptimization}.

But my another question was also, is it possible to prove that this is correct formula at all?
The \emph{Aha!} checking some input/output values against XOR operation, but of course, not all the possible values.
It is 32-bit code, so it may take very long time to try all possible 32-bit inputs to test it.

We can try Z3 theorem prover for the job. It's called \emph{prover}, after all.

So I wrote this:

\begin{lstlisting}
#!/usr/bin/python
from z3 import *

x = BitVec('x', 32)
y = BitVec('y', 32)
output = BitVec('output', 32)
s = Solver()
s.add(x^y==output)
s.add(((y & x)*0xFFFFFFFE) + (y + x)!=output)
print s.check()
\end{lstlisting}

In plain English language, this means
``are there any case for $x$ and $y$ where $x \oplus y$ doesn't equals to $((y \& x)*-2) + (y + x)$?''
\dots and Z3 prints ``unsat'', meaning, it can't find any counterexample to the equation.
So this \emph{Aha!} result is proved to be working just like XOR operation.

Oh, I also tried to extend the formula to 64 bit:

\begin{lstlisting}
#!/usr/bin/python
from z3 import *

x = BitVec('x', 64)
y = BitVec('y', 64)
output = BitVec('output', 64)
s = Solver()
s.add(x^y==output)
s.add(((y & x)*0xFFFFFFFE) + (y + x)!=output)
print s.check()
\end{lstlisting}

Nope, now it says ``sat'', meaning, Z3 found at least one counterexample.
Oops, it's because I forgot to extend -2 number to 64-bit value:

\begin{lstlisting}
#!/usr/bin/python
from z3 import *

x = BitVec('x', 64)
y = BitVec('y', 64)
output = BitVec('output', 64)
s = Solver()
s.add(x^y==output)
s.add(((y & x)*0xFFFFFFFFFFFFFFFE) + (y + x)!=output)
print s.check()
\end{lstlisting}

Now it says ``unsat'', so the formula given by \emph{Aha!} works for 64-bit code as well.

\myheading{In SMT-LIB form}

Now we can rephrase our expression to more suitable form: $(x + y - ((x \& y)<<1))$.
It also works well in Z3Py:

\begin{lstlisting}
#!/usr/bin/python
from z3 import *

x = BitVec('x', 64)
y = BitVec('y', 64)
output = BitVec('output', 64)
s = Solver()
s.add(x^y==output)
s.add((x + y - ((x & y)<<1)) != output)
print s.check()
\end{lstlisting}

Here is how to define it in SMT-LIB way:

\begin{lstlisting}
(declare-const x (_ BitVec 64))
(declare-const y (_ BitVec 64))
(assert 
	(not
		(=
			(bvsub
				(bvadd x y)
				(bvshl (bvand x y) (_ bv1 64)))
			(bvxor x y)
		)
	)
)
(check-sat)
\end{lstlisting}

\myheading{Using universal quantifier}

Z3 supports universal quantifier \TT{exists}, which is true
if at least one set of variables satistfied underlying condition:

\begin{lstlisting}
(declare-const x (_ BitVec 64))
(declare-const y (_ BitVec 64))
(assert 
	(exists ((x (_ BitVec 64)) (y (_ BitVec 64)))
		(not (=
			(bvsub 
				(bvadd x y)
				(bvshl (bvand x y) (_ bv1 64))
			)
			(bvxor x y)
		))
	)
)
(check-sat)
\end{lstlisting}

It returns ``unsat'', meaning, Z3 couldn't find any counterexample of the equation, i.e., it's not exist.\\
\\
This is also known as $\exists$ in mathematical logic lingo.\\
\\
Z3 also supports universal quantifier \TT{forall}, which is true if the equation is true for all
possible values.
So we can rewrite our SMT-LIB example as:

\begin{lstlisting}
(declare-const x (_ BitVec 64))
(declare-const y (_ BitVec 64))
(assert 
	(forall ((x (_ BitVec 64)) (y (_ BitVec 64)))
		(=
			(bvsub 
				(bvadd x y)
				(bvshl (bvand x y) (_ bv1 64))
			)
			(bvxor x y)
		)
	)
)
(check-sat)
\end{lstlisting}

It returns ``sat'', meaning, the equation is correct for all possible 64-bit \TT{x} and \TT{y} values,
like them all were checked.

Mathematically speaking: $\forall n\!\in\!\mathbb{N}\; (x \oplus y = (x + y - ((x \& y)<<1)))$
\footnote{
$\forall$ means \emph{equation must be true for all possible values}, which are chosen from natural numbers ($\mathbb{N}$).}

\myheading{How the expression works}

First of all, binary addition can be viewed as binary XORing with carrying (\ref{adder}).
Here is an example: let's add 2 (10b) and 2 (10b).
XORing these two values resulting 0, but there is a carry generated during addition of two second bits.
That carry bit is propagated further and settles at the place of the 3rd bit: 100b.
4 (100b) is hence a final result of addition.

If the carry bits are not generated during addition, the addition operation is merely XORing.
For example, let's add 1 (1b) and 2 (10b). $1 + 2$ equals to 3, but $1 \oplus 2$ is also 3.

If the addition is XORing plus carry generation and application, we should eliminate effect of carrying somehow here.
The first part of the expression ($x + y$) is addition, the second ($(x \& y)<<1$) is just calculation of every carry bit which was used during addition.
If to subtract carry bits from the result of addition, the only XOR effect is left then.

It's hard to say how Z3 proves this:
maybe it just simplifies the equation down to single XOR using simple Boolean algebra rewriting rules?

\levelup{}