\myheading{Hacking CRC64 with KLEE}

All timings are when KLEE 3.0 running on AMD Ryzen 5 3600.

\leveldown{}

\renewcommand{\CURPATH}{CRC/KLEE}

\myheading{Buffer alteration case \#1}

Sometimes, you need to alter a piece of data which is \emph{protected} by some kind of checksum or \ac{CRC}, and you can't change checksum or CRC value, but can alter piece of data so that checksum will remain the same.

Let's pretend, we've got a piece of data with ``Hello, world!'' string at the beginning and ``and goodbye'' string at the end.
We can alter 14 characters at the middle, but for some reason, they must be in \emph{a..z} limits, but we can put any characters there.
CRC64 of the whole block must be \TT{0x12345678abcdef12}.

Let's see\footnote{There are several slightly different CRC64 implementations,
the one I use here can also be different from popular ones.}:

\lstinputlisting[style=customc]{\CURPATH/klee_CRC64.c}

We got 3 unit tests, the last one is correct:

\lstinputlisting{\CURPATH/klee_CRC64.txt}

Since our code uses memcmp() standard C/C++ function, we need to add \TT{--libc=uclibc} switch,
so KLEE will use its own uClibc implementation. % \ref{} -> closed programs

Maybe it's slow, but definitely faster than bruteforce.
Indeed, $log_2{26^{14}} \approx 65.8$
which is close to 64 bits.
In other words, one need $\approx 14$ latin characters to encode 64 bits.
And KLEE + \ac{SMT} solver needs 64 bits at some place it can alter to make final CRC64 value equal
to what we defined.

I tried to reduce length of the \emph{middle block} to 13 characters:
no luck for KLEE then, it has no space enough.

\myheading{Buffer alteration case \#2}

I went sadistic: what if the buffer must contain the CRC64 value which,
after calculation of CRC64, will result in the same value?
Fascinatingly, KLEE can solve this.
The buffer will have the following format:

\begin{lstlisting}
Hello, world!.. <8 bytes (64-bit value)> <6 more bytes> ... and goodbye
\end{lstlisting}

\lstinputlisting[caption=Diff]{\CURPATH/klee_CRC64_v2.patch}

It works:

\begin{lstlisting}
% time klee --libc=uclibc --optimize klee_CRC64_v2.bc
...
KLEE: done: total instructions = 16901
KLEE: done: completed paths = 1
KLEE: done: partially completed paths = 32
KLEE: done: generated tests = 3

real    2m4.957s
user    2m4.803s

% ls klee-last/*
klee-last/assembly.ll   klee-last/solver-queries.smt2  klee-last/test000002.ktest
klee-last/info          klee-last/test000001.kquery    klee-last/test000002.user.err
klee-last/messages.txt  klee-last/test000001.ktest     klee-last/test000003.ktest
klee-last/run.istats    klee-last/test000001.user.err  klee-last/warnings.txt
klee-last/run.stats     klee-last/test000002.kquery

% ktest-tool klee-last/test000003.ktest | grep data
object 0: data: b'Hello, world!.. \r\xc6\x98\x14\x05\xd2X\xef\xc2^\r\x01\x1b\xc9 ... and goodbye'
\end{lstlisting}

8 bytes between two strings is 64-bit value which equals to CRC64 of this whole block itself: 0xef58d2051498c60d.
Also, there are 14-8=6 bytes after that 64-bit value which seems random, but it's OK.
Again, it's faster than brute-force way to find it.
If to decrease last spare 6-byte buffer to 4 bytes or less, KLEE works so long so I've stopped it.

\myheading{Recovering input data for given CRC64 value of it}

I've always wanted to do so, but everyone knows this is impossible for input buffers larger than 8 bytes.
As my experiments show, it's still possible for tiny input buffers of data, which is constrained in some way.

The CRC64 value of 13-byte ``undisciplined'' string is known: \TT{0x811265a32d6ac13a}.
KLEE can find this 13-byte string, if it knows that each byte of input buffer is in \emph{a..z} limits:

\lstinputlisting[numbers=left,style=customc]{\CURPATH/klee_CRC64_undisciplined.c_part}

\begin{lstlisting}
% clang -emit-llvm -c -g -O0 -Xclang -disable-O0-optnone -I /tmp/klee_src/include/klee/ klee_CRC64_undisciplined.c
...

% time klee --libc=uclibc --optimize klee_CRC64_undisciplined.bc
...
real    0m13.948s
user    0m13.829s
...

% ls klee-last/*
klee-last/assembly.ll  klee-last/messages.txt  klee-last/run.stats            klee-last/test000001.ktest
klee-last/info         klee-last/run.istats    klee-last/solver-queries.smt2  klee-last/warnings.txt

% ktest-tool klee-last/test000001.ktest | grep data
object 0: data: b'undisciplined'
\end{lstlisting}

Still, it's no magic: if to remove condition at line 30 (i.e., if to relax constraint),
KLEE will produce some other string, which will be still correct for the CRC64 value given.

It works, because 13 Latin characters in \emph{a..z} limits contain $\approx 61$ bits:
$log_2{26^{13}} \approx 61$, which is even smaller value than 64.
In other words, the final CRC64 value holds enough bits to recover $\approx 61$ bits of input.

The input buffer can be even bigger, if each byte of it will be in even tighter
constraints (decimal digits, binary digits, etc).

\myheading{Finding string that results in CRC64(string)==0}

I always was wondering, how to find a string so that result of CRC64 operation will be zero.
Again, this string is constrained: each character is in \emph{a..z} range.

\lstinputlisting[style=customc]{\CURPATH/klee_CRC64_zero.c_part}

\begin{lstlisting}
% ktest-tool klee-last/test000001.ktest | grep data
object 0: data: b'adnwsskbroggni'
\end{lstlisting}

So \verb|CRC64("adnwsskbroggni")==0|, cool.

Buf shorter string (13 character) isn't possible, alas.

Every tool has its limitations, so is KLEE.
I couldn't find such string with characters in \emph{0..9} range.

\levelup{}