\myheading{LZSS decompressor} I've googled for a very simple \ac{LZSS} decompressor and landed at this page:\\ \url{https://web.archive.org/web/20151202051931/http://www.opensource.apple.com/source/boot/boot-132/i386/boot2/lzss.c} Let's pretend, we're looking at unknown compressing algorithm with no compressor available. Will it be possible to reconstruct a compressed piece of data so that decompressor would generate data we need? This is my first experiment: \lstinputlisting{KLEE/klee_lzss1.c} I decreased \TT{COMPRESSED\_LEN} gradually to check, whether KLEE would find compressed piece of data, and it did: \begin{lstlisting} % clang -emit-llvm -c -g -O0 -Xclang -disable-O0-optnone -I /tmp/klee_src/include/klee/ klee_lzss1.c ... % time klee --libc=uclibc --optimize klee_lzss1.bc KLEE: ERROR: klee_lzss1.c:73: invalid klee_assume call (provably false) KLEE: NOTE: now ignoring this error at this location KLEE: ERROR: klee_lzss1.c:46: memory error: out of bound pointer KLEE: NOTE: now ignoring this error at this location KLEE: ERROR: klee_lzss1.c:36: memory error: out of bound pointer KLEE: NOTE: now ignoring this error at this location KLEE: WARNING ONCE: skipping fork (memory cap exceeded) KLEE: WARNING: killing 2660 states (over memory cap: 2110MB) KLEE: WARNING ONCE: calling external: klee_assert(0) at klee_lzss1.c:77 2 KLEE: ERROR: klee_lzss1.c:75: failed external call: klee_assert KLEE: NOTE: now ignoring this error at this location KLEE: done: total instructions = 16853770 KLEE: done: completed paths = 0 KLEE: done: partially completed paths = 184070 KLEE: done: generated tests = 2664 real 6m31.177s user 5m43.183s sys 0m47.280s % ls klee-last/*err* klee-last/test000001.user.err klee-last/test000003.ptr.err klee-last/test000002.ptr.err klee-last/test002664.external.err % ktest-tool klee-last/test002664.ktest ktest file : 'klee-last/test002664.ktest' args : ['klee_lzss1.bc'] num objects: 1 object 0: name: 'input' object 0: size: 15 object 0: data: b'\xffBuffalo \tb\xef\xf4\xee\xf4' object 0: hex : 0xff42756666616c6f200962eff4eef4 object 0: text: .Buffalo .b.... \end{lstlisting} KLEE consumed $\approx 2GB$ of RAM and worked for $\approx 6$ minutes (on AMD Ryzen 5 3600). And here is it, a 15 bytes which, if decompressed by our copypasted algorithm, will result in desired plain text! So how \ac{LZSS} works? Without peeking into Wikipedia, we can say that: if \ac{LZSS} compressor observes some data it already had, it replaces the data with a link to some place in past with size. If it observes something yet unseen, it puts data as is. This is theory. This is indeed what we've got. Desired text is three ``Buffalo'' words, the first and the last are equivalent, but the second is \emph{almost} equivalent, differing with first by one character. That's what we see (better formatted C-string): \begin{lstlisting} "\xff" "Buffalo " "\x09" "b" "\xef\xf4\xee\xf5" \end{lstlisting} Here is some control byte (0xff), ``Buffalo'' word is placed \emph{as is}, then another control byte (0x09), then we see beginning of the second word (``b'') and more control bytes, perhaps, links to the beginning of the buffer. These are command to decompressor, like, in plain English, ``copy data from the buffer we've already done, from that place to that place'', etc. What we've got here is KLEE acting like a LZSS compressor -- very slow, complex and expensive, but compressor. Interesting, is it possible to meddle into this piece of compressed data? Out of whim, can we force KLEE to find a compressed data, where not just ``b'' character has been placed \emph{as is}, but also the second character of the word, i.e., ``bu''? I've modified main() function by adding \TT{klee\_assume()}: now the 11th byte of input (compressed) data (right after ``b'' byte) must have ``u''. I has no luck with 15 byte of compressed data, so I increased it to 16 bytes: \begin{lstlisting} int main() { #define COMPRESSED_LEN 16 uint8_t input[COMPRESSED_LEN]; uint8_t plain[PLAIN_TEXT_LEN]; uint32_t size=COMPRESSED_LEN; klee_make_symbolic(input, sizeof input, "input"); klee_assume(input[11]=='u'); decompress_lzss(plain, input, size); for (int i=0; i