\myheading{Amateur cryptography} \leveldown{} (Another term is \textit{Low-Complexity Cryptography}.) This is what you can find in serial numbers, license keys, executable file packers, \ac{CTF}, malware, etc. Sometimes even ransomware (but rarely nowadays, in 2017-2021). Java obfuscators encrypts text strings using very simple algorithms. Amateur cryptography is often can be broken using SMT solver, or even KLEE. Amateur cryptography is usually based not on theory, but on visual complexity: if its creator getting results which are seems chaotic enough, often, one stops to improve it further. This is security based not even on obscurity, but on a chaotic mess. This is also sometimes called ``The Fallacy of Complex Manipulation'' [see \href{https://tools.ietf.org/html/rfc4086}{RFC4086}]. Devising your own cryptoalgorithm is a very tricky thing to do. This can be compared to devising your own \ac{PRNG}. Even famous Donald Knuth in 1959 constructed one, and it was visually very complex, but, as it turns out in practice, it has very short cycle of length 3178 [see \ac{TAOCP} vol.II page 4, (1997).] The very first problem is that making an algorithm which can generate very long expressions is tricky thing itself. A common mistake is to use operations like XOR and rotations/permutations, which can't help much. Even worse: some people think that XORing a value several times can be better, like: $(x \oplus 1234) \oplus 5678$. Obviously, these two XOR operations (or more precisely, any number of it) can be reduced to a single one. Same story about applied operations like addition and subtraction---they all also can be reduced to single one. Real cryptoalgorithms, like IDEA, can use several operations from different groups, like XOR, addition and multiplication. Applying them all in specific order will make resulting expression irreducible. When I prepared this article, I tried to make an example of such amateur hash function: \lstinputlisting[style=customc]{crypto/1.c} KLEE can break it with little effort. Functions of such complexity is common in shareware, which checks license keys, etc. Here is how we can make its work harder by making rotations dependent of inputs, and this makes number of possible inputs much, much bigger: \lstinputlisting[style=customc]{crypto/2.c} Addition (or \textit{modular addition}\footnote{Read more about it in \MathForProg{}}, as cryptographers say) can make things even harder: \lstinputlisting[style=customc]{crypto/3.c} \textit{Heavy operations} for SAT/SMT are shifts/rotates by a variable, division, remainder. \textit{Easy operations}: shifts/rotates by constant, bit twiddling. As en exercise, you can try to make a block cipher which KLEE wouldn't break. This is quite sobering experience. Another exercise is to make an algorithm that converts a plain text in English or your native language to the block of the same length, but with highest possible entropy. It's not an easy task. (Data compression is cheating.) An easy way to test your algorithm: encrypt numbers starting at 0 and feed the resulting ciphertext to \emph{diehard test} \footnote{\url{https://en.wikipedia.org/wiki/Diehard_tests}} (like in Counter/CTR encryption mode). These tests are designed to check \ac{PRNG}s. In other words, these tests shouldn't find any regularities in a list of random numbers, as well as in a ciphertext. Summary: if you deal with amateur cryptography, you can always give KLEE and SMT solver a try. Even more: sometimes you have only decryption function, and if algorithm is simple enough, KLEE or SMT solver can reverse things back. If a SAT/SMT solver can find a key faster than bruteforce, this is usually a very bad symptom for the algorithm. One amusing thing to mention: if you try to implement amateur cryptoalgorithm in Verilog/VHDL language to run it on \ac{FPGA}, maybe in brute-force way, you can find that \ac{EDA} tools can optimize many things during synthesis (this is the word they use for ``compilation'') and can leave cryptoalgorithm much smaller/simpler than it was. Even if you try to implement \ac{DES} algorithm \emph{in bare metal} with a fixed key, Altera Quartus can optimize first round of DES and make it slightly smaller/faster than others. \myheading{Avalanche effect} A significant property of the \textit{serious} cryptography is: ``The two inputs differ by only a single bit, but approximately half the bits are different in the digests.'' [cited from Alan A. A. Donovan, Brian W. Kernighan --- The Go Programming Language]. This is also known as the \emph{avalanche effect} in cryptography. Let's see, how two SHA256 hashes will differ if the input strings are differ in one bit? \begin{lstlisting}[caption=*NIX shell] % echo hello0 | sha256sum aa36f6c221b2295332d8de275e4b9e1fd2383a8ca3f2598022958b82a1f85ce0 - % echo hello1 | sha256sum 8d678c54ff53cac1efd6d62697aeb5a005491cd04b83228fd8411ba9f02b98e7 - \end{lstlisting} \begin{lstlisting}[caption=Wolfram Mathematica] In[]:= a=16^^aa36f6c221b2295332d8de275e4b9e1fd2383a8ca3f2598022958b82a1f85ce0 Out[]= 76990297064032879667747642138410318794708739067059518462895617345038449138912 In[]:= b=16^^8d678c54ff53cac1efd6d62697aeb5a005491cd04b83228fd8411ba9f02b98e7 Out[]= 63959065433925909480062682925690708784612992082320075146058498580507093866727 In[]:= diff=BitXor[a,b] Out[]= 17784161787513000877216087191171407793188241648243295249170496887068053193735 In[]:= DigitCount[diff, 2, 1] Out[]= 130 In[]:= 256/130//N Out[]= 1.96923 \end{lstlisting} Almost a half (130 bits of 256). \myheading{Bugs} Another prominent feature of amateur cryptography is bugs. Bugs here often left uncaught because output of encrypting function visually looked ``good enough'' or ``obfuscated enough'', so a developer stopped to work on it. This is especially feature of hash functions, because when you work on block cipher, you have to do two functions (encryption/decryption), while hashing function is single. Weirdest ever amateur encryption algorithm I once saw, encrypted only odd bytes of input block, while even bytes left untouched, so the input plain text has been partially seen in the resulting encrypted block. It was encryption routine used in license key validation. Hard to believe someone did this on purpose. Most likely, it was just an unnoticed bug. Not uncommon thing is when both encryption and decryption functions are laying side-by-side in the code: probably a programmer just wrote two functions and they are residing in the same source code file/object file. So if you see a license decryption function in code, try to inspect functions near it. Also, not uncommon bug is when a encryption key(s) are present in the executable file. Even, (RSA) private keys that can be used to sign license. \myheading{XOR ciphers} Simplest possible amateur cryptography is just application of XOR operation using some kind of table. Maybe even simpler. This is a real algorithm I once saw: \begin{lstlisting} for (i=0; i