\myheading{Simple examples} Let's start with simplest example: \begin{lstlisting} mov rax, rdi imul rax, rsi \end{lstlisting} At start, these symbols are assigned to registers: RAX=initial\_RAX, RBX=initial\_RBX, RDI=arg1, RSI=arg2, RDX=arg3, RCX=arg4. When we handle MOV instruction, we just copy expression from RDI to RAX. When we handle IMUL instruction, we create a new expression, adding together expressions from RAX and RSI and putting result into RAX again. I can feed this to decompiler and we will see how register's state is changed through processing: \begin{lstlisting} python td.py --show-registers --python-expr tests/mul.s ... line=[mov rax, rdi] rcx=('EXPR_SYMBOL', 'arg4') rsi=('EXPR_SYMBOL', 'arg2') rbx=('EXPR_SYMBOL', 'initial_RBX') rdx=('EXPR_SYMBOL', 'arg3') rdi=('EXPR_SYMBOL', 'arg1') rax=('EXPR_SYMBOL', 'arg1') line=[imul rax, rsi] rcx=('EXPR_SYMBOL', 'arg4') rsi=('EXPR_SYMBOL', 'arg2') rbx=('EXPR_SYMBOL', 'initial_RBX') rdx=('EXPR_SYMBOL', 'arg3') rdi=('EXPR_SYMBOL', 'arg1') rax=('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_SYMBOL', 'arg2')) ... result=('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_SYMBOL', 'arg2')) \end{lstlisting} IMUL instruction is mapped to ``*'' string, and then new expression is constructed in \TT{handle\_binary\_op()}, which puts result into RAX. In this output, the data structures are dumped using Python \TT{str()} function, which does mostly the same, as \TT{print()}. Output is bulky, and we can turn off Python expressions output, and see how this internal data structure can be rendered neatly using our internal \TT{expr\_to\_string()} function: \begin{lstlisting} python td.py --show-registers tests/mul.s ... line=[mov rax, rdi] rcx=arg4 rsi=arg2 rbx=initial_RBX rdx=arg3 rdi=arg1 rax=arg1 line=[imul rax, rsi] rcx=arg4 rsi=arg2 rbx=initial_RBX rdx=arg3 rdi=arg1 rax=(arg1 * arg2) ... result=(arg1 * arg2) \end{lstlisting} Slightly advanced example: \begin{lstlisting} imul rdi, rsi lea rax, [rdi+rdx] \end{lstlisting} LEA instruction is treated just as ADD. \begin{lstlisting} python td.py --show-registers --python-expr tests/mul_add.s ... line=[imul rdi, rsi] rcx=('EXPR_SYMBOL', 'arg4') rsi=('EXPR_SYMBOL', 'arg2') rbx=('EXPR_SYMBOL', 'initial_RBX') rdx=('EXPR_SYMBOL', 'arg3') rdi=('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_SYMBOL', 'arg2')) rax=('EXPR_SYMBOL', 'initial_RAX') line=[lea rax, [rdi+rdx]] rcx=('EXPR_SYMBOL', 'arg4') rsi=('EXPR_SYMBOL', 'arg2') rbx=('EXPR_SYMBOL', 'initial_RBX') rdx=('EXPR_SYMBOL', 'arg3') rdi=('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_SYMBOL', 'arg2')) rax=('EXPR_OP', '+', ('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_SYMBOL', 'arg2')), ('EXPR_SYMBOL', 'arg3')) ... result=('EXPR_OP', '+', ('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_SYMBOL', 'arg2')), ('EXPR_SYMBOL', 'arg3')) \end{lstlisting} And again, let's see this expression dumped neatly: \begin{lstlisting} python td.py --show-registers tests/mul_add.s ... result=((arg1 * arg2) + arg3) \end{lstlisting} Now another example, where we use 2 input arguments: \begin{lstlisting} imul rdi, rdi, 1234 imul rsi, rsi, 5678 lea rax, [rdi+rsi] \end{lstlisting} \begin{lstlisting} python td.py --show-registers --python-expr tests/mul_add3.s ... line=[imul rdi, rdi, 1234] rcx=('EXPR_SYMBOL', 'arg4') rsi=('EXPR_SYMBOL', 'arg2') rbx=('EXPR_SYMBOL', 'initial_RBX') rdx=('EXPR_SYMBOL', 'arg3') rdi=('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_VALUE', 1234)) rax=('EXPR_SYMBOL', 'initial_RAX') line=[imul rsi, rsi, 5678] rcx=('EXPR_SYMBOL', 'arg4') rsi=('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg2'), ('EXPR_VALUE', 5678)) rbx=('EXPR_SYMBOL', 'initial_RBX') rdx=('EXPR_SYMBOL', 'arg3') rdi=('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_VALUE', 1234)) rax=('EXPR_SYMBOL', 'initial_RAX') line=[lea rax, [rdi+rsi]] rcx=('EXPR_SYMBOL', 'arg4') rsi=('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg2'), ('EXPR_VALUE', 5678)) rbx=('EXPR_SYMBOL', 'initial_RBX') rdx=('EXPR_SYMBOL', 'arg3') rdi=('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_VALUE', 1234)) rax=('EXPR_OP', '+', ('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_VALUE', 1234)), ('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg2'), ('EXPR_VALUE', 5678))) ... result=('EXPR_OP', '+', ('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg1'), ('EXPR_VALUE', 1234)), ('EXPR_OP', '*', ('EXPR_SYMBOL', 'arg2'), ('EXPR_VALUE', 5678))) \end{lstlisting} \dots and now neat output: \begin{lstlisting} python td.py --show-registers tests/mul_add3.s ... result=((arg1 * 1234) + (arg2 * 5678)) \end{lstlisting} Now conversion program: \begin{lstlisting} mov rax, rdi sub rax, 32 imul rax, 5 mov rbx, 9 idiv rbx \end{lstlisting} You can see, how register's state is changed over execution (or parsing). Raw: \lstinputlisting{\CURPATH/fahr_raw.txt} Neat: \lstinputlisting{\CURPATH/fahr_neat.txt} It is interesting to note that IDIV instruction also calculates reminder of division, and it is placed into RDX register. It's not used, but is available for use. This is how quotient and remainder are stored in registers: \begin{lstlisting} def handle_unary_DIV_IDIV (registers, op1): op1_expr=register_or_number_in_string_to_expr (registers, op1) current_RAX=registers["rax"] registers["rax"]=create_binary_expr ("/", current_RAX, op1_expr) registers["rdx"]=create_binary_expr ("%", current_RAX, op1_expr) \end{lstlisting} Now this is \TT{align2grain()} function\footnote{Taken from \url{https://docs.oracle.com/javase/specs/jvms/se6/html/Compiling.doc.html}}: \begin{lstlisting} ; uint64_t align2grain (uint64_t i, uint64_t grain) ; return ((i + grain-1) & ~(grain-1)); ; rdi=i ; rsi=grain sub rsi, 1 add rdi, rsi not rsi and rdi, rsi mov rax, rdi \end{lstlisting} \lstinputlisting{\CURPATH/align2grain.txt}