The sourcecode for rainbowcrack piqued my interest this weekend. I was looking through it and noticed the the whole thing was C++ with the exception of one part, see below
__asm__ __volatile__ ( "mov %2, %%eax;""xor %%edx, %%edx;"
"divl %3;"
"mov %%eax, %0;"
"mov %%edx, %1;"
: "=m"(nIndexOfX32), "=m"(nTemp)
: "m"(nIndexOfX32), "m"(nPlainCharsetLen)
: "%eax", "%edx"
);
</code></pre>
Well, I think I read through one or two pages of the Art of Assembly 3 or 4 years ago. Basically I know nothing about assembly.
I wanted to know what was so important about this code that it just had to be different from every other part of the app; so I started googling.
It was pretty easy to find out what the mov does. This website</a> was a god-send. They mentioned that a mov command in Intel ASM takes the value of the second argument and puts it in the first argument.
"Ordering of operands</strong> : Unlike to Intel convention (first operand is destination), the order of operands is source(s) first, and destination last. For example, Intel syntax "mov eax, edx</code>" will look like "mov %edx, %eax</code>" in AT&T assembly."
so
mov eax, nIndexOfX32 = Intel
mov %2, %%eax; = AT&T
</code></pre>
They both do the same thing, the arguments are just flipped.
Then I started wondering what the heck the lines starting with colons meant. Once again, that same article</a> had the answer. Apparently in extended assembly, you can specify the "input registers, output registers and a list of clobbered registers." Well, that's exactly what's going on.</span>
The first row is specifying the output data. The second row the input data, and the third row, a list of registers that are being "clobbered". I found this meant "the clobbered register after the third colon informs the GCC that the value of the register is to be modified inside "asm", so GCC won’t use this register to store any other value." Aha!</span>
Well, I was moving along, but was once again stuck. What the heck do the %0, %1, etc values mean. Well, apparently it's the order that the outputs and inputs are listed (starting from 0).
So in my example, the pointer to the memory location nIndexOfX32 is being put in the eax register (ok I'm not entirely sure this is what's going on, but from what I've read, that's what I've come up with.)
The next thing that confused me was that xor command. I know the theory behind an xor, but I couldn't figure out why they were xor'ing the same values. I also couldn't figure out the divl thing. I knew from reading</a> that the div had to do with division, and the 'l' part of that meant it was dividing a long. With regard to the div instruction, I couldn't figure out what was being divided by the %3 value. Ya know, cause normally in division you have a value1/value2. So where was the magical first value.
That's when I found this</a> page though, and the whole xor divl thing became instantly clear