┏━━┓
BACK
┗━━┛
╔══════════════════════════════════════════════════════════════════════════════════╗
║ 15-01-2026 ║
║ Committing Crimes Against Readable Assembly Part 3 ║
║ ║
║ Getting Something to Run ║
║ ║
╠══-----==[ Contents ]==---------------------------------------------------------══╣
║ ║
║ 1: the challenge ║
║ 2: bugs ║
║ a: indeterminate size arguments ║
║ b: protected segments of memory ║
║ c: dealing with >32 bit numbers using xor ║
║ d: wrong registers ║
║ e: addressing mistakes with indices ║
║ f: I made a program where 0000 is valid bytecode... ║
║ 3: a small summary ║
║ ║
╚══════════════════════════════════════════════════════════════════════════════════╝
╔══════════════════════════════════════════════════════════════════════════════════╗
╠══-----==[ 1 ]==----------------------------------------------------------------══╣
║ ║
║ Let's write some silly code ║
║ ║
║ We theoretically have the ability to do non-branching conditionals, so let's set ║
║ ourselves a challenge - can I make a simple xor-only assembly program that ║
║ actually compiles? ║
║ ║
║ As there are lots of moving parts, for my own sanity I'm going to refer to some ║
║ xor constructs by their abbreviated name, rather than the full list of ║
║ instructions that make them up. So, for example: ║
║ ║
║ xor rax, rax ║
║ xor rax, a ║
║ ║
║ Can be represented by the shorthand: ║
║ ║
║ xmov rax, a ║
║ ║
║ Similarly, for setting memory at a specific location pointed to by a register: ║
║ (sch is our scratch register) ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ xor sch, sch ║
║ xor sch, [rax] ║
║ xor [rax], sch ║
║ xor [rax], 0x1 ║
╠══------------------------------------------------------------------------------══╣
║ ║
║ We will represent as: ║
║ ║
║ xmem [rax], 0x1 ║
║ ║
║ In our previous blog-post (still feels weird to say that...) we came across an ║
║ issue with a single add instruction in our conditional block. What I completely ║
║ forgot about was that xor allows us to do register addition using relative ║
║ addressing tricks. In-fact, I did it earlier without thinking ║
║ ║
║ With that realisation, and the "shorthand" method above, we can simplify the ║
║ conditional non-branching code block to: ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ here we are checking if rdx == n, returning a if TRUE and b if FALSE ║
║ ║
║ VALUE A ║
║ xmov rbx, a # load a ║
║ ║
║ VALUE B ║
║ xmov rcx, b # load b ║
║ ║
║ COMPARISON OF RBX TO N: ║
║ xmem [rdx], 0x0 # memory pointed to by rdx == 0x0 ║
║ ║
║ xmem [n], 0x1 # memory pointed to by n == 0x1 ║
║ ║
║ xmov rax, [rdx] # rax now contains either 0x1, or 0x0 ║
║ ║
║ INDEXED LOOKUP FOR A AND B: ║
║ xmem [x], a # load a into table ║
║ ║
║ xmem [x+1], b # load b into table+1 ║
║ ║
║ xor sch, sch ║
║ xor sch, [x+rax] ║
║ xor rax, rax ║
║ xor rax, sch # rax now contains a if rdx == n and b otherwise ║
╠══------------------------------------------------------------------------------══╣
║ ║
║ Let's actually code this for values: a = 0xdead, b = 0xcaff, n = 0x100 ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ let our scratch register = r10 ║
║ ║
║ let our target register = rdx ║
║ ║
║ let our n register = r11 ║
║ ║
║ LOADING VALUES FOR A B AND N: ║
║ xor rbx, rbx ║
║ xor rbx, 0xdead ║
║ xor rcx, rcx ║
║ xor rcx, 0xcaff ║
║ xor r11, r11 ║
║ xor r11, 0x100 ║
║ ║
║ LOAD VALUE TO COMPARE TO N: ║
║ xor rdx, rdx ║
║ xor rdx, 0x100 ║
║ ║
║ COMPARISON OF RDX TO N: ║
║ xor r10, r10 ║
║ xor r10, [rdx] ║
║ xor [rdx], r10 ║
║ ║
║ xor r10, r10 ║
║ xor r10, [r11] ║
║ xor [r11], r10 ║
║ xor [r11], 0x1 ║
║ ║
║ xor rax, rax ║
║ xor rax, [rdx] ║
║ ║
║ INDEXED LOOKUP FOR A AND B STARTING AT N: ║
║ xor r10, r10 ║
║ xor r10, [r11] ║
║ xor [r11], r10 ║
║ xor [r11], rbx ║
║ ║
║ xor r10, r10 ║
║ xor r10, [r11+1] ║
║ xor [r11+1], r10 ║
║ xor [r11+1], rcx ║
║ ║
║ xor r10, r10 ║
║ xor r10, [r11+rax] ║
║ xor rax, rax ║
║ xor rax, r10 ║
╠══------------------------------------------------------------------------------══╣
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:a]==----------------------------------------------------------------══╣
║ ║
║ Bug 1: ║
║ ║
║ When trying to compile we get an error: ║
║ ║
║ Error: ambiguous operand size for 'xor' ║
║ ║
║ Some googling seems to suggest that unless otherwise specified, the default size ║
║ for an addressing operand is set by the register (here 64 bit registers so 64 ║
║ bit size operands) ║
║ ║
║ I tried looking for this information in the Intel manuals, but they didn't seem ║
║ to mention the operand sizes - turns out that depending on your syntax, the way ║
║ you signal size changes. The best way I found the size deceleration described ║
║ was from the following website (for Intel syntax): ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ In general, the intended size of the data item at a given memory address can be ║
║ inferred from the assembly code instruction in which it is referenced. For ║
║ example, in all of the above instructions, the size of the memory regions could ║
║ be inferred from the size of the register operand. When we were loading a ║
║ 32-bit register, the assembler could infer that the region of memory we were ║
║ referring to was 4 bytes wide. When we were storing the value of a one byte ║
║ register to memory, the assembler could infer that we wanted the address to ║
║ refer to a single byte in memory. ║
║ ║
║ However, in some cases the size of a referred-to memory region is ambiguous. ║
║ Consider the instruction mov [ebx], 2. Should this instruction move the value 2 ║
║ into the single byte at address EBX? Perhaps it should move the 32-bit integer ║
║ representation of 2 into the 4-bytes starting at address EBX. Since either is a ║
║ valid possible interpretation, the assembler must be explicitly directed as to ║
║ which is correct. The size directives BYTE PTR, WORD PTR, and DWORD PTR serve ║
║ this purpose, indicating sizes of 1, 2, and 4 bytes respectively. ║
╠══------------------------------------------------------------------------------══╣
║ ║
║ This might the first time I prefer the implementation in AT&T syntax over Intel. ║
║ Where the operator (say mov) is just suffixed with one of the following size ║
║ labels: ║
║ ║
║ b = byte (8 bit) ║
║ s = single (32-bit floating point) ║
║ w = word (16 bit) ║
║ l = long (32 bit integer or 64-bit floating point) ║
║ q = quad (64 bit) ║
║ t = ten bytes (80-bit floating point) ║
║ ║
║ So the instruction: ║
║ ║
║ mov esi, [rax] ║
║ ║
║ Becomes in AT&T: ║
║ ║
║ movl esi, [rax] ║
║ ║
║ Instead of the Intel: ║
║ ║
║ mov esi, DWORD PTR [rax] ║
║ ║
║ Regardless, I'm sticking with Intel for the moment so let's amend our assembly ║
║ code to include the XWORD PTR labels (only one in our case) and try again ║
║ ║
║ -> gcc -nostdlib -nostartfiles -no-pie asm.s -o asm ║
║ ║
║ Nice, we get a compiled file - that when we run, SegFaults... ║
║ ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:b]==----------------------------------------------------------------══╣
║ ║
║ Bug 2: ║
║ ║
║ I had a feeling this would happen. Lets try and step through the execution using ║
║ a debugger to see if trying to write to [0] is what's causing this, or something ║
║ else. Firstly, let's compile with debug symbols: ║
║ ║
║ -> gcc -g -nostdlib -nostartfiles -no-pie asm.s -o asm ║
║ ║
║ Stepping through the instructions, we get a SegFault at line 20 for: ║
║ ║
║ 20 -> xor r10, [rbx] ║
║ ║
║ And if we [info registers] we can see that: ║
║ ║
║ rax 0x0 0 ║
║ rbx 0xdead 57005 ║
║ rcx 0xcaff 51967 ║
║ rcx 0x100 256 ║
║ ║
║ Which are our intended values - but accessing the memory at the location 0xdead ║
║ causes a SegFault on our system, despite the compiler allowing it. Our issue is ║
║ that Linux virtual addressing explicitly disallows the lower addresses from ║
║ being used to stop stack overflow exploits and the like. ║
║ ║
║ What addresses are free to use then? ║
║ ║
║ The stack needs to be free for writing for sure, so anything from rsp onwards ║
║ (within reason) ought to be good. Using GDB, [info proc mappings] gives us the ║
║ addressing structure for our binary: ║
║ ║
║ [stack] ║
║ start: 0x00007ffffffde000 ║
║ end: 0x00007ffffffff000 ║
║ length: 0x21000 ║
║ ║
║ For the moment ignore the issues this presents us in terms of what hardcoded ║
║ values we are allowed to use, and see if we can get our xor code to ║
║ just run at all. Our new values are now: ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ rcx/target: OLD-> 0x100 NEW-> [rsp+0] = 0x7fffffffdea0 ║
║ rbx/a: OLD-> 0xdead NEW-> [rsp+1] = 0x7fffffffdea1 ║
║ rcx/b: OLD-> 0xcaff NEW-> [rsp+2] = 0x7fffffffdea2 ║
║ r11/n: OLD-> 0x100 NEW-> [rsp+0] = 0x7fffffffdead ║
╠══------------------------------------------------------------------------------══╣
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:c]==----------------------------------------------------------------══╣
║ ║
║ Bug 3: ║
║ ║
║ Compiling this gives us 4 new errors, one for each of our hardcoded variables... ║
║ each reads: ║
║ ║
║ Error: operand type mismatch for 'xor' ║
║ ║
║ So it turns out that the largest item that xor can take is an imm32 value and ║
║ since the value we are trying to pass is 48 bits, it doesn't allow it. Of course ║
║ if we just placed this value in a register, then xored it with our target that ║
║ would work just fine. We however cannot do this, as our xmov instruction will ║
║ always have the im32 constraint somewhere in the chain ║
║ ║
║ Let's try fetching a >32 bit value from rsp using: ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ xor rax, rax ║
║ xor rax, rsp ║
║ xor rax, 0x1 ║
║ ║
║ yeilds: ║
║ ║
║ rax 0x7fffffffdea1 140737488346785 ║
╠══------------------------------------------------------------------------------══╣
║ ║
║ Nice! okay, so now I shall try making our program again, using the values from ║
║ above and see if it runs ║
║ ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:d]==----------------------------------------------------------------══╣
║ ║
║ Bug 4: ║
║ ║
║ We got past our previous roadblock of assigning values, but still run into a ║
║ SegFault at line 42: ║
║ ║
║ 42 -> xor r10, [r11+rax] ║
║ ║
║ Let's see what r10, r11 and rax contain to see if we can understand our SegFault ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ rax 0xffffffe16c000000 -131332046848 unexpected? ║
║ rbx 0x7fffffffdea1 140737488346785 expected ║
║ rcx 0x7fffffffdea2 140737488346786 expected ║
║ rcx 0x7fffffffdea0 140737488346784 expected ║
║ rsp 0x7fffffffdea0 0x7fffffffdea0 expected ║
║ ║
║ r10 0x0 0 ║
║ r11 0x7fffffffdea0 140737488346784 expected ║
╠══------------------------------------------------------------------------------══╣
║ ║
║ That seems odd, I'll try retrace the steps that rax takes to get there: ║
║ ║
║ 28 -> xor rax, rax ║
║ 29 -> xor rax, [rbx] ║
║ ║
║ At this point, rax is 0 and rbx is ...5 - nothing wrong there. However, the ║
║ memory pointed to by rbx is incorrect. I think I've made a mistake somewhere... ║
║ ║
║ After a longer time than I'd like to admit, I found that some of the rbx/rcx ║
║ registers were mixed up ║
║ ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:e]==----------------------------------------------------------------══╣
║ ║
║ Bug 5: ║
║ ║
║ Now when we run our program, we get the correct result for the first case, but ║
║ we get something a little odd for our second case. Here are the registers... ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ for the first case: ║
║ rax 0x7fffffffdea1 140737488346785 ║
║ ║
║ for the second case: ║
║ rbx 0x7fffffffdea1a2 36028797016777122 ║
╠══------------------------------------------------------------------------------══╣
║ ║
║ It looks like the first register is overwriting the second one, which it turns ║
║ out, it is. I made the mistake of thinking of each address in memory as having ║
║ unlimited size to put things into - I think this misunderstanding came from the ║
║ way we are treating addresses as arrays, and that each index of an array can ║
║ have whatever you want in it ║
║ ║
║ Our memory would look something like: ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ addresses (relative) ║
║ ║
║ 0 7f───────────────┐ ║
║ 1 ff 7f───┐ │ ║
║ 2 ff ff │ │ ║
║ 3 ff ff │ │ ║
║ 4 de ff │ │ ║
║ 5 a1──────de┄┄┄│─┬─┘ ║
║ 6 a2─┬─┘ └─ 2nd write ║
║ └─ 1st write ║
╠══------------------------------------------------------------------------------══╣
║ ║
║ Fixing this is as simple as multiplying the rax value by our chosen offset (here ║
║ I chose 8 for 64 bit numbers). The only caveat is that as we cannot use the imul ║
║ instruction, we need to do this via relative addressing, which only allows ║
║ multiplication of 2^x ║
║ ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:f]==----------------------------------------------------------------══╣
║ ║
║ Bug? 6: ║
║ ║
║ A good 30% of the time trying to get this program to work was wrestling with one ║
║ particular issue. When we step through our program in GDB to see what is going ║
║ on at each stage and what the registers are doing, GDB only seems to update the ║
║ registers page once you have stepped past an instruction. For every line of code ║
║ except for one, this is fine ║
║ ║
║ If, however, we step past the last instruction (the one that should show us ║
║ either a or b in rax) we get a very odd number in rax: ║
║ ║
║ rax 0x7df00 515840 ║
║ ║
║ This was greatly confusing, as the previous instruction showed that the ║
║ registers seemed in the correct state, with the correct a/b value ready to be ║
║ put into rax: ║
║ ║
║ rax 0x1 0 ║
║ r10 0x7fffffffdea1 140737488346785 ║
║ ║
║ And all that was listed afterwards was: ║
║ ║
║ 44 -> xor rax, r10 ║
║ ║
║ Another odd thing was that each time I compiled the program, this value would ║
║ change. This should have been a lightbulb moment for me, but I missed it and ║
║ kept plugging on along fruitless roads ║
║ ║
║ Eventually though, and with the help of some people from the OAlabs discord ║
║ (mainly Xusheng) We found out what was happening. When the binary was compiled, ║
║ as my code wasn't very long - it was shorter than the 4k page size. This meant ║
║ that the remainder of the page was filled with 0s ║
║ ║
║ A funny quirk of x86 is that 0000 as bytecode executes to: ║
║ ║
║ add BYTE PTR [rax], al ║
║ ║
║ This would 99% of the time cause a SegFault, but since my program specifically ║
║ relies on loading values that are accessible memory addresses - dereferencing ║
║ rax actually executes. When I stepped in GDB past the last instruction, the code ║
║ flow passed to these "0000" sections, which also have some non-zero values in ║
║ there placed from when the rest of the page is marked as "empty". These also, ║
║ by some quirk of ELF page construction, happen to correspond to valid x86 ║
║ instructions! ║
║ ║
║ These instructions then modify the rax register a bunch of times, all the while ║
║ appearing as a single step in GDB ║
║ ║
║ Fixing this is thankfully somewhat easy, if we add a single nop to the end of ║
║ the program, then GDB doesn't have a fit stepping it (in our case, our nop ║
║ instruction could just be xor r10, r10). We could also just not use rax as our ║
║ register - but given that this problem will dissolve the moment our program ║
║ actually functions I'd rather keep it there ║
║ ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[ 3 ]==----------------------------------------------------------------══╣
║ ║
║ So, now that all the bugs (I found) have been fumigated, we can summarise what ║
║ our executable program does: ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ load a value to compare - let's call this x (in rcx) ║
║ ║
║ load a value to compare to x - let's call this n (in r11) ║
║ ║
║ load a value to display if we fail - let's call this a (in rbx) ║
║ ║
║ load a value to display if we succeed - let's call this b (in rcx) ║
║ ║
║ load into the memory pointed to by x, the value 0 ║
║ ║
║ load into the memory pointed to by n, the value 1 ║
║ ║
║ load the value pointed to by x, into rax (rax = 1 or 0) ║
║ ║
║ set up an array indexed from n ║
║ ║
║ load value a into the memory pointed to by n ║
║ ║
║ load value b into the memory pointed to by n + 1 ║
║ ║
║ retrieve the value pointed to by n + rax (so n + 1 or 0) ║
║ ║
║ load either a or b into rax ║
╠══------------------------------------------------------------------------------══╣
║ ║
║ Here is the final source code: ║
║ ║
╠══------------------------------------------------------------------------------══╣
║ .intel_syntax noprefix ║
║ .global _start ║
║ ║
║ _start: ║
║ xor rcx, rcx ║
║ xor rcx, rsp # load rsp into rdx, this = val to compare ║
║ ║
║ xor rbx, rbx ║
║ xor rbx, rsp ║
║ xor rbx, 0x1 # load rsp+1 into rbx, this = a ║
║ ║
║ xor rcx, rcx ║
║ xor rcx, rsp ║
║ xor rcx, 0x2 # load rsp+2 into rdx, this = b ║
║ ║
║ xor r11, r11 ║
║ xor r11, rsp # load rps into r11, this = n ║
║ ║
║ xor r10, r10 # r10 is our scratch ║
║ xor r10, [rcx] ║
║ xor [rcx], r10 # clear memory pointed to by rdx ║
║ ║
║ xor r10, r10 ║
║ xor r10, [r11] ║
║ xor [r11], r10 # clear memory pointed to by n ║
║ xor BYTE PTR [r11], 0x1 # load 1 into the memory pointed to by n ║
║ ║
║ xor rax, rax ║
║ xor rax, [rcx] # load 0 or 1 into rax if rdx == n or not ║
║ ║
║ xor r10, r10 ║
║ xor r10, [r11] ║
║ xor [r11], r10 ║
║ xor [r11], rbx # load a into memory pointed to by n ║
║ ║
║ xor r10, r10 ║
║ xor r10, [r11+8] ║
║ xor [r11+8], r10 ║
║ xor [r11+8], rcx # load b into memory pointed to by n+8 ║
║ ║
║ xor r10, r10 ║
║ xor r10, [r11+rax*8] # retrieve rax*8 to index for a/b ║
║ xor rax, rax ║
║ xor rax, r10 # load a/b into rax ║
║ ║
║ xor r10, r10 # padding ║
╠══------------------------------------------------------------------------------══╣
║ ║
║ Try it yourself if you like: ║
║ compiled using: gcc -g -nostdlib -nostartfiles -no-pie asm.s -o asm ║
║ ║
║ I think this satisfies our aim for this short part - next I'll have to work out ║
║ some way of transmuting this data into "strings" and a way to invoke syscalls ║
║ using xor. We shall see how easy that turns out to be... ║
║ ║
║ CWW out ║
╚══════════════════════════════════════════════════════════════════════════════════╝
┏━━┓
BACK
┗━━┛