CWW/blog/CCARA_Part

┏━━┓
BACK
┗━━┛
╔══════════════════════════════════════════════════════════════════════════════════╗
║ 10-04-2026                                                                       ║
║ Committing Crimes Against Readable Assembly Part 4                               ║
║                                                                                  ║
║ Magic syscalls                                                                   ║
║                                                                                  ║
╠══-----==[ Contents ]==---------------------------------------------------------══╣
║                                                                                  ║
║ 1: systems, calls, and systemcalls                                               ║
║ 2: ideas                                                                         ║
║   a: direct bytecode                                                             ║
║   b: intentional faults                                                          ║
║   c: mmio/pmio                                                                   ║
║   d: intentional faults revisited                                                ║
║ 3: one small problem                                                             ║
║ 4: non syscall syscalls                                                          ║
║ 5: final touches and summary                                                     ║
║ 6: the program                                                                   ║
║                                                                                  ║
╚══════════════════════════════════════════════════════════════════════════════════╝
╔══════════════════════════════════════════════════════════════════════════════════╗
╠══-----==[ 1 ]==----------------------------------------------------------------══╣
║                                                                                  ║
║ We're back, it's been quite a while and it is time (for me) to suffer            ║
║                                                                                  ║
║ Last time we got a working binary that compared a value, and "output" a specific ║
║ value depending on the result. While it is true that there is still a lot of     ║
║ tweaking to do to make that model work for the types of values we would want to  ║
║ be comparing, for the moment let's leave that be - and work on the next          ║
║ conceptual piece (I have arbitrarily decided) we need: syscalls                  ║
║                                                                                  ║
║ At the moment, our binary is only a program by sheer definition. If we ran it    ║
║ outside a debugger, or without a breakpoint, execution would just continue       ║
║ careening into the rest of the elf empty page. In order to exit properly, or do  ║
║ anything of note - we would need to invoke a syscall. But hang on a minute I     ║
║ hear you exclaim, what's a syscall?                                              ║
║                                                                                  ║
║ From the 'ol Wikipedia:                                                          ║
╠══------------------------------------------------------------------------------══╣
║ "a system call (syscall) is the programmatic way in which a computer program     ║
║ requests a service from the operating system"                                    ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ As we have decided to try and produce code that runs within an operating         ║
║ system (in our case, Linux) we have to play by its rules. That means asking      ║
║ very politely when we require some functionality that is outside of our          ║
║ program's control                                                                ║
║                                                                                  ║
║ Such functionalities that we may want to be able to use include; exiting the     ║
║ program, talking to the file system, displaying to the screen and many more      ║
║                                                                                  ║
║ On older OSs and embedded devices, sometimes these sorts of things were          ║
║ accessible through MMIO (mapping device memory to virtual memory within the      ║
║ program's scope) or PMMIO (mapping device I/O to ports)                          ║
║                                                                                  ║
║ Modern Linux though, instead of being able to access the memory of hardware      ║
║ input (of, say, a keyboard) by doing:                                            ║
║                                                                                  ║
║       mov rax, [keyboard input address]                                          ║
║                                                                                  ║
║ Forces us to use these "syscalls" (here with the syscall instruction):           ║
╠══------------------------------------------------------------------------------══╣
║       mov rax, arg1           # args for "keyboard" syscall                      ║
║       mov rdi, arg2                                                              ║
║       mov rsi, arg3                                                              ║
║       syscall                                                                    ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ Unfortunately I count syscall as an instruction, and since it's not spelt "xor"  ║
║ we aren't allowed to use it. So how do we go about achieving the functionality   ║
║ of system calls without ever using them?                                         ║
║                                                                                  ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:a]==----------------------------------------------------------------══╣
║                                                                                  ║
║ Let's say that we want to write the following:                                   ║
╠══------------------------------------------------------------------------------══╣
║       mov rax, 0x1                                                               ║
║       mov rdi, 0x2                                                               ║
║       mov rsi, 0x3                                                               ║
║       syscall                                                                    ║
╠══------------------------------------------------------------------------------══╣  
║                                                                                  ║
║ This assembly has some corresponding bytecode. When compiled with GCC (objdump   ║
║ -s -j .text syscall || hd), the snippet above becomes:                           ║
╠══------------------------------------------------------------------------------══╣
║  401000  48 c7 c0 01 00 00 00 48  c7 c3 02 00 00 00 48 c7  H......H......H.      ║
║  401010  c1 03 00 00 00 0f 05                              .......               ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ We can sort of see the instructions, and where they map onto our snippet:        ║
╠══------------------------------------------------------------------------------══╣
║       mov rax, 0x1  ->  48 c7 c0 01 00 00 00                                     ║
║       mov rdi, 0x2  ->  48 c7 c3 02 00 00 00                                     ║
║       mov rsi, 0x3  ->  48 c7 c1 03 00 00 00                                     ║
║       syscall       ->  0f 05                                                    ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ Is there then, a way to write these bytes into memory somewhere, and then        ║
║ execute them? that way we could do the writing with our xor instruction set, and ║
║ still invoke a syscall                                                           ║
║                                                                                  ║
║ This would require a section of addresses somewhere that was readable, writeable ║
║ and executable by our program. You can do this by explicitly by forcing your     ║
║ compiler to link a segment with r...w...e permissions. This is generally         ║
║ considered a terrible idea for safety reasons, so it could be perfect for us!    ║
║                                                                                  ║
║ However, given that the bytecodes will be read as opcodes + arguments, we are    ║
║ essentially getting the CPU to execute non-xor instructions - even if no non-xor ║
║ instructions appear in our source asm code. The purist in me feels this          ║
║ capitulates a core component of the challenge - at that point we are             ║
║ essentially just using xor to build a long list of numbers (bytecodes) which     ║
║ isn't that impressive                                                            ║
║                                                                                  ║
║ Let's shelve this idea for the moment then                                       ║
║                                                                                  ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:b]==----------------------------------------------------------------══╣
║                                                                                  ║
║ Our previous program exited whenever it tried to access an address outside of    ║
║ the binaries readable segments. In a way, SegFault-ing was our way to halt the   ║
║ program's execution                                                              ║
║                                                                                  ║
║ As exceptions halt execution - we can substitute the syscall for exit with a     ║
║ purposeful exception invocation. This is a list of all the 64 bit mode           ║
║ exceptions for the xor instruction:                                              ║
║                                                                                  ║
╠══------------------------------------------------------------------------------══╣
║ #SS(0) 	  If a memory address referencing the SS segment is in a           ║
║                 non-canonical form                                               ║
║                                                                                  ║
║ #GP(0) 	  If the memory address is in a non-canonical form                 ║
║                                                                                  ║
║ #PF(fault-code) If a page fault occurs                                           ║
║                                                                                  ║
║ #AC(0) 	  If alignment checking is enabled and an unaligned memory         ║
║                 reference is made while the current privilege level is 3         ║
║                                                                                  ║
║ #UD 	          If the LOCK prefix is used but the destination is not a memory   ║
║                 operand                                                          ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ I've decided to use the lock prefix without a memory operand for purposeful      ║
║ fault invocation. Given the way our xor program flow works (see previous blogs)  ║
║ we are likely to see SegFaults when things go wrong, so let's not confuse        ║
║ ourselves by adding SegFaults when things go right. Instead this will display    ║
║ the IllegalInstruction error in console.                                         ║
║                                                                                  ║
║ so this is our pseudo-"exit" syscall:                                            ║
║                                                                                  ║
║       lock xor rax, rbx                                                          ║
║                                                                                  ║
║ Unfortunately for us, GCC knows this would result in a fault, and doesn't allow  ║
║ you to compile this:                                                             ║
║                                                                                  ║
║       Error: expecting lockable instruction after `lock'                         ║
║                                                                                  ║
║ Instead, we need to directly compile from bytecode (eugh). We can include this   ║
║ at the end of our files:                                                         ║
║                                                                                  ║
║       .byte 0xF0, 0x48, 0x31, 0xD8        # bytecode for [lock xor rax, rbx]     ║
║                                                                                  ║
║ Ideally, this should show up as an xor instruction when disassembled. However,   ║
║ when opening the executable in Binary Ninja (henceforth Binja), we can see the   ║
║ following for the relevant .text section:                                        ║
╠══------------------------------------------------------------------------------══╣
║     00401000        f0 .. .. ..   ??                                             ║
║     00401001        .. 48 31 d8   H1.                                            ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ Interestingly it's decided to partition off the last three bytes, perhaps they   ║
║ are all utf-8 values? Regardless, it's clearly got this "wrong" - despite having ║
║ the correct bytecode to reconstruct the instructions, Binja has decided that     ║
║ this section is non-code                                                         ║
║                                                                                  ║
║ It turns out that _start is not assumed to be a legit entry point to disassemble ║
║ from, as sneaky malware authors could abuse that assumption to mess with the     ║
║ real code flow                                                                   ║
║                                                                                  ║
║ Hiding opcodes from disassemblers is always something interesting to play with,  ║
║ so we might return to this another time, but for the time being - lets see if we ║
║ can get Binja to display the instructions                                        ║
║                                                                                  ║
║ By explicitly labelling _start as a:                                             ║
║                                                                                  ║
║       .type _start, @function                                                    ║
║                                                                                  ║
║ We shall make sure Binja knows it's a function - checking the disassembly view,  ║
║ we can see that it isn't fixed at all. Hmmm                                      ║
║                                                                                  ║
║ It seems that Binja knows x86 better than I, and notices that locking an xor     ║
║ with two register operands is illegal, and so doesn't assume it is actual code.  ║
║ Interestingly, this heuristic allows us to put any regular opcodes and arguments ║
║ afterwards, and Binja will also not mark these up as instructions -  since the   ║
║ CPU would have faulted previously                                                ║
║                                                                                  ║
║ As a tentative solution, I shall use the .byte chunk as a substitution for the   ║
║ exit syscall, despite the disassembly issues                                     ║
║                                                                                  ║
║ As for trying to emulate other syscalls using fault behaviour, we aren't so      ║
║ lucky. Basically all of them crash the program, just with slightly different     ║
║ fault messages. However, the signal handler may well come in handy later on...   ║
║                                                                                  ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:c]==----------------------------------------------------------------══╣
║                                                                                  ║
║ Perhaps, if we could setup an environment where the keyboard and screen control  ║
║ that is usually handled by syscalls, was instead mapped to specific regions of   ║
║ virtual memory - we could just use relative addressing to use both of those      ║
║ features, just as some older systems used to.                                    ║
║                                                                                  ║
║ This runs counter to the whole compartmentalisation and security ethos of Linux  ║
║ so this might be a headache. As such, the only ways to achieve this require, at  ║
║ least, root access to a system                                                   ║
║                                                                                  ║
║ One solution might be to create some sort of an "emulator" or VM. Kernel         ║
║ programming - oh no                                                              ║
║                                                                                  ║
║ While this seems viable, I'd rather exhaust other options before spending weeks  ║
║ being aggrieved at my inability to code kernel modules                           ║
║                                                                                  ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[2:d]==----------------------------------------------------------------══╣
║                                                                                  ║
║ It turns out, to my great fortune, that you can create custom signal handlers    ║
║ for your programs in linux! They don't even need escalated privileges            ║
║                                                                                  ║
║ When our program raises a SigIll, the following happens:                         ║
╠══------------------------------------------------------------------------------══╣
║   the program raises a undefined instruction exception                           ║
║   |                                                                              ║
║   ├─> CPU switches to kernel mode                                                ║
║   |                                                                              ║
║   ├─> kernel detects the exception is a SigIll                                   ║
║   |                                                                              ║
║   ├─> the program's signal handlers table is accessed, and if one is registered, ║
║   |  it is pointed to                                                            ║
║   |                                                                              ║
║   ├─> before calling the handler, the kernel saves the CPU registers in a        ║
║   |  structure called ucontext                                                   ║
║   |                                                                              ║
║   ├─> the kernel sets up:                                                        ║
║   |    rdi -> signal id                                                          ║
║   |    rsi -> pointer to siginfo                                                 ║
║   |    rdx -> pointer to ucontext                                                ║
║   |                                                                              ║
║   ├─> the handler is called, and the return value is stored in rax               ║
║   |                                                                              ║
║   └─> program resumes execution with the handler modified register values        ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ I have no idea what the ucontext structure is actually like, so let's try and    ║
║ setup a test program and inspect the registers to see how it's parsed            ║
║                                                                                  ║
║ A basic program that sets various registers to some useful values to confirm in  ║
║ the debugger, then faults, will do just fine:                                    ║
╠══------------------------------------------------------------------------------══╣
║ .intel_syntax noprefix                                                           ║
║ .global _start                                                                   ║
║                                                                                  ║
║ _start:                                                                          ║
║                                                                                  ║
║       mov rax, 0x1            # test values                                      ║
║       mov rdi, 0x2                                                               ║
║       mov rsi, 0x3                                                               ║
║       mov rdx, 0x4                                                               ║
║       mov r10, 0x5                                                               ║
║       mov r8,  0x6                                                               ║
║       mov r9,  0x7                                                               ║
║                                                                                  ║
║       mov r11, [0x0]                                                             ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ As well as a basic signal handler:                                               ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ .intel_syntax noprefix                                                           ║
║ .global _sig_handler                                                             ║
║                                                                                  ║
║ _sig_handler:                                                                    ║
║                                                                                  ║
║       mov r12, 0xdead                                                            ║
║       ret                                                                        ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ Actually getting a separate program to take another as a signal handler seems a  ║
║ little tricky, so I'll include the _sig_handler inside the main .elf             ║
║                                                                                  ║
║ The syscall for registering a signal handler for a specific signal is sigaction, ║
║ which takes the following arguments:                                             ║
║                                                                                  ║
║     sigaction(signum, sigaction, sigaction_old)                                  ║
║                                                                                  ║
║ where the sigaction arguments are structs containing a pointer to the signal     ║
║ handler, flags to modify behaviour and any signals to block while running. The   ║
║ second sigaction argument is just a pointer to the old struct, we don't need to  ║
║ bother with that one, so that just = NULL                                        ║
║                                                                                  ║
║ Let's make a sigaction struct:                                                   ║
╠══------------------------------------------------------------------------------══╣
║ .section .data                                                                   ║
║ .align 8                                                                         ║
║ _sigaction:                                                                      ║
║       .quad _sig_handler      # pointer to the handler                           ║
║       .quad 0                 # no flags                                         ║
║       .quad 0                 # no restorer                                      ║
║       .zero 128               # no blockers                                      ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ And then reference it in a syscall to sigaction:                                 ║
╠══------------------------------------------------------------------------------══╣
║ .intel_syntax noprefix                                                           ║
║ .global _start                                                                   ║
║                                                                                  ║
║ _start:                                                                          ║
║       mov rax, 0xd            # syscall for sigaction                            ║
║       mov rdi, 0xb            # signum for SigSegV                               ║
║       lea rsi, [rel _sigaction]         # pointer to sigaction struct            ║
║       xor rdx, rdx            # pointer to old sigaction struct                  ║
║       syscall                                                                    ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ Once we smush all of that together, we get the following:                        ║
╠══------------------------------------------------------------------------------══╣
║ .intel_syntax noprefix                                                           ║
║ .global _start                                                                   ║
║ .global _sig_handler                                                             ║
║                                                                                  ║
║ .section .data                                                                   ║
║ .align 8                                                                         ║
║ _sigaction:                                                                      ║
║       .quad _sig_handler      # pointer to the handler                           ║
║       .quad 0                                                                    ║
║       .quad 0                                                                    ║
║       .zero 128               # no blockers                                      ║
║                                                                                  ║
║ .section .text                                                                   ║
║ _sig_handler:                                                                    ║
║       mov r12, 0xdead         # test value                                       ║
║ loop:                                                                            ║
║       jmp loop                                                                   ║
║                                                                                  ║
║ _start:                                                                          ║
║       mov rax, 13             # syscall for sigaction                            ║
║       mov rdi, 11             # signum for SigSegV                               ║
║       lea rsi, [rip + _sigaction]       # pointer to sigaction struct            ║
║       xor rdx, rdx            # pointer to old sigaction struct                  ║
║       mov r10, 128                                                               ║
║       syscall                                                                    ║
║                                                                                  ║
║       mov rax, 0x1            # load test values                                 ║
║       mov rdi, 0x2                                                               ║
║       mov rsi, 0x3                                                               ║
║       mov rdx, 0x4                                                               ║
║       mov r10, 0x5                                                               ║
║       mov r8,  0x6                                                               ║
║       mov r9,  0x7                                                               ║
║                                                                                  ║
║       mov r11, [0]            # cause SigSegV                                    ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ Compiling with gcc requires us to first make an object file:                     ║
║                                                                                  ║
║       gcc -c -g -o testprogram.o testprogram.s                                   ║
║                                                                                  ║
║ And then link it afterwards:                                                     ║
║                                                                                  ║
║       gcc -nostdlib -no-pie -o testprogram testprogram.o                         ║
║                                                                                  ║
║ gdb time (make sure to handle SigSegV nostop pass!) :)                           ║
║                                                                                  ║
║ Running our testprogram yields a SegFault - that's a good sign, alongside the    ║
║ fact it compiled and linked as expected. This may the smoothest one of these     ║
║ blogposts has gone (so far..).                                                   ║
║                                                                                  ║
║ If we hit a breakpoint on our _sig_handler, the rsp register should contain a    ║
║ pointer to the top of the signal frame - where our ucontext info is contained.   ║
║ which, it doesn't. I knew it was too good to be true.                            ║
║                                                                                  ║
║ If we check the value in rax after we try to register a signal handler, we see   ║
║ that the value is -22, which is the einval return. This means our registration   ║
║ is where we are failing, which is why the SegFault just terminates the program!  ║
║                                                                                  ║
║ Turns out, I needed to be more explicit with my handler, and specify the fault   ║
║ that it targets in the arguments - instead of leaving it blank and hoping that   ║
║ applied to all faults. Plus I had to change the test values a little, so that    ║
║ some duplicates that were showing up in GDB no longer appeared. There is now     ║
║ also a "restorer" function, and corresponding arguments.                         ║
║                                                                                  ║
║ This is the working test script, for creating and registering a signal handler:  ║
╠══------------------------------------------------------------------------------══╣
║ .intel_syntax noprefix                                                           ║
║ .global _start                                                                   ║
║ .global _sig_handler                                                             ║
║                                                                                  ║
║ .section .data                                                                   ║
║ .align 8                                                                         ║
║ _sigaction:                                                                      ║
║       .quad _sig_handler      # pointer to the handler                           ║
║       .quad 0x04000000                                                           ║
║       .quad _restorer                                                            ║
║       .quad 0                                                                    ║
║                                                                                  ║
║ .section .text                                                                   ║
║ _sig_handler:                                                                    ║
║       your sighandler instructions of choice                                     ║
║       ret                                                                        ║
║                                                                                  ║
║ _restorer:                                                                       ║
║       mov rax, 15             # rt_sigreturn                                     ║
║       syscall                                                                    ║
║                                                                                  ║
║ _start:                                                                          ║
║       mov rax, 13             # syscall for sigaction                            ║
║       mov rdi, 11             # signum for SigSegV                               ║
║       lea rsi, [rip + _sigaction]       # pointer to sigaction struct            ║
║       xor rdx, rdx            # pointer to old sigaction struct (NULL)           ║
║       mov r10, 8                                                                 ║
║       syscall                                                                    ║
║                                                                                  ║
║       mov rax, 0x111          # testing values                                   ║
║       mov rbx, 0x222                                                             ║
║       mov rcx, 0x333                                                             ║
║       mov rdx, 0x444                                                             ║
║       mov rsi, 0x555                                                             ║
║       mov rdi, 0x666                                                             ║
║       mov r8,  0x777                                                             ║
║       mov r9,  0x888                                                             ║
║       mov r10, 0x999                                                             ║
║       mov r11, 0xaaa                                                             ║
║       mov r12, 0xbbb                                                             ║
║       mov r13, 0xccc                                                             ║
║       mov r14, 0xddd                                                             ║
║       mov r15, 0xeee                                                             ║
║                                                                                  ║
║       mov r11, [0]                # cause SigSegV                                ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ So - if we breakpoint at the signal handler, and inspect the memory section      ║
║ from the addresses for the first 32 double QWORDs contained on the stack, this   ║
║ should be where the ucontext is stored (x/32gx $rsp):                            ║
║                                                                                  ║
╠══------------------------------------------------------------------------------══╣
║ 0x7fffffffd8f8: 0x40100f |_restorer|    0x7                                      ║
║ 0x7fffffffd908: 0x0                     0x0                                      ║
║ 0x7fffffffd918: 0x2                     0x0                                      ║
║ 0x7fffffffd928: 0x777                   0x888                                    ║
║ 0x7fffffffd938: 0x999                   0xaaa                                    ║
║ 0x7fffffffd948: 0xbbb                   0xccc                                    ║
║ 0x7fffffffd958: 0xddd                   0xeee                                    ║
║ 0x7fffffffd968: 0x666                   0x555                                    ║
║ 0x7fffffffd978: 0x0                     0x222                                    ║
║ 0x7fffffffd988: 0x444                   0x111                                    ║
║ 0x7fffffffd998: 0x333                   0x7fffffffdea0                           ║
║ 0x7fffffffd9a8: 0x40109b |_start+131|   0x10246                                  ║
║ 0x7fffffffd9b8: 0x2b000000000033        0x4                                      ║
║ 0x7fffffffd9c8: 0xe                     0x0                                      ║
║ 0x7fffffffd9d8: 0x0                     0x7fffffffdac0                           ║
║ 0x7fffffffd9e8: 0x0                     0x0                                      ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ Our "test" values appear a bit scattered around the place. By counting the       ║
║ offsets from the stack pointer for each one, we can reconstruct what the         ║
║ ucontext structure looks like:                                                   ║
║                                                                                  ║
║ Byte offset from rsp ---┐                                                        ║
║                         ↓                                                        ║
║                                                                                  ║
║ 000|001|003|004|005|006|007|008|009|010|011|012|013|014|015|016|017|018|019|020| ║
║                      |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   ║
║                      |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   ║
║                    |r8 |r9 |r10|r11|r12|r13|r14|r15|rdi|rsi|   |rbx|rdx|rax|rcx| ║
║                                                                                  ║
║                                                      ↑                           ║
║                     Corresponding stored registers --┘                           ║
║                                                                                  ║
║ This allows us to pass information from the faulting program (our xor script)    ║
║ directly to the signal handler. For instance, let's say we wanted to invoke any  ║
║ given systemcall - we could load the arguments for that into rax, rbx and rcx    ║
║ and then cause a fault. The signal handler then could read the areas from the    ║
║ ucontext structure corresponding to the saved registers (rsp+19*8 for rax,       ║
║ rsp+17*8 for rbx and rcx+20*8 for rcx) and then execute the syscall. Importantly ║
║ it also allows our handler to communicate with the faulting script, as the       ║
║ ucontext structure is used to restore registers when passing back execution      ║
║ flow.                                                                            ║
║                                                                                  ║
║ Let's test this idea, by creating a signal handler that writes "0xdead" to       ║
║ rsp+19*8 (rax) and inspecting rax once code flow returns to the faulting         ║
║ program. We can do this by adding:                                               ║
║                                                                                  ║
║       mov dword ptr [rsp + 19*8], 0xdead                                         ║
║                                                                                  ║
║ To the signal handler code.                                                      ║
║                                                                                  ║
║ Once in GDB, we set a breakpoint at the signal handler, and when it hits we can  ║
║ step through the instructions until the flow returns to our faulting program.    ║
║ At this point, inspecting the registers shows that rax contains 0xdead!          ║
║                                                                                  ║
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[ 3 ]==----------------------------------------------------------------══╣
║                                                                                  ║
║ Now that I (partially) understand how the handler works, lets try make one that  ║
║ does our syscalls for us                                                         ║
║                                                                                  ║
║ There is one big problem in our way, which is that the signal handler's job is   ║
║ to actually handle the faults - we currently do no such thing. When our program  ║
║ hits a fault, and the signal handler is done, we are just dumped back into our   ║
║ program at the very same faulting instruction, where we fault all over again     ║
║                                                                                  ║
║ Fair warning, if you tell your GDB to ignore SegFaults, and register a           ║
║ _sig_handler that doesn't do anything, it may spam error messages until it       ║
║ crashes your system (it did to mine...)                                          ║
║                                                                                  ║
║ How does the signal handler know where to put us? turns out, it uses ucontext    ║
║ again, this time reading the instruction pointer (rip) - and passing that to the ║
║ restore syscall so we get plopped out at the same place in instruction flow      ║
║                                                                                  ║
║ after some searching, I found where rip is stored in ucontext (rsp + 22*8).      ║
║ Thankfully, the usual write protections present to prevent us from directly      ║
║ writing to rip are not present in the version saved to ucontext!                 ║
║                                                                                  ║
║ in our example script, the faulting instruction:                                 ║
║                                                                                  ║
║       mov r11, [0]                                                               ║
║                                                                                  ║
║ is encoded with the following bytecode:                                          ║
║                                                                                  ║
║       0x4c    0x8b    0x1c    0x25    0x00    0x00    0x00    0x00               ║
║                                                                                  ║
║ so, in theory, if we know that rip points to the start of this instruction, then ║
║ rip + 8 would point to the instruction directly afterwards. Allowing the flow of ║
║ execution to continue past it                                                    ║
║                                                                                  ║
║ lets make a program to test this assumption:                                     ║
╠══------------------------------------------------------------------------------══╣
║ .intel_syntax noprefix                                                           ║
║ .global _start                                                                   ║
║ .global _sig_handler                                                             ║
║                                                                                  ║
║ .section .data                                                                   ║
║ .align 8                                                                         ║
║ _sigaction:                                                                      ║
║       .quad _sig_handler      # pointer to the handler                           ║
║       .quad 0x04000000                                                           ║
║       .quad _restorer                                                            ║
║       .quad 0                                                                    ║
║                                                                                  ║
║ .section .text                                                                   ║
║ _sig_handler:                                                                    ║
║       xor rax, rax            # clear rax                                        ║
║       add rax, [rsp + 22*8]   # rax = stored rip                                 ║
║       add rax, 0x8            # rax = stored rip+8                               ║
║       mov [rsp + 22*8], rax   # move rip+8 to stored rip                         ║
║       ret                                                                        ║
║                                                                                  ║
║ _restorer:                                                                       ║
║       mov rax, 15             # rt_sigreturn                                     ║
║       syscall                                                                    ║
║                                                                                  ║
║ _start:                                                                          ║
║       mov rax, 13             # syscall for sigaction                            ║
║       mov rdi, 11             # signum for SigSegV                               ║
║       lea rsi, [rip + _sigaction]       # pointer to sigaction struct            ║
║       xor rdx, rdx            # pointer to old sigaction struct                  ║
║       mov r10, 8                                                                 ║
║       syscall                                                                    ║
║                                                                                  ║
║                                                                                  ║
║       mov r11, [0]            # cause SigSegV                                    ║
║       mov rax, 0xcaff         # new landing instruction                          ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ once again, we break on _sig_handler and step through until we are put back into ║
║ our _start function, directly at our intended landing instruction. Bingo         ║
║                                                                                  ║
╠══------------------------------------------------------------------------------══╣
║ Breakpoint 2, _sig_handler () at testing_controlflow.s:15                        ║
║ 15          xor rax, rax                        # clear rax                      ║
║ (gdb) si                                                                         ║
║ 16           add rax, [rsp + 22*8]               # rax = stored rip              ║
║ (gdb) si                                                                         ║
║ 17           add rax, 0x8                        # rax = stored rip+8            ║
║ (gdb) si                                                                         ║
║ 18           mov [rsp + 22*8], rax               # move rip+8 to stored rip      ║
║ (gdb) si                                                                         ║
║ 19           ret                                                                 ║
║ (gdb) si                                                                         ║
║ _restorer () at testing_controlflow.s:22                                         ║
║ 22           mov rax, 15                         # rt_sigreturn                  ║
║ (gdb) si                                                                         ║
║ 23           syscall                                                             ║
║ (gdb) si                                                                         ║
║ _start () at testing_controlflow.s:35                                            ║
║ 35           mov rax, 0xcaff                     # new landing instruction       ║
╠══------------------------------------------------------------------------------══╣
╠══════════════════════════════════════════════════════════════════════════════════╣
╠══-----==[ 4 ]==----------------------------------------------------------------══╣
║                                                                                  ║
║ I think we have all the tools required to write a syscall-less script that       ║
║ invokes one. As the final aim for this blogpost, I'll try to craft a PoC         ║
║                                                                                  ║
║ Let's start with a basic script idea; print to the screen and then exit. I shall ║
║ leave any conditional statements for later - as they will add a lot of visual    ║
║ clutter without much logical complexity (beyond just combining the methods)      ║
║                                                                                  ║
║ For additional simplicity, I'll use an 8 letter word; TRIANGLE. This way it can  ║
║ fit entirely within one register. The little-endian ASCII encoding for TRIANGLE  ║
║ is;                                                                              ║
║                                                                                  ║
║       0x454C474E41495254                                                         ║
║                                                                                  ║
║ we will be using the write syscall, and so need to populate the following        ║
║ registers with the correct args:                                                 ║
╠══------------------------------------------------------------------------------══╣
║       rax = 1 (syscall for write)                                                ║
║       rdi = 1 (we are writing to stdout)                                         ║
║       rsi = char buffer                                                          ║
║       rdx = len of chars                                                         ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ And for the exit "syscall", we will just use the SigIll instruction from before  ║
║                                                                                  ║
║ Okay, this took way longer than I thought it would. Here I go again              ║
║ underestimating the complexity of "small" tasks. Some fun excerpts include:      ║
╠══------------------------------------------------------------------------------══╣
║       forgetting that xor cannot take imm 64 values - so only 4 characters max   ║
║                                                                                  ║
║       having to spread TRIANGLE over two registers instead                       ║
║                                                                                  ║
║       issues with getting a pointer to our string using xor only                 ║
║                                                                                  ║
║       trying to put the pointer logic inside the _sig_handler instead            ║
║                                                                                  ║
║       resorting to using the stack, and stack pointers                           ║
║                                                                                  ║
║       forgetting that using the stack will offset the whole ucontext struct      ║
║                                                                                  ║
║       having to offset the ucontext relative addressing by 8                     ║
║                                                                                  ║
║       reordering the _sig_handler so I don't have to offset the addressing       ║
║                                                                                  ║
║       problems with the length arg, so offloading that to the _sig_handler too   ║
║                                                                                  ║
║       found out the issue was my addressing - so moved it back to _start         ║
║                                                                                  ║
║       cleaning up the stack to ensure the ucontext rip is found correctly        ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ All that being said, I did manage to get a working program that I am happy with. ║
║ Looking at this stuff really brings to focus how much better more modern code    ║
║ conveys dense information, assembly really just starts to resemble grains of     ║
║ sand once you stare at it for too long. As such, I've included a "high" level    ║
║ abstraction of the program below:                                                ║
║                                                                                  ║
╠══------------------------------------------------------------------------------══╣
║ .PREFIX_STUFF                                                                    ║
║                                                                                  ║
║ _SIGACTION_STRUCT:                                                               ║
║       struct stuff                                                               ║
║                                                                                  ║
║ _SIGNAL_HANDLER:                                                                 ║
║                                                                                  ║
║       move saved rax -> rax                                                      ║
║       move saved rdi -> rdi                                                      ║
║       move saved rdx -> rdx                                                      ║
║       move saved r12 -> r12                                                      ║
║                                                                                  ║
║       reserve 8 bytes on the stack                                               ║
║       put r12 onto the stack                                                     ║
║       move pointer to r12 -> rsi                                                 ║
║                                                                                  ║
║       syscall                                                                    ║
║                                                                                  ║
║       clean up the stack                                                         ║
║                                                                                  ║
║       move saved rip -> rax                                                      ║
║       add 8 to rax                                                               ║
║       move rax -> saved rip location                                             ║
║                                                                                  ║
║       ret                                                                        ║
║                                                                                  ║
║ _RESTORER_FUNCTION:                                                              ║
║       restorer syscall                                                           ║
║                                                                                  ║
║ _START:                                                                          ║
║                                                                                  ║
║       setup signal handler                                                       ║
║                                                                                  ║
║       store TRIA in r12                                                          ║
║       move 1 -> rax                   # 1 = syscall for write                    ║
║       move 1 -> rdi                   # 1 = arg for "to console"                 ║
║       move len of TRIA -> rdx                                                    ║
║                                                                                  ║
║       SegFault                                                                   ║
║                                                                                  ║
║       store NGLE in r12               # we reuse args here                       ║
║                                                                                  ║
║       SegFault                                                                   ║
║                                                                                  ║
║       SigIll                          # illegal instruction exit                 ║
╠══------------------------------------------------------------------------------══╣
╠══════════════════════════════════════════════════════════════════════════════════╣ 
╠══-----==[ 5 ]==----------------------------------------------------------------══╣
║                                                                                  ║
║ While this is entirely "functional" (for our purposes) there are still some      ║
║ small tradeoffs this approach forces us to deal with. Despite our _start         ║
║ function now being non-xor free, we do have to include at least three syscalls.  ║
║ One in the code that sets up the _sig_handler, one in the _sig_handler itself    ║
║ and another in the _restorer.                                                    ║
║                                                                                  ║
║ In writing the asm for this, I defaulted to my usual (and unforgivable) trait of ║
║ using mov instructions. Going through and replacing them was not too tricky,     ║
║ only requiring me to remember that xor cannot operate directly on dereferenced   ║
║ memory.                                                                          ║
║                                                                                  ║
║ The final non-xor instruction count comes to 8 (which I am happy with):          ║
╠══------------------------------------------------------------------------------══╣
║       1x sub                                                                     ║
║       1x ret                                                                     ║
║       1x lea                                                                     ║
║       2x add                                                                     ║
║       3x syscall                                                                 ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ The logical next step would be to combine this functionality with our            ║
║ non-branching conditional logic from previous blogs to make a more interesting   ║
║ program. I'm undecided on what that should be - a full calculator seems a little ║
║ tricky, but interesting, perhaps a simple text adventure?                        ║
║                                                                                  ║
║ Feel free to ping me some ideas, though I reserve the right to refuse!           ║
║                                                                                  ║
╚══════════════════════════════════════════════════════════════════════════════════╝
╔══════════════════════════════════════════════════════════════════════════════════╗
╠══-----==[ 6 ]==----------------------------------------------------------------══╣
║                                                                                  ║
║ Of those who have been sufficiently interested to read thus far, there may be a  ║
║ further subsection who would be curious to see the full code for this program,   ║
║ so I'll put that here:                                                           ║
║                                                                                  ║
║ (Please excuse my comments, I appear to not be able to follow a rubric, and      ║
║ each time I program they turn out different, nonetheless hopefully they help     ║
║ explain what each part does)                                                     ║
╠══------------------------------------------------------------------------------══╣
║ .intel_syntax noprefix                                                           ║
║ .global _start                                                                   ║
║ .global _sig_handler                                                             ║
║                                                                                  ║
║ .section .data                                                                   ║
║ .align 8                                                                         ║
║ _sigaction:                                                                      ║
║       .quad _sig_handler                  # pointer to the handler               ║
║       .quad 0x04000000                                                           ║
║       .quad _restorer                                                            ║
║       .quad 0                                                                    ║
║                                                                                  ║
║ .section .text                                                                   ║
║ _sig_handler:                                                                    ║
║                                                                                  ║
║       xor rax, rax                                                               ║
║       xor rax, [rsp + 19*8]               # move saved rax to rax                ║
║       xor rdi, rdi                                                               ║
║       xor rdi, [rsp + 14*8]               # move saved rdi to rdi                ║
║       xor rdx, rdx                                                               ║
║       xor rdx, [rsp + 18*8]               # move saved rdx to rdx                ║
║       xor r12, r12                                                               ║
║       xor r12, [rsp + 10*8]               # move saved r12 to r12                ║
║                                                                                  ║
║       sub rsp, 8                          # reserve some stack space for chars   ║
║                                                                                  ║
║       xor r11, r11                                                               ║
║       xor r11, [rsp]                                                             ║
║       xor [rsp], r11                      # zeroing out *rsp                     ║
║                                                                                  ║
║       xor qword ptr [rsp], r12            # pointer to r12                       ║
║       xor rsi, rsi                                                               ║
║       xor rsi, rsp                                                               ║
║                                           # take care about ucontext offset      ║
║       syscall                             # execute the syscall                  ║
║                                                                                  ║
║       add rsp, 0x8                        # clean stack                          ║
║                                                                                  ║
║       xor rax, rax                        # clear rax                            ║
║       xor rax, [rsp + 22*8]               # rax = stored rip                     ║
║       add rax, 0x8                        # rax = stored rip+8                   ║
║                                                                                  ║
║       xor r11, r11                                                               ║
║       xor r11, [rsp + 22*8]                                                      ║
║       xor [rsp + 22*8], r11               # zeroing out *[rsp + 22*8]            ║
║                                                                                  ║
║       xor [rsp + 22*8], rax               # move rip+8 to stored rip             ║
║                                                                                  ║
║       ret                                                                        ║
║                                                                                  ║
║ _restorer:                                                                       ║
║       xor rax, rax                                                               ║
║       xor rax, 15                         # rt_sigreturn                         ║
║       syscall                                                                    ║
║                                                                                  ║
║ _start:                                                                          ║
║       xor rax, rax                                                               ║
║       xor rax, 13                         # syscall for sigaction                ║
║       xor rdi, rdi                                                               ║
║       xor rdi, 11                         # signum for SigSegV                   ║
║       lea rsi, [rip + _sigaction]         # pointer to sigaction struct          ║
║       xor rdx, rdx                        # pointer to old sigaction struct      ║
║       xor r10, r10                                                               ║
║       xor r10, 8                                                                 ║
║       syscall                                                                    ║
║                                                                                  ║
║       xor r12, r12                                                               ║
║       xor r12, 0x41495254                 # r12 = TRIA                           ║
║       xor rax, rax                                                               ║
║       xor rax, 0x1                        # rax = syscall for write              ║
║       xor rdi, rdi                                                               ║
║       xor rdi, 0x1                        # rdi = output, stdout                 ║
║                                           # pointer to TRIA by sig_handler       ║
║       xor rdx, rdx                                                               ║
║       xor rdx, 0x4                        # len of TRIA (4 bytes)                ║
║                                                                                  ║
║       xor r11, [0]                        # cause SigSegV                        ║
║                                                                                  ║
║       xor r12, r12                                                               ║
║       xor r12, 0x454C474E                 # r12 = NGLE                           ║
║                                                                                  ║
║       xor r11, [0]                        # cause SigSegV                        ║
║                                                                                  ║
║       xor r12, r12                                                               ║
║       xor r12, 0x0a                       # newline char                         ║
║       xor rdx, rdx                                                               ║
║       xor rdx, 0x1                        # length arg = 1                       ║
║                                                                                  ║
║       xor r11, [0]                        # cause SigSegV                        ║
║                                           # reuse other register values          ║
║                                                                                  ║
║       .byte 0xF0, 0x48, 0x31, 0xD8        # illegal instruction exit             ║
╠══------------------------------------------------------------------------------══╣
║                                                                                  ║
║ Until next time, CWW out                                                         ║
╚══════════════════════════════════════════════════════════════════════════════════╝
┏━━┓
BACK
┗━━┛