Windows x64 Shellcode
Recently I have been rewriting several pieces of shellcode that I have implemented for x86 Windows into x64 and have had a hard time finding resources online that aided in my endeavors. I wanted to write a blog post (my first one) in order to hopefully help someone that is or will be in the position that I was in while trying to port over shellcode.
There are already several tutorials out on the internet that help in beginning to learn shellcode and I am not going to go over that. I not going to touch much on the basics of assembly, although I will talk about calling conventions, register clobbering and registers.
Refer to papers such as Skape’s Understanding Windows Shell code. or resources like project-shellcode for in-depth shellcode writing tutorials.
I will go over the differences between 32 and 64 bit assembly that I have noticed and how to work with them as well as some of the structures windows uses that are useful to know about for shellcode in the 64bit environment. I will also introduce two tools that I have created in helping my exploit development process.
Lastly before I get started I want to mention that I am still in the somewhat beginning stages of exploitation development and for the purpose of this tutorial I am only going to rely on needing to target Windows 7 x64 machines. I am also going to use the phrases Win32 to refer to x86 windows builds and Win64 to refer to x64 builds.
Registers
x86
Normally on a x86 processor, there are 8 general purpose registers that are all 32 bits wide.
-
eax - Accumulator register
-
ecx - Counter Register
-
edx - Data Register
-
ebx - Base Register
-
esp - Stack Pointer
-
ebp - Base Pointer
-
esi - Source Index
-
edi - Destination Index
and the instruction pointer . . .
- eip - Instruction Pointer
Because of backwards compatibility reasons, 4 of those registers {eax. ebx, ecx and edx} can be broken down into 16 bit and 8 bit varieties.
-
AX - Low 16 bits of EAX.
-
AH - High 8 bits of AX.
-
AL - Low 8 bits of AX.
-
BX - Low 16 bits of EBX.
-
BH - High 8 bits of EBX
-
BL - Low 8 bits of EBX
The same goes for ECX, and EDX by taking the middle letter (c, d) and post fixing it with (X, H or L)
x64
64 bit processors extended the above 8 registers by prefixing all of them with an “R”.
RAX, RCX, RDX.. etc. It is important to note that all other addressing forms are still the same (eax, ax, al… can still be used).
Also introduced are 8 new registers. r8, r9, r10, r11, r12, r13, r14 and r15. These registers can also be broken down into 32, 16 and 8 bit versions.
-
r# = 64 bit
-
r#d = low 32 bits
-
r#w = low 16 bits
-
r#b = 8 bits
Unfortunately, unlike being able to address the high 8 bits of the low 16 bits in registers such as eax, this is not possible with these extended registers.
Clobber Registers
Clobber registers are registers that can be overwritten in a function (such as those in the Windows API). These registers are volatile and should not be relied on, although can still be used if the API function of interest is tested to see which registers are actually clobbered.
In the Win32 API . . . EAX, ECX and EDX are clobber registers. In the Win64 API . . . RBP, RBX, RDI, RSI, R12, R13, R14 and R15 are not clobber registers, all others are.
RAX and EAX are used to return parameters from a function for both x86 and x64.
Calling Convention
x86
Win32 uses the stdcall calling convection and passes arguments on the stack backwards.
A call to a function foo with the arguments int x and int y
would need to be passed on the stack as such
x64
In win64 the calling convention is different and is similar to Win32 fast call as arguments are passed in registers. The first four arguments are passed in RCX, RDX, R8 and R9 respectively with additional arguments stored on the stack. Keep in mind, the registers fill the arguments vector from right to left on a function prototype.
A call to the MessageBox function in the Windows API for example is declared as follows:
In the Win64 convention the arguments would be:
Shellcode
Let’s Get Started
Now that the key differences have been established for Win64 shellcode, let’s write something!
In order to demonstrate the ability to run Win64 shellcode, I am going to pop a MessageBox. Once I have the code base written to display a MessageBox, I will inject the code into calc with a tool I wrote to ensure that it works within another process.
Notes:
I am using NASM for my assembler. Also, for linking Win64 object files I am using golink, written by Jeremy Gordon.
Open your favorite text editor, mine is Notepad++ for windows, and start typing!
Starting
1.) Declare the NASM directives.
2.) Set up the stack
3.) Let’s get the base address of Kernel32.dll.
In order to do this, a difference in the location of the PEB must be discussed.
In Win32, the PEB lives at [fs:30h] whereas in Win64 the PEB is at [gs:60h].
While the PEB struct has changed dramatically,
we only care about the LDR list which can be seen by using the “!peb” command in Windbg.
Notice how in the Windbg output of the PEB, the Ldr.InMemoryOrderModuleList contained kernel32.dll and it was the third entry. This list shows where PE files are in memory (consisting of both executables and dynamically linked libraries).
By filling the PEB structure in windbg, the location of the Ldr list is determined.
Ldr is at the 0x18th offset in the PEB.
So far we know that we need to
2.) Go to the LDR list by going to offset 18 in the PEB.
Further going into the LDR list, we need to access the InMemoryOrderModuleList. This is at offset 0x20 in the LDR struct as shown in the below output.
3.) At offset 0x20 is the InMemoryOrderModuleList.
From the figure that had the output of the InMemoryOrderModule list, it is shown that Kernel32.dll is the 3rd entry. The way that the _LIST_ENTRY struct works is as follows and is useful to know so that the base address of Kernel32 can be determined.
The lists contain a forward and backwards pointer and contains circular references.
In Windbg, !list allows the traversal of these lists. with !list, -x can be used to give a command for each element located. Let’s use that to go to the 0x20th offset in the _PEB_LDR_DATA struct and parse through the _LIST_ENTRY elements.
Will list all of the InMemoryOrderModule list and display the related _LDR_DATA_TABLE_ENTRY
Note that in this struct, InLoadOrderLinks points to the next element, DllBase is the base address of the module and FullDllName is the Unicode string of it.
Because we know Kernel32.dll is the 3rd entry in this list, let’s go to it.
We now know that the base address of a loaded module is at the 0x30th offset in this list.
So far we know that we need to
1.) Go to the PEB by accessing [gs:60h]
2.) Go to the LDR list by going to offset 18 in the PEB.
3.) At offset 0x20 is the InMemoryOrderModuleList.
4.) At the 3rd element in the InMemoryOrderModuleList is Kernel32 and the 0x30th offset is the base address of the module.
5.) We are going to want to call ExitProcess, which is actually RtlExitUserProcess from ntdll.dll… Ntdll.dll is the 2nd entry in the InMemoryOrderModuleList and I will also grab the base address of it and store it in r15 for later use. I find this method easier and more reliable than relying on Kernel32 to properly execute a function in ntdll.
Output from dependency walker showing that ExitProcess simply points to Ntdll.RtlExitUserProcess.
Now to assembly!
Notice I put Kernel32 into r12, which is not a clobber register! This address needs to be kept for the duration of the execution of the shellcode.
Now that Kernel32 is found, it can be used to load other libraries into ourselves and get the address of processes.
LoadLibraryA will be used to load a library into ourselves because we cannot rely any dll already being in our target process because shellcode needs to be position independent. In our case user32.dll is going to get loaded.
In order to use the LoadLibraryA function, it must be found in kernel32.dll. . . this is where GetProcAddress comes in.
This function takes two arguments, the handle to the module that contains the function we want and the function name.
Once we know where LoadLibraryA lives, we can use it to load user32.dll.
The “ 0xec0e4e8e” number and following numbers that are moved into rdx before the call to GetProcessAddress are hashed forms of function names.
0xec0e4e8e is LoadLibraryA when each letter is rotated by 13 and added to a sum. This is common in shellcode that I have examined and used in projects such as MetaSploit. I have written a small C program to perform these hashes for me.
Now load User32.dll
Now we can get the address of the MessageBox function that was described before.
and call it
and exit the process cleanly with the ExitProcess syscall.
Note that this is the header for the Kernel32 call, but we are going to use RtlExitUserProcess.
The finished shellcode with the GetProcAddress function I keep calling:
Note: I have adjusted all of the “lea” instructions with call/pop implementations for the final form. I simply used “lea” above for demonstration.
For information on the magic of the GetProcAddress function, refer to Skape’s paper.
Now that our shellcode is complete, let’s assemble it and test it.
This ran our shellcode as a binary.. we want to use it as pure shellcode.
Taking all of the hex bytes returned, let’s go to another little program I wrote because I wanted to be able to fire shellcode against a target, calc, to make sure it would work in a remote process. Please note this application is still in more or less of a beta form and I mostly wrote it because I wanted to play around with an open source disassembly project, BeaEngine.
Fire up the application, insert the bytes into the left text box and selected the assembly version we are using (x64). Afterwards, hit the disassemble button and the disassembly will appear on the right. I do this to make sure that the assembly is still intact and because I wanted to be able to recover undocumented shellcode that I had (opps).
Afterwards, hit “fire” and the application will run “calc” and inject a thread into it which will run the shellcode.
Success!
EOF
I hope that this blog post has helped in aiding the development of Win64 shellcode… I am just getting started with writing what I have learned in my research and am hopefully going to continue to write/document on my website.
To download the applications I used I have zipped them up here: Resources
Update 3/18/2015: I have open sourced my shellcode Tester and put the repository on my github page here.