[VULNCON 2021] - IPS
Introduction
Last December (which is a month ago), I learnt that there was a Linux kernel CTF challenge, called IPS, unsolved during VULNCON 2021.
At that moment, I was struggling exploiting the first bug that we eventually used to pwn GKE for kCTF and I thought it would be great to solve a CTF challenge and regain some confidence in kernel exploitation.
However, I struggled a little bit again with the CTF challenge (because I was and still am dumb) and finally solved it with an unintended solution.
Even though my solution was much harder than the intended solution, it was the first blood for this challenge.
Challenge
Now let me briefly discribe what kind of challenge it is.
Different from normal Linux kernel challenges where players deal with compiled kernel modules, this challenge implements a new system call ips
and the source code of this system call is given. The source code is quite short, interested readers can find the source code here.
Basically, it implements a service that allows users to allocate/remove/edit/copy some data in kernel space (just like heap challenges in userspace lol).
The logic looks like this:
1 | SYSCALL_DEFINE2(ips, int, choice, userdata *, udata) { |
The data, called chunk
, stored in kernel space is described as follow:
1 | typedef struct { |
Although there are several attributes such as next
, idx
, and priority
, which makes it complicated, they actually have any no meaningful impact in the challenge. The only thing interesting is data
, which is some content fully controlled by users. (I later learnt that @kileak used the next
pointer for info leak, you can find his blog post here).
Although the next
pointer defined in chunk
makes it appear to be stored as a list, it is actually stored in an array defined as a global variable: chunk *chunks[MAX] = {NULL};
(MAX
is 16)
Each time the user calls alloc_storage
, it will pick the first index where chunks[idx]
is NULL
for use using the funciton get_idx
. Each time the user wants to remove/edit/copy a certain chunk, it will make sure the index is valid by calling the check_idx
function.
1 | int get_idx(void) { |
Vulnerabilities
Bug 1
As a veteran in CTFs, my instinct told me there must be something wrong with the copy
functionality, or it should be just alloc/edit/remove.
It turned out I was right.
1 | int copy_storage(int idx) { |
Looking at the code, I immediately noticed that the copy
functionality basically copies a user-decided pointer (idx
is user input) to a new index. It is likely that there can be a UAF situation if the the copyed pointer is not properly handled during remove
.
1 | int remove_storage(int idx) { |
But no, remove_storage
properly handles the copied pointers by clearing the copied versions before freeing the target pointer, so no UAF :(
So, what’s the issue with copy_storage
? Looking closely, I realized that get_idx
can return -1
if it is called when the chunk array is full, which becomes chunks[-1] = chunks[idx]
.
This is an underflow in global variables, good, right? But not really.
The bug does not help the exploitation because 1. it does not overwrite any important data 2. all indices passed to copy/remove/edit
will be sanitizied and alloc
does not take index argument at all.
Bug 2
I got stuck here for a while and didn’t know how to proceed. I eventually decided to give up and started doing my research work. Luckily, I somehow managed to finish that day’s work earlier than I expected, so I picked up the challenge again and immediately noticed something fishy.edit_storage
is defined as follow:
1 | int edit_storage(int idx, char *data) { |
It appears that it also sanitizes the idx
argument at the first glance. But look closer……. WTF??? It does not return even if the check fails???
Astonishment aside, this bug allows attackers to use the stored pointer at index -1
from Bug 1.
Exploitation
Primitive
Combining Bug 1&2 together, it is clearly a UAF scenario.
One can copy a pointer, let’s say chunks[idx]
, to chunks[-1]
, free chunks[idx]
such that chunks[-1]
becomes a dangling pointer, and then finally edit chunks[-1]
to achieve UAF-write in kmalloc-128
(the cache that chunk
belongs to)
Protections
Once I found the usable primitive, I started checking the protections applied in the challenge and deciding the exploitation stategies.
I checked the challenge start script and found out that KASLR
is on and the author appends +smap +smep
to the kernel arguments. However, the author does not specify cpu choices, which means QEMU will use its own emulated cpu to run the kernel. And by default, the cpu does not support SMEP
/SMAP
. So even though the kernel will try to enable SMEP
/SMAP
, it cannot succeed on a cpu that does not support them. As a result, the kernel will be run without these two protections.
After understanding the protection, the exploitation plan is clear: leak a kernel pointer to defeat KASLR and then overwrite a function pointer to perform ret2usr.
OOB-Read
With the UAF primitive in hand, I decided to try out the msg_msg
attack first to get info leak.
This is an attack that has been quite popular among Linux kernel exploit writers for a while.
It provides immediate OOB-Read and Arbitrary-Free primitive (source1, source2). It has also been demonstrated to be able to used to gain Arbitrary-Read/Write in some settings (source1, source2).
I’ll only briefly talk about how this attack works here, interested readers can follow the links provided above for further readings.
struct msg_msg
is defined like this:
1 | struct msg_msg { |
They look like this in memory:
1 | 0xffff96daced9ba80: 0xffff96daced5a8c0 0xffff96daced5a8c0 <- m_list |
struct msg_msg
is a header object that describes a message in the System V message IPC system.
It is allocated together with the actual content of the message. When the message itself is too long, the message content will be segmented: the first segment will be right after the header, the other segments will be chained as a linked list using the next
attribute.
In this blog, I will refer to struct msg_msg
as “message header” and the header+content as “msg_msg object” since they are allocated together.
The allocation code looks like this:
1 | static struct msg_msg *alloc_msg(size_t len) |
In particular, m_ts
stores the length of the message. Overwriting it will make the kernel to send back more data to userspace.
At this point, the leak plan is quite obvious. I freed a chunk
object and then allocated a short msg_msg
(next
is NULL
) to occupy the slot. And then I enlarged its m_ts
using the UAF-write. Finally, I called msgrcv
so that data after the msg_msg
object would be sent back to me in userspace. This gave me a reliable OOB-read primitive.
Info Leak
Now the question is what to leak.
Since I didn’t know many good objects in kmalloc-128
, I decided to do it in another cache. I went for the classic struct file
object, which is in kmalloc-256
.
The idea is quite simple: spray many struct file
objects so that a new kmalloc-256
slab full of struct file
will be allocated right after the target kmalloc-128
slab. Since I could leak as many bytes as I wanted, I decided to leak two pages to ensure that all content in the next page will be leaked to userspace. Because of how the page allocator works, this spray is actually quite reliable.
1 | kmalloc-128 |
In memory, a struct file
object looks like this:
1 | 0xffff96daced9c000: 0x0000000000000000 0x0000000000000000 |
This can give us a lot of precious information:
- kernel base: by leaking
struct file->fops
(0xffffffffba229500
in the memory dump) - UAF object address:
0xffff96daced9c058
points back to itself, which means we can infer the address of thestruct file
object. With a little bit of calculation, we can get the heap address of the UAF object!
This two information is enough for us to fully defeat KASLR.
Now it’s time for the fun part.
arbitrary-free
At this point, overwriting fops
(which is a pointer to an array of function pointers) would be enough to provide me PC control but I didn’t see how.
This is because I totally missed the point that this challenge uses a pretty new kernel and it no longer stores the freelist pointer at the start of the slot. Instead, it stores the freelist pointer in the middle of the slot. Since I missed this point, I thought there was no way to hijack the freelist (the UAF-write starts at offset 0xe and I thought the freelist pointer was at offset 0).
It turned out @kileak used this approach and this approach was also the intended approach. A caveat is that this kernel is also compiled with heap cookie, a little bit of calculation is needed to hijack the freelist.
Anyway, I didn’t think about hijacking freelist so I came up with another approach: use the arbitrary-free primitive provided by msg_msg
attack.
Alexander Popov has shown that if the message in msg_msg
object is too long, the message will be segmented and the unfit part will be stored into a linked list referenced by next
. Overwriting this next
pointer provides the attacker arbitrary-free primitive when the kernel tries to clean up the message (source).
This approach works in long msg_msg
where next
is initially not NULL
. But in my exploit, the msg_msg
object is very short, and next
is NULL
, will the kernel free my faked next
pointer as well or it will just free my msg_msg
object? By reading the source code of free_msg
, it turns out the kernel uses next
pointer without any checks.
1 | void free_msg(struct msg_msg *msg) |
This means overwriting next
also provides arbitrary-free primitive even in short msg_msg
object!
object overlapping
At this point, the most natural thing to do is freeing a struct file
(the address was leaked already) and then performing a heap spray to overwrite its fops
pointer.
But somehow I didn’t think of it, I went for a weird but easier to implement (and also very interesting) approach: free a misaligned address right before the target struct file
and reoccupy it with a chunk
object.
1 | 0xffff96daced9bfc0: 0xf0dcae9bb74c91ea 0x0000000000000000 <- the last slot in kmalloc-128 |
In other words, my goal was to overwrite the first struct file
in kmalloc-256
(at 0xffff96daced9c000
).
What I did was overwriting the next
pointer of msg_msg
to 0xffff96daced9bff0
(0xffff96daced9c000-0x10
) and then free it.
Interestingly, although it was not a normal slot address, SLUB did not complain about it.
I could reclaim it with chunk
object and easily overwrite fops
in kmalloc-256
using the edit_storage
function.
ret2usr
LPE is easy now. It only requires overwriting fops
to a controlled region and make kernel jump to userspace and execute shellcode there (since SMEP and SMAP are disabled by mistake).
Bypassing SMEP and SMAP is not hard either, it only requires pivoting stack to heap and executing ROP chain there.
Conclusion
This challenge is not hard. Yet, I still struggled a little bit. Anyway, I like this challenge because I learnt something from it :D
The full exploit can be found here.
Fun Fact
I was supposed to write this blog a month ago since the exploitation is interesing. But I completely forgot about it because my attention was drawn to kCTF.
Until yesterday, the author of the challenge @BitFriends contacted me, asking about the details of my unintended solution, I was reminded that I still had a blog to finish. And now here it is. Thank you for reminding me @BitFriends and thank you for the challenge!