Summary of "No, Seriously. AI is REALLY Good at Hacking Now"

“No, Seriously. AI is REALLY Good at Hacking Now”

Core story

A Claude-like AI wrote a fully functional exploit for an existing FreeBSD kernel vulnerability (referenced as 20264747).
The vulnerability itself is a classic pre-auth stack-based buffer overflow in an RPC daemon: a length field is memcpy’d into a stack buffer without proper bounds checks.
A researcher tested the AI and published the prompts they used. It reportedly took roughly 20 prompts to obtain a working end-to-end exploit (Python script + write-up).

The AI automatically constructed a ROP chain from kernel gadgets to achieve code execution without injecting directly executable code into non-executable pages (to bypass NX / non-executable stack).
ROP overview:
- Overwrite the return address (program counter) via the stack overflow.
- Chain short instruction sequences (gadgets ending in RET) already present in kernel memory to perform arbitrary operations.

Establish a connection to the vulnerable RPC service and trigger the overflow.
Overwrite the return address to jump into ROP gadgets.
Use ROP to call a kernel page-protection function (pmap_change_protection / pmap_protect-like) to set the kernel BSS page permissions to RWX.
Use ROP writes (writes of quadwords — ~32 bytes at a time) to copy shellcode into the now-executable BSS.
Jump to the shellcode in BSS to execute it in kernel context.
Clean exit: after running the payload, call kthreadd/kthread_exit so the exploited thread terminates cleanly and the kernel does not panic.

Payload outcome: spawns a kernel-level shell (bensh) running as root.
Implementation details: the AI-generated exploit is a Python script containing gadget addresses and kernel base addresses tied to a specific FreeBSD kernel build/version. The exploit’s feasibility was helped by kernel ASLR not being enabled by default in the tested FreeBSD configuration.

Demonstrates that AI can both reason about advanced exploitation techniques (like ROP) and generate working, complex kernel exploits quickly.

AI will accelerate both attackers and defenders.
The presenter expects a transient period of increased automated exploitation, but also notes AI tooling can scale security testing and ultimately help make code safer.

Availability of gadget addresses (absence of kernel ASLR in the tested setup).
Ability to change page protections from kernel context.
The AI’s ability to chain cleanup calls (kthread_exit) to avoid crashing the system — avoiding a kernel panic is critical for practical kernel exploitation.

Full exploit: AI-generated Python script and write-up (including the prompts, gadget addresses, and pmap/kthread function references).
Researcher’s published prompts and write-up (the prompt sequence used to obtain the exploit).
Tools referenced: reverse-engineering workflows (e.g., Ghidra), classic exploitation techniques (ROP), and the Python exploit implementation.

Flare (sponsor): a threat-exposure management platform that:
- Ingests cybercrime / dark-web data to detect when organization credentials or data are being sold or leaked.
- Alerts on compromised credentials; integrates with Entra and plans integration with Okta.
- Positions itself to shorten detection/remediation windows (noting claimed averages such as ~48 minutes to compromise, while remediation can take much longer).