Binary Exploit Discovery Review
Notes on state of the art research into the automated discovery of vulnerabilities in binaries.
Leveraging LLM and Crash Reuse for Embedded Bug Unearthing
The authors use LLM based seed generation to generate the initial input seeds for mutation based coverage-guided fuzzing. Second they introduce a crash reuse strategy to speed up fuzzing across variants present in different targets. Notably these techniques can be used without altering the target’s source code or compilation process by using AFL++ in QEMU mode.
The main takeaways from this paper is that using LLM’s to generate initial seeds leads to improved findings by the fuzzer. AFL++ was used as the fuzzer and ChatGPT 4.0 was used to generate the seeds.
Generating Seeds
For example the prompt for fuzzing the awk
applet in BusyBox was
"role": "system",
"content": "You are an initial seed generator for a fuzzer that has to fuzz BusyBox awk applet. In response only provide the list of awk scripts"
"role": "user",
"content": "Generate inital seed to fuzz BusyBox awk applet"
After gathering results from the LLM run afl-cmin
to clean up the corpus
root@kali:~/# afl-cmin -i in -o out -- ./ @@
corpus minimization tool for afl-fuzz by <lcamtuf@google.com>
[*] Testing the target binary...
[+] OK, 82 tuples recorded.
[*] Obtaining traces for input files in 'in'...
Processing file 1/1...
[*] Sorting trace sets (this may take a while)...
[+] Found 82 unique tuples across 1 files.
[*] Finding best candidates for each tuple...
Processing file 1/1...
[*] Sorting candidate list (be patient)...
[*] Processing candidates and writing output files...
Processing tuple 82/82...
[!] WARNING: All test cases had the same traces, check syntax!
[+] Narrowed down to 1 files, saved in 'out'.
You can also run afl-tmin
to make sure the test cases don’t have unnecessary segments.
afl-tmin -i afl_in -o afl_out -- ./ @@
AFL QEMU mode
To fuzz apps not insturmented by afl we can use Qemu mode, just add a -Q
flag to the usual command.
afl-fuzz -Q -i afl_in -o afl_out -- <binary>
Reference
Binary Libification
Writing exploits often boils down to 3 steps: reaching, triggering, and exploiting. This paper focuses on the second step triggering by modifying the ELF headers to turn them into a shared library. This elegantly enables you to arbitrarily call functions within an ELF or even turn the entire ELF into a callable API.
This paper introduces the Witchcraft Linker
and Witchcraft Shell
. The linker enables transforming ET_EXEC
or ET_DYN
ELF binaries into a shared libray. The Shell allows you to invoke arbitrary C/C++ functions within libraries created by the linker. The main advantage of these tools is it allows researchers to manually invoke the posible vulnerable function without needing to produce user inputs traversing the applications call graph.
Installing
Make sure to install the prequisites. On Debian based systems run:
sudo apt install -y clang libbfd-dev uthash-dev libelf-dev libcapstone-dev libreadline-dev libiberty-dev libgsl-dev build-essential git debootstrap file
Then you can grab a copy from github and then make sure the submodules are up to date.
git clone https://github.com/endrazine/wcc.git
cd wcc
git submodule init
git submodule update
Once everything is installed you can use make
to build and install.
make
sudo make install
WLD
WLD is the witchcraft linker. Using it is straight forward just copy the bin you want to convert then pass it to wld.
cp /bin/ls /tmp/ls.so
wld --libify /tmp/ls.so
Currently wld
only works on ELF binaries however the underlying architecture or operating system is irrelevant. Meaning that wld
could process executables from Android, Linux, BSD and transform them into non relocatable shared libraries.
Keep in mind that you should only need to convert Executable files. You can find this the readelf
command and looking at the Type field.
~$ file /bin/ls
~$ readelf -h /bin/ls
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file) <--------------- Needs wld
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x404890
Start of program headers: 64 (bytes into file)
Start of section headers: 108288 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 28
Section header string table index: 27
One that doesn’t need to be linked would be apache for example:
~$ readelf -h /usr/sbin/apache2
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file) <----------- No need for wld
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x37156
Start of program headers: 64 (bytes into file)
Start of section headers: 635736 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 28
Section header string table index: 27
Future Research
- Automatically load individual functions and scan for exploitable instructions.
References