SAOC LLDB D integration: 1st Weekly Update
Hi D community!
I’m here to describe what I’ve done during the first week on the Symmetry Autumn of Code.
During the discussion for the milestones plan with my mentor, I decided to advance some work and wrote a simple C API around D runtime demangler to expose the D demangler API into a C interface. This would allow in the future to implement an LLDB language plugin into the LLVM. The source code is available on Github, liblldbd.
In the meanwhile, we decided to focus on porting libiberty demangler codebase
to the LLVM upstream repository since it would provide much more benefits and
acceptance to be upstreamed. So the
liblldbd is a plan B if libiberty is not
accepted by the LLVM team.
Right after we finished the plan, in which you can follow up
here, I started
libiberty and integrate the code into the LLVM core. Similarly to
Rust demangler, I tried to follow up some patches on the LLVM review
platform and the awesome documentation that LLVM
This ended up being relatively easy to plug into the LLVM codebase, since most of the demangler logic was isolated in one file, thanks to Iain (@ibuclaw) for the excelent code. Because I didn’t expect this to be so plug and play I decided to extensively test the code using the robust test suite that LLVM provides.
First, I started to port the
libiberty test suite for D demangling and right
after wrote some
libfuzzer tests and ran it with an address sanitizer and UB
libfuzzer results took some time to show up but I got some interesting
outputs from there. The most interesting one was a heap/stack buffer overflow.
I also managed to find a null dereferencing. Both, with a crafted malicious
mangle name, can trigger a segmentation fault or undefined behaviour by
reading/writing to a protected memory space.
I wrote a patch to fix both issues and contacted MITRE for standard vulnerabilities reporting procedure, since GCC is widely used and can potentially cause some issues. I pushed those patches into the GCC mailing list, and I’m currently waiting for appreciation. You can check those two patches here and here.
After patching the code I ran the fuzzer again and after some hours the fuzzer reported a timeout with a huge number of recursive calls. I carefully analyzed the generated output mangle that the fuzzer created and found out that it is a very repetitive name. Doing some superficial analysis I found out that those recursive calls are creating exponential time complexity and can cause the demangler to wait for hours or even days to complete. I believe that this can also be used to maliciously cause a denial of service, although I didn’t have much time to profile it yet.
To have some discussion about this I’m going to create a thread on the GCC security mailing list and express some solutions to mitigate those problems, such as integrating part of the codebase into the OSS fuzzer.
Before that, I’m waiting for a reply to the message I sent to MITRE, which was forwarded to Red Hat security team for further appreciation.
I don’t really know if this is crucial to share now, but I saved the fuzzer result, if anyone is interested in researching more ideas of crafted mangles to feed the address/UB sanitizer.
The last task I was working on (today) was on finalizing the LLDB integration. I still need to write some tests but the most important fact is that it is already working! My LLDB tree can successfully pretty print the mangled names. My fork is available on my Github, here.
From the first time I built LLVM I found out that compiling it with debug
information is extremely costly in terms of memory usage, since linking all
those symbols at once can consume a lot of RAM. I recommend you build it with
Here is my
cmake config so far, if someone wants to test my work at any
cmake -S llvm -B build -G Ninja \ -DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi;lldb" \ -DCMAKE_BUILD_TYPE=Release \ -DLLDB_EXPORT_ALL_SYMBOLS=0 \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLLVM_CCACHE_BUILD=ON \ -DLLVM_LINK_LLVM_DYLIB=ON \ -DCLANG_LINK_CLANG_DYLIB=ON
To build LLDB, you can do something like:
cmake --build build -- lldb -j$(nproc --all)
Next week, I’m going to have an eye on the time complexity problem, try to solve it, restructure the code to look a bit more C++ish and finishing the LLDB test suite to finally start upstreaming my changes. Although, this can take a while, since there is a challenge, described in the plan, which is dual-licensing the GCC codebase with LLVM codebase. This is cooperatively being handled by Mathias (my mentor), Iain and GCC team.
Read about the next week.