My personal website source code
Log | Files | Refs | Submodules | README | LICENSE

commit fe9f044c621ec6956744891a4a8d7f128b0f4788
parent 2609773618f3c434879b04ae53735bab47c0d95c
Author: Luís Ferreira <>
Date:   Thu, 28 Oct 2021 02:01:49 +0100

posts: Add 'SAOC LLDB D integration: 6th Weekly Update'

Signed-off-by: Luís Ferreira <>

Mcontent/posts/ | 3++-
Acontent/posts/ | 140+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/content/posts/ b/content/posts/ @@ -87,4 +87,5 @@ on the LLDB integration, since that is what the milestone is focused on. I'm going to clean up the idea I have of the language plugin and hopefully show some interesting practical results. -Read about the [previous week](../d-saoc-2021-04/). +Read about the [previous week](../d-saoc-2021-04/) and [next +week](../d-saoc-2021-05/). diff --git a/content/posts/ b/content/posts/ @@ -0,0 +1,140 @@ +--- +title: 'SAOC LLDB D integration: 6th Weekly Update' +date: '2021-10-28T00:38:00+01:00' +tags: ['saoc', 'saoc2021', 'dlang', 'llvm', 'lldb', 'debug', 'debugging', 'dwarf'] +description: "This post describes what I've done on the 6th week of the +Symmetry Autumn of Code 2021, including follow up on LLVM patches, +implementation of the array and string slices formatters on the D language +plugin and minor fixes and refactoring" +--- + +Hi D community! + +I'm here again, to describe what I've done during the sixth week of Symmetry +Autumn of Code. + +## LLVM Patches follow up + +The first two patches were merged into the LLVM tree! + +- +- + +Hopefully we can now proceed with merging the demangling patches as the next +step. + +## LLDB D Plugin + +This week I primarily worked on getting the D plugin working. I added two +features to the plugin which includes handling D slices generically and the +special case of string slices. They are now formatted as a D string literal, +depending on its encoding. + +This is a reduced example of what the LLDB can show to the user, with the D +plugin. + +``` +* thread #1, name = 'app', stop reason = signal SIGSEGV: invalid address (fault address: 0xdeadbeef) + frame #0: 0x0000555555555edc app`app.foobar(p=0x00000000deadbeef, a=([0] = 1, [1] = 2, [2] = 3), ...) at app.d:43:2 + 40 immutable(dchar)[] sh = "double atum"d.dup; + 41 const(wchar)[] si = "wide atum"w.dup; + 42 +-> 43 return *p; + 44 } + 45 + 46 class CFoo { +(lldb) fr v +(int *) p = 0x00000000deadbeef +(int[]) a = ([0] = 1, [1] = 2, [2] = 3) +(long double) c = 123.122999999999999998 +(Foo) f = {} +(string) sa = "atum" +(wstring) sb = "wide atum"w +(dstring) sc = "double atum"d +(char[]) sd = "atum" +(dchar[]) se = "double atum"d +(wchar[]) sf = "wide atum"w +(const(char)[]) sg = "atum" +(dstring) sh = "double atum"d +(const(wchar)[]) si = "wide atum"w +``` + +If you are excited to test it by yourself, checkout +[this]( branch and +compile lldb. I suggest the following steps: + +```bash +# To use clang to compiler LLVM +export CC=clang +export CXX=clang++ + +# CMake flags (compile to different target if you are not using x86) +cmake -S llvm -B build -G Ninja \ + -DLLVM_ENABLE_PROJECTS="clang;lldb" \ + -DCMAKE_BUILD_TYPE=Debug \ + -DLLDB_EXPORT_ALL_SYMBOLS=OFF \ + -DLLVM_OPTIMIZED_TABLEGEN=ON \ + -DLLVM_ENABLE_ASSERTIONS=ON \ + -DLLDB_ENABLE_PYTHON=ON \ + -DLLVM_TARGETS_TO_BUILD="X86" \ + -DLLVM_CCACHE_BUILD=ON \ + -DLLVM_LINK_LLVM_DYLIB=ON \ + -DCLANG_LINK_CLANG_DYLIB=ON + +ninja -C build lldb lldb-server +ldc2 -g app.d +./build/bin/lldb app +``` + +You can also use +[this]( file, +which is what I use to test the D plugin and used to show the above example. + +### Issues + +During the plugin development and testing, I found out that LLDB was not +properly showing UTF8 strings when using `char8_t` types with different names +so I made a patch to fix it: . An issue was +also created to cross reference the fix + . This is particularly an issue for +the D formatter if the compiler exports types with different type names, which +they should. Debuggers should be able to read encoding DWARF tags and rely on +that first, instead of hardcoding the formatters. LLDB does that but this +somehow got skipped on . + +While reading how plugin are built with their internal C++ interface, I found +very repetitive code and decide to patch it: . + +I also happened to reproduce +[this]( issue that Mathias reported +a while ago and decided to investigate on it since it indirectly affects the +behaviour on D side. I got some conclusions and I believe this is a regression +introduced in 2015. Please read the issue for more context. + +I found other issues on the LDC side and DMD side that I already added to my +task list, including: +- DMD should use wchar and dchar type names instead of `wchar_t`: This triggers + the hardcoded formatters to format char pointers wrongly. Furthermore this is + wrongly typed since `wchar_t` is not exactly UTF16, according to the C + standard. +- DMD also reports other types as C style naming instead of D style +- LDC reports hardcoded const(char) type instead of a DWARF type modifier + +### Mailing list announcement + +As discussed erlier in a LLDB bug, I decided to write to the `llvm-dev` and +`lldb-dev` mailing list to discuss about upstreaming the D language plugin. You +can follow up the thread +[here]( + +## What is next? + +Next week, I'm going to try to fix the above listed issues on either DMD and +LDC trees. I need to be careful with these changes to make sure I don't break +GDB behaviour, if they are relying on the hardcoded types. If that is the case +I'll try to patch it too. I'm going to also finish my DWARF refactor on the +backend to handle DWARF abbreviations correctly. The objective of the second +milestone is finished but I'm going to try to study more features to improve +pretty printing. + +Read about the [previous week](../d-saoc-2021-05/).