d-saoc-2021-06.md (5513B)
1 --- 2 title: 'SAOC LLDB D integration: 6th Weekly Update' 3 date: '2021-10-28T00:38:00+01:00' 4 tags: ['saoc', 'saoc2021', 'dlang', 'llvm', 'lldb', 'debug', 'debugging', 'dwarf'] 5 description: "This post describes what I've done on the 6th week of the 6 Symmetry Autumn of Code 2021, including follow up on LLVM patches, 7 implementation of the array and string slices formatters on the D language 8 plugin and minor fixes and refactoring" 9 --- 10 11 Hi D community! 12 13 I'm here again, to describe what I've done during the sixth week of Symmetry 14 Autumn of Code. 15 16 ## LLVM Patches follow up 17 18 The first two patches were merged into the LLVM tree! 19 20 - https://reviews.llvm.org/D111947 21 - https://reviews.llvm.org/D111948 22 23 Hopefully we can now proceed with merging the demangling patches as the next 24 step. 25 26 ## LLDB D Plugin 27 28 This week I primarily worked on getting the D plugin working. I added two 29 features to the plugin which includes handling D slices generically and the 30 special case of string slices. They are now formatted as a D string literal, 31 depending on its encoding. 32 33 This is a reduced example of what the LLDB can show to the user, with the D 34 plugin. 35 36 ``` 37 * thread #1, name = 'app', stop reason = signal SIGSEGV: invalid address (fault address: 0xdeadbeef) 38 frame #0: 0x0000555555555edc app`app.foobar(p=0x00000000deadbeef, a=([0] = 1, [1] = 2, [2] = 3), ...) at app.d:43:2 39 40 immutable(dchar)[] sh = "double atum"d.dup; 40 41 const(wchar)[] si = "wide atum"w.dup; 41 42 42 -> 43 return *p; 43 44 } 44 45 45 46 class CFoo { 46 (lldb) fr v 47 (int *) p = 0x00000000deadbeef 48 (int[]) a = ([0] = 1, [1] = 2, [2] = 3) 49 (long double) c = 123.122999999999999998 50 (Foo) f = {} 51 (string) sa = "atum" 52 (wstring) sb = "wide atum"w 53 (dstring) sc = "double atum"d 54 (char[]) sd = "atum" 55 (dchar[]) se = "double atum"d 56 (wchar[]) sf = "wide atum"w 57 (const(char)[]) sg = "atum" 58 (dstring) sh = "double atum"d 59 (const(wchar)[]) si = "wide atum"w 60 ``` 61 62 If you are excited to test it by yourself, checkout 63 [this](https://github.com/ljmf00/llvm-project/commits/llvm-plugin-d) branch and 64 compile lldb. I suggest the following steps: 65 66 ```bash 67 # To use clang to compiler LLVM 68 export CC=clang 69 export CXX=clang++ 70 71 # CMake flags (compile to different target if you are not using x86) 72 cmake -S llvm -B build -G Ninja \ 73 -DLLVM_ENABLE_PROJECTS="clang;lldb" \ 74 -DCMAKE_BUILD_TYPE=Debug \ 75 -DLLDB_EXPORT_ALL_SYMBOLS=OFF \ 76 -DLLVM_OPTIMIZED_TABLEGEN=ON \ 77 -DLLVM_ENABLE_ASSERTIONS=ON \ 78 -DLLDB_ENABLE_PYTHON=ON \ 79 -DLLVM_TARGETS_TO_BUILD="X86" \ 80 -DLLVM_CCACHE_BUILD=ON \ 81 -DLLVM_LINK_LLVM_DYLIB=ON \ 82 -DCLANG_LINK_CLANG_DYLIB=ON 83 84 ninja -C build lldb lldb-server 85 ldc2 -g app.d 86 ./build/bin/lldb app 87 ``` 88 89 You can also use 90 [this](../../public/assets/posts/d-saoc-2021-06/app.d) file, 91 which is what I use to test the D plugin and used to show the above example. 92 93 ### Issues 94 95 During the plugin development and testing, I found out that LLDB was not 96 properly showing UTF8 strings when using `char8_t` types with different names 97 so I made a patch to fix it: https://reviews.llvm.org/D112564 . An issue was 98 also created to cross reference the fix 99 https://bugs.llvm.org/show_bug.cgi?id=52324 . This is particularly an issue for 100 the D formatter if the compiler exports types with different type names, which 101 they should. Debuggers should be able to read encoding DWARF tags and rely on 102 that first, instead of hardcoding the formatters. LLDB does that but this 103 somehow got skipped on https://reviews.llvm.org/D66447 . 104 105 While reading how plugin are built with their internal C++ interface, I found 106 very repetitive code and decide to patch it: https://reviews.llvm.org/D112658 . 107 108 I also happened to reproduce 109 [this](https://bugs.llvm.org/show_bug.cgi?id=45856) issue that Mathias reported 110 a while ago and decided to investigate on it since it indirectly affects the 111 behaviour on D side. I got some conclusions and I believe this is a regression 112 introduced in 2015. Please read the issue for more context. 113 114 I found other issues on the LDC side and DMD side that I already added to my 115 task list, including: 116 - DMD should use wchar and dchar type names instead of `wchar_t`: This triggers 117 the hardcoded formatters to format char pointers wrongly. Furthermore this is 118 wrongly typed since `wchar_t` is not exactly UTF16, according to the C 119 standard. 120 - DMD also reports other types as C style naming instead of D style 121 - LDC reports hardcoded const(char) type instead of a DWARF type modifier 122 123 ### Mailing list announcement 124 125 As discussed erlier in a LLDB bug, I decided to write to the `llvm-dev` and 126 `lldb-dev` mailing list to discuss about upstreaming the D language plugin. You 127 can follow up the thread 128 [here](https://lists.llvm.org/pipermail/lldb-dev/2021-October/017101.html). 129 130 ## What is next? 131 132 Next week, I'm going to try to fix the above listed issues on either DMD and 133 LDC trees. I need to be careful with these changes to make sure I don't break 134 GDB behaviour, if they are relying on the hardcoded types. If that is the case 135 I'll try to patch it too. I'm going to also finish my DWARF refactor on the 136 backend to handle DWARF abbreviations correctly. The objective of the second 137 milestone is finished but I'm going to try to study more features to improve 138 pretty printing. 139 140 You can also read this on the D programming language forum, 141 [here](https://forum.dlang.org/thread/mailman.409.1635399049.11670.digitalmars-d@puremagic.com), 142 and discuss there. 143 144 Read about the [previous week](../d-saoc-2021-05/) and the [next 145 week](../d-saoc-2021-07/).