This is the second digest of what happened to the the development of Chrysalide during the last month.
The following news is based on commit df579a2, so you can give this version of Chrysalide a try by running:
git clone http://git.0xdeadc0de.fr/chrysalide.git cd chrysalide git checkout df579a2
And then follow the installation procedure.
Graphical view
This part is one of the most interesting changes this month!

Drawing graphical nodes rely on computing logical nodes at first, so the whole core has been improved:
- basic block ranks are computed without mistake (commit b642749).
- computing ranks produced many infinite loops. In some cases, the root cause was some instructions were not seen as return points as expected (for instance
"throw"
with Dalvik code (commit 1aac673) or"pop { pc }"
in ARM code (commit 570ae6b)).
Lots of old code in "src/gtkext/graph/"
has been removed and replaced by a new graphic logic in the "cluster.[ch]"
files (commit 3628caa).
This work allows to introduce new graphs, with well drawn edges between well placed nodes. Improvements have been released (commit 3cf2601), and more code needs to get written but the current situation looks promising:

That was a long fight with Cairo's API to render graphic node with nice shadows. And a good way for learning how operators, groups, masks and paths are working (commits ef174ce and 0fdcf7a).
If possible, the final graph view is centered on the screen. Quite simple, but this has never been done with the previous version of the graph engine (commit fbf9c54).
For the anecdote: an invisible 1x1 widget is required to extend the GtkFixed
support in order to display all drawn shadows.
A lot of things are still in the TODO list for the graphical view: no crossing edges, loops support, aso. Stay tuned!

Dex and Dalvik support
All Dex pool items are now loaded using several threads, which speeds up the whole loading process. This has been made possible by switching the Flex/Bison pair to a reentrant mode (commit 0b6e87d).
Moreover, due to a small mistake, the Dex strings were truncated to the size of the variable containing their length (commit f3e8472).
The switch and fill-array data pseudo-instructions are now handled in the core program, without the need of an external plugin (commit 8dff3da).
Switch cases with or without (commit 722539f) fallthrough are recognized and supported, and comments provide information about the value linked to each case (commit 4c5f0e1).
Heavy load
Chrysalide is not yet ready to load big binaries (here "big" means a few Mb)... A lot of allocations are made to handle all instructions, and all rendering lines produce a lot of allocations too.
So the system may run out of memory quickly and may freeze because of Chrysalide. This kind of behavior was early reported by @Julien_Legras, thanks to him!
The optimization guide of GNOME describes how to use Massif to track memory allocations and the results are a good start to improve things.
The work has begun and the memory footprint has already been a little bit reduced by:
- using tricks: storing disassembling hooks in the
.data
section rather than duplicating them in heap (commit 0f0cb56). - removing fields in structures, for instance in GArchInstruction (commits 2c70e33, 8c71b36 and 38e455e).
- telling the compiler to use less memory for enumerations (commit 945f58c).
- fixing memory leaks (commit df579a2).
- reorganizing the code to use less structures: the buffer segment object has thus disappeared. It has been in charge of collecting all the information needed to print content (color, text, aso.) and has been replaced by a lighter structure shared between all places sharing common content. That was a huge change, reached step by step (commits 40886e0, fa30b0f, 3f05bac, 56f7524).
A small "quick and dirty" patch is available to track GLib instance and to get a better view of the evolution. Please note that the patch tracks only allocations, releasing memory is not taken into account!
The Android application "Toilettes à Paris" has been used as a reference for the tests: it has a 2.4 Mb Dex file, and Chrysalide needs about 1.5 Gb of memory to load it.
Here are some statistics for some main internal structures:
GLib object | instances | old size | old consumption | new size | new consumption | saved space in memory |
---|---|---|---|---|---|---|
GDalvikInstruction | 281027 | 296 | 83 183 992 | 168 | 47 212 536 | 35 971 456 |
GRawInstruction | 403453 | 304 | 122 649 712 | 160 | 64 552 480 | 58 097 232 |
GImmOperand | 952195 | 96 | 91 410 720 | 64 | 60 940 480 | 30 470 240 |
GBufferLine | 781259 | 248 | 193 752 232 | 256 | 200 002 304 | -6 250 072 |
The new version of GBufferLine
failed to reduce memory consumption, but the changes are required to support the shared segment mechanism. And sharing segments save in this case 33 438 193 bytes of raw text, plus 102 825 024 bytes for already allocated segments (9 127 606 segments are used when disassembling the selected Android application).
There is still a lot of improvements to perform, but there are hopefully a lot of ideas to lower the memory footprint.
For instance, the GLib signal system will be avoided: a connected signal allocates its handler which cost 40 bytes. And code buffers currently register two signals for each lines they contain, so about 31 Mb of memory (781259 lines x 40 bytes) are used only for line signal processing.
Misc
Chrysalide was crashing from times to times when loading binaries: realloc()
may change base addresses and does not initialize new memory. The relative code has been fixed (commit d800cb1).
Mouse clicks in the GUI when no binary is not yet loaded may lead to crashes. Some bugs have been fixed (commit 3a259b1), once again many thanks to @Julien_Legras for the bug repport!
Two bugs have been fixed with a tiny 1-byte long change each time:
- a mistake when decoding sparse-switch and packed-switch payloads (commit 4ff85e3).
- even the first basic block can have a loop to itself (commit 3e6c0fb).
There is still a pending bug when trying to detect loops... It can be avoided by commenting the call to rank_routine_blocks()
in "src/analysis/disass/routines.c"
, but this call is needed to get graphs... Work in progress!
Posted on October 30, 2016 at 14:51.