This is the second digest of what happened to the the development of Chrysalide during the last month.

The following news is based on commit df579a2, so you can give this version of Chrysalide a try by running:

git clone http://git.0xdeadc0de.fr/chrysalide.git
cd chrysalide
git checkout df579a2

And then follow the installation procedure.

Graphical view

This part is one of the most interesting changes this month!

Drawing graphical nodes rely on computing logical nodes at first, so the whole core has been improved:

Lots of old code in "src/gtkext/graph/" has been removed and replaced by a new graphic logic in the "cluster.[ch]" files (commit 3628caa).

This work allows to introduce new graphs, with well drawn edges between well placed nodes. Improvements have been released (commit 3cf2601), and more code needs to get written but the current situation looks promising:

That was a long fight with Cairo's API to render graphic node with nice shadows. And a good way for learning how operators, groups, masks and paths are working (commits ef174ce and 0fdcf7a).

If possible, the final graph view is centered on the screen. Quite simple, but this has never been done with the previous version of the graph engine (commit fbf9c54).

For the anecdote: an invisible 1x1 widget is required to extend the GtkFixed support in order to display all drawn shadows.

A lot of things are still in the TODO list for the graphical view: no crossing edges, loops support, aso. Stay tuned!

Dex and Dalvik support

All Dex pool items are now loaded using several threads, which speeds up the whole loading process. This has been made possible by switching the Flex/Bison pair to a reentrant mode (commit 0b6e87d).

Moreover, due to a small mistake, the Dex strings were truncated to the size of the variable containing their length (commit f3e8472).

The switch and fill-array data pseudo-instructions are now handled in the core program, without the need of an external plugin (commit 8dff3da).

Switch cases with or without (commit 722539f) fallthrough are recognized and supported, and comments provide information about the value linked to each case (commit 4c5f0e1).

Heavy load

Chrysalide is not yet ready to load big binaries (here "big" means a few Mb)... A lot of allocations are made to handle all instructions, and all rendering lines produce a lot of allocations too.

So the system may run out of memory quickly and may freeze because of Chrysalide. This kind of behavior was early reported by @Julien_Legras, thanks to him!

The optimization guide of GNOME describes how to use Massif to track memory allocations and the results are a good start to improve things.

The work has begun and the memory footprint has already been a little bit reduced by:

A small "quick and dirty" patch is available to track GLib instance and to get a better view of the evolution. Please note that the patch tracks only allocations, releasing memory is not taken into account!

The Android application "Toilettes à Paris" has been used as a reference for the tests: it has a 2.4 Mb Dex file, and Chrysalide needs about 1.5 Gb of memory to load it.

Here are some statistics for some main internal structures:

GLib object instances old size old consumption new size new consumption saved space in memory
GDalvikInstruction 281027 296 83 183 992 168 47 212 536 35 971 456
GRawInstruction 403453 304 122 649 712 160 64 552 480 58 097 232
GImmOperand 952195 96 91 410 720 64 60 940 480 30 470 240
GBufferLine 781259 248 193 752 232 256 200 002 304 -6 250 072

The new version of GBufferLine failed to reduce memory consumption, but the changes are required to support the shared segment mechanism. And sharing segments save in this case 33 438 193 bytes of raw text, plus 102 825 024 bytes for already allocated segments (9 127 606 segments are used when disassembling the selected Android application).

There is still a lot of improvements to perform, but there are hopefully a lot of ideas to lower the memory footprint.

For instance, the GLib signal system will be avoided: a connected signal allocates its handler which cost 40 bytes. And code buffers currently register two signals for each lines they contain, so about 31 Mb of memory (781259 lines x 40 bytes) are used only for line signal processing.

Misc

Chrysalide was crashing from times to times when loading binaries: realloc() may change base addresses and does not initialize new memory. The relative code has been fixed (commit d800cb1).

Mouse clicks in the GUI when no binary is not yet loaded may lead to crashes. Some bugs have been fixed (commit 3a259b1), once again many thanks to @Julien_Legras for the bug repport!

Two bugs have been fixed with a tiny 1-byte long change each time:

There is still a pending bug when trying to detect loops... It can be avoided by commenting the call to rank_routine_blocks() in "src/analysis/disass/routines.c", but this call is needed to get graphs... Work in progress!


Posted on October 30, 2016 at 14:51.