Loading a binary file
First case: known format
Base for other cases
Working with binaries
Browsing symbols
Analysing instructions
GUI integration
Custom panel
Other resources
Python tutorial
There are two ways to use Chrysalide from Python:
- as a standalone extension from the Python interpreter.
- as an embedded extension when running the GUI.
Here are some basic steps to introduce both of these usages.
Loading a binary file
First case: known format
If the binary format is known and fixed, loading process can be straightforward:
1 2 3 4 5 6 | from pychrysalide.features import * cnt = FileContent('/path/to/binary_file') fmt = ElfFormat(cnt) binary = LoadedBinary(fmt) binary.analyze_and_wait() |
-
line 1: import all items available from Chrysalide bindings.
For a more selective loading, the following lines import only used module items.1 2 3
from pychrysalide.analysis.contents import FileContent from pychrysalide.analysis import LoadedBinary from pychrysalide.format.elf import ElfFormat
-
line 3: load content from a binary file.
Others sources could have been memory or content encapsulated in another content. -
line 4: setup a matching file format.
ElfFormatcould have been replaced byDexFormatdepending on the case. -
line 5: create an abstract layer to deal with all high level analysis requests.
The relativeArchProcessorclass can be retrieved from this loaded binary if requested. -
line 6: analysis of file format and code instructions starts here.
Execution flow will wait for the end of the analysis.
Base for other cases
Here is the general process to load binaries, which remains quite simple:
1 2 3 4 5 6 7 8 9 | from pychrysalide.features import * prj = StudyProject() cnt = FileContent('/path/to/binary_file') prj.discover(cnt) wait_for_all_global_works() binary = prj.contents[0] |
- line 3: create a fresh and empty new project, which will become a placeholder for the loaded binary.
-
line 5: load content from a binary file.
Others sources could have been memory or content encapsulated in another content. -
line 6: two processes are launched at once:
- one to explore the provided binary and its inner contents (for targets like APK files).
- another one to resolve discovered contents (to match a given file format such as ELF for instance).
- line 7: we must wait for the end of these two processes before dealing with loaded binary contents.
- line 9: for a simple file format without inner binaries, only one format is resolved and only one binary content is loaded and analyzed.
Working with binaries
Browsing symbols
Strings are a special kind of symbols, which can be handled with code like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | for s in binary.format.symbols: if not(s.stype in [ SymbolType.RO_STRING, SymbolType.DYN_STRING ]): continue print('0x%04x - %s' % (s.range.addr.phys, s.label)) origin = binary.content.read_raw(s.range.addr, s.range.length) print(' -> origin:', origin) print(' -> raw:', s.raw) print(' -> utf8:', s.utf8) assert(s.stype == SymbolType.DYN_STRING or s.utf8 == origin.decode('utf-8')) |
-
line 3: even if it belongs to the
BinSymbolclass, theSymbolTypeenumeration can here be accessed directly, thanks to the initial import of all the features.
RO_STRINGis the type of strings loaded directly from the original binary content.
DYN_STRINGmarks strings rebuilt during analysis. - line 8: if the string has been encrypted using a xor-like algorithm, the original source data can be read from the binary content.
-
line 11:
s.rawprovides the raw content of the current string value.
If the string was decrypted by Chrysalide, then the displayed bytes are the plain text values.
Analysing instructions
Disassembled instructions can be accessed using the processor attribute of loaded binaries, but also from basic blocks:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | biggest = None for s in binary.format.symbols: if s.stype != SymbolType.ROUTINE: continue if biggest is None or s.basic_blocks.count > biggest.basic_blocks.count: biggest = s def show_block(blk, grp): first, last = blk.boundaries print('Block @ 0x%x: %s - %s' \ % (first.range.addr.phys, first.keyword, last.keyword), end='') for db, dt in blk.destinations: desc = str(dt) # desc is for instance: 'InstructionLinkType.JUMP_IF_TRUE' desc = desc[desc.find('.') + 1:] print(' |-> 0x%x (%s)' % (db.boundaries[0].range.addr.phys, desc), end='') print() for bb in biggest.basic_blocks: show_block(bb, biggest.basic_blocks) |
-
line 1: basic blocks are stored in the routine owning them.
This first loop only looks for the biggest routine, according to the number of basic blocks. -
line 8:
basic_blocksis an iterable object, and it also provides one useful attribute to directly count the quantity of blocks. - line 14: block boundaries refer to a start instruction and an ending instruction.
- line 16: instructions are classes with some attributes such as keywords and locations.
- line 19: for sources and destinations, links between code blocks are a pair of linked block and link type.
-
line 26: each instruction link type (such as
JUMP_IF_TRUEfor instance) is defined using a constant value.
For Python these values are exported in theArchInstructionclass using theInstructionLinkTypeenumeration.
GUI integration
Custom panel
There are plenty of panels in the GUI main window.
Creating a new plugin is a way to build a new kind of panel. The first step is to setup a directory for this new plugin :
1 2 | mkdir hellopanel echo "from hellopanel.core import HelloPlugin as AutoLoad" > hellopanel/__init__.py |
The directory containing the hellopanel directory has to be in $PYTHONPATH.
At startup Chrysalide will load the HelloPlugin class as plugin because AutoLoad is an alias for it.
The content of the hellopanel/core.py file is the following one:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | from pychrysalide.features import * from .panel import HelloPanel class HelloPlugin(PluginModule): """Simple demo plugin to build a GUI panel.""" def __init__(self): """Initialize the plugin for Chrysalide.""" interface = { 'name' : 'HelloPanel', 'desc' : 'Say hello in the main GUI', 'version' : '0.1', 'actions' : ( ) } super(HelloPlugin, self).__init__(**interface) p = HelloPanel() register_panel(p) |
- line 2: import the class providing the widget used for the panel.
- line 10: a dictionary is defined to provide the plugin description.
-
line 20: the call to
super().__init__()actualy build the new object. - line 24: the built panel is registered as a legit panel for Chrysalide.
The last file is hellopanel/panel.py; its content is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | from gi.repository import Gtk from pychrysalide.features import * DEFAULT_MSG = 'No active loaded binary' class HelloPanel(PanelItem): def __init__(self): """Initialize the GUI panel.""" self._label = Gtk.Label() self._label.set_text(DEFAULT_MSG) params = { 'name' : 'Hello', 'widget' : self._label, 'personality' : PanelItem.PIP_SINGLETON, 'lname' : 'Hello panel description', 'dock' : True, 'path' : 'MEN' } super(HelloPanel, self).__init__(**params) def _change_content(self, old, new): """Get notified about loaded content change.""" if type(new) is LoadedBinary: count = len(list(new.processor.instrs)) self._label.set_text('Loaded binary with %u instructions' % count) else: self._label.set_text(DEFAULT_MSG) |
A few last words:
-
line 6: the panel has to be a subclass of class
PanelItem. -
line 26: some properties are transmitted to the parent constructor:
- name: label for the GUI tab.
- widget: GTK widget for the panel.
- personality: defines how many panels can be created at the same time.
- lname: long description for tooltips.
- dock: True if the panel should be displayed at startup.
- path: location of the panel in the tiled grid (M = Main area, E = East, N = North, aso).
-
line 28: if defined, the
_change_contentmethod will be called each time a new content gets active in the GUI.
In this case, the panel will be docked at the upper right corner of the main window.
Other resources
This tutorial does not explain all the Python API yet.
The reference documentation provides information about all available features.
The snippets repository contains more advanced examples. Some real case plugins are also defined in the Chrysalide's source code.