DEV IN PROGRESS

Module pychrysalide.analysis

Documentation

This module provides bindings for all Chrysalide analysis-relative features.

Sub modules

Classes

Class BinContent

A BinContent is an abstract object which handles access to a given binary content.

All of its implementations are located in the contents module. The main implemantation is the FileContent class.

The following methods have to be defined for new implementations:

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.BinContent

Implements: pychrysalide.analysis.storage.SerializableObject

Known subclasses:

Methods

_describe(self, full)

Abstract method used to build a (full ?) description of the binary content.describe the loaded content.

The description is returned as a string.

_read_raw(self, addr, length)

Abstract method used to provide the bytes read from a given position.

The description is returned as a string.

_read_u16(self, addr, endian)

Abstract method used to read two unsigned bytes from a given position.

The location of the data to read is a vmpa instance. The endianness of the data can be provided using SourceEndian values.

The returned value is the read data or None is case of error.

_read_u32(self, addr, endian)

Abstract method used to read four unsigned bytes from a given position.

The location of the data to read is a vmpa instance. The endianness of the data can be provided using SourceEndian values.

The returned value is the read data or None is case of error.

_read_u64(self, addr, endian)

Abstract method used to read eight unsigned bytes from a given position.

The location of the data to read is a vmpa instance. The endianness of the data can be provided using SourceEndian values.

The returned value is the read data or None is case of error.

_read_u8(self, addr)

Abstract method used to read an unsigned bytes from a given position.

The location of the data to read is a vmpa instance.

The returned value is the read data or None is case of error.

describe(self, full=False)

Get a (full ?) description of the binary content.

read_raw(self, addr, length)

Read bytes from a given position.

read_u16(self, addr, endian)

Read two unsigned bytes from a given position.

The location of the data to read is a vmpa instance. The endianness of the data can be provided using SourceEndian values.

The returned value is the read data or None is case of error.

read_u32(self, addr, endian)

Read four unsigned bytes from a given position.

The location of the data to read is a vmpa instance. The endianness of the data can be provided using SourceEndian values.

The returned value is the read data or None is case of error.

read_u64(self, addr, endian)

Read eight unsigned bytes from a given position.

The location of the data to read is a vmpa instance. The endianness of the data can be provided using SourceEndian values.

The returned value is the read data or None is case of error.

read_u8(self, addr)

Read an unsigned byte from a given position.

The location of the data to read is a vmpa instance.

The returned value is the read data or None is case of error.

Attributes

attributes

Provide or define the attributes linked to the binary content.

checksum

Compute a SHA256 hash as chechsum of handled data.

data

Provide all the content bytes at once.

end_pos

Provide the ending position of the binary content.

root

Provide, as a BinContent instance, the root content leading to the current content.

This property is relevant only for EncapsulatedContent objects.

size

Compute the quantity of readable bytes.

start_pos

Provide the starting position of the binary content.

Constants

MemoryDataSize

Size of processed data.

0= 0x0
1= 0x1
2= 0x2
3= 0x3
4= 0x4
5= 0x5
129= 0x81
130= 0x82
131= 0x83
132= 0x84
133= 0x85

SourceEndian

Endianness of handled data.

0= 0
1= 1
2= 2
3= 3

Class BinRoutine

BinRoutine is an object for a function in a binary.

Instances can be created using the following constructor:

    BinRoutine()

As routines can be built from demangling, with no information other than a name at first glance, the usual process is to create a routine object and to define its core properties (namely a location range and a symbol type) after this operation.

The object can be compared using rich methods (like <= or !=) and produce an "informal" string representation of itself with a call to str().

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.format.BinSymbol
           ╰── pychrysalide.analysis.BinRoutine

Implements:

Known subclasses:

Attributes

args

Arguments for the routine, provided as a tuple of BinVariable instances.

basic_blocks

Basic blocks for the routine.

This list is managed by a BlockList instance.

name

String for the raw name of the routine or None if any.

namespace

Namespace of the routine, provided as a DataType instance, or None if any.

return_type

Return of the routine, provided as a DataType instance, or None if any.

typed_name

Typed name of the routine, provided as a DataType instance, or None if any.

When a routine is built from a demangling operation, its final name carries some type information. This kind of information can be retrived thanks to this attribute.

Class BinVariable

PyChrysalide binary variable

The object can produce an "informal" string representation of itself with a call to str().

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.BinVariable

Attributes

name

of the current variable.

type

of the current variable.

Class BlockList

PyChrysalide basic block

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.BlockList

Methods

__iter__(self)

Implement iter(self).

find_by_addr(self, addr)

Find a code block containing a given address.

Attributes

count

tity of code blocks included in the list.

Class CodeBlock

PyChrysalide code block

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.CodeBlock

Known subclass: pychrysalide.analysis.disass.BasicBlock

Attributes

destinations

List of destination blocks.

Each item of the resulting tuple is a pair of CodeBlock instance and InstructionLinkType value.

index

x of the code block in the parent list, if any.

rank

of the code block.

sources

List of source blocks.

Each item of the resulting tuple is a pair of CodeBlock instance and InstructionLinkType value.

Class ContentAttributes

ContentAttributes is a set of values used at binary content loading.

Such parameters are useful to transmit password for encrypted contents for instance. These parameters can be accessed like dictionary items:

    password = attributes['password']
    attributes['password'] = 'updated'

Instances can be created using the following constructor:

    ContentAttributes(path)

Where path is a list of parameters: '[...]&key0=value0&key1=value1...'

The constructor returns a tuple containing a ContentAttributes instance and the original targot filename.

The object can provide some sequence methods (such as len() or [n]).

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.ContentAttributes

Attributes

keys

Keys of all attributes contained in a set of values.

Class ContentExplorer

PyChrysalide content explorer

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.ContentExplorer

Methods

populate_group(self, wid, content)

Push a new binary content into the list to explore.

Class ContentResolver

PyChrysalide content resolver

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.ContentResolver

Methods

add_detected(self, wid, loaded)

Add a binary content as loaded content ready to get analyzed.

Class DataType

The DataType object is the base class for all data types.

Instances can be created using the following constructor:

    DataType()

The following methods have to be defined for new classes:

Some extra method definitions are optional for new classes:

The object can produce an "informal" string representation of itself with a call to str().

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.DataType

Implements: pychrysalide.analysis.storage.SerializableObject

Known subclasses:

Methods

_dup(self)

Abstract method used to create a copy of a data type.

The returned value has to be a new instance of the DataType class.

_handle_namespaces(self)

Abstract method used to state if the type handles namespaces or not.

The return is a boolean value. If this method does not exist, the True value is assumed.

_hash(self)

Abstract method used to create a hash of the data type.

The returned value has to be a 32-bit integer.

_is_pointer(self)

Abstract method used to state if the type points to another type or not.

The return is a boolean value. If this method does not exist, the False value is assumed.

_to_string(self, include)

Abstract method used to provide the string represention of a data type.

The include argument defines if the type namespace has to get prepended, if it exists.

The returned value has to be a string.

dup(self)

Create a copy of a data type.

The returned value has to be a new instance of the DataType class.

Attributes

handle_namespaces

True if the type handles namespaces, False otherwise.

hash

Hash value for the type, as a 32-bit integer.

Each proporty change implies a hash change.

is_pointer

True if the type is a pointer, False otherwise.

is_reference

True if the type is a reference, False otherwise.

namespace

Namespace for the type, None if any.

This property carries a tuple of two values:

  • a namespace, as a TypeQualifier instance;
  • a namespace separator, as a string.

qualifiers

Qualifier for the data type, TypeQualifier.NONE if any.

This property carries a TypeQualifier value.

Constants

TypeQualifier

Qualifier for a data type.

0= 0
1= 1
2= 2
4= 4
7= 7

Class LoadedBinary

PyChrysalide loaded binary

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.LoadedContent
           ╰── pychrysalide.analysis.LoadedBinary

Methods

add_to_collection(self, item)

Ask a server to include the given item into the update database.

The server type (internal or remote) depends on the collection type linked to the item and the user configuration.

The item has to be a subclass of DbItem.

The method returns True if the item has been successfully forwarded to a server, False otherwise.

find_collection(self, feature)

Provide the collection managing a given database feature.

The feature is a value of type DbItemFlags.

get_client(self)

Provide the client connected to an internal or remote server if defined, or return None otherwise.

The returned object is a AnalystClient instance or None.

set_last_active(self, timestamp)

Define the timestamp of the last active item in the collection and returns the status of the request transmission.

Attributes

collections

List of all collections of database items linked to the binary.

disassembly_cache

Give access to the disassembly graphical cache, which is a BufferCache instance or None.

In graphical mode, the cache is built by default. Otherwise, the build depends on the cache argument provided at the analysis call (please refer to the LoadedContent interface for more information about this kind of call).

format

format recognized in the binary content.

processor

ler for the current binary processor.

Class LoadedContent

The LoadedContent object is an intermediary level of abstraction for all loaded binary contents to analyze.

No matter if the loaded content comes from an ELF file or XML data, some basic features are available here.

A typical class declaration for a new implementation looks like:

    class NewImplem(GObject.Object, LoadedContent):
        ...

The following methods have to be defined for new implementations:

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.LoadedContent

Known subclass: pychrysalide.analysis.LoadedBinary

Methods

_analyze(self, connect, cache, gid, status)

Abstract method used to start the analysis of the loaded binary.

The connect parameter defines if connections to database servers (internal and/or remote) will be established. The default value depends on the running mode: if the analysis is run from the GUI, the binary will get connected to servers; in batch mode, no connection will be made.

The cache parameter rules the build of the cache for rendering lines. The same behavior relative to the running mode applies.

The identifier refers to the working queue used to process the analysis. A reference to the main status bar may also be provided, as a StatusStack instance if running in graphical mode or None otherwise.

_describe(self, full)

Abstract method used to describe the loaded content.

The boolean full parameter shapes the size of the returned string.

This method is mainly used to provide a label (or a tooltip text) for tabs in the graphical main window.

_get_content(self)

Abstract method used to get the binary content linked to the loaded content. The result is provided as a BinContent instance.

_get_content_class(self, human)

Abstract method used to provide the nature of the loaded content.

The description associated to a loaded ARM Elf binary is for instance 'elf-armv7', or 'Elf, ARMv7' for the human version.

analyze(self, connect=?, cache=?)

Start the analysis of the loaded binary and send an analyzed signal when done.

The connect parameter defines if connections to database servers (internal and/or remote) will be established. The default value depends on the running mode: if the analysis is run from the GUI, the binary will get connected to servers; in batch mode, no connection will be made.

The cache parameter rules the build of the cache for rendering lines. The same behavior relative to the running mode applies.

All theses operations can be forced by providing True values as parameters.

analyze_and_wait(self, connect=?, cache=?)

Run the analysis of the loaded binary and wait for its completion.

The final analysis status is returned as boolean.

The connect parameter defines if connections to database servers (internal and/or remote) will be established. The default value depends on the running mode: if the analysis is run from the GUI, the binary will get connected to servers; in batch mode, no connection will be made.

The cache parameter rules the build of the cache for rendering lines. The same behavior relative to the running mode applies.

All theses operations can be forced by providing True values as parameters.

describe(self, full)

Describe the loaded content.

The boolean full parameter shapes the size of the returned string.

This method is mainly used to provide a label (or a tooltip text) for tabs in the graphical main window.

detect_obfuscators(self, version)

List all detected obfuscators.

If the version parameter is equal to True, the operation tries to resolve obfuscators versions too.

The result is a tuple of strings or an empty tuple.

Attributes

content

Binary content, provided as a BinContent instance.

content_class

Nature of the loaded content.

The description associated to a loaded ARM Elf binary is for instance 'elf-armv7'.

content_class_for_human

Humain version of the nature of the loaded content.

The description associated to a loaded ARM Elf binary is for instance ''Elf, ARMv7'.

Class StudyProject

PyChrysalide study project

Hierarchy

builtins.object
 ╰── gi._gi.GObject
      ╰── pychrysalide.analysis.StudyProject

Methods

attach(self, loaded)

Add a loaded content to the project.

discover(self, content, cache, filter)

Explore a new binary content for the project.

save(self, filename)

Save the project into a given file.

Attributes

contents

of all loaded contents for the project.