Module pychrysalide.format
Class BinFormat
Class BinSymbol
Class ExeFormat
Class FlatFormat
Class KnownFormat
Class PreloadInfo
Class StrSymbol
Class SymIterator
Module pychrysalide.format
Documentation
This module contains the basic definitions requiered for dealing with file formats.
Support for specific formats (such as ELF files for instance) needs extra definitions in a specific module.
Sub modules
Classes
Class BinFormat
The BinFormat class is the major poart of binary format support. It is the core class used by loading most of the binary files.
One item has to be defined as class attribute in the final class:
_endianness
: aSourceEndian
value indicating the endianness of the format.
Calls to the __init__
constructor of this abstract object expect no particular argument.
Hierarchy
builtins.object ╰── gi._gi.GObject ╰── pychrysalide.format.KnownFormat ╰── pychrysalide.format.BinFormat
Implements: pychrysalide.analysis.storage.SerializableObject
Known subclass: pychrysalide.format.ExeFormat
Methods
add_error(self, type, addr, desc)
Extend the list of detected errors linked to the format.
The type of error has to be one of the BinaryFormatError
flags. The location of the error is a vmpa
instance and a one-line description should give some details about what has failed.
add_symbol(self, symbol)
Register a new symbol for the format.
The symbol has to be a BinSymbol
instance.
find_next_symbol_at(self, addr)
Find the symbol next to the one found at a given address, provided as a vmpa
instance.
The result is a BinSymbol
instance, or None if no symbol was found.
find_symbol_at(self, addr)
Find the symbol located at a given address, provided as a vmpa
instance.
The result is a BinSymbol
instance, or None if no symbol was found.
find_symbol_by_label(self, label)
Find the symbol with a given label, provided as a string.
The result is a BinSymbol
instance, or None if no symbol was found.
has_flag(self, flag)
Test if a binary format has a given property.
This property is one of the values listed in the of FormatFlag
enumeration.
The result is a boolean value.
register_code_point(self, point, level)
Register a virtual address as entry point or basic point.
The point is an integer value for the virtual memory location of the new (entry) point. The type of this entry has to be a DisassPriorityLevel
value.
remove_symbol(self, symbol)
Unregister a symbol from the format.
The symbol has to be a BinSymbol
instance.
resolve_symbol(self, addr, strict)
Search for a position inside a symbol by a given address.
The result is a couple of (BinSymbol
, offset) values, or None if no symbol was found. The offset is the distance between the start location of the symbol and the location provided as argument.
If the search is run in strict mode, then the offset is always 0 upon success.
set_flag(self, flag)
Add a property from a binary format.
This property is one of the values listed in the of FormatFlag
enumeration.
If the flag was not set before the operation, True is returned, else the result is False.
unset_flag(self, flag)
Remove a property from a binary format.
This property is one of the values listed in the of FormatFlag
enumeration.
If the flag was not set before the operation, False is returned, else the result is True.
Attributes
endianness
Endianness of the format. The return value is of type SourceEndian
.
errors
List of all detected errors which occurred while loading the binary.
The result is a tuple of (BinaryFormatError
, vmpa
, string) values, providing a location and a description for each error.
flags
Provide all the flags set for a format. The return value is of type FormatFlag
.
symbols
Iterable list of all symbols found in the binary format.
The returned iterator is a SymIterator
instance and remains valid until the list from the format does not change.
Constants
BinaryFormatError
Flags for error occurring while loading a binary format.
4 | = 4 |
FormatFlag
Extra indications for formats.
1 | = 0x1 |
Class BinSymbol
BinSymbol represents all kinds of symbols, such as strings, routines or objects. If something can be linked to a physical or virtual location, it can be a symbol.
Instances can be created using the following constructor:
BinSymbol(range, stype)
Where range is a memory space defined by mrange
and stype a SymbolType
value.
The following methods have to be defined for new classes:
The object can be compared using rich methods (like <=
or !=
).
Hierarchy
builtins.object ╰── gi._gi.GObject ╰── pychrysalide.format.BinSymbol
Implements:
Known subclasses:
Methods
_get_label(self)
Abstract method used to provide the default label for a symbol.
The returned value has to be a string.
has_flag(self, flag)
Test if a binary symbol has a given property.
This property is one of the values listed in the of SymbolFlag
enumeration.
The result is a boolean value.
set_flag(self, flag)
Add a property from a binary symbol.
This property is one of the values listed in the of SymbolFlag
enumeration.
If the flag was not set before the operation, True is returned, else the result is False.
unset_flag(self, flag)
Remove a property from a binary symbol.
This property is one of the values listed in the of SymbolFlag
enumeration.
If the flag was not set before the operation, False is returned, else the result is True.
Attributes
flags
Provide all the flags set for a symbol. The return value is of type SymbolFlag
.
label
Label of the symbol, provided by the internal component or by the user.
nm_prefix
Single-byte string for an optional nm
prefix, or None if any.
range
Memory range covered by the symbol.
This property is a mrange
instance.
status
Status of the symbol's visibility, as a value of type SymbolStatus
.
stype
Type of the current symbol, as a value of type SymbolType
.
Constants
SymbolFlag
Extra indications for symbols.
1 | = 0x1 |
SymbolStatus
Status of a symbol visibility.
0 | = 0 |
1 | = 1 |
2 | = 2 |
3 | = 3 |
SymbolType
Available values for symbol types.
0 | = 0 |
1 | = 1 |
2 | = 2 |
3 | = 3 |
4 | = 4 |
5 | = 5 |
6 | = 6 |
7 | = 7 |
Class ExeFormat
PyChrysalide executable format
Hierarchy
builtins.object ╰── gi._gi.GObject ╰── pychrysalide.format.KnownFormat ╰── pychrysalide.format.BinFormat ╰── pychrysalide.format.ExeFormat
Implements: pychrysalide.analysis.storage.SerializableObject
Known subclasses:
- pychrysalide.format.FlatFormat
- pychrysalide.format.dex.DexFormat
- pychrysalide.format.elf.ElfFormat
- pychrysalide.format.pe.PeFormat
Methods
register_user_portion(self, portion)
Remember a given user-defined binary portion as part of the executable format content.
translate_address_into_vmpa(self, addr)
Translate a physical offset to a full location.
translate_offset_into_vmpa(self, off)
Translate a physical offset to a full location.
Class FlatFormat
FlatFormat is suitable for all executable contents without a proper file format, such as shellcodes ou eBPF programs.
Instances can be created using the following constructor:
FlatFormat(content, machine, endian)
Where content is a BinContent
object, machine defines the target architecture and endian provides the right endianness of the data.
Hierarchy
builtins.object ╰── gi._gi.GObject ╰── pychrysalide.format.KnownFormat ╰── pychrysalide.format.BinFormat ╰── pychrysalide.format.ExeFormat ╰── pychrysalide.format.FlatFormat
Implements: pychrysalide.analysis.storage.SerializableObject
Class KnownFormat
KnownFormat is a small class providing basic features for recognized formats.
One item has to be defined as class attribute in the final class:
_key
: a string providing a small name used to identify the format.
The following methods have to be defined for new classes:
The following method may also be defined for new classes too:
Calls to the __init__
constructor of this abstract object expect only one argument: a binary content, provided as a BinContent
instance.
Hierarchy
builtins.object ╰── gi._gi.GObject ╰── pychrysalide.format.KnownFormat
Implements: pychrysalide.analysis.storage.SerializableObject
Known subclasses:
Methods
_analyze(self, gid, status)
Abstract method used to start the analysis of the known format and return its status.
The identifier refers to the working queue used to process the analysis. A reference to the main status bar may also be provided, as a StatusStack
instance if running in graphical mode or None otherwise.
The expected result of the call is a boolean.
_complete_analysis(self, gid, status)
Abstract method used to complete an analysis of a known format.
The identifier refers to the working queue used to process the analysis. A reference to the main status bar may also be provided, as a StatusStack
instance if running in graphical mode or None otherwise.
_get_description(self)
Abstract method used to build a description of the format.
The result is expected to be a string.
analyze(self, gid, status)
Start the analysis of the known format and return its status.
Once this analysis is done, a few early symbols and the mapped sections are expected to be defined, if any.
The identifier refers to the working queue used to process the analysis. A reference to the main status bar may also be provided, as a StatusStack
instance if running in graphical mode or None otherwise.
The return value is a boolean status of the operation.
complete_analysis(self, gid, status)
Complete an analysis of a known format.
This process is usually done once the disassembling process is completed.
The identifier refers to the working queue used to process the analysis. A reference to the main status bar may also be provided, as a StatusStack
instance if running in graphical mode or None otherwise.
The return value is a boolean status of the operation.
Attributes
content
Binary content linked to the known format.
The result is a BinContent
instance.
description
Human description of the known format, as a string.
key
Internal name of the known format, provided as a (tiny) string.
Class PreloadInfo
The PreloadInfo object stores all kinds of disassembling information available from the analysis of a file format itsself.
Instances can be created using the following constructor:
PreloadInfo()
Hierarchy
builtins.object ╰── gi._gi.GObject ╰── pychrysalide.format.PreloadInfo
Known subclass: pychrysalide.arch.ProcContext
Class StrSymbol
StrSymbol is a special symbol object dedicated to strings.
Instances can be created using one of the following constructors:
StrSymbol(encoding, format=KnownFormat
, range=mrange)
StrSymbol(encoding, string=string, addr=vmpa)
The first constructor is aimed to be used for read-only strings available from the raw data of the analyzed binary. The format provides the raw content, and the memory range specifies the location of the string.
The second constructor is useful for strings which can not be extracted directly from the original content, such as obfuscted strings. A dynamic string is then provided here, and the start point of this string has to be provided.
In both cases, the encoding remains the first argument, as a StringEncodingType
value.
Hierarchy
builtins.object ╰── gi._gi.GObject ╰── pychrysalide.format.BinSymbol ╰── pychrysalide.format.StrSymbol
Implements:
- pychrysalide.analysis.storage.SerializableObject
- pychrysalide.arch.operands.ProxyFeeder
- pychrysalide.glibext.LineGenerator
Attributes
encoding
Encoding of the string, provided as a StringEncodingType
value.
raw
Raw data of the string, provided as bytes.
structural
True if the string symbol is linked to the file structure, else False.
utf8
String content as UTF-8 data.
Constants
StringEncodingType
Kinds of encoding for strings.
0 | = 0 |
1 | = 1 |
2 | = 2 |
3 | = 3 |
4 | = 4 |
Class SymIterator
Iterator for Chrysalide symbols registered in a given format.
This iterator is built when accessing to the symbols
field.
Hierarchy
builtins.object ╰── pychrysalide.format.SymIterator
Methods
__iter__(self)
Implement iter(self).
__next__(self)
Implement next(self).