Rules definition
Editing an initial rule
Converting existing rules
Scanning from command line
Found patterns in details
Embedding in Python scripts
Automate a scan
Rules definition
ROST is a content scanner relying on rule(s) describing how to raise a match status.
Extended rule definitions are detailed in the next pages of the documentation; the current page shows some basics as introduction.
Editing an initial rule
The only mandatory part inside a ROST rule is the match condition.
A simple rule can thus be really short:
From Shell:
# Create an always matching rule
$ echo "rule Simple { condition: true }" > first_rule.rost
# Use this simple rule with ROST to scan itself
$ rost ./first_rule.rost /proc/self/exe
Rule 'Simple' has matched!
Converting existing rules
The ROST's grammar slightly differs from the YARA's one. The documentation later lists some of these differences.
As there are plenty of rules for YARA available on the Internet, it may be interesting to reuse them. A tool named yara2rost
can translate this kind of rules effortlessly:
From Shell:
$ echo "rule MyShell { condition: filesize == $( stat -c '%s' /bin/bash ) }" | yara2rost
rule MyShell {
condition:
datasize == 1265648
}
As ROST supports reading rules from its standard input, translated rules can be used on the fly:
From Shell:
$ echo "rule MyShell { condition: filesize == $( stat -c '%s' /bin/bash ) }" | yara2rost | rost - /bin/bash
Rule 'MyShell' has matched!
Note: the dash is optional in this case and remains here for clarity.
Scanning from command line
Found patterns in details
ROST handles a -s
switch when run from command line. For each found pattern, this argument provides the relative locations inside the scanned content:
From Shell:
$ echo "rule Bash { bytes: \$magic = \"\\x7fELF\" \$name = \"GNU bash\" fullword condition: true }" | \
rost -s /bin/bash
0x0:$magic: \x7fELF
0xf0004:$name: GNU bash
0xf4f45:$name: GNU bash
Rule 'Bash' has matched!
Output patterns are sorted by declarations first, offsets next.
ROST also handles a -j
switch which provides a JSON output, suitable for further processing.
For instance, here are the full results stored for the first processed rule ([0]
):
From Shell:
$ echo "rule Bash { bytes: \$magic = \"\\x7fELF\" \$name = \"GNU bash\" fullword condition: true }" | \
rost -j /bin/bash | jq .[0]
{
"name": "Bash",
"bytes_patterns": [
{
"name": "$magic",
"match_count": 1,
"matches": [
{
"offset": 0,
"offset_hex": "0x0",
"content": "\u007fELF",
"content_str": "\\x7fELF",
"length": 4,
"length_hex": "0x4"
}
]
},
{
"name": "$name",
"match_count": 2,
"matches": [
{
"offset": 983044,
"offset_hex": "0xf0004",
"content": "GNU bash",
"content_str": "GNU bash",
"length": 8,
"length_hex": "0x8"
},
{
"offset": 1003333,
"offset_hex": "0xf4f45",
"content": "GNU bash",
"content_str": "GNU bash",
"length": 8,
"length_hex": "0x8"
}
]
}
],
"matched": true
}
Embedding in Python scripts
Automate a scan
Scan can be driven from Python code. Here is a minimal snippet reproducing the previous behavior:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | from pychrysalide.analysis.contents import FileContent from pychrysalide.analysis.scan import ContentScanner from pychrysalide.analysis.scan import ScanOptions from pychrysalide.analysis.scan.patterns.backends import AcismBackend if __name__ == '__main__': options = ScanOptions() options.backend_for_data = AcismBackend content = FileContent('/bin/bash') rule = ''' rule Bash { bytes: $magic = "\x7fELF" $name = "GNU bash" fullword condition: $magic and $name } ''' scanner = ContentScanner(rule) ctx = scanner.analyze(options, content) print(ctx.has_match_for_rule('Bash')) |
A few comments:
- line 1: all import lines can be replaced by a single one:
from pychrysalide.features import *
. - line 9: options are defined to tune the scan process. For instance, the Aho-Corasick parallel string search algorithm, using interleaved arrays, is selected as the search backend here.
- line 11: the file
/bin/bash
is expected as scan target, so the FileContent class is suitable here; a MemoryContent object could have been involved in case of memory scan for instance. - line 26: the
scanner
variable may be set toNone
if a syntax error is raised while parsing the rule.
The Python script can be run using a classic interpreter; it outputs the expected boolean status:
From Shell:
$ python3 ./scan-bash.py
True
To achieve a JSON processing, the following lines can be appended to the script, replacing the old status output:
1 2 3 4 5 6 7 8 9 | if ctx.has_match_for_rule('Bash'): import json data = scanner.convert_to_json(ctx) jdata = json.loads(data) for p in jdata[0]['bytes_patterns']: print('Match count for %s: %u' % (p['name'], p['match_count'])) |
Running the Python interpreter confirms the previous results:
From Shell:
$ python3 ./scan-bash.py
Match count for $magic: 1
Match count for $name: 2