DEV IN PROGRESS

Rules definition

ROST is a content scanner relying on rule(s) describing how to raise a match status.

Extended rule definitions are detailed in the next pages of the documentation; the current page shows some basics as introduction.

Editing an initial rule

The only mandatory part inside a ROST rule is the match condition.

A simple rule can thus be really short:

From Shell:

# Create an always matching rule

$ echo "rule Simple { condition: true }" > first_rule.rost


# Use this simple rule with ROST to scan itself

$ rost ./first_rule.rost /proc/self/exe

Rule 'Simple' has matched!

Converting existing rules

The ROST's grammar slightly differs from the YARA's one. The documentation later lists some of these differences.

As there are plenty of rules for YARA available on the Internet, it may be interesting to reuse them. A tool named yara2rost can translate this kind of rules effortlessly:

From Shell:

$ echo "rule MyShell { condition: filesize == $( stat -c '%s' /bin/bash ) }" | yara2rost

rule MyShell {


    condition:

        datasize == 1265648


}

As ROST supports reading rules from its standard input, translated rules can be used on the fly:

From Shell:

$ echo "rule MyShell { condition: filesize == $( stat -c '%s' /bin/bash ) }" | yara2rost | rost - /bin/bash

Rule 'MyShell' has matched!

Note: the dash is optional in this case and remains here for clarity.

Scanning from command line

Found patterns in details

ROST handles a -s switch when run from command line. For each found pattern, this argument provides the relative locations inside the scanned content:

From Shell:

$ echo "rule Bash { bytes: \$magic = \"\\x7fELF\" \$name = \"GNU bash\" fullword condition: true }" | \

    rost -s /bin/bash

0x0:$magic: \x7fELF

0xf0004:$name: GNU bash

0xf4f45:$name: GNU bash

Rule 'Bash' has matched!

Output patterns are sorted by declarations first, offsets next.

ROST also handles a -j switch which provides a JSON output, suitable for further processing.

For instance, here are the full results stored for the first processed rule ([0]):

From Shell:

$ echo "rule Bash { bytes: \$magic = \"\\x7fELF\" \$name = \"GNU bash\" fullword condition: true }" | \

    rost -j /bin/bash | jq .[0]

{

  "name": "Bash",

  "bytes_patterns": [

    {

      "name": "$magic",

      "match_count": 1,

      "matches": [

        {

          "offset": 0,

          "offset_hex": "0x0",

          "content": "\u007fELF",

          "content_str": "\\x7fELF",

          "length": 4,

          "length_hex": "0x4"

        }

      ]

    },

    {

      "name": "$name",

      "match_count": 2,

      "matches": [

        {

          "offset": 983044,

          "offset_hex": "0xf0004",

          "content": "GNU bash",

          "content_str": "GNU bash",

          "length": 8,

          "length_hex": "0x8"

        },

        {

          "offset": 1003333,

          "offset_hex": "0xf4f45",

          "content": "GNU bash",

          "content_str": "GNU bash",

          "length": 8,

          "length_hex": "0x8"

        }

      ]

    }

  ],

  "matched": true

}

Embedding in Python scripts

Automate a scan

Scan can be driven from Python code. Here is a minimal snippet reproducing the previous behavior:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from pychrysalide.analysis.contents import FileContent
from pychrysalide.analysis.scan import ContentScanner
from pychrysalide.analysis.scan import ScanOptions
from pychrysalide.analysis.scan.patterns.backends import AcismBackend

if __name__ == '__main__':

    options = ScanOptions()
    options.backend_for_data = AcismBackend

    content = FileContent('/bin/bash')

    rule = '''
rule Bash {

    bytes:
        $magic = "\x7fELF"
        $name = "GNU bash" fullword

    condition:
        $magic and $name

}
'''

    scanner = ContentScanner(rule)
    ctx = scanner.analyze(options, content)

    print(ctx.has_match_for_rule('Bash'))

A few comments:

  • line 1: all import lines can be replaced by a single one: from pychrysalide.features import *.
  • line 9: options are defined to tune the scan process. For instance, the Aho-Corasick parallel string search algorithm, using interleaved arrays, is selected as the search backend here.
  • line 11: the file /bin/bash is expected as scan target, so the FileContent class is suitable here; a MemoryContent object could have been involved in case of memory scan for instance.
  • line 26: the scanner variable may be set to None if a syntax error is raised while parsing the rule.

The Python script can be run using a classic interpreter; it outputs the expected boolean status:

From Shell:

$ python3 ./scan-bash.py

True

To achieve a JSON processing, the following lines can be appended to the script, replacing the old status output:

1
2
3
4
5
6
7
8
9
if ctx.has_match_for_rule('Bash'):

        import json

        data = scanner.convert_to_json(ctx)
        jdata = json.loads(data)

        for p in jdata[0]['bytes_patterns']:
            print('Match count for %s: %u' % (p['name'], p['match_count']))

Running the Python interpreter confirms the previous results:

From Shell:

$ python3 ./scan-bash.py

Match count for $magic: 1

Match count for $name: 2