DEV IN PROGRESS

Scan rule body

A ROST rule is usually divided into the following sections:

  • the meta section, which provides some context for the expected matches;
  • the bytes section, which stores the search patterns to handle;
  • the condition section, which defines the final status of a scan: is there a match or not ?

The condition section is the only one which is required in order to define a valid rule.

A rule file can contain several rule definitions; each of them have to get a unique name though.

Identity of the definition

The meta section aims to provide extra information about a defined rule.

This section lists key/value pairs, with any identifiers as keys and text, integers or booleans as values.

Example:

rule MetadataExample

{

    meta:

        identifier_0 = "Some string data"

        identifier_1 = 123

        identifier_2 = true


    condition:

        true

}

A good common practice is to describe the following topics as additional information:

  • author: name, email address, Twitter handle;
  • description: overview of the rule's purpose;
  • dates: creation, update;
  • reference: link to an article or a Tweet which was the source of inspiration for the rule's definition;
  • hashes: MD5, SHA1 or SHA256 of the samples for which the rule has been validated;
  • TLP: rule's sharing boundaries according to the Traffic Light Protocol designations.

Example:

rule RuleWithGoodMetadata

{

    meta:

        author = "@me"

        description = "Detection of samples from the XXX operation of APTn"

        reference = "https://site.com/technical-analysis/"

        created = "2023-09-19"

        last_modified = "2023-09-20"

        hash = "5177a58dc65c8a14dc90c69db3bf3dd2"

        hash = "aaf9ff488e0767da5ea1d56118e6f65a16c5633b0cefc1fa089bd3ab1810613d"

        TLP = "CLEAR"


    condition:

        true

}

Metadata is used for documentation only: the key/value pairs can not be used in the condition section.

Search patterns

The bytes section is described with more details in its own documentation page.

There are three kinds of search patterns:

  • text strings, providing human-readable patterns;
  • hexadecimal bytes, defining raw sequences of content;
  • regular expressions, for complex search patterns.

While text can be translated into hexadecimal bytes, the opposite is not possible.

Likewise, raw bytes can be translated into regular expressions but regular expressions can not all be rewritten into raw bytes.

Match condition

The condition section is described with more details in its own documentation page.

A conditional match is defined from a boolean status computed in this section.

The status relies on found patterns (expected bytes as forbidden ones) or scanned content properties. Other rules can be referenced, too.

Example:

global rule ConsiderOnlySmallSamples

{

    condition:

        datasize < 1MB

}


rule RuleExample

{

    bytes:

        $forbidden = "__do_not_include_me__"


    condition:

        ConsiderOnlySmallSamples and #forbidden == 0

}

Rules properties

Private rules

The private property applied to a rule tells the scanner to not report matched patterns found inside a binary content.

This kind of bahavior can seem strange at first glance, but it may be useful to skip results when a rule is used as match condition for another rule: output verbosity is then not increased with basic matches, and only matches with high values get reported.

The syntax to declare a rule as private is this one:

Example:

private rule SimplePrivateRule

{

    condition:

        true

}

Global rules

The global property applied to a rule tells the scanner to mark this rule as a required condition for all other rule success.

This kind of behavior allows for instance to skip contents larger than a given size automatically:

Example:

global rule ConsiderOnlySmallSamples

{

    condition:

        datasize < 1MB

}

Note: both private and global modifiers can be applied to a rule; a global and silent rule is then defined for every scan rule.

Rule tags

Tags can be seen as labels creating categories of rules: there can be no tag for unsorted rules (default case), or rules with one or more tags.

Such tags are listed after a rule name:

Example:

rule RuleWithTwoTags : tag1 Another_Tag

{

    condition:

        true

}

The main advantage of tags is the ability to filter the output without updating rules. This kind of selection can be organized using the -t / --tag switch in command line mode.

Note

When output follows a JSON format, all rules are printed. Thus ROST does not filter output according to tags, and selections have to get performed using JSON parsing instead (with tools like jq or python for instance).

Preprocessing

Include files

Like many languages, the rule grammar allows to store into a single location common definitions shared between several rules. This common code can then be later referenced using the include keyword:

Examples:

include "common.rost"

include "/repository/rules/base.rost"

include "../rules/malwares/ransomware.rost"

The extra rule definitions can get included using two kinds of paths:

  • absolute paths;
  • relative paths, computed from the location of the main rule.
Warning

Rules loaded from memory are not linked to any filename and thus can only include extra definitions from absolute paths.

A good practice is to locate all inclusions at the beginning of a rule definition. However, this is not mandatory.

As a rule gets loaded only if all its definitions are valid, an included file needs to be valid in order to avoid a global loading failure.