width and endian

Snort minimum required version: v3.6.2.0

The width and endian content modifiers allow rule writers to create clearer detection of multibyte character strings (e.g., UTF-16LE and UTF-16BE). This is useful when, for example, wanting to detect wide-character strings present in compiled Windows executables. Without these two modifiers, rule writers would have to interleave one or more |00| sequences between each character to detect these wide strings.

For example, to detect the string "Hello World" encoded with two-byte characters (in little endian), a rule writer would previously write this content match like so:

content:"H|00|e|00|l|00|l|00|o|00| |00|W|00|o|00|r|00|l|00|d|00|";

However, with these new modifiers, that same content match can be written in a much more readable fashion:

content:"Hello World",width 16,endian little;

Note that this does not enable support for content matches to contain characters outside of the set of ASCII printable characters. This instead just prefixes or suffixes one more more null bytes between each character in the content match before evaulating it against a given piece of traffic.

width

The width content modifier tells Snort how many bits to check for each character in the content match. This can be set to either 8, 16, or 32 to match 8-bit, 16-bit and 32-bit encoded strings, respectively. If not set explicitly, this will default to 8.

Format:

width {8|16|32}

Examples:

# Match "test" encoded with 16 bits per character in big endian (\x00t\x00e\x00s\x00t)
content:"test",width 16;

# Match "hello" encoded with 32 bits per character in big endian (\x00\x00\x00h\x00\x00\x00e\x00\x00\x00l\x00\x00\x00l\x00\x00\x00l\x00\x00\x00o)
content:"hello",width 32;

endian

Multibyte character strings can also be represented in either big or little endian, and so the endian modifier is used to set the endianness of the content match. This should be set to little and big for little endian and big endian, respectively, and will default to big if not explicitly set.

Format:

endian {big|little}

Examples:

# Match "test" encoded with 16 bits per character in little endian (t\x00e\x00s\x00t\x00)
content:"test",width 16,endian little;

# Match "Talos" encoded with 32 bits per character in little endian (T\x00\x00\x00a\x00\x00\x00l\x00\x00\x00o\x00\x00\x00s\x00\x00\x00)
content:"Talos",width 32,endian little;

# Match "Snort" encoded with 16 bits per character in big endian (\x00S\x00n\x00o\x00r\x00t)
content:"Snort",width 16,endian big;