Searching in Malcat

As an hexadecimal editor, Malcat encourages you to explore files, and searching for patterns is a big part of exploration. Malcat allows to search for several types of items, in both the current file and in a set of local files.

Searching in current file

You can initiate a search inside the current file at any time in Malcat using either the keyboard shortcut Ctrl+F or menus. After a very short time, search results will be listed under the Search results entry of the data tab (left panel). Malcat will also jump to the first result and change the current view if need.

The total number of matches will reported in the status bar. If you are inside either the Hexadecimal view or the Structure/text view, matching patterns will be highlighted, provided you have the matching highlight enabled (cf. Highlighting). To navigate through all search hits, you can use the following shortcuts:

Shortcut

Action

Ctrl+N

Goto next search result

Ctrl+Shift+N

Goto previous search result

Ctrl+F

Start a new search

We list below all the ways you can initiate a search under Malcat.

Searching for selected bytes

Sometimes, you stumble on an interesting pattern and you just want to find more of the same. This is really easy: after Selecting what you want to search, just use the selection context menu: RightClick ‣ Search in current file. The selected pattern will be immediately searched across the current file.

../_images/search1.png

Searching for all occurences of the selected pattern inside the current file

Note that Excluding bytes from selection works: excluded bytes will be replaced by wild cards in the search pattern.

Searching for arbitrary bytes

If you open the search dialog (Ctrl+F) and chose Raw bytes, you will be able to search for an arbitrary byte sequence inside the current file. In the text box, you have to enter bytes in their hexadecimal representation. You can wild card any number of nibbles using the symbol ?. Hit OK and you’re good to go.

../_images/find_bytes.png

Searching for a bytes sequence

Searching for text

In the same vein, you can search for any characters sequence by choosing the Text category of the search dialog. The text mode displays additional options:

  • Ascii string: search for the ascii representation of the given text

  • Utf8 string: search for the utf-8 representation of the given text

  • Utf16 string: search for the utf-16le representation of the given text

  • Case sensitive: Search is case sensitive

  • Only referenced strings: Search hits which have no incoming references are discarded

Note that you may enter unicode characters in the search field, just make sure to uncheck Ascii string in this case.

../_images/find_text.png

Searching for text using Malcat

Searching for numbers

The search dialog also has a number search tab. There you can enter any number you are looking for, and select the number encoding options: endianeness, width, etc.

../_images/find_number.png

Looking for numbers

Searching for regular expression

Malcat makes heavy use of the PCRE2 regular expression engine internally. If you chose the Regular expression tab in the search dialog, you may enter a PCRE2 pattern to look for, as well as as a few options. Note that your regular expressions will be compiled using PCRE2’s just-in-time compiler for extra performances!

../_images/find_regexp.png

Leveraging PCRE2 regular expression engine

Searching for structure fields

Sometimes, you only want to look for a specific value or field within a structure identified by Malcat’s File parsers. For instance, where was this EntryPointRva field defined in the PE header already? Or, where does this value 0x4200 comes from, is it defined in some header? Don’t worry, we have your back with the Structure field tab in the search dialog. You can look for numbers or strings,into field values, field names or field comments.

../_images/find_struct.png

Looking for native imports inside a .NET sample

Searching in the corpus

As a malware analyst or detection engineer, you often need to search for a pattern or yara-scan a large number of files in a timely manner. Some of the most frequent use cases are:

  • threat attribution: look for samples sharing a piece of code, a string or a Yara rule with the current file

  • false positive remediation: look if the selected string is a good candidate for your new Yara rule by searching in your clean files set

  • malware analysis: want to compare the current analyzed malware against previous versions? Search your corpus for all previous samples of the current family (using a Yara rule for instance)

Malcat allows you to perform such searches directly from within the interface. Not only is it fast, but it lets you list and open matching files directly from the same window. This makes the whole process of finding the right file a lot easier!

Note

Local corpus search is done done using multiple threads. You can change the number of threads in Edit ‣ Preferences ‣ Analysis Setup ‣ Number of threads. Disk I/O plays an important part in performances too. While your first corpus search may be slow, subsequent searches should be a lot faster once most of the files are in cache!.

Organising your local corpus directories

If you want to take advantage of Malcat’s local corpus search features, you first need to organize your Corpus collection. A corpus is simply a directory containing a bunch of files which has a nice label.

../_images/corpus_prefs.png

Organizing your corpus sets

Using the preferences dialog (Edit ‣ Preferences ‣ Intelligence ‣ Local samples corpus), you can add, edit and remove corpus:

  • First click on the Add/remove corpus button (1), this will open a dialog that allow you to organize your corpus labels

  • Once you have added (2) one or more labels, you can assign a directory (3) to each corpus label. Note that sub-directories will be searched recursively.

  • Click on OK (4), and you’re good to go!

If you’re on Windows, you can for instance add a “CleanSystem32” corpus that points to C:\Windows\System32. This allows you to scan your Yara rules against Microsoft clean files. This is of course not enough to avoid Yara false positives, but it’s a good start!

Once you are happy with your local corpus configuration, you can start your first corpus searches.

Performing corpus searches

Yara scan

In the same vein, you can scan your corpus of files against a single Yara rule. This is useful to test your newly created Yara detection rule for instance. First go to the Yara editor / browser, select the rule you want to scan with in the rules list and open its context menu. You have two options:

  • Scan corpus: simply scan the corpus and report every file matching the rule

  • Scan corpus (partial matches allowed): scan the corpus and report every file matching the rule AND every file where at least one string of the Yara rule was found

This will initiate a parallel search of the selected Yara rules across all files within your corpus directories.

Displaying results

Once the corpus search has finished (you can monitor the progress through the status bar gauge control), Malcat will automatically open the corpus view. It is a three columns grid view that displays the result of the pattern search / Yara scan across local and remote corpus sets.

Local corpus hits

Under the Local corpus hits category, you’ll find all the files located inside one of your local corpus directories (cf. Organising your local corpus directories) which are matching the searched pattern or the selected Yara rule.

../_images/corpus.png

Corpus view displaying pattern search results

For local corpus hits, the grid columns have the following meaning:

  • the first column (Object) displays the path to the file relative to the corpus root directory.

  • the second column (# Hits) displays how many times the selected pattern was found in the file, or the number of string matches in the file for a Yara rule

  • the third column (Corpus) displays the label of the corpus where the file was found

Double-clicking on a row will open the selected file as a new project in Malcat. All the matching patterns (or matching Yara strings in case of a Yara scan) will be automatically highlighted in the newly open file, so that you easily inspect them and see if it is indeed the file you were looking for. Hitting Ctrl+N or Ctrl+Shift+N let you cycle through all the matches, like when Searching in current file.

Virustotal hits

For some time, Virustotal introduced a great feature named VTGrep. VTGrep allows premium Virustotal users to perform fast pattern searches in Virustotal’s malware corpus. This feature is also integrated inside Malcat’s own corpus view, in addition to the local corpus search.

Note

Currently, you can only search for patterns on Virustotal. Yara scans on VT sadly are too slow, and we had to deactivate the functionnality.

After selecting a string, a function or any arbitrary data range in Malcat and starting a corpus search (cf. Searching in the corpus), you will be able to list all files containing this pattern in Virustotal:

../_images/vtgrep.png

Corpus view displaying VTgrep results

For Virustotal’s hits, the grid columns have the following meaning:

  • the first column (Object) displays the initial name of the file in Virustotal. Note that files may have been uploaded several times using different names on Virustotal.

  • the second column (# Hits) displays the number of antivirus detections on Virustotal for the matching file

  • the third column (Corpus) displays the type of the matching file as reported by Virustotal

Double-clicking on a row will open a new browser tab on the Virustotal’s report for the selected file.

MalwareBazaar hits

MalwareBazaar queries are not implemented yet, stay tuned!