Searching in Malcat
As an hexadecimal editor, Malcat encourages you to explore files, and searching for patterns is a big part of exploration. Malcat allows to search for several types of items, in both the current file and in a set of local files.
Searching in current file
You can initiate a search inside the current file at any time in Malcat using either the keyboard shortcut Ctrl+F or menus. After a very short time, search results will be listed under the Search results entry of the data tab (left panel). Malcat will also jump to the first result and change the current view if need.
The total number of matches will reported in the status bar. If you are inside either the Hexadecimal view or the Structure/text view, matching patterns will be highlighted, provided you have the matching highlight enabled (cf. Highlighting). To navigate through all search hits, you can use the following shortcuts:
Goto next search result
Goto previous search result
Start a new search
We list below all the ways you can initiate a search under Malcat.
Searching for selected bytes
Sometimes, you stumble on an interesting pattern and you just want to find more of the same. This is really easy: after Selecting what you want to search, just use the selection context menu: . The selected pattern will be immediately searched across the current file.
Note that Excluding bytes from selection works: excluded bytes will be replaced by wild cards in the search pattern.
Searching for arbitrary bytes
If you open the search dialog (Ctrl+F) and chose
Raw bytes, you will be able to search for an arbitrary byte sequence inside the current file. In the text box, you have to enter bytes in their hexadecimal representation. You can wild card any number of nibbles using the symbol
?. Hit OK and you’re good to go.
Searching for text
In the same vein, you can search for any characters sequence by choosing the
Text category of the search dialog. The text mode displays additional options:
Ascii string: search for the ascii representation of the given text
Utf8 string: search for the utf-8 representation of the given text
Utf16 string: search for the utf-16le representation of the given text
Case sensitive: Search is case sensitive
Only referenced strings: Search hits which have no incoming references are discarded
Note that you may enter unicode characters in the search field, just make sure to uncheck Ascii string in this case.
Searching for numbers
The search dialog also has a number search tab. There you can enter any number you are looking for, and select the number encoding options: endianeness, width, etc.
Searching for regular expression
Malcat makes heavy use of the PCRE2 regular expression engine internally. If you chose the Regular expression tab in the search dialog, you may enter a PCRE2 pattern to look for, as well as as a few options. Note that your regular expressions will be compiled using PCRE2’s just-in-time compiler for extra performances!
Searching for structure fields
Sometimes, you only want to look for a specific value or field within a structure identified by Malcat’s File parsers. For instance, where was this EntryPointRva field defined in the PE header already? Or, where does this value 0x4200 comes from, is it defined in some header? Don’t worry, we have your back with the Structure field tab in the search dialog. You can look for numbers or strings,into field values, field names or field comments.
Searching in the corpus
As a malware analyst or detection engineer, you often need to search for a pattern or yara-scan a large number of files in a timely manner. Some of the most frequent use cases are:
threat attribution: look for samples sharing a piece of code, a string or a Yara rule with the current file
false positive remediation: look if the selected string is a good candidate for your new Yara rule by searching in your clean files set
malware analysis: want to compare the current analyzed malware against previous versions? Search your corpus for all previous samples of the current family (using a Yara rule for instance)
Malcat allows you to perform such searches directly from within the interface. Not only is it fast, but it lets you list and open matching files directly from the same window. This makes the whole process of finding the right file a lot easier!
Local corpus search is done done using multiple threads. You can change the number of threads in. Disk I/O plays an important part in performances too. While your first corpus search may be slow, subsequent searches should be a lot faster once most of the files are in cache!.
Organising your local corpus directories
If you want to take advantage of Malcat’s local corpus search features, you first need to organize your Corpus collection. A corpus is simply a directory containing a bunch of files which has a nice label.
Using the preferences dialog (), you can add, edit and remove corpus:
First click on the Add/remove corpus button (1), this will open a dialog that allow you to organize your corpus labels
Once you have added (2) one or more labels, you can assign a directory (3) to each corpus label. Note that sub-directories will be searched recursively.
Click on OK (4), and you’re good to go!
If you’re on Windows, you can for instance add a “CleanSystem32” corpus that points to
C:\Windows\System32. This allows you to scan your Yara rules against Microsoft clean files. This is of course not enough to avoid Yara false positives, but it’s a good start!
Once you are happy with your local corpus configuration, you can start your first corpus searches.
Performing corpus searches
In the Hexadecimal view or the Structure/text view, you can select a pattern that you are interested in (cf. Selecting). For instance, is this string a good candidate for a Yara rule? Or did I see this pattern in a malware? Feel free to wild card some of the bytes if you wish (cf. Excluding bytes from selection).
Once your are done, open the selection context menu and chose:. This will initiate a parallel search of the selected pattern across all files within your corpus directories.
In the same vein, you can scan your corpus of files against a single Yara rule. This is useful to test your newly created Yara detection rule for instance. First go to the Yara editor / browser, select the rule you want to scan with in the rules list and open its context menu. You have two options:
Scan corpus: simply scan the corpus and report every file matching the rule
Scan corpus (partial matches allowed): scan the corpus and report every file matching the rule AND every file where at least one string of the Yara rule was found
This will initiate a parallel search of the selected Yara rules across all files within your corpus directories.
Once the corpus search has finished (you can monitor the progress through the status bar gauge control), Malcat will automatically open the corpus view. It is a three columns grid view that displays the result of the pattern search / Yara scan across local and remote corpus sets.
Local corpus hits
Under the Local corpus hits category, you’ll find all the files located inside one of your local corpus directories (cf. Organising your local corpus directories) which are matching the searched pattern or the selected Yara rule.
For local corpus hits, the grid columns have the following meaning:
the first column (Object) displays the path to the file relative to the corpus root directory.
the second column (# Hits) displays how many times the selected pattern was found in the file, or the number of string matches in the file for a Yara rule
the third column (Corpus) displays the label of the corpus where the file was found
Double-clicking on a row will open the selected file as a new project in Malcat. All the matching patterns (or matching Yara strings in case of a Yara scan) will be automatically highlighted in the newly open file, so that you easily inspect them and see if it is indeed the file you were looking for. Hitting Ctrl+N or Ctrl+Shift+N let you cycle through all the matches, like when Searching in current file.
For some time, Virustotal introduced a great feature named VTGrep. VTGrep allows premium Virustotal users to perform fast pattern searches in Virustotal’s malware corpus. This feature is also integrated inside Malcat’s own corpus view, in addition to the local corpus search.
Currently, you can only search for patterns on Virustotal. Yara scans on VT sadly are too slow, and we had to deactivate the functionnality.
After selecting a string, a function or any arbitrary data range in Malcat and starting a corpus search (cf. Searching in the corpus), you will be able to list all files containing this pattern in Virustotal:
For Virustotal’s hits, the grid columns have the following meaning:
the first column (Object) displays the initial name of the file in Virustotal. Note that files may have been uploaded several times using different names on Virustotal.
the second column (# Hits) displays the number of antivirus detections on Virustotal for the matching file
the third column (Corpus) displays the type of the matching file as reported by Virustotal
Double-clicking on a row will open a new browser tab on the Virustotal’s report for the selected file.
MalwareBazaar queries are not implemented yet, stay tuned!