Anomaly scanner

Malcat features a powerful anomaly scanner which leverage all Malcat’s analyses in order to highlight suspicious items in the file. It is present in all Full & Pro versions of Malcat.

What are anomalies?

Anomalies are small heuristics computed by python functions located in data/anomalies. These functions make use of Malcat’s python Analysis object (analysis) to inspect the file under all angles. Each found anomaly in the file is tagged and its locations + size are reported in the Summary view (see below).

../_images/summary_anomalies.png

The anomaly screen in the summary view

Each anomaly heuristic is represented by a python Anomaly class featuring a Anomaly.scan() function. The Anomaly.scan() function is very simple and simply has to yield the location + size where it has found the anomaly. One Anomaly instance is responsible for locating one particular type of anomaly. Here is an example of anomaly for the PE file format:

class SectionTableAfterFirstSection(Anomaly):
    """The section table is located after the first section's physical address"""
    level = Anomaly.WARNING
    category = "headers"
    filetype = "PE"     # <-- will only be run against PE files

    def scan(self, malcat):
        first_nonzero_region_offset = len(malcat.file)
        for r in malcat.map:
            if r.phys_size and r.phys:
                first_nonzero_region_offset = min(first_nonzero_region_offset, r.phys)
        sections = malcat.struct.Sections
        if sections.offset >= first_nonzero_region_offset:
            # this is odd, tag the whole section table as the anomaly
            yield sections.address, sections.size

And here is another one that you can find in data/anomalies/xref.py:

class UnreferencedImports(Anomaly):
    """More than half of the imports are not referenced, it could mean that the APIs are just decoys, or that the file is packed"""
    level = Anomaly.ODD
    category = "imports"

    def scan(self, malcat):
        notref = []
        referenced = 0
        for s in malcat.syms:
            if s.type != bindings.Symbol.IMPORT:
                continue
            if s.address not in malcat.xref:
                notref.append((s.address, len(s)))
            else:
                referenced += 1
        if len(notref) > referenced:
            # tag every non-referenced import
            yield from iter(notref)

Despite being written in python, the anomaly scanner runs relatively fast, since most of the work has already be done by Malcat’s C++ Analysis engine (doc in progress).

Warning

While some of the anomalies may seem quite expressive and strong, we discourage their use as standalone malware detector. They have been mainly designed to help the human analyst get faster to the point, not to have a perfect detection rate.

Found anomalies are displayed in the Summary view and are additionnaly available to scripts through the Anomalies (analysis.anomalies) object.

Write your own anomaly

Writing

Once you have setup a User data directory, you can start creating new anomalies by creating a python file under <user data dir>/anomalies/. Note that you can also add anomalies directly under <malcat install dir>/data/anomalies, but you take the risk that they get overwritten with the next Malcat’s update.

Once you have your python file ready, writing a new anomaly is as simple as adding an python class inheriting from malcat.Anomaly and featuring a malcat.Anomaly.scan() function to the file. The malcat.Anomaly.scan() function takes as parameter a malcat.Analysis object, which is well documented. You need also to specify two class attributes:

You should also write a small doc string for your class and of course chose a meaningful name. And that’s it! We plan to write a beginner-friendly tutorial addressing anomaly creation very soon. In the meantime, you can get some inspiration by looking at Malcat’s 200+ List of anomalies.

Testing

As for all hackable things in Malcat, you can test your anomaly directly from within the user interface by hitting Ctrl+R or selecting Analysis ‣ Reanalyse current file from the menu. The anomaly list will be reloaded and rerun.

If you have an error in your code, you will see an exclamation mark in Malcat’s status bar. Hover the mouse over the exclamation mark, or look at the Console window (F8) to see the error in detail. Fix it, hit Ctrl+R and have a look at the result. It is that simple!

A note on performances

Despite being written in python, the anomaly scanner runs relatively fast, since most of the work has already be done by Malcat’s C++ Analysis engine (doc in progress). Nonetheless, the constant switch between C++ and python can have a cost in the long run. In particular, if your anomalies need to inspect all functions or all strings or all basic blocks of the file, this can add up quickly. And this is a bit wasteful too: let’s say two different anomalies need to inspect all functions of a file, the CPP function object would need to be converted N*2 times to python.

In order to speed-up the anomaly process a bit, you also have the possibility (and should too) in your anomaly to override one or more of the following functions (instead of, or in addition to malcat.Anomaly.scan()):

  • malcat.Anomaly.scan_function(self, analysis, function): this function will be called for every function found by the analysis. The argument function is an instance of malcat.Function

  • :func:malcat.Anomaly.scan_structure(self, analysis, structure): this function will be called for every structure found by the analysis. The argument structure is an instance of malcat.StructAccess

  • :func:``malcat.Anomaly.scan_string(self, analysis, string)`: this function will be called for every string found by the analysis. The argument string is an instance of malcat.String

  • :func:malcat.Anomaly.scan_object(self, analysis, carved_file): this function will be called for every carved file found by the analysis. The argument function is an instance of malcat.CarvedFile

  • :func:malcat.Anomaly.scan_bb(self, analysis, bb): this function will be called for every basic block found by the analysis. The argument function is an instance of malcat.BasicBlock

This way, Malcat can optimise the way anomalies are called and can ensure that function/strings/objects/etc. CPP objects are only converted once to python, for an arbitrary number of anomalies.

Note

As added benefit, Malcat also makes sure for these functions that the number of times they are called is limited to something reasonable. If a program contains millions of strings for instance, Malcat will stop calling your anomaly’s malcat.Anomaly.scan_string() function after a certain amount of strings have been processed. No need to worry that your anomaly slows everything down!