Control Flow Graph (analysis.cfg)

analysis.cfg: malcat.CFG

The analysis.cfg object is a malcat.CFG instance that gives you access to all the basic blocks identified by the CFG reconstruction algorithm.

Note that in addition to this documentation, you can find usage examples in the sample script which is loaded when you hit F8.

Basic blocks definition

Definition

The CFG, or control flow graph, divides executable code into a graph of basic blocks. Basic blocks are contiguous file ranges that satisfies:

  • control flow always starts at the beginning of the block for every possible execution of the program

  • control flow always goes to the end of the block for every possible execution of the program, i.e no branching/jump except for the last instruction (with a special case for the EXCEPTION edge, see below.

  • the basic block is located in a single region

  • basic blocks have incoming and outgoing edges, which can be of 4 types:

    • STEP: normal control flow, next instruction will be executed

    • JUMP: a conditional or unconditional jump

    • CALL: a call

    • EXCEPTION: a conditional jump, the condition being that an exception happens when executing any instruction inside the basic block.

Note

Contrary to some other tools or papers, we consider that a call instruction ends a basic block. Such exotic basic blocks will have their property BasicBlock.exotic set to True, to help user used to the other definition.

Code blocks and data blocks

In order to simplify the interface a bit and make code easier to read, non-code regions (i.e. data) also belong to special basic blocks named data blocks, which have no incoming nor outgoing edges. So every byte of the effective address space belongs to either a code or data block.

Basic blocks

Blocks

A basic block is a BasicBlock instance offering the following python methods and properties:

class malcat.BasicBlock
address: int (effective address)

the start of the basic block

start: int (effective address)

same as address

end: int (effective address)

last effective address inside the basic block + 1

size: int

size in bytes of the basic block: end - start

__len__()

return the size in bytes of the basic block: end - start

Return type:

int

__iter__()

Iterate over the basic block’s instructions (even if data). Shortcut for analysis.asm[bb.start:bb.end].

ep_rva = analysis.struct['OptionalHeader']['AddressOfEntryPoint']
ep_bb = analysis.cfg[analysis.map.from_rva(ep_rva)]
for insn in ep_bb:
    print(insn)
Return type:

iterator over the sequence of instructions (Instruction)

__contains__(ea)

return True iff the effective address ea is within the basic block boundaries

Parameters:

ea (int) – address to query

Return type:

bool

code: bool

True iff the basic block contains code

data: bool

True iff the basic block contains data

entry: bool

True iff the basic block was an entry node in the CFG reconstruction algorithm

exotic: bool

True iff the last instruction of the basic block is a call

incoming: List[BasicBlockEdge]

list of incoming edges going to this basic block

outgoing: List[BasicBlockEdge]

list of outgoing edges departing from this basic block

Edges

An edge links two basic blocks and is represented by a BasicBlockEdge python object:

class malcat.BasicBlockEdge
address: int (effective address)

for incoming edges, this is the source address, for outgoing edges this is the target address

type: BasicBlockEdge.Type

edge type

conditional: bool

True iff there is a condition attached to the edge. Condition interface will be added later.

class malcat.BasicBlockEdge.Type

This enum describes the type of BasicBlockEdge

CALL

a call edge

JUMP

a jump edge, conditionnal jump or (in)direct jump

EXCEPTION

flow switch because of exception structure (try->catch, try->finally or catch->finally)

STEP

links to contiguous basic blocks, no jump / call, just normal execution flow