Control Flow Graph (analysis.cfg)

analysis.cfg: malcat.CFG

The analysis.cfg object is a malcat.CFG instance that gives you access to all the basic blocks identified by the CFG reconstruction algorithm.

Note that in addition to this documentation, you can find usage examples in the sample script which is loaded when you hit F8.

Basic blocks definition

Definition

The CFG, or control flow graph, divides executable code into a graph of basic blocks. Basic blocks are contiguous file ranges that satisfies:

  • control flow always starts at the beginning of the block for every possible execution of the program

  • control flow always goes to the end of the block for every possible execution of the program, i.e no branching/jump except for the last instruction (with a special case for the EXCEPTION edge, see below.

  • the basic block is located in a single region

  • basic blocks have incoming and outgoing edges, which can be of 4 types:

    • STEP: normal control flow, next instruction will be executed

    • JUMP: a conditional or unconditional jump

    • CALL: a call

    • EXCEPTION: a conditional jump, the condition being that an exception happens when executing any instruction inside the basic block.

Note

Contrary to some other tools or papers, we consider that a call instruction ends a basic block. Such exotic basic blocks will have their property BasicBlock.exotic set to True, to help user used to the other definition.

Code blocks and data blocks

In order to simplify the interface a bit and make code easier to read, non-code regions (i.e. data) also belong to special basic blocks named data blocks, which have no incoming nor outgoing edges. So every byte of the effective address space belongs to either a code or data block.

Basic blocks

Blocks

A basic block is a BasicBlock instance offering the following python methods and properties:

class malcat.BasicBlock
address: int (effective address)

the start of the basic block

start: int (effective address)

same as address

end: int (effective address)

last effective address inside the basic block + 1

size: int

size in bytes of the basic block: end - start

function: malcat.Function

the function this basic block belongs to (or None if no function, e.g. for data blocks)

__len__()

return the size in bytes of the basic block: end - start

Return type:

int

__iter__()

Iterate over the basic block’s instructions (even if data). Shortcut for analysis.asm[bb.start:bb.end].

ep_rva = analysis.struct['OptionalHeader']['AddressOfEntryPoint']
ep_bb = analysis.cfg[analysis.map.from_rva(ep_rva)]
for insn in ep_bb:
    print(insn)
Return type:

iterator over the sequence of instructions (Instruction)

__contains__(ea)

return True iff the effective address ea is within the basic block boundaries

Parameters:

ea (int) – address to query

Return type:

bool

disasm(use_hexadecimal=True, resolve_symbols=True, resolve_functions=True, resolve_strings=True, resolve_structures=True)

Disassemble this basic block following the given formatting

Parameters:
  • use_hexadecimal (bool) – display immediates in hexadecimal base

  • resolve_symbols (bool) – known symbol addresses will be replaced by their symbol names

  • resolve_functions (bool) – known function start addresses will be replaced by their function name

  • resolve_strings (bool) – known string addresses will be replaced by the string content

  • resolve_structures (bool) – known structure/fields addresses will be replaced by their structure/field name

Return type:

str

hex(exclude_off=False, exclude_disp=False, exclude_reg=False, exclude_imm=False)

Get the masked out basic block hex bytes, e.g. 558BEC68????????8374??45..

Parameters:
  • exclude_off (bool) – exclude absolute offsets in valid instructions, e.g. mov eax, off_15bf34

  • exclude_disp (bool) – exclude displacements in valid instructions, e.g. mov eax, [ecx+128]

  • exclude_reg (bool) – exclude registers in valid instructions, e.g. push eax

  • exclude_imm (bool) – exclude immediates in valid instructions, e.g. mov eax, 0x1223

Return type:

str

code: bool

True iff the basic block contains code

data: bool

True iff the basic block contains data

entry: bool

True iff the basic block was an entry node in the CFG reconstruction algorithm

exotic: bool

True iff the last instruction of the basic block is a call

incoming: List[BasicBlockEdge]

list of incoming edges going to this basic block

outgoing: List[BasicBlockEdge]

list of outgoing edges departing from this basic block

inrefs: List[malcat.Reference]

list of all data and code (a list of malcat.Reference objects) that references the start of this basic block

bb = analysis.cfg[analysis.v2a(0x18000127c)]
for inref in bb.inrefs:
    print(f"basic block {analysis.ppa(bb.start)} is referenced by {analysis.ppa(inref.address)} ({inref.type})")
>> "basic block 0x18000127c (sub_18000127c) is referenced by 0x1800015f3 (sub_1800014c4+12f) (Type.CODE)"
outrefs: List[malcat.Reference]

list of all data and code (a list of malcat.Reference objects) referenced by any code instruction of this basic block

bb = analysis.cfg[analysis.v2a(0x18000127c)]
for outref in bb.outrefs:
    print(f"An instruction of basicblock {bb} references {analysis.ppa(outref.address)}")
>> "An instruction of basicblock <malcat.BasicBlock object at 0x000000000A95B670> references 0x18002bc58 (.data:3c58)"
>> "An instruction of basicblock <malcat.BasicBlock object at 0x000000000A95B670> references 0x18000129b (sub_18000127c+1f)"

Edges

An edge links two basic blocks and is represented by a BasicBlockEdge python object:

class malcat.BasicBlockEdge
address: int (effective address)

for incoming edges, this is the source address, for outgoing edges this is the target address

type: BasicBlockEdge.Type

edge type

conditional: bool

True iff there is a condition attached to the edge. Condition interface will be added later.

class malcat.BasicBlockEdge.Type

This enum describes the type of BasicBlockEdge

CALL

a call edge

JUMP

a jump edge, conditionnal jump or (in)direct jump

EXCEPTION

flow switch because of exception structure (try->catch, try->finally or catch->finally)

STEP

links to contiguous basic blocks, no jump / call, just normal execution flow