File structures (analysis.struct)

analysis.struct: FileStructure

The analysis.struct object is a malcat.FileStructure instance that gives you access to all the structures identified by the File parsers.

Note that in addition to this documentation, you can find usage examples in the sample script which is loaded when you hit F8.

Structures, fields and values

In Malcat, structures are named objects (the names you see in the Structure/text view for instance) and are composed of a single field (aka the root field) which can be:

  • an atomic field : an integer, a string, a timestamp, etc

  • a record field : like a typical C structure, a sequence of named fields

  • an array field : like a typical C array, a sequence of identical (unnamed) fields

  • a bitfield : a sequence of Bit fields

A record or array can itself be composed of other sub-records or sub-arrays, there is no limit on the depth of fields. Most of the time, a structure’s level-0 field is a record or an array, but it can be an atomic field too in some cases. The root object containing all level-0 fields is the analysis.struct variable of type FileStructure.

Fields are python objects of type StructAccess, which have among other properties a value. Atomic fields have a pythonic value, e.g. a datetime.datetime will be used for the value of a Timestamp field, or a str for fields of type String. Records, Arrays and Bitfields have no value per se, accessing their value field returns back their StructAccess instance.

Accessing structures

class malcat.FileStructure

This class contains all structures identified by the File parsers. Note that all addresses used in this class are effective addresses. See Addressing in Malcat for more details.

__iter__()

Iterate over the file’s identified structures

for s in analysis.struct:
    print("#{:x}: {}".format(analysis.map.a2p(s.address), s.name))
Return type:

iterator over the list of structure (StructAccess)

__getitem__(interval)

Iterate over all the structures contained in the interval (effective address):

for s in analysis.struct[0x100:]:
    print("#{:x}: {}".format(analysis.map.a2p(s.address), s.name))
Parameters:

interval (slice) – effective address interval

Return type:

iterator over the list of structure (StructAccess)

__getitem__(name)

return the value of the root field for the first structure named name

ep_rva = analysis.struct['OptionalHeader']['AddressOfEntryPoint']
print(f"EntryPoint: 0x{analysis.map.imagebase + ep_rva:x}")
Parameters:

name (str) – name of the structure

Return type:

StructAccess for aggregate fields, the python type for atomic fields

Raises:

KeyError if no structure can be found

__getattr__(name)

return the root field for the first structure named name. Example:

optional_header = analysis.struct.OptionalHeader
Parameters:

name (str) – name of the structure

Return type:

StructAccess

Raises:

KeyError if no structure can be found

at(name)

return an accessor to the root field for the first structure named name. Useful if name contains non-alphanumerical characters and __getattr__() can’t be used.

Parameters:

name (str) – name of the structure

Return type:

StructAccess

Raises:

KeyError if no structure can be found

__contains__(name)

return True iff a structure named name exists

if not "#US" in analysis.struct:
    raise ValueError("No user strings!")
Parameters:

name (str) – name of the structure

Return type:

bool

find(ea)

return the structure’s root field which starts at or contains the effective address ea, or None if no one can be found.

Parameters:

ea (int) – effective address for the query

Return type:

StructAccess or None

find_forward(ea)

return the structure’s root field which starts at or contains the effective address ea or starts directly after ea, or None if no structure is defined beyond ea.

Parameters:

ea (int) – effective address for the query

Return type:

StructAccess or None

find_backward(ea)

return the structure’s root field which starts at or contains the effective address ea or the first one that start before ea, or None if no structure is defined before ea.

Parameters:

ea (int) – effective address for the query

Return type:

StructAccess or None

__len__()

return the number of identified structures

if len(analysis.struct) == 0:
    raise ValueError("No structure found!")
Return type:

bool

User types

force(ea, type_name)

Force a new type definition at the given effective address. The name of the type should be one that you can find in the user type dialog (cf. Apply a custom type). The method will return false iff a type is already defined at the address. If it succeeds, the operation will be registered in the Undo/redo manager.

address = analysis.v2a(0x401000)
if not address in analysis.struct and analysis.struct.force(address, "windows.STARTUPINFOA"):
    print("New type defined at {}".format(analysis.ppa(address)))
Parameters:
  • ea (int) – effective address where to type should be defined (i.e. first byte of the structure)

  • type_name (str) – the name of a static or dynamic type (cf. Apply a custom type)

Return type:

bool

Note

This method invalidates several analyses, which need to be rerun. If you Run a script from the user interface, the UI will take care of this for you at the end of the script. But if you Run Malcat from your python interpreter, you will have to call the method malcat.Analysis.run() at some point to take your change into account.

unforce(ea)

Un-force a custom type definition previously made via force(). The address given should be the same as for the call to force(). The method will return false iff a type was not forced at the address. If it succeeds, the unforce operation will be registered in the Undo/redo manager.

Parameters:

ea (int) – effective address given to a previous call to force() start of type definition)

Return type:

bool

Inspecting fields

The FileStructure object has many methods which gives access to a structure’s root field. Fields are python objects of class StructAccess and have the following methods and attributes:

class malcat.StructAccess

This class represents a field, which can be the root field of a FileStructure, or a child of an aggregate field, e.g. a record/structure field or an array field.

Attributes

All types of fields have the following attributes and methods:

value: depends on the field type

the value of the field. For aggregate fields, this would return itself (aka a StructAccess instance) since aggregate fields have no value per se. For atomic fields, the returned type depends on the field type: int, str, datetime, etc.

name: str

the name of the field. Example:

print(analysis.struct["Directories"][0].name)
>> Directories[0]
print(analysis.struct["Directories"][0].StreamSize.name)
>> StreamSize
address: int

the effective address of the field.

start: int

same as address

end: int

same as address + size

offset: int

the physical address of the field. Fields can only be defined on file-backed memory, so they always have a valid physical address.

size: int

how many bytes does the field takes on disk

__len__()

return the field’s size

Return type:

int

bytes: bytes

raw bytes of field on disk. Returned bytes have a size of size

has_enum()

some atomic fields have a fixed set of values. If so, has_enum will be true

Return type:

bool

enum: str

the textual representation of the field’s value if the field has an enum defined (i.e. has_enum() is True). If the field is not an enum, the empty string is returned. Example:

print(analysis.struct.PE.Machine.value)
>> 332
print(analysis.struct.PE.Machine.enum)
>> IMAGE_FILE_MACHINE_I386

Aggregate fields

For structures/records, arrays and bitfields, you have access to the following additional methods:

count: int

number of sub-fields/members of the aggregate. For atomic fields this attribute is still defined, but it will be always 1

__iter__()

Iterate over all the aggregate’s members, i.e. all the bits of a bitfield, all the rows of an array or all members of a record field

sections = analysis.struct["Sections"]
for s in sections:
    print("{}: #{:x}".format(s["Name"], analysis.map.a2p(s["PointerToRawData"])))
Returns:

iterator over the list of field members

Return type:

iterator over StructAccess instances

Raises:

Error for atomic fields

__getitem__(interval)

Iterate through from the ith to the jth sub-elements of the array/record/bitfield

for s in analysis.struct.Sections[1:]:
    print("#{:x}: {}".format(analysis.map.a2p(s.address), s.name))
Parameters:

interval (slice) – index interval

Return type:

iterator over the list of members (StructAccess)

__getitem__(i)

return the value of the ith aggregate member.

first_section = analysis.struct.Sections[0]
Parameters:

i (int) – index of the sub-field to retrieve

Return type:

StructAccess for aggregate fields, the python type for atomic fields

Raises:

KeyError if the aggregate field has less than i members

__getitem__(name)

return the value of the the aggregate member named name. Note that this method is not valid for arrays, since array’s elements don’t have names.

is_executable = analysis.struct.PE.Characteristics['ExecutableImage']
Parameters:

name (str) – name of the member

Return type:

StructAccess for aggregate fields, the python type for atomic fields

Raises:

KeyError if no member named name can be found

at(name)

return an accessor to the first aggregate member named name. Note that this method is not valid for arrays, since array’s elements don’t have names.

is_executable = analysis.struct.PE.Characteristics.at('ExecutableImage').value
Parameters:

name (str) – name of the member

Return type:

StructAccess

Raises:

KeyError if no member named name can be found

__getattr__(name)

return an accessor to the first aggregate member named name. Note that this method is not valid for arrays, since array’s elements don’t have names.

is_executable = analysis.struct.PE.Characteristics.ExecutableImage.value
# equivalent
is_executable = analysis.struct.PE.Characteristics.at('ExecutableImage').value
# equivalent
is_executable = analysis.struct.PE.Characteristics['ExecutableImage']
Parameters:

name (str) – name of the member

Return type:

StructAccess

Raises:

KeyError if no member named name can be found

at(i)

return an accessor to the ith aggregate member

is_executable = analysis.struct.PE.Characteristics.at(1).value
Parameters:

i (int) – position of the aggregate member to query

Return type:

StructAccess

Raises:

KeyError if the aggregate field has less than i members

Editing

The same way you access an aggregate field’s value, you can also edit it. See the examples below:

# all of these lines do the same thing: set the ExecutableImage bit of the PE characteritics bitfield to True:
analysis.struct.PE.Characteristics['ExecutableImage'] = True
analysis.struct.PE.Characteristics.at('ExecutableImage').value = True
analysis.struct.PE.Characteristics.ExecutableImage.value = True
analysis.struct.PE.Characteristics[1] = True
analysis.struct.PE.Characteristics.at(1).value = True

# setting an int value
analysis.struct.MZ.InitialSS.value = 5

# changing a time / date value
analysis.struct["PE"]["TimeDateStamp"] = datetime.datetime.now()

# changing a string value. Make sure that the string is not bigger than 8 chars!
analysis.struct["Sections"][0]["Name"] = "newname"

Note

When editing a field, you’ll have to make sure that the new values does not take more place on disk than the new one. Otherwise, an error will be thrown. You’ll also have to make sure to assign the correct python type and value, like a positive int less 65536 than for an UInt16 field.