File structures (analysis.struct)
- analysis.struct: FileStructure
The
analysis.structobject is amalcat.FileStructureinstance that gives you access to all the structures identified by the File parsers.
Note that in addition to this documentation, you can find usage examples in the sample script which is loaded when you hit F8.
Structures, fields and values
In Malcat, structures are named objects (the names you see in the Structure/text view for instance) and are composed of a single field (aka the root field) which can be:
an atomic field : an integer, a string, a timestamp, etc
a record field : like a typical C structure, a sequence of named fields
an array field : like a typical C array, a sequence of identical (unnamed) fields
a bitfield : a sequence of
Bitfields
A record or array can itself be composed of other sub-records or sub-arrays, there is no limit on the depth of fields. Most of the time, a structure’s level-0 field is a record or an array, but it can be an atomic field too in some cases. The root object containing all level-0 fields is the analysis.struct variable of type FileStructure.
Fields are python objects of type StructAccess, which have among other properties a value. Atomic fields have a pythonic value, e.g. a datetime.datetime will be used for the value of a Timestamp field, or a str for fields of type String. Records, Arrays and Bitfields have no value per se, accessing their value field returns back their StructAccess instance.
Accessing structures
Accessing structures in Malcat tries to follow the dictionnary access model. Variable of type Records, Arrays and Bitfields, as well as the main FileStructure object, can be indexed either by name or by integer in order to access their sub-fields. Here is a short example:
# [] is for field value access: return the structure field value (if field is a leaf, i.e no struct, bitfield or array)
print(f"EntryPoint RVA: 0x{analysis.struct['OptionalHeader']['AddressOfEntryPoint']:x}")
>>> EntryPoint RVA: 0xdca30
# You can also use the [index] syntax to access the value directly. AddressOfEntryPoint is the 6th field in the structure:
print(f"EntryPoint: 0x{analysis.struct.OptionalHeader[6]:x}")
>>> EntryPoint: 0xdca30
# . is for field detailled access: you get access to the value, offset, size, bytes, name, etc
ep_field = analysis.struct.OptionalHeader.AddressOfEntryPoint
print(f"{ep_field.name}: 0x{ep_field.value:x} (at effective address {ep_field.address:x}, aka offset {ep_field.offset:x}), size of field: {ep_field.size}, bytes: {ep_field.bytes}")
>>> AddressOfEntryPoint: 0xdca30 (at effective address 170, aka offset 170), size of field: 4, bytes: b'0\xca\r\x00'
# .at(field_name) is a synonym for .field_name.
print(f"EntryPoint: {analysis.struct.OptionalHeader.at('AddressOfEntryPoint').value:x}")
>>> EntryPoint: dca30
# enums
machine = analysis.struct.PE.Machine
print(f"PE.Machine = {machine.value}, has_enum = {machine.has_enum}, enum = {machine.enum}")
>>> PE.Machine = 332, has_enum = True, enum = IMAGE_FILE_MACHINE_I386
# arrays
print(f"has exports ? {analysis.struct['OptionalHeader']['DataDirectory'][0]['Size'] > 0}")
>>> has exports ? False
print(f"number of data directories: {analysis.struct['OptionalHeader']['DataDirectory'].count}")
>>> number of data directories: 16
for i, dd in enumerate(analysis.struct["OptionalHeader"]["DataDirectory"]):
if "Offset" in dd:
print(f" DataDirectory[{i}]: #{dd['Offset']:x}-#{dd['Offset'] + dd['Size']:x}")
else:
print(f" DataDirectory[{i}]: 0x{dd['Rva']:x}-0x{dd['Rva'] + dd['Size']:x}")
>>> DataDirectory[0]: 0x0-0x0
DataDirectory[1]: 0x15af70-0x15b074
DataDirectory[2]: 0x166000-0x1741038
DataDirectory[3]: 0x0-0x0
...
The FileStructure object
To start your journey, you shall always start with the analysis.struct object, a malcat.FileStructure instance:
- class malcat.FileStructure
This class contains all structures identified by the File parsers. Note that all addresses used in this class are effective addresses. See Addressing in Malcat for more details.
- __iter__()
Iterate over the file’s identified structures
for s in analysis.struct: print("#{:x}: {}".format(analysis.map.a2p(s.address), s.name))
- Return type:
iterator over the list of structure (
StructAccess)
- __getitem__(interval)
Iterate over all the structures contained in the interval (effective address):
for s in analysis.struct[0x100:]: print("#{:x}: {}".format(analysis.map.a2p(s.address), s.name))
- Parameters:
interval (slice) – effective address interval
- Return type:
iterator over the list of structure (
StructAccess)
- __getitem__(name)
return the value of the root field for the first structure named name
ep_rva = analysis.struct['OptionalHeader']['AddressOfEntryPoint'] print(f"EntryPoint: 0x{analysis.map.imagebase + ep_rva:x}")
- Parameters:
name (str) – name of the structure
- Return type:
StructAccessfor aggregate fields, the python type for atomic fields- Raises:
KeyErrorif no structure can be found
- __getattr__(name)
return the root field for the first structure named name. Example:
optional_header = analysis.struct.OptionalHeader
- Parameters:
name (str) – name of the structure
- Return type:
- Raises:
KeyErrorif no structure can be found
- at(name)
return an accessor to the root field for the first structure named name. Useful if name contains non-alphanumerical characters and
__getattr__()can’t be used.- Parameters:
name (str) – name of the structure
- Return type:
- Raises:
KeyErrorif no structure can be found
- __contains__(name)
return True iff a structure named name exists
if not "#US" in analysis.struct: raise ValueError("No user strings!")
- Parameters:
name (str) – name of the structure
- Return type:
bool
- find(ea)
return the structure’s root field which starts at or contains the effective address ea, or None if no one can be found.
- Parameters:
ea (int) – effective address for the query
- Return type:
StructAccessor None
- find_forward(ea)
return the structure’s root field which starts at or contains the effective address ea or starts directly after ea, or None if no structure is defined beyond ea.
- Parameters:
ea (int) – effective address for the query
- Return type:
StructAccessor None
- find_backward(ea)
return the structure’s root field which starts at or contains the effective address ea or the first one that start before ea, or None if no structure is defined before ea.
- Parameters:
ea (int) – effective address for the query
- Return type:
StructAccessor None
- __len__()
return the number of identified structures
if len(analysis.struct) == 0: raise ValueError("No structure found!")
- Return type:
bool
User types
- force(ea, type_name)
Force a new type definition at the given effective address. The name of the type should be one that you can find in the user type dialog (cf. Apply a custom type). The method will return false iff a type is already defined at the address. If it succeeds, the operation will be registered in the Undo/redo manager.
address = analysis.v2a(0x401000) if not address in analysis.struct and analysis.struct.force(address, "windows.STARTUPINFOA"): print("New type defined at {}".format(analysis.ppa(address)))
- Parameters:
ea (int) – effective address where to type should be defined (i.e. first byte of the structure)
type_name (str) – the name of a static or dynamic type (cf. Apply a custom type)
- Return type:
bool
Note
This method invalidates several analyses, which need to be rerun. If you Run a script from the user interface, the UI will take care of this for you at the end of the script. But if you Run Malcat from your python interpreter, you will have to call the method
malcat.Analysis.run()at some point to take your change into account.
- unforce(ea)
Un-force a custom type definition previously made via
force(). The address given should be the same as for the call toforce(). The method will return false iff a type was not forced at the address. If it succeeds, the unforce operation will be registered in the Undo/redo manager.- Parameters:
ea (int) – effective address given to a previous call to
force()start of type definition)- Return type:
bool
Inspecting fields
The FileStructure object has many methods which gives access to a structure’s root field. Fields are python objects of class StructAccess and have the following methods and attributes:
- class malcat.StructAccess
This class represents a field, which can be the root field of a
FileStructure, or a child of an aggregate field, e.g. a record/structure field or an array field.Attributes
All types of fields have the following attributes and methods:
- value: depends on the field type
the value of the field. For aggregate fields, this would return itself (aka a
StructAccessinstance) since aggregate fields have no value per se. For atomic fields, the returned type depends on the field type: int, str, datetime, etc.
- name: str
the name of the field. Example:
print(analysis.struct["Directories"][0].name) >> Directories[0] print(analysis.struct["Directories"][0].StreamSize.name) >> StreamSize
- address: int
the effective address of the field.
- offset: int
the physical address of the field. Fields can only be defined on file-backed memory, so they always have a valid physical address.
- size: int
how many bytes does the field takes on disk
- has_enum()
some atomic fields have a fixed set of values. If so, has_enum will be true
- Return type:
bool
- enum: str
the textual representation of the field’s value if the field has an enum defined (i.e.
has_enum()is True). If the field is not an enum, the empty string is returned. Example:print(analysis.struct.PE.Machine.value) >> 332 print(analysis.struct.PE.Machine.enum) >> IMAGE_FILE_MACHINE_I386
Aggregate fields
For structures/records, arrays and bitfields, you have access to the following additional methods:
- count: int
number of sub-fields/members of the aggregate. For atomic fields this attribute is still defined, but it will be always 1
- __iter__()
Iterate over all the aggregate’s members, i.e. all the bits of a bitfield, all the rows of an array or all members of a record field
sections = analysis.struct["Sections"] for s in sections: print("{}: #{:x}".format(s["Name"], analysis.map.a2p(s["PointerToRawData"])))
- Returns:
iterator over the list of field members
- Return type:
iterator over
StructAccessinstances- Raises:
Errorfor atomic fields
- __getitem__(interval)
Iterate through from the ith to the jth sub-elements of the array/record/bitfield
for s in analysis.struct.Sections[1:]: print("#{:x}: {}".format(analysis.map.a2p(s.address), s.name))
- Parameters:
interval (slice) – index interval
- Return type:
iterator over the list of members (
StructAccess)
- __getitem__(i)
return the value of the ith aggregate member.
first_section = analysis.struct.Sections[0]
- Parameters:
i (int) – index of the sub-field to retrieve
- Return type:
StructAccessfor aggregate fields, the python type for atomic fields- Raises:
KeyErrorif the aggregate field has less than i members
- __getitem__(name)
return the value of the the aggregate member named name. Note that this method is not valid for arrays, since array’s elements don’t have names.
is_executable = analysis.struct.PE.Characteristics['ExecutableImage']
- Parameters:
name (str) – name of the member
- Return type:
StructAccessfor aggregate fields, the python type for atomic fields- Raises:
KeyErrorif no member named name can be found
- at(name)
return an accessor to the first aggregate member named name. Note that this method is not valid for arrays, since array’s elements don’t have names.
is_executable = analysis.struct.PE.Characteristics.at('ExecutableImage').value
- Parameters:
name (str) – name of the member
- Return type:
- Raises:
KeyErrorif no member named name can be found
- __getattr__(name)
return an accessor to the first aggregate member named name. Note that this method is not valid for arrays, since array’s elements don’t have names.
is_executable = analysis.struct.PE.Characteristics.ExecutableImage.value # equivalent is_executable = analysis.struct.PE.Characteristics.at('ExecutableImage').value # equivalent is_executable = analysis.struct.PE.Characteristics['ExecutableImage']
- Parameters:
name (str) – name of the member
- Return type:
- Raises:
KeyErrorif no member named name can be found
- at(i)
return an accessor to the ith aggregate member
is_executable = analysis.struct.PE.Characteristics.at(1).value
- Parameters:
i (int) – position of the aggregate member to query
- Return type:
- Raises:
KeyErrorif the aggregate field has less than i members
Editing
The same way you access an aggregate field’s value, you can also edit it. See the examples below:
# all of these lines do the same thing: set the ExecutableImage bit of the PE characteritics bitfield to True:
analysis.struct.PE.Characteristics['ExecutableImage'] = True
analysis.struct.PE.Characteristics.at('ExecutableImage').value = True
analysis.struct.PE.Characteristics.ExecutableImage.value = True
analysis.struct.PE.Characteristics[1] = True
analysis.struct.PE.Characteristics.at(1).value = True
# setting an int value
analysis.struct.MZ.InitialSS.value = 5
# changing a time / date value
analysis.struct["PE"]["TimeDateStamp"] = datetime.datetime.now()
# changing a string value. Make sure that the string is not bigger than 8 chars!
analysis.struct["Sections"][0]["Name"] = "newname"
Note
When editing a field, you’ll have to make sure that the new values does not take more place on disk than the new one. Otherwise, an error will be thrown. You’ll also have to make sure to assign the correct python type and value, like a positive int less 65536 than for an UInt16 field.