File structures (analysis.struct)
- analysis.struct: FileStructure
The
analysis.struct
object is amalcat.FileStructure
instance that gives you access to all the structures identified by the File parsers.
Note that in addition to this documentation, you can find usage examples in the sample script which is loaded when you hit F8.
Structures, fields and values
In Malcat, structures are named objects (the names you see in the Structure/text view for instance) and are composed of a single field (aka the root field) which can be:
an atomic field : an integer, a string, a timestamp, etc
a record field : like a typical C structure, a sequence of named fields
an array field : like a typical C array, a sequence of identical (unnamed) fields
a bitfield : a sequence of
Bit
fields
A record or array can itself be composed of other sub-records or sub-arrays, there is no limit on the depth of fields. Most of the time, a structure’s level-0 field is a record or an array, but it can be an atomic field too in some cases. The root object containing all level-0 fields is the analysis.struct
variable of type FileStructure
.
Fields are python objects of type StructAccess
, which have among other properties a value. Atomic fields have a pythonic value, e.g. a datetime.datetime
will be used for the value of a Timestamp
field, or a str
for fields of type String
. Records
, Arrays
and Bitfields
have no value per se, accessing their value field returns back their StructAccess
instance.
Accessing structures
- class malcat.FileStructure
This class contains all structures identified by the File parsers. Note that all addresses used in this class are effective addresses. See Addressing in Malcat for more details.
- __iter__()
Iterate over the file’s identified structures
for s in analysis.struct: print("#{:x}: {}".format(analysis.map.a2p(s.address), s.name))
- Return type:
iterator over the list of structure (
StructAccess
)
- __getitem__(interval)
Iterate over all the structures contained in the interval (effective address):
for s in analysis.struct[0x100:]: print("#{:x}: {}".format(analysis.map.a2p(s.address), s.name))
- Parameters:
interval (slice) – effective address interval
- Return type:
iterator over the list of structure (
StructAccess
)
- __getitem__(name)
return the value of the root field for the first structure named name
ep_rva = analysis.struct['OptionalHeader']['AddressOfEntryPoint'] print(f"EntryPoint: 0x{analysis.map.imagebase + ep_rva:x}")
- Parameters:
name (str) – name of the structure
- Return type:
StructAccess
for aggregate fields, the python type for atomic fields- Raises:
KeyError
if no structure can be found
- __getattr__(name)
return the root field for the first structure named name. Example:
optional_header = analysis.struct.OptionalHeader
- Parameters:
name (str) – name of the structure
- Return type:
- Raises:
KeyError
if no structure can be found
- at(name)
return the root field for the first structure named name. Useful if name contains non-alphanumerical characters and
__getattr__()
can’t be used.- Parameters:
name (str) – name of the structure
- Return type:
- Raises:
KeyError
if no structure can be found
- __contains__(name)
return True iff a structure named name exists
if not "#US" in analysis.struct: raise ValueError("No user strings!")
- Parameters:
name (str) – name of the structure
- Return type:
bool
- find(ea)
return the structure’s root field which starts at or contains the effective address ea, or None if no one can be found.
- Parameters:
ea (int) – effective address for the query
- Return type:
StructAccess
or None
- find_forward(ea)
return the structure’s root field which starts at or contains the effective address ea or starts directly after ea, or None if no structure is defined beyond ea.
- Parameters:
ea (int) – effective address for the query
- Return type:
StructAccess
or None
- find_backward(ea)
return the structure’s root field which starts at or contains the effective address ea or the first one that start before ea, or None if no structure is defined before ea.
- Parameters:
ea (int) – effective address for the query
- Return type:
StructAccess
or None
- __len__()
return the number of identified structures
if len(analysis.struct) == 0: raise ValueError("No structure found!")
- Return type:
bool
User types
- force(ea, type_name)
Force a new type definition at the given effective address. The name of the type should be one that you can find in the user type dialog (cf. Apply a custom type). The method will return false iff a type is already defined at the address. If it succeeds, the operation will be registered in the Undo/redo manager.
address = analysis.v2a(0x401000) if not address in analysis.struct and analysis.struct.force(address, "windows.STARTUPINFOA"): print("New type defined at {}".format(analysis.ppa(address)))
- Parameters:
ea (int) – effective address where to type should be defined (i.e. first byte of the structure)
type_name (str) – the name of a static or dynamic type (cf. Apply a custom type)
- Return type:
bool
Note
This method invalidates several analyses, which need to be rerun. If you Run a script from the user interface, the UI will take care of this for you at the end of the script. But if you Run Malcat from your python interpreter, you will have to call the method
malcat.Analysis.run()
at some point to take your change into account.
- unforce(ea)
Un-force a custom type definition previously made via
force()
. The address given should be the same as for the call toforce()
. The method will return false iff a type was not forced at the address. If it succeeds, the unforce operation will be registered in the Undo/redo manager.- Parameters:
ea (int) – effective address given to a previous call to
force()
start of type definition)- Return type:
bool
Inspecting fields
The FileStructure
object has many methods which gives access to a structure’s root field. Fields are python objects of class StructAccess
and have the following methods and attributes:
- class malcat.StructAccess
This class represents a field, which can be the root field of a
FileStructure
, or a child of an aggregate* field, e.g. a record field or an array field.Attributes
All types of fields have the following attributes and methods:
- value: depends on the field type
the value of the field. For aggregate fields, this would return itself (aka a
StructAccess
instance) since aggregate fields have no value per se. For atomic fields, the returned type depends on the field type: int, str, datetime, etc.
- name: str
the name of the field. Example:
print(analysis.struct["Directories"][0].name) >> Directories[0] print(analysis.struct["Directories"][0].StreamSize.name) >> StreamSize
- address: int
the effective address of the field.
- offset: int
the physical address of the field. Fields can only be defined on file-backed memory, so they always have a valid physical address.
- size: int
how many bytes does the field takes on disk
- has_enum()
some atomic fields have a fixed set of values. If so, has_enum will be true
- Return type:
bool
- enum: str
the textual representation of the field’s value if the field has an enum defined (i.e.
has_enum()
is True). Example:print(analysis.struct.PE.Machine.value) >> 332 print(analysis.struct.PE.Machine.enum) >> IMAGE_FILE_MACHINE_I386
Aggregate fields
For records, arrays and bitfields, you have access to the following additional methods:
- count: int
number of sub-fields/members of the aggregate. For atomic fields this attribute is still defined, but it will be always 1
- __iter__()
Iterate over all the aggregate’s members, i.e. all the bits of a bitfield, all the rows of an array or all members of a record field
sections = analysis.struct["Sections"] for s in sections: print("{}: #{:x}".format(s["Name"], analysis.map.a2p(s["PointerToRawData"])))
- Returns:
iterator over the list of field members
- Return type:
iterator over
StructAccess
instances- Raises:
Error
for atomic fields
- __getitem__(interval)
Iterate through from the ith to the jth sub-elements of the array/record/bitfield
for s in analysis.struct.Sections[1:]: print("#{:x}: {}".format(analysis.map.a2p(s.address), s.name))
- Parameters:
interval (slice) – index interval
- Return type:
iterator over the list of members (
StructAccess
)
- __getitem__(name)
return the value of the the aggregate member named name. Note that this method is not valid for arrays, since array’s elements don’t have names.
is_executable = analysis.struct.PE.Characteristics['ExecutableImage']
- Parameters:
name (str) – name of the member
- Return type:
StructAccess
for aggregate fields, the python type for atomic fields- Raises:
KeyError
if no member named name can be found
- at(name)
return the first aggregate member named name. Note that this method is not valid for arrays, since array’s elements don’t have names.
is_executable = analysis.struct.PE.Characteristics.at('ExecutableImage').value
- Parameters:
name (str) – name of the member
- Return type:
- Raises:
KeyError
if no member named name can be found
- __getattr__(name)
return the first aggregate member named name. Note that this method is not valid for arrays, since array’s elements don’t have names.
is_executable = analysis.struct.PE.Characteristics.ExecutableImage.value # equivalent is_executable = analysis.struct.PE.Characteristics.at('ExecutableImage').value # equivalent is_executable = analysis.struct.PE.Characteristics['ExecutableImage']
- Parameters:
name (str) – name of the member
- Return type:
- Raises:
KeyError
if no member named name can be found
- __getitem__(i)
return the value of the the ith aggregate member of the field
is_executable = analysis.struct.PE.Characteristics[1]
- Parameters:
i (int) – position of the aggregate member to query
- Return type:
StructAccess
for aggregate fields, the python type for atomic fields- Raises:
KeyError
if the aggregate field has less than i members
- at(i)
return the ith aggregate member
is_executable = analysis.struct.PE.Characteristics.at(1).value
- Parameters:
i (int) – position of the aggregate member to query
- Return type:
- Raises:
KeyError
if the aggregate field has less than i members
Editing
The same way you access an aggregate field’s value, you can also edit it. See the examples below:
# all of these lines do the same thing: set the ExecutableImage bit of the PE characteritics bitfield to True:
analysis.struct.PE.Characteristics['ExecutableImage'] = True
analysis.struct.PE.Characteristics.at('ExecutableImage').value = True
analysis.struct.PE.Characteristics.ExecutableImage.value = True
analysis.struct.PE.Characteristics[1] = True
analysis.struct.PE.Characteristics.at(1).value = True
# setting an int value
analysis.struct.MZ.InitialSS.value = 5
# changing a time / date value
analysis.struct["PE"]["TimeDateStamp"] = datetime.datetime.now()
# changing a string value. Make sure that the string is not bigger than 8 chars!
analysis.struct["Sections"][0]["Name"] = "newname"
Note
When editing a field, you’ll have to make sure that the new values does not take more place on disk than the new one. Otherwise, an error will be thrown. You’ll also have to make sure to assign the correct python type and value, like a positive int less 65536 than for an UInt16 field.