This module converts information represented as binary data (Python bytes objects) into Python objects.
There are three components to the data typing system in LibForensics:
- Describing the type and structure of information (i.e. a ctypes wrapper)
- Translating a bytes object into Python objects (Data Access Layer or DAL)
- A standard protocol for using the ctypes wrappers and the DAL.
The ctypes module is used to extract various data types (e.g. integers, floating point values, structures, etc.) from a stream of bytes. To facilitate this, LibForensics provides a (restricted) wrapper for the ctypes module.
LibForensics supports several primitive data types, including 8/16/32/64 bit signed and unsigned integers, 32/64 bit floating point values, records (a.k.a. structs, data structures), and arrays.
Records (structures) are a data type that is composed of one or more primitive data types. The description of a record is a class, where the class attributes represent the fields of the record. At runtime, a metaclass is used to translate a Record class into a ctypes object. The ctypes object is accessible as the _ctype_ attribute.
The following snippet shows how to create a record with three fields:
>>> from lf.dtypes import LERecord, uint8, int16, uint32
>>> class SomeStruct(LERecord):
... field1 = uint8
... field2 = uint32
... field3 = int16
...
>>> ctypes_obj = SomeStruct._ctype_
Primitive data types are data types that can be used by themselves, or grouped together to create Composite data types.
LibForensics provides 2 types of Primitive data types:
Basic data types are “basic building blocks”. They can be used by themselves, or can be grouped together to create Composite data types. Basic data types however are not further decomposable.
LibForensics provides 3 types of Basic data types:
Composite data types are data types composed of one or more Primitive data types. This includes Basic data types, as well as other Composite data types.
LibForensics provides 2 types of Composite data types:
- Record (a.k.a. structs, data structures) – Composite data types where the elements do not need to be the same type.
- Arrays – Composite data types where the elements are identical types.
One of the design goals of the ctypes wrapper provided by LibForensics is to have data types that are not data-dependent. This means that regardless of the value of the bytes used, extracting Python values with data types (created with the ctypes wrapper) should not fail.
The advantage of this type of design is that any pattern of 1s and 0s can be used. There are several places situations that this design is useful. Such as (blindly) looking for a particular data structure in unallocated space, slack space, or without much other structural information. Additionally, if a data structure has been partially overwritten (e.g. it was in a file that was deleted, and then part of the file was reallocated and overwritten) NULL bytes can be used to make up the missing part of the data structure.
The disadvantage of this approach means that the ctypes wrapper does not support any data type that requires using the value. A common example is a pointer data type. The value of the pointer is a location (address). An invalid value for the pointer can cause problems for automated solutions. For example:
>>> from ctypes import Structure, c_int8, POINTER
>>> class Struct(Structure):
... _fields_ = [("field1", POINTER(c_int8))]
...
>>> values = Struct.from_buffer_copy(b"\x01\x00\x00\x00\x00\x00\x00\x00")
>>> values.field1.contents
Segmentation fault
Depending on the situation, there are several different options for dealing with invalid data.
The DAL is built around two concepts: Value objects and Entities (a.k.a. reference objects). The primary difference between a value object and an entity is how equality is determined. Value objects are considered equal if their value (or the values of their attributes) are equal. Entity objects are considered equal if thier identities (memory addresses, unique identifiers, etc.) are equal.
In LibForensics, value objects are subclasses of the Structuple class (usually ActiveStructuple). Entities however, are regular user-defined classes.
Some data types have equivalents in the Python standard library. For instance, a 64-bite FILETIME timestamp can be represented by a datetime.datetime class. Converter classes fill the role of converting from bytes to a standard Python object. Converter classes are subclasses of the Converter class.
In order to reduce the learning curve of the several data structures (Record data types) used throughout LibForensics, there is a standard approach to naming, locating, and using the ctypes wrappers, and the DAL. The rules are:
- The definitions of the data types are placed in a file called dtypes.py
- A convenience module called ctypes.py contains the _ctype_ attribute from each class defined in the dtypes.py file.
- Classes that represent value objects are descendants of the Structuple class.
- Classes that represent entities are regular user-defined classes (i.e. they are not descendents of Structuple.)
- Classes that are used to translate a bytes object to a standard Python object inherit from the Converter class.
- value objects, entities, and converters are placed in a file called objects.py
These are data types that are the “basic building blocks”. These data types can be used for composition, but are not composable.
These are data types that have native support in the ctypes module.
LibForensics provides support for bit-oriented data types using the BitType, bits, and bit classes.
Represents one or more bits.
Parameter: | size (int) – The number of bits in the data type. |
---|
A container class for bits. This class is used to allow bits to be used as a Primitive class.
The following BitType subclasses can be used as Primitive data types.
Composite data types are data types that are composed of one or more data types. LibForensics supports two types of composite data types, arrays and records (a.k.a. data structures, structs, tuples, etc.)
Arrays are an arrangement of multiple copies of a single data type. In the LibForensics data typing system, arrays are represented by lists. The first element of the list is the data type of the elements of the array. The size of the list denotes the number of elements in the array.
For example a data structure with a field that has 10 8-bit integers:
>>> from lf.dtypes import LERecord, uint8
>>> class SomeStruct(LERecord):
... field1 = [uint8] * 10
Records are a data structure similar to arrays, except the elements (called fields) of the record do not have to all be the same data type. Records are represented by the Record class, usually a subclass.
Base class for data types that can be composed of data types. Since this is a Primitive class, subclasses can be used to both compose data types, as well as be composed of other classes.
Fields are implemented as class attributes. For instance:
>>> from lf.dtypes import LERecord, int8, uint8
>>> class SomeStruct(LERecord):
... field1 = int8
... field2 = uint8
...
>>>
Will create a class called SomeStruct, with two fields called field1 and field2.
Composite objects can also inherit from each other, adding the new fields to the old ones. Continuing the previous example:
>>> class AnotherStruct(SomeStruct):
... field3 = uint8
...
>>>
Will create a class called AnotherStruct, with three fields called field1, field2, and field 3.
The DAL provides three types of functionality, Structuple, Converter, and Reader classes.
Factory function to create new Structuple classes.
Parameters: |
|
---|---|
Return type: | |
Returns: | The newly created class. |
Base class for creating tuples with named attribute access.
Parameter: | iterable – An optional iterator (or object that supports iteration) to provide initial values. |
---|
Note
When inheriting from this class (or a subclass) the _fields_ attribute in subclasses appends to the _fields_ attribute of the parent(s). This means that subclasses will have all of the fields of the parent(s).
The caveat is that if a subclass defines a field or alias that is already defined in the parent class, then the field is kept in the position specified by the subclass.
For example:
>>> from lf.dtypes import Structuple
>>> class ParentClass(Structuple):
... _fields_ = ("field0", "field1", "field2")
...
>>> class SubClass1(ParentClass):
... _fields_ = ("field3", "field4", "field5")
...
>>> class SubClass2(ParentClass):
... _fields_ = ("field6", "field1", "field7")
...
>>> ParentClass._fields_
('field0', 'field1', 'field2')
>>> SubClass1._fields_
('field0', 'field1', 'field2', 'field3', 'field4', 'field5')
>>> SubClass2._fields_
('field0', 'field2', 'field6', 'field1', 'field7')
Base class for value objects.
Parameter: | iterable – An optional iterator (or object that supports iteration) to provide initial values. |
---|
Creates an ActiveStructuple from a bytes object.
Note
This method is available if _takes_stream is True.
Parameter: | bytes (bytes) – A bytes object to read from. |
---|---|
Return type: | ActiveStructuple |
Returns: | The corresponding ActiveStructuple |
Creates an ActiveStructuple from an IStream object.
Note
This method is available if _takes_stream is True.
Parameters: |
|
---|---|
Return type: | |
Returns: | The corresponding ActiveStructuple |
Creates an ActiveStructuple from a ctypes object.
Note
This method is available if _takes_ctype is True.
Parameter: | ctype (ctypes._CData) – A ctypes object that describes the values of the attributes. |
---|---|
Return type: | ActiveStructuple |
Returns: | The corresponding ActiveStructuple. |
An ActiveStructuple that is a wrapper around a ctypes object. This class provides from_stream(), from_bytes(), and from_ctype() methods.
The way this class is designed, from_stream() and from_bytes() depend on from_ctype(). Therefore, just overriding from_ctype() in a subclass will affect from_stream() and from_bytes().
Parameter: | bytes (bytes) – A bytes object to read from. |
---|---|
Return type: | CtypesWrapper |
Returns: | The corresponding CtypesWrapper class. |
Parameters: |
|
---|---|
Return type: | |
Returns: | The corresponding CtypesWrapper object. |
Parameter: | ctype (ctypes._CData) – A ctypes object that describes the values of the attributes. |
---|---|
Return type: | CtypesWrapper |
Returns: | The corresponding CtypesWrapper. |
Base class to convert data into a native Python object.
Creates a Python object from a bytes object.
Note
This method is available if _takes_stream is True.
Parameter: | bytes (bytes) – A bytes object to read from. |
---|---|
Return type: | object |
Returns: | The corresponding Python object. |
Creates a Python object from an IStream object.
Note
This method is available if _takes_stream is True.
Parameters: |
|
---|---|
Return type: | object |
Returns: | The corresponding Python object. |
Creates a Python object from a ctypes object.
Note
This method is available if _takes_ctype is True.
Parameter: | ctype (ctypes._CData) – A ctypes object that describes the values of the attributes. |
---|---|
Return type: | object |
Returns: | The corresponding Python object. |
Reader clsses and objects read BuiltIn data types from streams. This type of operation is occurs fairly often.
Convenience class to read BuiltIn data types from a stream.
Reads a signed 8-bit integer from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 8-bit integer from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 16-bit integer (little endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 16-bit integer (little endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 16-bit integer (big endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 16-bit integer (big endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 32-bit integer (little endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 32-bit integer (little endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 32-bit integer (big endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 32-bit integer (big endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 64-bit integer (little endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 64-bit integer (little endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 64-bit integer (big endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 64-bit integer (big endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | int |
Returns: | The corresponding value. |
Reads a 32-bit floating point number (little endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | float |
Returns: | The corresponding value. |
Reads a 32-bit floating point number (big endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | float |
Returns: | The corresponding value. |
Reads a 64-bit floating point number (little endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | float |
Returns: | The corresponding value. |
Reads a 64-bit floating point number (big endian) from a stream.
Parameters: |
|
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) |
|
Return type: | float |
Returns: | The corresponding value. |
A Reader that is bound to a IStream.
Parameter: | stream (IStream) – A stream that contains the values to read. |
---|
Reads a signed 8-bit integer.
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 8-bit integer.
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 16-bit integer (little endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 16-bit integer (little endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 16-bit integer (big endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 16-bit integer (big endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 32-bit integer (little endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 32-bit integer (little endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 32-bit integer (big endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 32-bit integer (big endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 64-bit integer (little endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 64-bit integer (little endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads a signed 64-bit integer (big endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads an unsigned 64-bit integer (big endian).
Parameter: | offset (int or None) – The start of the integer. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | int |
Returns: | The corresponding value. |
Reads a 32-bit floating point number (little endian).
Parameter: | offset (int or None) – The start of the floating point number. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | float |
Returns: | The corresponding value. |
Reads a 32-bit floating point number (big endian).
Parameter: | offset (int or None) – The start of the floating point number. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | float |
Returns: | The corresponding value. |
Reads a 64-bit floating point number (little endian).
Parameter: | offset (int or None) – The start of the floating point number. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | float |
Returns: | The corresponding value. |
Reads a 64-bit floating point number (big endian).
Parameter: | offset (int or None) – The start of the floating point number. |
---|---|
Raises ValueError: | |
if stream (starting at offset is too small.) | |
Return type: | float |
Returns: | The corresponding value. |