147 lines
5.4 KiB
Plaintext
147 lines
5.4 KiB
Plaintext
Metasm source code organisation
|
|
===============================
|
|
|
|
The metasm source code takes advantage of the ruby language facilities,
|
|
which allows splitting the definition of a single class in multiple files.
|
|
|
|
Each file in the source tree holds code related to a particular feature of
|
|
the framework.
|
|
|
|
Directories
|
|
-----------
|
|
|
|
The top-level directories are :
|
|
|
|
* `doc/`: this documentation
|
|
* `metasm/`: the framework core
|
|
* `samples/`: a set of sample scripts showing various functionnalities of the framework
|
|
* `tests/`: a few unit tests (too few..)
|
|
* `misc/`: misc ruby scripts, not directly related to metasm
|
|
|
|
The core
|
|
--------
|
|
|
|
The `metasm/` directory holds most of the code of the framework, along with the
|
|
main `metasm.rb` file in the top directory.
|
|
|
|
The top-level `metasm.rb` has code to load parts of the framework source on demand
|
|
in the ruby interpreter, which is implemented with ruby's <const_missing.txt>
|
|
|
|
|
|
Executable formats
|
|
##################
|
|
|
|
The `exe_format/` subdirectory contains the implementations of the various
|
|
binary file formats supported in the framework.
|
|
|
|
Three files have a special meaning here:
|
|
|
|
* `main.rb`: it defines the <core/ExeFormat.txt> class
|
|
* `serialstruct.rb`: here you'll find the definitions of <core/SerialStruct.txt>
|
|
* `autoexe.rb`: the implementation of <core/AutoExe.txt>, which allows the recognition of arbitrary files from their binary signature.
|
|
|
|
The `main.rb` file is included in all other formats, as all file classes
|
|
are subclasses of `ExeFormat`.
|
|
|
|
The `serialstruct.rb` implements a helper class to ease the description of
|
|
binary structures, and generate parsing/encoding functions for those.
|
|
|
|
All other files implement a specific file format handler. The bigger files
|
|
(`ELF` and `PE/COFF`) are split between the parsing/encoding functions and
|
|
decoding/disassembly.
|
|
|
|
|
|
CPUs
|
|
####
|
|
|
|
All supported architectures have a dedicated subdirectory, and a helper file
|
|
that will simply include all the arch-specific files.
|
|
|
|
All those files will contribute to add functions to the same class implementing
|
|
the CPU interface. Not all CPUs implement all those features. They are:
|
|
|
|
* `main.rb`: inner classes definitions (for registers etc), generic functions
|
|
* `opcodes.rb`: initializes the opcode list for the architecture
|
|
* `encode.rb`: methods to encode instructions
|
|
* `decode.rb`: methods to decode/emulate instructions
|
|
* `parse.rb`: methods to parse asm instructions from a source file
|
|
* `render.rb`: methods to output an instruction to a string
|
|
* `compile_c.rb`: the C compiler implementation
|
|
* `decompile.rb`: the arch-specific part of the generic decompiler
|
|
* `debug.rb`: arch-specific information used when debugging target of this architecture
|
|
|
|
In some cases the files are small enough to be all merged into the `main.rb` file.
|
|
|
|
|
|
Operating systems
|
|
#################
|
|
|
|
The `os/` subdirectory holds the code used to abstract an operating systems.
|
|
|
|
The files here define an API allowing to enumerate running processes, and interact
|
|
with them in various ways. The <core/Debugger.txt> class and subclasses are
|
|
defined there.
|
|
|
|
Those files also holds the list of known functions and in which system libraries
|
|
they can be found (see <core/WindowsExports.txt> or <core/GNUExports.txt>), which
|
|
are used when linking executable files.
|
|
|
|
|
|
Graphical user-interface
|
|
########################
|
|
|
|
The `gui/` subdirectory contains the code needed by the metasm graphical user-interfaces.
|
|
|
|
Currently those include the disassembler and the debugger (see the *samples* section).
|
|
|
|
Those GUI elements are implemented using a custom GUI abstraction, and reside in the
|
|
various `dasm_*.rb` and `debug.rb`.
|
|
|
|
The actual implementation of the GUI are found in:
|
|
|
|
* `win32.rb`: the native Win32 API backend
|
|
* `gtk.rb`: a Gtk2 backend, intended for unix platforms
|
|
* `qt.rb`: a Qt backend experiment
|
|
|
|
Please note that the Qt backend does not work *at all*.
|
|
|
|
The `gui.rb` file in the main directory is used to chose among the available GUI backend
|
|
the most appropriate for the current session.
|
|
|
|
|
|
Others
|
|
######
|
|
|
|
The other files directly in the `metasm/` directory are either support files
|
|
(eg `encode.rb`, `parse.rb`) that hold generic functions to be used by
|
|
specific cpu/exeformat instances, or implement arch-agnostic features.
|
|
Those include:
|
|
|
|
* `preprocessor.rb`: the C/asm preprocessor/lexer
|
|
* `parse_c.rb`: this is the implementation of the C parser
|
|
* `compile_c.rb`: this is a C precompiler, it generates a very simplified C from a standard source
|
|
* `decompile.rb`: the generic decompiler code, it uses arch-specific functions defined in the arch folder
|
|
* `dynldr.rb`: this module is used when interacting directly with the host operating system through <core/DynLdr.txt>
|
|
|
|
|
|
The samples
|
|
-----------
|
|
|
|
The `samples/` directory contains a lot of small files that intend to be
|
|
exemples of how to use the framework. It also holds experiments and
|
|
work-in-progress for features that may later be integrated into the main
|
|
framework.
|
|
|
|
The comment at the beginning of the file should be clear about the purpose
|
|
of the script, and the scripts are expected to be copy/pasted and tweaked
|
|
for the specific task needed by the user (that's you).
|
|
|
|
Some of those files however are full-featured applications:
|
|
|
|
* `exeencode.rb`: a shellcode compiler, with its `peencode.rb`, `elfencode.rb`, `machoencode.rb` counterparts
|
|
* `disassemble.rb`: a disassembler
|
|
* `disassemble-gui.rb`: the graphical disassembler / debugger
|
|
|
|
The `samples/dasm-plugins/` subdirectory holds various plugins for the disassembler.
|
|
|