metasploit-framework/dev/documentation/devguide/developers_guide.tex

636 lines
28 KiB
TeX
Executable File

\documentclass{report}
\usepackage{graphicx}
\usepackage{color}
\usepackage[colorlinks,urlcolor=blue,linkcolor=black,citecolor=blue]{hyperref}
\begin{document}
\title{Metasploit 3.0 Developer's Guide}
\author{skape}
\begin{titlepage}
\begin{center}
\huge{{Metasploit 3.0 Developer's Guide}} \\[150mm]
\rule{10cm}{1pt} \\[8mm]
\small\bf{skape} \\
\small\bf{mmiller@hick.org} \\[4mm]
\textit{Last modified: \small{11/24/2003}}
\end{center}
\end{titlepage}
\tableofcontents
\setlength{\parindent}{0pt} \setlength{\parskip}{8pt}
\chapter{Introduction}
\par
The Metasploit framework is an open-source exploitation framework
that is designed to provide security researches and pen-testers with
a uniform model that allows for the rapid development of exploits,
payloads, encoders, NOP generators, and reconnaissance tools. The
framework provides exploit writers with the ability to re-use large
chunks of code that would otherwise have to be copied or
re-implemented on a per-exploit basis. To help further this cause,
the Metasploit staff is proud to present the next major evolution of
the exploitation framework: version 3.0.
\par
The 3.0 version of the framework is a re-factoring of the 2.x branch
which has been written entirely in Ruby. The primary goal of the
3.0 branch is to make the framework easy to use and extend from a
programmatic aspect. This goal encompasses not only the development
of framework modules, such as exploits, but also to the development
of third party tools and plugins that can be used to increase the
functionality of the entire suite. By developing an easy to use
framework at a programmatic level, it follows that exploits and
other extensions should be easier to understand and implement than
those provided in earlier versions of the framework.
\par
This document will provide the reader with an explanation of the
design goals, methodologies, and implementation details of the 3.0
version of the framework. Henceforth, the 3.0 version of the
framework will simply be referred to as \textit{the framework}.
\section{Why Ruby?}
\par
During the development of the framework, the one recurring question
that the Metasploit staff was continually asked was why Ruby was
selected as the programming language. To avoid having to answer
this question on an individual basis, the authors have opted for
explaining their reasons in this document.
\par
The Ruby programming language was selected over other choices, such
as python, perl, and C++ for quite a few reasons. The first, and
primary, reason that Ruby was selected was because it's a language
that the Metasploit staff enjoyed writing in. After spending time
analyzing other languages and factoring in past experiences, the
Ruby programming language was found to offer both a simple and
powerful approach to an interpreted language. The degree of
introspection and the object-oriented aspects provided by Ruby was
something that fit very nicely with some of the requirements of the
framework, where automated class construction and for the purpose of
code re-use was a very key concern, and it was one that perl was not
very well suited to offer. On top of this, the syntax is incredibly
simplistic and provides the same level of language features that
other more accepted languages have, like perl.
\par
The second reason Ruby was selected was because of its platform
independent support for threading. While a number of limitations
have been encountered during the development of the framework under
this model, the Metasploit staff has observed a marked performance
and usability improvement over the 2.x branch. Future versions of
Ruby (the 1.9 series) will back the existing threading API with
native threads for the operating system the interpreter is compiled
against which will solve a number of existing issues with the
current implementation, such as permitting the use of blocking
operations. In the meantime, the existing threading model has been
found to be far superior to a forking model, especially on platforms
that lack a native fork implementation like Windows.
\par
Another reason that Ruby was selected was because of the supported
existence of a native interpreter for the Windows platform. While
perl has a cygwin version and an ActiveState version, both are
plagued by usability problems. The fact that the Ruby interpreter
can be compiled and executed natively on Windows drastically
improves performance. Furthermore, the interpreter is also very
small and can be easily modified in the event that there is a bug.
\par
The Python programming language was also a language candidate. The
reason the Metasploit staff opted for Ruby instead of Python was for
a few different reasons. The primary reason is a general distaste
for some of the syntactical annoyances forced by Python, such as
block-indention. While many would argue the benefits of such an
approach, some members of the Metasploit staff find it to be an
unnecessary restriction. Other issues with Python center around
limitations in parent class method calling and backward
compatibility of interpreters.
\par
The C/C++ programming languages were also very seriously considered,
but in the end it was obvious that attempting to deploy a portable
and usable framework in a non-interpreted language was something
that would not be feasible. Furthermore, the development time-line
for this language selection would most likely be much longer.
\par
Even though the 2.x branch of the framework has been quite
successful, the Metasploit staff encountered a number of limitations
and annoyances with perl's object-oriented programming model, or
lack thereof. The fact that the perl interpreter is part of the
default install on many distributions is not something that the
Metasploit staff felt was worth detouring the language selection.
\chapter{Architecture and Design}
\par
The framework was designed to be as modular as possible as to
encourage the re-use of code across various projects. The most
fundamental piece of the architecture is the \textit{Rex} library
which is short for the \texttt{Ruby Extension Library}\footnote{This
library has many similarities to the 2.x Pex library}. Some of the
components provided by Rex are a wrapper socket subsystem,
implementations of protocol clients and servers, a logging
subsystem, exploitation utility classes, and a number of other
useful classes. Rex itself is designed to have no dependencies
other than what comes with the default Ruby install. In the event
that a Rex class depends on something that is not included in the
default install, the failure to find such a dependency should not
lead to the inability to use Rex.
\par
The framework itself is broken down into a few different pieces, the
most low-level being the \textit{framework core}. The framework
core is responsible for implementing all of the required interfaces
that allow for interacting with exploit modules, sessions, and
plugins. This core library is extended by the \textit{framework
base} which is designed to provide simpler wrapper routines for
dealing with the framework core as well as providing utility classes
for dealing with different aspects of the framework, such as
serializing module state to different output formats. Finally, the
base library is extended by the \textit{framework ui} which
implements support for the different types of user interfaces to the
framework itself, such as the command console and the web interface.
\par
Separate from the framework are the modules and plugins that it's
designed to support. A framework module is defined as being one of
an exploit, payload, encoder, NOP generator, or recon tool. These
modules have a well-defined structure and interface for being loaded
into the framework. A framework plugin is very loosely defined as
something that extends the functionality of the framework or
augments an existing feature to make it act in a different manner.
Plugins can add new commands to user interfaces, log all network
traffic, or perform whatever other action might be useful.
\par
Figure \ref{fig-arch-pkg} illustrates the framework's inter-package
dependencies. The following sections will elaborate on each of the
packages described above and the various important subsystems found
within each package. Full documentation of the classes and APIs
mentioned in this document can be found in the auto-generated API
level documentation found on the Metasploit website.
\begin{figure}[h]
\begin{center}
\includegraphics[height=4in,width=4in]{dev_guide_arch_packages}
\caption{Framework 3.0 package dependencies} \label{fig-arch-pkg}
\end{center}
\end{figure}
\section{Rex}
\par
The \textit{Rex} library is a collection of classes and modules that
may be useful to more than one project. The most useful classes
provided by the library are documented in the following subsections.
\subsection{Assembly}
\par
When writing exploits it is often necessary to have to generate
assembly instructions on the fly with variable operands, such as
immediate values, registers, and so on. To support this
requirement, the Rex library provides classes under the
\texttt{Rex::Arch} namespace that implement architecture-dependent
opcode generation routines as well as other architecture-specific
methods, like integer packing.
\subsubsection{Integer packing}
\par
Packing an integer depends on the byte-ordering of the target
architecture, whether it be big endian or little endian. The
\texttt{Rex::Arch.pack\_addr} method supports packing an integer
using the supplied architecture type (\texttt{ARCH\_XXX}) as an
indication of which byte-ordering to use.
\subsubsection{Stack pointer adjustment}
\par
Some exploits require that the stack pointer be adjusted prior to
the execution of a payload that modifies the stack in order to
prevent corruption of the payload itself. To support this, the
\texttt{Rex::Arch.adjust\_stack\_pointer} method provides a way to
generate the opcodes that lead to adjusting the stack pointer of a
given architecture by the supplied adjustment value. The adjustment
value can be positive or negative.
\subsubsection{Architecture-specific opcode generation}
\par
Each architecture that currently has support for dynamically
generating instruction opcodes has a class under the
\texttt{Rex::Arch} namespace, such as \texttt{Rex::Arch::X86}. The
x86 class has support for generating \texttt{jmp}, \texttt{call},
\texttt{push}, \texttt{mov}, \texttt{add}, and \texttt{sub}
instructions.
\subsection{Encoding}
\par
Encoding buffers using algorithms like XOR can sometimes be useful
outside the context of an exploit. For that reason, the Rex library
provides a basic set of classes that implement different types of
XOR encoders, such as variable length key XOR encoders and additive
feedback XOR encoders. These classes are used by the framework to
implement different types of basic encoders that can be used by
encoder modules. The classes for encoding buffers can be found in
the \texttt{Rex::Encoding} namespace.
\subsection{Exploitation}
\par
Often times vulnerabilities will share a common attack vector or
will require a specific order of operations in order to achieve the
end-goal of code execution. To assist in that matter, the Rex
library has a set of classes that implement some of the common
necessities that exploits have.
\subsubsection{Egghunter}
\par
In some cases the exploitation of a vulnerability may be limited by
the amount of payload space that exists in the area of the overflow.
This can sometimes prevent normal methods of exploitation from being
possible due to the inability to fit a standard payload in the
amount of room that is available. To solve this problem, an exploit
writer can make use of an \textit{egghunting} payload that searches
the target process' address space for an egg that is prefixed to a
larger payload. This requires that an attacker have the ability to
stick the larger payload somewhere else in memory prior to
exploitation. In the event that an egghunter is necessary, the
\texttt{Rex::Exploitation::Egghunter} class can be used.
\subsubsection{SEH record generation}
\par
One attack vector that is particularly common on the Windows
platform is what is referred to as an SEH overwrite. When this
occurs, an SEH registration record is overwritten on the stack with
user-controlled data. To leverage this, the handler address of the
registration record is point to an address that will either directly
or indirectly lead to control of execution flow. To make this work,
most attackers will point the handler address at the location of a
\texttt{pop/pop/ret} instruction set somewhere in the address space.
This action returns four bytes before the location of the handler
address on the stack. In most cases, attackers will set two of the
four bytes to be equivalent a short jump instruction that hops over
the handler address and into the payload controlled by the attacker.
\par
While the common approach works fine, there is plenty of room for
improvement. The \texttt{Rex::Exploitation::Seh} class provides
support for generating the normal (static) SEH registration record
via the \texttt{generate\_static\_seh\_record} method. However, it
also supports the generation of a dynamic registration record that
has a random short jump length and random padding between the end of
the registration record and the actual payload. This can be used to
make the exploit harder to signature in an IDS environment. The
generation of a dynamic registration record is provided by
\texttt{generate\_dynamic\_seh\_record}. Both methods are by the
\texttt{generate\_seh\_record} method that decides which of the two
methods to use based on evasion levels.
\subsection{Logging}
\par
The Rex library provides support for the basic logging of strings to
arbitrary log sinks, such as a flat file or a database. The logging
interface is exposed to programmers through a set of
globally-defined methods: \texttt{dlog}, \texttt{ilog},
\texttt{wlog}, \texttt{elog}, and \texttt{rlog}. These methods
represent debug logging, information logging, warning logging, error
logging, and raw logging respectively. Each method can be passed a
log message, a log source (the name of the component or package that
the message came from), and a log level which is a number between
zero and three. Log sources can be registered on the fly by
\texttt{register\_log\_source} and their log level can be set by
\texttt{set\_log\_level}.
\par
The log levels are meant to make it easy to hide verbose log
messages when they are not necessary. The use of the three log
levels is defined below:
\subsubsection{LEV\_0 - Default}
This log level is the default log level if none is specified. It
should be used when a log message should always be displayed when
logging is enabled. Very few log messages should occur at this level
aside from necessary information logging and error/warning logging.
Debug logging at level zero is not advised.
\subsubsection{LEV\_1 - Extra}
This log level should be used when extra information may be needed
to understand the cause of an error or warning message or to get
debugging information that might give clues as to why something is
happening. This log level should be used only when information may
be useful to understanding the behavior of something at a basic
level. This log level should not be used in an exhaustively verbose
fashion.
\subsubsection{LEV\_2 - Verbose}
This log level should be used when verbose information may be needed
to analyze the behavior of the framework. This should be the
default log level for all detailed information not falling into
LEV\_0 or LEV\_1. It is recommended that this log level be used by
default if you are unsure.
\subsubsection{LEV\_3 - Insanity}
This log level should contain very verbose information about the
behavior of the framework, such as detailed information about variable
states at certain phases including, but not limited to, loop iterations,
function calls, and so on. This log level will rarely be displayed,
but when it is the information provided should make it easy to analyze
any problem.
\subsection{Post-exploitation}
\par
The rex library provides client-side implementations for some
advanced post-exploitation, such as DispatchNinja and Meterpreter.
These two post-exploitation client interfaces are designed to be
usable outside of the context of an exploit. The \texttt{Rex::Post}
namespace provides a set of classes at its root that are meant to
act as a generalized interface to remote systems via the
post-exploitation clients, if supported. These classes allow
programmers to write automated tools that can operate upon remote
machines in a platform-independent manner. While it's true that
platforms may lack analogous feature sets for some actions, the
majority of the common set of actions will have functional
equivalents.
\subsection{Protocols}
\par
Support for some of the more common protocols, such as HTTP and SMB,
is included in the rex library to help support the development of
protocol-specific exploits and to allow for ease of use in other
projects. Each protocol implementation exists under the
\texttt{Rex::Proto} namespace.
\subsubsection{DCERC}
\par
The rex library supports a fairly robust implementation of a subset
of the DCERPC feature-set and includes support for doing invasive
actions such as multi-context bind and packet fragmentation. The
classes that support the DCERPC client interface can be found in the
\texttt{Rex::Proto::DCERPC} namespace.
\subsubsection{HTTP}
\par
Minimal support for an HTTP client and server are provided in the
rex library. While similar protocol class implementations are
provided both in webrick and in other areas of the ruby default
standard library set, it was deemed that the current implementations
were not well suited for general purpose use due to the existence of
blocking request parsing and other such things. The rex-provided
HTTP library also provides classes for parsing HTTP requests and
responses. The HTTP protocol classes can be found under the
\texttt{Rex::Proto::Http} namespace.
\subsubsection{SMB}
\par
Robust support for the SMB protocol is provided by the classes in
the \texttt{Rex::Proto::SMB} namespace. These classes support
connecting to SMB servers and performing logins and other
SMB-exposed actions like transacting a named pipe and performing
other specific commands. The SMB classes are particularly useful
for exploits that require communicating with an SMB server.
\subsection{Services}
\par
One of the limitations identified in the 2.x branch of the framework
was that it was not possible to share listeners on the local machine
when attempting to perform two different exploits that both needed
to listen on the same port. To solve this problem, the 3.0 version
of the framework provides the concept of \textit{services} which are
registered listeners that are initialized once and then subsequently
shared by future requires to allocate the same service. This makes
it possible to do things like have two exploits waiting for an HTTP
request on port 80 without having any sort of specific conflicts.
This is especially useful because it makes it possible to not have
to worry about firewall restrictions on outbound ports that would
normally only permit connections to port 80, thus making it possible
to try multiple client-side exploits against a host with all the
different exploit instances listening on the same port for requests.
\par
Aside from the sharing of HTTP-like services, the service subsystem
also provides a way to relay connections from a local TCP port to an
already existing stream. This support is offered through the
\texttt{Rex::Services::LocalRelay} class.
\subsection{Socket}
\par
One of the most important features of the rex library is the set of
classes that wrapper sockets. The socket subsystem provides an
interface for creating sockets of a given protocol using what is
referred to as a \texttt{Comm} factory class. The purpose of the
Comm factory class is to make the underlying transport and classes
used to establish the connection for a given socket opaque. This
makes it possible for socket connections to be established using the
local socket facilities as well as by using some sort of tunneled
socket proxying system as is the case with Meterpreter connection
pivoting.
\par
Sockets are created using the socket \texttt{Parameter} class which
is initialized either directly or through the use of a hash. The
hash initialization of the Parameters class is much the same as
perl's socket initialization. The hash attributes supported by the
Parameter class are documented in the constructor of the Parameter
class.
\par
There are a few different ways to create sockets. The first way is
to simply call \texttt{Rex::Socket.create} with a hash that will be
used to create a socket of the appropriate type using the supplied
or default Comm factory class. A second approach that can be used
is to call the \texttt{Rex::Socket::create\_param} method which
takes an initialized Parameter instance as an argument. The
remaining approaches involve using protocol-specific factory
methods, such as \texttt{create\_tcp}, \texttt{create\_tcp\_server},
and \texttt{create\_udp}. All three of these methods take a hash as
a parameter that is translated into a Parameter instance and passed
on for actual creation.
\par
All sockets have five major attributes that are shared in common,
though some may not always be initialized. The first attributes
provide information about the remote host and port and are exposed
through the \texttt{peerhost} and \texttt{peerport} attributes,
respectively. The second attributes provide information the local
host and port and are exposed through the \texttt{localhost} and
\texttt{localport} attributes, respectively. Finally, every socket
has a hash of contextual information that was used during it's
creation which is exposed through the \texttt{context} attribute.
While most exploits will have an empty hash, some exploits may have
a hash that contains state information that can be used to track the
originator of the socket. The framework makes use of this feature
to associate sockets with framework, exploit, and payload instances.
\subsubsection{Comm classes}
\par
The \texttt{Comm} interface used in the library has one simple
method called \texttt{create} which takes a \texttt{Parameter}
instance. The purpose of this factory approach is to provide a
location and transport independent way of creating compatible socket
object instances using a generalized factory method. For
connections being established directly from the local box, the
\texttt{Rex::Socket::Comm::Local} class is used. For connections be
established through another machine, a medium specific Comm factory
is used, such as the Meterpreter Comm class.
\par
The \texttt{Comm} interface also supports registered event
notification handlers for when certain things occur, like prior to
and after the creation of a new socket. This can be used by
external projects to augment the feature set of a socket or to
change its default behavior.
\subsubsection{TCP sockets}
\par
TCP sockets in the Rex library are implemented as a mixin,
\texttt{Rex::Socket::Tcp}, that extends the built-in ruby Socket
base class when the local Comm factory is used. This mixin also
includes the \texttt{Rex::IO::Stream} and \texttt{Rex::Socket}
mixins. For TCP servers, the \texttt{Rex::Socket::TcpServer} class
should be used.
\subsubsection{SSL sockets}
\par
SSL sockets are implemented on top of the normal Rex TCP socket
mixin and makes use of the OpenSSL Ruby support. The module used
for SSL TCP sockets is \texttt{Rex::Socket::SslTcp}.
\subsubsection{Switch board routing table}
\par
One of the advancements in the 3.0 version of the framework is the
concept of a local routing table that controls which Comm factory is
used for a particular route. The reason this is useful is for
scenarios where a box is compromised that straddles an internal
network that can't be directly reached. By adjusting the switch
board routing table to point the local subnet through a Meterpreter
Comm running the host that straddles the network, it is possible for
force the socket library to automatically use the Meterpreter Comm
factory when anything tries to communicate with hosts on the local
subnet. This support is implemented through the
\texttt{Rex::Socket::SwitchBoard} class.
\subsubsection{Subnet walking}
\par
The \texttt{Rex::Socket::SubnetWalker} class provides a way of
enumerating all the IP addresses in a subnet as described by a
subnet address and a netmask.
\subsection{Synchronization}
\par
Due to the use of multi-threading, the Rex library provides extra
classes that don't exist by default in the Ruby standard library.
These classes provide extra synchronization primitives.
\subsubsection{Notification events}
\par
While Ruby does have the concept of \texttt{ConditionVariable}'s, it
lacks the complete concept of notification events. Notification
events are used extensively on platforms like Windows. These events
can be waited on and signaled, either permanently or temporarily.
For more information. This support is provided by the
\texttt{Rex::Sync::Event} class.
\subsubsection{Reader/Writer locks}
\par
A common threading primitive is the reader/writer lock.
Reader/writer locks are used to make it possible for multiple
threads to be reading a resource concurrently while only permitting
exclusive access to one thread when write operations are necessary.
This primitive is especially useful for resources that are not
updated very often as it can drastically reduce lock contentions.
While it may be overkill to have such a synchronization primitive in
the library, it's still cool.
\par
The reader/writer lock implementation is provided by the
\texttt{Rex::ReadWriteLock} class. To lock the resource for read,
the \texttt{lock\_read} method can be used. To lock the resource
for write access, the \texttt{lock\_write} method can be used.
\subsubsection{Reference counting}
\par
In some cases it is necessary to reference count an instance in a
synchronized fashion so that it is not cleaned up or destroyed until
the last reference is gone. For this purpose, the \texttt{Rex::Ref}
class can be used with the \texttt{refinit} method for initializing
references to 1 and the \texttt{ref} and \texttt{deref} methods that
do what their names imply. When the reference count drops to zero,
the \texttt{cleanup} method is called on the object instance to give
it a chance to restore things back to normal.
\subsubsection{Thread-safe operations}
\par
Some of the built-in functions in Ruby are not thread safe in that
they can block other ruby threads from being scheduled on certain
conditions. To solve this problem, the functions that have issues
have been wrappered with implementations that ensure that not all
ruby threads will block. The specific methods that required change
were \texttt{select} and \texttt{sleep}.
\section{Framework Core}
\subsection{Event Notifications}
\subsection{Framework Managers}
\section{Framework Base}
\subsection{Configuration}
\subsection{Logging}
\subsection{Serialization}
\subsection{Simplified Framework}
\section{Framework Ui}
\subsection{Console}
\subsection{Web}
\chapter{Framework Modules}
\section{Encoder}
\section{Exploit}
\subsection{Stances}
\subsection{Types}
\subsection{Mixins}
\section{Nop}
\section{Payload}
\subsection{Single}
\subsection{Stage}
\subsection{Stager}
\section{Recon}
\subsection{Discovery}
\subsection{Analyzer}
\chapter{Framework Plugins}
\section{User-interface Plugins}
\chapter{Methodologies}
\section{Writing an Exploit}
\chapter{Conclusion}
\end{document}