Upgrading metasm...

git-svn-id: file:///home/svn/framework3/trunk@6312 4d416f70-5f16-0410-b530-b9f4589650da
unstable
HD Moore 2009-03-07 22:58:19 +00:00
parent 2b2c6b983e
commit aa4274a3bb
107 changed files with 86 additions and 29433 deletions

View File

@ -1,3 +0,0 @@
71374080fcf5e7be3322ce56f062c29c984c577b sstic07
f3bcc93471bf9186ed62edc1bef90bbe6614a0a3 metasm-0.1-rc1
13bead20e76be749ecdb67096d9cb0847d69ad59 version 0.1

View File

@ -1,17 +0,0 @@
N: Yoann GUILLOT
E: yoann at ofjj.net
D: Lead developper
N: Julien TINNES
E: julien at cr0.org
D: Senior Product Manager
D: Ideas, bug hunting, Yoann-slapping
D: Metasploit integration
N: Arnaud CORNET
E: arnaud.cornet at gmail.com
D: Initial ELF support
N: Raphael RIGO
E: raphael at cr0.org
D: Initial MIPS support and misc stuff

View File

@ -1,458 +0,0 @@
GNU LESSER GENERAL PUBLIC LICENSE
Version 2.1, February 1999
Copyright (C) 1991, 1999 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
[This is the first released version of the Lesser GPL. It also counts
as the successor of the GNU Library Public License, version 2, hence
the version number 2.1.]
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
Licenses are intended to guarantee your freedom to share and change
free software--to make sure the software is free for all its users.
This license, the Lesser General Public License, applies to some
specially designated software packages--typically libraries--of the
Free Software Foundation and other authors who decide to use it. You
can use it too, but we suggest you first think carefully about whether
this license or the ordinary General Public License is the better
strategy to use in any particular case, based on the explanations below.
When we speak of free software, we are referring to freedom of use,
not price. Our General Public Licenses are designed to make sure that
you have the freedom to distribute copies of free software (and charge
for this service if you wish); that you receive source code or can get
it if you want it; that you can change the software and use pieces of
it in new free programs; and that you are informed that you can do
these things.
To protect your rights, we need to make restrictions that forbid
distributors to deny you these rights or to ask you to surrender these
rights. These restrictions translate to certain responsibilities for
you if you distribute copies of the library or if you modify it.
For example, if you distribute copies of the library, whether gratis
or for a fee, you must give the recipients all the rights that we gave
you. You must make sure that they, too, receive or can get the source
code. If you link other code with the library, you must provide
complete object files to the recipients, so that they can relink them
with the library after making changes to the library and recompiling
it. And you must show them these terms so they know their rights.
We protect your rights with a two-step method: (1) we copyright the
library, and (2) we offer you this license, which gives you legal
permission to copy, distribute and/or modify the library.
To protect each distributor, we want to make it very clear that
there is no warranty for the free library. Also, if the library is
modified by someone else and passed on, the recipients should know
that what they have is not the original version, so that the original
author's reputation will not be affected by problems that might be
introduced by others.
Finally, software patents pose a constant threat to the existence of
any free program. We wish to make sure that a company cannot
effectively restrict the users of a free program by obtaining a
restrictive license from a patent holder. Therefore, we insist that
any patent license obtained for a version of the library must be
consistent with the full freedom of use specified in this license.
Most GNU software, including some libraries, is covered by the
ordinary GNU General Public License. This license, the GNU Lesser
General Public License, applies to certain designated libraries, and
is quite different from the ordinary General Public License. We use
this license for certain libraries in order to permit linking those
libraries into non-free programs.
When a program is linked with a library, whether statically or using
a shared library, the combination of the two is legally speaking a
combined work, a derivative of the original library. The ordinary
General Public License therefore permits such linking only if the
entire combination fits its criteria of freedom. The Lesser General
Public License permits more lax criteria for linking other code with
the library.
We call this license the "Lesser" General Public License because it
does Less to protect the user's freedom than the ordinary General
Public License. It also provides other free software developers Less
of an advantage over competing non-free programs. These disadvantages
are the reason we use the ordinary General Public License for many
libraries. However, the Lesser license provides advantages in certain
special circumstances.
For example, on rare occasions, there may be a special need to
encourage the widest possible use of a certain library, so that it becomes
a de-facto standard. To achieve this, non-free programs must be
allowed to use the library. A more frequent case is that a free
library does the same job as widely used non-free libraries. In this
case, there is little to gain by limiting the free library to free
software only, so we use the Lesser General Public License.
In other cases, permission to use a particular library in non-free
programs enables a greater number of people to use a large body of
free software. For example, permission to use the GNU C Library in
non-free programs enables many more people to use the whole GNU
operating system, as well as its variant, the GNU/Linux operating
system.
Although the Lesser General Public License is Less protective of the
users' freedom, it does ensure that the user of a program that is
linked with the Library has the freedom and the wherewithal to run
that program using a modified version of the Library.
The precise terms and conditions for copying, distribution and
modification follow. Pay close attention to the difference between a
"work based on the library" and a "work that uses the library". The
former contains code derived from the library, whereas the latter must
be combined with the library in order to run.
GNU LESSER GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License Agreement applies to any software library or other
program which contains a notice placed by the copyright holder or
other authorized party saying it may be distributed under the terms of
this Lesser General Public License (also called "this License").
Each licensee is addressed as "you".
A "library" means a collection of software functions and/or data
prepared so as to be conveniently linked with application programs
(which use some of those functions and data) to form executables.
The "Library", below, refers to any such software library or work
which has been distributed under these terms. A "work based on the
Library" means either the Library or any derivative work under
copyright law: that is to say, a work containing the Library or a
portion of it, either verbatim or with modifications and/or translated
straightforwardly into another language. (Hereinafter, translation is
included without limitation in the term "modification".)
"Source code" for a work means the preferred form of the work for
making modifications to it. For a library, complete source code means
all the source code for all modules it contains, plus any associated
interface definition files, plus the scripts used to control compilation
and installation of the library.
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running a program using the Library is not restricted, and output from
such a program is covered only if its contents constitute a work based
on the Library (independent of the use of the Library in a tool for
writing it). Whether that is true depends on what the Library does
and what the program that uses the Library does.
1. You may copy and distribute verbatim copies of the Library's
complete source code as you receive it, in any medium, provided that
you conspicuously and appropriately publish on each copy an
appropriate copyright notice and disclaimer of warranty; keep intact
all the notices that refer to this License and to the absence of any
warranty; and distribute a copy of this License along with the
Library.
You may charge a fee for the physical act of transferring a copy,
and you may at your option offer warranty protection in exchange for a
fee.
2. You may modify your copy or copies of the Library or any portion
of it, thus forming a work based on the Library, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) The modified work must itself be a software library.
b) You must cause the files modified to carry prominent notices
stating that you changed the files and the date of any change.
c) You must cause the whole of the work to be licensed at no
charge to all third parties under the terms of this License.
d) If a facility in the modified Library refers to a function or a
table of data to be supplied by an application program that uses
the facility, other than as an argument passed when the facility
is invoked, then you must make a good faith effort to ensure that,
in the event an application does not supply such function or
table, the facility still operates, and performs whatever part of
its purpose remains meaningful.
(For example, a function in a library to compute square roots has
a purpose that is entirely well-defined independent of the
application. Therefore, Subsection 2d requires that any
application-supplied function or table used by this function must
be optional: if the application does not supply it, the square
root function must still compute square roots.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Library,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Library, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote
it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Library.
In addition, mere aggregation of another work not based on the Library
with the Library (or with a work based on the Library) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License. (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.) Do not make any other change in
these notices.
Once this change is made in a given copy, it is irreversible for
that copy, so the ordinary GNU General Public License applies to all
subsequent copies and derivative works made from that copy.
This option is useful when you wish to copy part of the code of
the Library into a program that is not a library.
4. You may copy and distribute the Library (or a portion or
derivative of it, under Section 2) in object code or executable form
under the terms of Sections 1 and 2 above provided that you accompany
it with the complete corresponding machine-readable source code, which
must be distributed under the terms of Sections 1 and 2 above on a
medium customarily used for software interchange.
If distribution of object code is made by offering access to copy
from a designated place, then offering equivalent access to copy the
source code from the same place satisfies the requirement to
distribute the source code, even though third parties are not
compelled to copy the source along with the object code.
5. A program that contains no derivative of any portion of the
Library, but is designed to work with the Library by being compiled or
linked with it, is called a "work that uses the Library". Such a
work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License.
However, linking a "work that uses the Library" with the Library
creates an executable that is a derivative of the Library (because it
contains portions of the Library), rather than a "work that uses the
library". The executable is therefore covered by this License.
Section 6 states terms for distribution of such executables.
When a "work that uses the Library" uses material from a header file
that is part of the Library, the object code for the work may be a
derivative work of the Library even though the source code is not.
Whether this is true is especially significant if the work can be
linked without the Library, or if the work is itself a library. The
threshold for this to be true is not precisely defined by law.
If such an object file uses only numerical parameters, data
structure layouts and accessors, and small macros and small inline
functions (ten lines or less in length), then the use of the object
file is unrestricted, regardless of whether it is legally a derivative
work. (Executables containing this object code plus portions of the
Library will still fall under Section 6.)
Otherwise, if the work is a derivative of the Library, you may
distribute the object code for the work under the terms of Section 6.
Any executables containing that work also fall under Section 6,
whether or not they are linked directly with the Library itself.
6. As an exception to the Sections above, you may also combine or
link a "work that uses the Library" with the Library to produce a
work containing portions of the Library, and distribute that work
under terms of your choice, provided that the terms permit
modification of the work for the customer's own use and reverse
engineering for debugging such modifications.
You must give prominent notice with each copy of the work that the
Library is used in it and that the Library and its use are covered by
this License. You must supply a copy of this License. If the work
during execution displays copyright notices, you must include the
copyright notice for the Library among them, as well as a reference
directing the user to the copy of this License. Also, you must do one
of these things:
a) Accompany the work with the complete corresponding
machine-readable source code for the Library including whatever
changes were used in the work (which must be distributed under
Sections 1 and 2 above); and, if the work is an executable linked
with the Library, with the complete machine-readable "work that
uses the Library", as object code and/or source code, so that the
user can modify the Library and then relink to produce a modified
executable containing the modified Library. (It is understood
that the user who changes the contents of definitions files in the
Library will not necessarily be able to recompile the application
to use the modified definitions.)
b) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (1) uses at run time a
copy of the library already present on the user's computer system,
rather than copying library functions into the executable, and (2)
will operate properly with a modified version of the library, if
the user installs one, as long as the modified version is
interface-compatible with the version that the work was made with.
c) Accompany the work with a written offer, valid for at
least three years, to give the same user the materials
specified in Subsection 6a, above, for a charge no more
than the cost of performing this distribution.
d) If distribution of the work is made by offering access to copy
from a designated place, offer equivalent access to copy the above
specified materials from the same place.
e) Verify that the user has already received a copy of these
materials or that you have already sent this user a copy.
For an executable, the required form of the "work that uses the
Library" must include any data and utility programs needed for
reproducing the executable from it. However, as a special exception,
the materials to be distributed need not include anything that is
normally distributed (in either source or binary form) with the major
components (compiler, kernel, and so on) of the operating system on
which the executable runs, unless that component itself accompanies
the executable.
It may happen that this requirement contradicts the license
restrictions of other proprietary libraries that do not normally
accompany the operating system. Such a contradiction means you cannot
use both them and the Library together in an executable that you
distribute.
7. You may place library facilities that are a work based on the
Library side-by-side in a single library together with other library
facilities not covered by this License, and distribute such a combined
library, provided that the separate distribution of the work based on
the Library and of the other library facilities is otherwise
permitted, and provided that you do these two things:
a) Accompany the combined library with a copy of the same work
based on the Library, uncombined with any other library
facilities. This must be distributed under the terms of the
Sections above.
b) Give prominent notice with the combined library of the fact
that part of it is a work based on the Library, and explaining
where to find the accompanying uncombined form of the same work.
8. You may not copy, modify, sublicense, link with, or distribute
the Library except as expressly provided under this License. Any
attempt otherwise to copy, modify, sublicense, link with, or
distribute the Library is void, and will automatically terminate your
rights under this License. However, parties who have received copies,
or rights, from you under this License will not have their licenses
terminated so long as such parties remain in full compliance.
9. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Library or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Library (or any work based on the
Library), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Library or works based on it.
10. Each time you redistribute the Library (or any work based on the
Library), the recipient automatically receives a license from the
original licensor to copy, distribute, link with or modify the Library
subject to these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties with
this License.
11. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Library at all. For example, if a patent
license would not permit royalty-free redistribution of the Library by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Library.
If any portion of this section is held invalid or unenforceable under any
particular circumstance, the balance of the section is intended to apply,
and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
12. If the distribution and/or use of the Library is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Library under this License may add
an explicit geographical distribution limitation excluding those countries,
so that distribution is permitted only in or among countries not thus
excluded. In such case, this License incorporates the limitation as if
written in the body of this License.
13. The Free Software Foundation may publish revised and/or new
versions of the Lesser General Public License from time to time.
Such new versions will be similar in spirit to the present version,
but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Library
specifies a version number of this License which applies to it and
"any later version", you have the option of following the terms and
conditions either of that version or of any later version published by
the Free Software Foundation. If the Library does not specify a
license version number, you may choose any version ever published by
the Free Software Foundation.
14. If you wish to incorporate parts of the Library into other free
programs whose distribution conditions are incompatible with these,
write to the author to ask for permission. For software which is
copyrighted by the Free Software Foundation, write to the Free
Software Foundation; we sometimes make exceptions for this. Our
decision will be guided by the two goals of preserving the free status
of all derivatives of our free software and of promoting the sharing
and reuse of software generally.
NO WARRANTY
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES.
END OF TERMS AND CONDITIONS

View File

@ -1,128 +0,0 @@
Metasm, the Ruby assembly manipulation suite
============================================
* You have some samples in samples/
* LICENCE is LGPL
Author: Yoann Guillot <yoann at ofjj.net>
Basic overview:
Metasm allows you to interact with executables formats (ExeFormat):
PE, ELF, Shellcode, etc
There are three approaches of an ExeFormat:
- compiling one up, from scratch ( -> source)
- decompiling an existing format ( -> blocks)
- manipulating the file structure( -> encoded)
Assembly:
When compiling, you start from a source text (ruby String, consisting
mostly in a sequence of instructions/data/padding directive), then you parse
it.
The string is handed to a Preprocessor (which handles #if, #ifdef, #include,
#define, comments etc, almost 100% compatible with gcc -E), which is
encapsulated in an AsmPreprocessor (which handles asm macro definitions, equ and
asm comments).
This AsmPreprocessor returns tokens to the ExeFormat, which parses them as Data,
Padding, Labels or parser directives. Parser directives always start with a dot.
They can be generic (.pad, .offset...) or ExeFormat-specific (.section,
.import...).
If the ExeFormat does not recognize a word, it hands it to its CPU instance,
which is responsible for parsing Instructions, or raise an exception.
All these tokens are stored in one or more arrays in the @source attribute of
the ExeFormat (Shellcode's @source is an Array, for PE/ELF it is a hash of
section name => Array)
Every immediate value can be an arbitrary Expression (see later).
You can then assemble the source to binary sections.
ExeFormat has a constructor to do that: ExeFormat.assemble(cpu, source)
it parses the source, assemble it, and return the ExeFormat instance.
EncodedData:
In Metasm all binary data is stored as an EncodedData.
EncodedData has 3 main attributes:
- @data which holds the raw binary data (generally a ruby String, but see
VirtualString)
- @export which is a hash associating an export name (label name) to an offset
within @data
- @reloc which is a hash whose keys are offsets within @data, and whose values
are Relocation objects.
A Relocation object has an endianness (:little/:big), a sign (:signed/:unsigned/:any),
a size (in bits) and a target.
The target is an arbitrary arithmetic/logic Expression.
EncodedData also has a @virtualsize (for e.g. .bss sections), and a @ptr (used
when decoding things)
You can fixup an EncodedData, with a Hash variable name => value (value should
be an Expression or a numeric value). When you do that, each relocation's target
is bound using the binding, and if the result is calculable (no external variable
name used in the Expression), the result is encoded using the relocation's
size/sign/endianness information. If it overflows (try to store 128 in an 8bit
signed relocation), an EncodeError exception is raised.
If the relocation's target is not numeric, the target is unchanged if you use
EncodedData#fixup, or it is replaced with the bound target with #fixup! .
Desassembly: (experimental)
When decompiling, you start from a decoded ExeFormat (you need to be able to
say what data is at which virtual address), you specify a virtual address to
start (virtual address or export name). The ExeFormat starts disassembling
instructions. When it encounters an Opcode marked as :setip, it calls the CPU
to find the jump destination, and backtracks instructions until it finds the
numeric value.
The disassembled code is stored as InstructionBlocks, whichs holds a list of
DecodedInstruction, a list of @from and @to (array of block addresses)
A DecodedInstruction has an Instruction, an Opcode and a bin_length (to allow
printing the hex dump)
(experimental for now, does not handle external calls, does not handle well
subfunctions, should only be used on small shellcodes)
Constructor: Shellcode.disassemble(cpu, binary)
ExeFormat manipulation:
You can encode/decode an ExeFormat (ie decode sections, imports, headers etc)
Constructor: ExeFormat.decode_file(str), ExeFormat.decode_file_header(str)
Methods: ExeFormat#encode_file(filename), ExeFormat#encode_string
VirtualString:
A VirtualString is an object String-like : you can read/maybe write slices of
it. It can be used as @data in an EncodedData, and thus allows virtualization
of most Metasm algorithms.
You cannot change a VirtualString length.
Taking a slice of a VirtualString can return either a String (length smaller
than 4096) or another VirtualString. You can force getting a small VirtualString
using the #dup(from, length) method.
Any unimplemented method called on it is forwarded to frozen String which is
a full copy of the VirtualString (should generally not be used).
There are currently 3 VirtualStrings implemented:
- VirtualFile, whichs loads a file by 4096-bytes chunks, on demand,
- WindowsRemoteString, which maps another process' virtual memory (uses windows
debug api)
- LinuxRemoteString, which maps another process' virtual memory (need ptrace
rights, memory reading is done using /proc/pid/mem)
The Win/Lin version are quite powerful, and allow things like live process
disassembly/patching easily (use LoadedPE/LoadedELF as ExeFormat)
Things planned:
Write a C parser (at least for headers), and adding syntax to support C structs
in assembly.
Write a good disassembler, supporting external calls through C header parsing,
recognize/handle sub functions.
Write an UI for dasm

View File

@ -1,19 +0,0 @@
disasm:
find a way to recognize non-returning subfunction (eg thunk_exit)
DecodedData (dword, string, array, structs? ...)
make exe.decode generate DecodedData ? (for elf symbols, import names etc)
handle function-local stack space (esp+XX) -> private, nobacktrace
handle function-local labels (also rename local stack vars offsets)
forward register tracking ? with weak values ?
path-specific backtracking ( foo: call a ; a: jmp retloc ; bar: call b ; b: jmp retloc ; retloc: ret ; call foo ; ret : last ret trackback should only reach a:)
function signatures (a la FLIRT?)
decompiler: make one
ia32: emu fpu
encode: SplitReloc for pseudo-instrs (mips li => reloc high :a16 + reloc low :a16), use Reloc.encode(edata, off) or sumthin for edata.fixup
mips: find a way to have a 'li' instruction that resolve as 'loadlow' or 'loadhigh+orlow'
optimizer/deoptimizer (asm/dasm): reorder instructions
compile: optimize (jmp -> jmp, non-volatile vars, ..), support intrinsics?
debug: unify windows/linux API, support hw dbg registers uses (bpx/r/w..)
gui: debugger, hexedit, C code navigation
elf: symbol versions

View File

@ -1,70 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm'
class String
@@cpu = Metasm::Ia32.new
class << self
def cpu() @@cpu end
def cpu=(c) @@cpu=c end
end
# encodes the current string as a Shellcode, returns the resulting EncodedData
def encode_edata
s = Metasm::Shellcode.assemble @@cpu, self
s.encoded
end
# encodes the current string as a Shellcode, returns the resulting binary String
# outputs warnings on unresolved relocations
def encode
ed = encode_edata
if not ed.reloc.empty?
puts 'W: encoded string has unresolved relocations: ' + ed.reloc.map { |o, r| r.target.inspect }.join(', ')
end
ed.fill
ed.data
end
# decodes the current string as a Shellcode, with specified base address
# returns the resulting Disassembler
def decode_blocks(base_addr=0, eip=base_addr)
sc = Metasm::Shellcode.decode(self, @@cpu)
sc.base_addr = base_addr
sc.disassemble(eip)
end
# decodes the current string as a Shellcode, with specified base address
# returns the asm source equivallent
def decode(base_addr=0, eip=base_addr)
decode_blocks(base_addr, eip).to_s
end
end
# get in interactive assembler mode
def asm
puts 'type "exit" or "quit" to quit', 'use ";" for newline', ''
while (print "asm> " ; $stdout.flush ; l = gets)
break if %w[quit exit].include? l.chomp
begin
data = l.gsub(';', "\n")
next if data.strip.empty?
data = data.encode
puts '"' + data.unpack('C*').map { |c| '\\x%02x' % c }.join + '"'
rescue Metasm::Exception => e
puts "Error: #{e.class} #{e.message}"
end
end
puts
end
if __FILE__ == $0
asm
end

View File

@ -1,36 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
# temporarily put the current file directory in the ruby include path
metasmdir = File.dirname(__FILE__)
if $:.include? metasmdir
metasmdir = nil
else
$: << metasmdir
end
# cpu architectures
%w[ia32 mips].each { |f|
require "metasm/#{f}/render"
require "metasm/#{f}/parse"
require "metasm/#{f}/encode"
require "metasm/#{f}/decode"
require "metasm/#{f}/compile_c"
}
# executable formats
%w[mz elf_encode elf_decode pe coff_encode coff_decode shellcode a_out xcoff autoexe].each { |f|
require "metasm/exe_format/#{f}"
}
# os-specific features
%w[windows linux].each { |f|
require "metasm/os/#{f}"
}
require 'metasm/parse_c'
require 'metasm/compile_c'
# cleanup include path
$:.delete metasmdir if metasmdir

View File

@ -1,40 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/mips/opcodes'
require 'metasm/encode'
module Metasm
class MIPS
private
def encode_instr_op(section, instr, op)
base = op.bin
set_field = proc { |f, v|
base |= (v & @fields_mask[f]) << @fields_shift[f]
}
val, mask, shift = 0, 0, 0
op.args.zip(instr.args).each { |sym, arg|
case sym
when :rs, :rt, :rd
set_field[sym, arg.i]
when :ft
set_field[sym, arg.i]
when :rs_i16
set_field[:rs, arg.base.i]
val, mask, shift = arg.offset, @fields_mask[:i16], @fields_shift[:i16]
when :sa, :i16
val, mask, shift = arg, @fields_mask[sym], @fields_shift[sym]
when :i26
val, mask, shift = Expression[arg, :>>, 2], @fields_mask[sym], @fields_shift[sym]
end
}
# F%SK&*cks PE base relocation detection
Expression[base, :+, [[val, :&, mask], :<<, shift]].encode(:u32, @endianness)
end
end
end

View File

@ -1,38 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class ARM < CPU
class Reg
class << self
attr_reader :s_to_i
end
@s_to_i = { 'sp' => 13, 'lr' => 14, 'pc' => 15 }
(0..15).each { |i| @s_to_i["r#{i}"] = @s_to_i["$r#{i}"] = i }
attr_reader :i
def initialize(i)
@i = i
end
end
# class Memref
# attr_reader :base, :offset
# def initialize(base, offset)
# @base, @offset = base, offset
# end
# end
def initialize(endianness = :little)
super()
@endianness = endianness
@size = 32
init
end
end
end

View File

@ -1,419 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/mips/main'
module Metasm
class MIPS
private
def addop(name, bin, *args)
o = Opcode.new(self, name)
o.bin = bin
#o.args.concat(args & @valid_args)
o.args.concat(args & @fields_mask.keys)
(args & @valid_props).each { |p| o.props[p] = true }
(args & @fields_mask.keys).each { |f|
o.fields[f] = [@fields_mask[f], @fields_shift[f]]
}
@opcode_list << o
end
def init_mips32_obsolete
addop 'beql', 0b010100 << 26, :rt, :rs, :i16, :modip # == , exec delay slot only if jump taken
addop 'bnel', 0b010101 << 26, :rt, :rs, :i16, :modip # !=
addop 'blezl',0b010110 << 26, :rt_z, :rs, :i16, :modip # <= 0
addop 'bgtzl',0b010111 << 26, :rt_z, :rs, :i16, :modip # > 0
end
def init_mips32_reserved
addop 'future111011', 0b111011 << 26, :i26
%w[011000 011001 011010 011011 100111 101100 101101 110100 110111 111100 111111].each { |b|
addop "reserved#{b}", b.to_i(2) << 26, :i26
}
addop 'ase_jalx', 0b011101 << 26, :i26
addop 'ase011110', 0b011110 << 26, :i26
# TODO add all special/regimm/...
end
def init_mips32
@fields_mask.update :rs => 0x1f, :rt => 0x1f, :rd => 0x1f, :sa => 0x1f,
:i16 => 0xffff, :i26 => 0x3ffffff, :rs_i16 => 0x3e0ffff, :it => 0x1f,
:ft => 0x1f, :i32 => 0
@fields_shift.update :rs => 21, :rt => 16, :rd => 11, :sa => 6,
:i16 => 0, :i26 => 0, :rs_i16 => 0, :it => 16,
:ft => 16, :i32 => 0
init_mips32_obsolete
init_mips32_reserved
addop 'j', 0b000010 << 26, :i26, :setip, :stopexec # sets the program counter to (i26 << 2) | ((pc+4) & 0xfc000000) ie i26*4 in the 256M-aligned section containing the instruction in the delay slot
addop 'jal', 0b000011 << 26, :i26, :setip, :stopexec # same thing, saves return addr in r31
addop 'mov', 0b001000 << 26, :rt, :rs # rt <- rs+0
addop 'addi', 0b001000 << 26, :rt, :rs, :i16 # add rt <- rs+i
addop 'addiu',0b001001 << 26, :rt, :rs, :i16 # add unsigned
addop 'slti', 0b001010 << 26, :rt, :rs, :i16 # set on less than
addop 'sltiu',0b001011 << 26, :rt, :rs, :i16 # set on less than unsigned
addop 'andi', 0b001100 << 26, :rt, :rs, :i16 # and
addop 'ori', 0b001101 << 26, :rt, :rs, :i16 # or
addop 'xori', 0b001110 << 26, :rt, :rs, :i16 # xor
addop 'lui', 0b001111 << 26, :rt, :i16 # load upper
addop 'li', 0b001111 << 26, :rt, :i32 # pseudoinstruction
addop 'beq', 0b000100 << 26, :rt, :rs, :i16, :modip # ==
addop 'bne', 0b000101 << 26, :rt, :rs, :i16, :modip # !=
addop 'blez', 0b000110 << 26, :rs, :i16, :modip # <= 0
addop 'bgtz', 0b000111 << 26, :rs, :i16, :modip # > 0
addop 'lb', 0b100000 << 26, :rt, :rs_i16 # load byte rs <- [rt+i]
addop 'lh', 0b100001 << 26, :rt, :rs_i16 # load halfword
addop 'lwl', 0b100010 << 26, :rt, :rs_i16 # load word left
addop 'lw', 0b100011 << 26, :rt, :rs_i16 # load word
addop 'lbu', 0b100100 << 26, :rt, :rs_i16 # load byte unsigned
addop 'lhu', 0b100101 << 26, :rt, :rs_i16 # load halfword unsigned
addop 'lwr', 0b100110 << 26, :rt, :rs_i16 # load word right
addop 'sb', 0b101000 << 26, :rt, :rs_i16 # store byte
addop 'sh', 0b101001 << 26, :rt, :rs_i16 # store halfword
addop 'swl', 0b101010 << 26, :rt, :rs_i16 # store word left
addop 'sw', 0b101011 << 26, :rt, :rs_i16 # store word
addop 'swr', 0b101110 << 26, :rt, :rs_i16 # store word right
addop 'll', 0b110000 << 26, :rt, :rs_i16 # load linked word (read for atomic r/modify/w, sc does the w)
addop 'sc', 0b111000 << 26, :rt, :rs_i16 # store conditional word
addop 'lwc1', 0b110001 << 26, :ft, :rs_i16 # load word in fpreg low
addop 'swc1', 0b111001 << 26, :ft, :rs_i16 # store low fpreg word
addop 'lwc2', 0b110010 << 26, :rt, :rs_i16 # load word to copro2 register low
addop 'swc2', 0b111010 << 26, :rt, :rs_i16 # store low coproc2 register
addop 'ldc1', 0b110101 << 26, :ft, :rs_i16 # load dword in fpreg low
addop 'sdc1', 0b111101 << 26, :ft, :rs_i16 # store fpreg
addop 'ldc2', 0b110110 << 26, :rt, :rs_i16 # load dword to copro2 register
addop 'sdc2', 0b111110 << 26, :rt, :rs_i16 # store coproc2 register
addop 'pref', 0b110011 << 26, :it, :rs_i16 # prefetch (it = %w[load store r2 r3 load_streamed store_streamed load_retained store_retained
# r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 writeback_invalidate
# id26 id27 id28 id29 prepare_for_store id31]
addop 'cache',0b101111 << 26, :it, :rs_i16 # do things with the proc cache
# special
addop 'nop', 0
addop 'ssnop',1<<6
addop 'ehb', 3<<6
addop 'sll', 0b000000, :rd, :rt, :sa
addop 'movf', 0b000001, :rd, :rs, :cc
addop 'movt', 0b000001 | (1<<16), :rd, :rs, :cc
addop 'srl', 0b000010, :rd, :rt, :sa
addop 'rotr', 0b000010 | (1<<21), :rd, :rt, :sa
addop 'sra', 0b000011, :rd, :rt, :sa
addop 'sllv', 0b000100, :rd, :rt, :rs
addop 'srlv', 0b000110, :rd, :rt, :rs
addop 'rotrv',0b000110 | (1<<6), :rd, :rt, :rs
addop 'srav', 0b000111, :rd, :rt, :rs
addop 'jr', 0b001000, :rs, :setip # hint field ?
addop 'jr.hb',0b001000 | (1<<10), :rs, :setip
addop 'jalr', 0b001001 | (31<<11), :rs, :setip # rd = r31 implicit
addop 'jalr', 0b001001, :rd, :rs, :setip
addop 'jalr.hb', 0b001001 | (1<<10) | (31<<11), :rs, :setip
addop 'jalr.hb', 0b001001 | (1<<10), :rd, :rs, :setip
addop 'movz', 0b001010, :rd, :rs, :rt # rt == 0 ? rd <- rs
addop 'movn', 0b001011, :rd, :rs, :rt
addop 'syscall', 0b001100, :i20
addop 'break',0b001101, :i20
addop 'sync', 0b001111 # type 0 implicit
addop 'sync', 0b001111, :sa
addop 'mfhi', 0b010000, :rd # copies special reg HI to reg
addop 'mthi', 0b010001, :rd # copies reg to special reg HI
addop 'mflo', 0b010010, :rd # copies special reg LO to reg
addop 'mtlo', 0b010011, :rd # copies reg to special reg LO
addop 'mult', 0b011000, :rs, :rt # multiplies the registers and store the result in HI:LO
addop 'multu',0b011001, :rs, :rt
addop 'div', 0b011010, :rs, :rt
addop 'divu', 0b011011, :rs, :rt
addop 'add', 0b100000, :rd, :rs, :rt
addop 'addu', 0b100001, :rd, :rs, :rt
addop 'sub', 0b100010, :rd, :rs, :rt
addop 'subu', 0b100011, :rd, :rs, :rt
addop 'and', 0b100100, :rd, :rs, :rt
addop 'or', 0b100101, :rd, :rs, :rt
addop 'xor', 0b100110, :rd, :rs, :rt
addop 'nor', 0b100111, :rd, :rs, :rt
addop 'slt', 0b101010, :rd, :rs, :rt # rs<rt ? rd<-1 : rd<-0
addop 'sltu', 0b101011, :rd, :rs, :rt
addop 'tge', 0b110000, :rs, :rt # rs >= rt ? trap
addop 'tgeu', 0b110001, :rs, :rt
addop 'tlt', 0b110010, :rs, :rt
addop 'tltu', 0b110011, :rs, :rt
addop 'teq', 0b110100, :rs, :rt
addop 'tne', 0b110110, :rs, :rt
end
def init_mips32r2
init_mips32
end
end
end
__END__
def macro_addop_regimm(name, bin, field2, *aprops)
flds = [ :rs, field2 ]
addop name, :regimm, bin, "rs, #{field2}", flds, *aprops
end
def macro_addop_cop1(name, bin, *aprops)
flds = [ :rt, :fs ]
addop name, :cop1, bin, 'rt, fs', flds, *aprops
end
def macro_addop_cop1_precision(name, type, bin, fmt, *aprops)
flds = [ :ft, :fs, :fd ]
addop name+'.'+(type.to_s[5,7]), type, bin, fmt, flds, *aprops
end
public
# Initialize the instruction set with the MIPS32 Instruction Set
def init_mips32
:cc => [7, 18, :fpcc],
:op => [0x1F, 16, :op ], :cp2_rt => [0x1F, 16, :cp2_reg ],
:stype => [0x1F, 6, :imm ],
:code => [0xFFFFF, 6, :code ],
:sel => [3, 0, :sel ]})
# ---------------------------------------------------------------
# REGIMM opcode encoding of function field
# ---------------------------------------------------------------
macro_addop_regimm 'bltz', 0b00000, :off
macro_addop_regimm 'bgez', 0b00001, :off
macro_addop_regimm 'btlzl', 0b00010, :off
macro_addop_regimm 'bgezl', 0b00011, :off
macro_addop_regimm 'tgei', 0b01000, :imm
macro_addop_regimm 'tgeiu', 0b01001, :imm
macro_addop_regimm 'tlti', 0b01010, :imm
macro_addop_regimm 'tltiu', 0b01011, :imm
macro_addop_regimm 'teqi', 0b01100, :imm
macro_addop_regimm 'tnei', 0b01110, :imm
macro_addop_regimm 'bltzal', 0b10000, :off
macro_addop_regimm 'bgezal', 0b10001, :off
macro_addop_regimm 'bltzall', 0b10010, :off
macro_addop_regimm 'bgezall', 0b10011, :off
# ---------------------------------------------------------------
# SPECIAL2 opcode encoding of function field
# ---------------------------------------------------------------
macro_addop_special2 'madd', 0b000000, 'rs, rt', :rd_zero
macro_addop_special2 'maddu', 0b000001, 'rs, rt', :rd_zero
macro_addop_special2 'mul', 0b000010, 'rd, rs, rt'
macro_addop_special2 'msub', 0b000100, 'rs, rt', :rd_zero
macro_addop_special2 'msubu', 0b000101, 'rs, rt', :rd_zero
macro_addop_special2 'clz', 0b100000, 'rd, rs'
macro_addop_special2 'clo', 0b100001, 'rd, rs'
addop 'sdbbp', :special2, 0b111111, 'rs, rt', [ :code ]
# ---------------------------------------------------------------
# COP0, field rs
# ---------------------------------------------------------------
addop 'mfc0', :cop0, 0b00000, 'rt, rd, sel', [ :rt, :rd, :sel ]
addop 'mtc0', :cop0, 0b00100, 'rt, rd, sel', [ :rt, :rd, :sel ]
# ---------------------------------------------------------------
# COP0 when rs=C0
# ---------------------------------------------------------------
macro_addop_cop0_c0 'tlbr', 0b000001
macro_addop_cop0_c0 'tlbwi', 0b000010
macro_addop_cop0_c0 'tlwr', 0b000110
macro_addop_cop0_c0 'tlbp', 0b001000
macro_addop_cop0_c0 'eret', 0b011000
macro_addop_cop0_c0 'deret', 0b011111
macro_addop_cop0_c0 'wait', 0b100000
# ---------------------------------------------------------------
# COP1, field rs
# ---------------------------------------------------------------
macro_addop_cop1 'mfc1', 0b00000
macro_addop_cop1 'cfc1', 0b00010
macro_addop_cop1 'mtc1', 0b00100
macro_addop_cop1 'ctc1', 0b00110
addop "bc1f", :cop1, 0b01000, 'cc, off', [ :cc, :off ], :diff_bits, [ 16, 3, 0 ]
addop "bc1fl", :cop1, 0b01000, 'cc, off', [ :cc, :off ], :diff_bits, [ 16, 3, 2 ]
addop "bc1t", :cop1, 0b01000, 'cc, off', [ :cc, :off ], :diff_bits, [ 16, 3, 1 ]
addop "bc1tl", :cop1, 0b01000, 'cc, off', [ :cc, :off ], :diff_bits, [ 16, 3, 3 ]
# ---------------------------------------------------------------
# COP1, field rs=S/D
# ---------------------------------------------------------------
[ :cop1_s, :cop1_d ].each do |type|
type_str = type.to_s[5,7]
macro_addop_cop1_precision 'add', type, 0b000000, 'fd, fs, ft'
macro_addop_cop1_precision 'sub', type, 0b000001, 'fd, fs, ft'
macro_addop_cop1_precision 'mul', type, 0b000010, 'fd, fs, ft'
macro_addop_cop1_precision 'abs', type, 0b000101, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'mov', type, 0b000110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'neg', type, 0b000111, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'movz', type, 0b010010, 'fd, fs, ft'
macro_addop_cop1_precision 'movn', type, 0b010011, 'fd, fs, ft'
addop "movf.#{type_str}", type, 0b010001, 'fd, fs, cc', [ :cc, :fs, :fd ], :diff_bits, [ 16, 1, 0 ]
addop "movt.#{type_str}", type, 0b010001, 'fd, fs, cc', [ :cc, :fs, :fd ], :diff_bits, [ 16, 1, 1 ]
%w(f un eq ueq olt ult ole ule sf ngle seq ngl lt nge le ngt).each_with_index do |cond, index|
addop "c.#{cond}.#{type_str}", type, 0b110000+index, 'cc, fs, ft',
[ :ft, :fs, :cc ]
end
end
# S and D Without PS
[:cop1_s, :cop1_d].each do |type|
macro_addop_cop1_precision 'div', type, 0b000011, 'fd, fs, ft'
macro_addop_cop1_precision 'sqrt', type, 0b000100, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'round.w', type, 0b001100, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'trunc.w', type, 0b001101, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'ceil.w', type, 0b001110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'floor.w', type, 0b001111, 'fd, fs', :ft_zero
end
# COP2 is not decoded (pretty useless)
[:cop1_d,:cop1_w].each { |type| macro_addop_cop1_precision 'cvt.s', type, 0b100000, 'fd, fs', :ft_zero }
[:cop1_s,:cop1_w].each { |type| macro_addop_cop1_precision 'cvt.d', type, 0b100001, 'fd, fs', :ft_zero }
[:cop1_s,:cop1_d].each { |type| macro_addop_cop1_precision 'cvt.w', type, 0b100100, 'fd, fs', :ft_zero }
[ :normal, :special, :regimm, :special2, :cop0, :cop0_c0, :cop1, :cop1_s,
:cop1_d, :cop1_w ].each \
{ |t| @@opcodes_by_class[t] = opcode_list.find_all { |o| o.type == t } }
end
# Initialize the instruction set with the MIPS32 Instruction Set Release 2
def init_mips64
init_mips32
#SPECIAL
macro_addop_special "rotr", 0b000010, 'rd, rt, sa', :diff_bits, [ 26, 1, 1 ]
macro_addop_special "rotrv", 0b000110, 'rd, rt, rs', :diff_bits, [ 6, 1, 1 ]
# REGIMM
addop "synci", :regimm, 0b11111, '', {:base => [5,21], :off => [16, 0] }
# ---------------------------------------------------------------
# SPECIAL3 opcode encoding of function field
# ---------------------------------------------------------------
addop "ext", :special3, 0b00000, 'rt, rs, pos, size', { :rs => [5, 21], :rt => [5, 16],
:msbd => [5, 11], :lsb => [5, 6] }
addop "ins", :special3, 0b00100, 'rt, rs, pos, size', { :rs => [5, 21], :rt => [5, 16],
:msb => [5, 11], :lsb => [5, 6] }
addop "rdhwr", :special3, 0b111011, 'rt, rd', { :rt => [5, 16], :rd => [5, 11] }
addop "wsbh", :bshfl, 0b00010, 'rd, rt', { :rt => [5, 16], :rd => [5, 11] }
addop "seb", :bshfl, 0b10000, 'rd, rt', { :rt => [5, 16], :rd => [5, 11] }
addop "seh", :bshfl, 0b11000, 'rd, rt', { :rt => [5, 16], :rd => [5, 11] }
# ---------------------------------------------------------------
# COP0
# ---------------------------------------------------------------
addop "rdpgpr", :cop0, 0b01010, 'rt, rd', {:rt => [5, 16], :rd => [5, 11] }
addop "wdpgpr", :cop0, 0b01110, 'rt, rd', {:rt => [5, 16], :rd => [5, 11] }
addop "di", :cop0, 0b01011, '', {}, :diff_bits, [ 5, 1 , 0]
addop "ei", :cop0, 0b01011, '', {}, :diff_bits, [ 5, 1 , 1]
# ---------------------------------------------------------------
# COP1, field rs
# ---------------------------------------------------------------
macro_addop_cop1 "mfhc1", 0b00011
macro_addop_cop1 "mthc1", 0b00111
# Floating point
[:cop1_s, :cop1_d].each do |type|
macro_addop_cop1_precision 'round.l', type, 0b001000, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'trunc.l', type, 0b001001, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'ceil.l', type, 0b001010, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'floor.l', type, 0b001011, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'recip', type, 0b010101, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'rsqrt', type, 0b010110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'cvt.l', type, 0b100101, 'fd, fs', :ft_zero
end
macro_addop_cop1_precision 'cvt.ps', :cop1_s, 0b100110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'cvt.s', :cop1_l, 0b100000, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'cvt.d', :cop1_l, 0b100000, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'add', :cop1_ps, 0b000000, 'fd, fs, ft'
macro_addop_cop1_precision 'sub', :cop1_ps, 0b000001, 'fd, fs, ft'
macro_addop_cop1_precision 'mul', :cop1_ps, 0b000010, 'fd, fs, ft'
macro_addop_cop1_precision 'abs', :cop1_ps, 0b000101, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'mov', :cop1_ps, 0b000110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'neg', :cop1_ps, 0b000111, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'movz', :cop1_ps, 0b010010, 'fd, fs, ft'
macro_addop_cop1_precision 'movn', :cop1_ps, 0b010011, 'fd, fs, ft'
addop "movf.#{:cop1_ps_str}", :cop1_ps, 0b010001, 'fd, fs, cc', [ :cc, :fs, :fd ]
addop "movt.#{:cop1_ps_str}", :cop1_ps, 0b010001, 'fd, fs, cc', [ :cc, :fs, :fd ]
%w(f un eq ueq olt ult ole ule sf ngle seq ngl lt nge le ngt).each_with_index do |cond, index|
addop "c.#{cond}.ps", :cop1_cond, 0b110000+index, 'cc, fs, ft',
[ :ft, :fs, :cc ]
# TODO: COP1X
[ :special3, :bshfl, :cop1_l, :cop1_ps ].each \
{ |t| @@opcodes_by_class[t] = opcode_list.find_all { |o| o.type == t } }
end
end
# Reset all instructions
def reset
metaprops_allowed.clear
args_allowed.clear
props_allowed.clear
fields_spec.clear
opcode_list.clear
end
end
# Array containing all the supported opcodes
attr_reader :opcode_list
init_mips32
end
end

View File

@ -1,40 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/arm/opcodes'
require 'metasm/parse'
module Metasm
class ARM
def parse_arg_valid?(op, sym, arg)
# special case for lw reg, imm32(reg) ? (pseudo-instr, need to convert to 'lui t0, up imm32 ori t0 down imm32 add t0, reg lw reg, 0(t0)
case sym
when :rs, :rt, :rd; arg.class == Reg
when :sa, :i16, :i26; arg.kind_of? Expression
when :rs_i16; arg.class == Memref
when :ft; arg.class == FpReg
end
end
def parse_argument(pgm)
if Reg.s_to_i[pgm.nexttok]
arg = Reg.new Reg.s_to_i[pgm.readtok]
elsif FpReg.s_to_i[pgm.nexttok]
arg = FpReg.new FpReg.s_to_i[pgm.readtok]
else
arg = Expression.parse pgm
if arg and pgm.nexttok == :'('
pgm.readtok
raise pgm, "Invalid base #{nexttok}" unless Reg.s_to_i[pgm.nexttok]
base = Reg.new Reg.s_to_i[pgm.readtok]
raise pgm, "Invalid memory reference, ')' expected" if pgm.readtok != :')'
arg = Memref.new base, arg
end
end
arg
end
end
end

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,310 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
require 'metasm/parse'
module Metasm
class ExeFormat
# encodes an Array of source (Label/Data/Instruction etc) to an EncodedData
# resolves ambiguities using +encode_resolve+
def assemble_sequence(seq, cpu)
# an array of edata or sub-array of ambiguous edata
# its last element is always an edata
ary = [EncodedData.new]
seq.each { |e|
case e
when Label; ary.last.add_export(e.name, ary.last.virtsize)
when Data; ary.last << e.encode(cpu.endianness)
when Align, Padding
e.fillwith = e.fillwith.encode(cpu.endianness) if e.fillwith and not e.fillwith.kind_of? EncodedData
ary << e << EncodedData.new
when Offset; ary << e << EncodedData.new
when Instruction
case i = cpu.encode_instruction(self, e)
when Array
if i.length == 1
ary.last << i.first
else
# ambiguity !
ary << i << EncodedData.new
end
else
ary.last << i
end
end
}
edata = (ary.length > 1) ? assemble_resolve(ary) : ary.shift
edata.fixup edata.binding
edata
end
# chose among multiple possible sub-EncodedData
# assumes all ambiguous edata have the equivallent relocations in the same order
def assemble_resolve(ary)
startlabel = new_label('section_start')
# create two bindings where all elements are the shortest/longest possible
minbinding = {}
minoff = 0
maxbinding = {}
maxoff = 0
ary.each { |elem|
case elem
when Array
elem.each { |e|
e.export.each { |label, off|
minbinding[label] = Expression[startlabel, :+, minoff + off]
maxbinding[label] = Expression[startlabel, :+, maxoff + off]
}
}
minoff += elem.map { |e| e.virtsize }.min
maxoff += elem.map { |e| e.virtsize }.max
when EncodedData
elem.export.each { |label, off|
minbinding[label] = Expression[startlabel, :+, minoff + off]
maxbinding[label] = Expression[startlabel, :+, maxoff + off]
}
minoff += elem.virtsize
maxoff += elem.virtsize
when Align
minoff += 0
maxoff += elem.val - 1
when Padding
# find the surrounding Offsets and compute the largest/shortest edata sizes to determine min/max length for the padding
prevoff = ary[0..ary.index(elem)].grep(Offset).last
nextoff = ary[ary.index(elem)..-1].grep(Offset).first
raise elem, 'need .offset after .pad' if not nextoff
# find all elements between the surrounding Offsets
previdx = prevoff ? ary.index(prevoff) + 1 : 0
surround = ary[previdx..ary.index(nextoff)-1]
surround.delete elem
if surround.find { |nelem| nelem.kind_of? Padding }
raise elem, 'need .offset beetween two .pad'
end
if surround.find { |nelem| nelem.kind_of? Align and ary.index(nelem) > ary.index(elem) }
raise elem, 'cannot .align after a .pad' # XXX really ?
end
# lenmin/lenmax are the extrem length of the Padding
nxt = Expression[nextoff.val]
ext = nxt.externals
raise elem, "bad offset #{nxt}" if ext.length > 1 or (ext.length == 1 and not minbinding[ext.first])
nxt = Expression[nxt, :-, startlabel] if not nxt.bind(minbinding).reduce.kind_of? ::Integer
prv = Expression[prevoff ? prevoff.val : 0]
ext = prv.externals
raise elem, "bad offset #{prv}" if ext.length > 1 or (ext.length == 1 and not minbinding[ext.first])
prv = Expression[prv, :-, startlabel] if not prv.bind(minbinding).reduce.kind_of? ::Integer
lenmin = Expression[nxt.bind(minbinding), :-, prv.bind(maxbinding)].reduce
lenmax = Expression[nxt.bind(maxbinding), :-, prv.bind(minbinding)].reduce
raise elem, "bad labels: #{lenmin}" if not lenmin.kind_of? ::Integer or not lenmax.kind_of? ::Integer
surround.each { |nelem|
case nelem
when Array
lenmin -= nelem.map { |e| e.virtsize }.max
lenmax -= nelem.map { |e| e.virtsize }.min
when EncodedData
lenmin -= nelem.virtsize
lenmax -= nelem.virtsize
when Align
lenmin -= nelem.val - 1
lenmax -= 0
end
}
raise elem, "no room for .pad before '.offset #{nextoff.val}' at #{Backtrace.backtrace_str(nextoff.backtrace)}, need at least #{-lenmax} more bytes" if lenmax < 0
minoff += [lenmin, 0].max
maxoff += lenmax
when Offset
# nothing to do for now
else
raise "Internal error: bad object #{elem.inspect} in encode_resolve"
end
}
# checks an expression linearity
check_linear = proc { |expr|
expr = expr.reduce if expr.kind_of? Expression
while expr.kind_of? Expression
case expr.op
when :*
if expr.lexpr.kind_of? Numeric; expr = expr.rexpr
elsif expr.rexpr.kind_of? Numeric; expr = expr.lexpr
else break
end
when :/, :>>, :<<
if expr.rexpr.kind_of? Numeric; expr = expr.lexpr
else break
end
when :+, :-
if not expr.lexpr; expr = expr.rexpr
elsif expr.lexpr.kind_of? Numeric; expr = expr.rexpr
elsif expr.rexpr.kind_of? Numeric; expr = expr.lexpr
else
break if not check_linear[expr.rexpr]
expr = expr.lexpr
end
else break
end
end
not expr.kind_of? Expression
}
# now we can resolve all relocations
# for linear expressions of internal variables (ie differences of labels from the ary):
# - calc target numeric bounds, and reject relocs not accepting worst case value
# - else reject all but largest place available
# then chose the shortest overall EData left
ary.map! { |elem|
case elem
when Array
# for each external, compute numeric target values using minbinding[external] and maxbinding[external]
# this gives us all extrem values for linear expressions
target_bounds = {}
rec_checkminmax = proc { |idx, target, binding, extlist|
if extlist.empty?
(target_bounds[idx] ||= []) << target.bind(binding).reduce
else
rec_checkminmax[idx, target, binding.merge(extlist.last => minbinding[extlist.last]), extlist[0...-1]]
rec_checkminmax[idx, target, binding.merge(extlist.last => maxbinding[extlist.last]), extlist[0...-1]]
end
}
# biggest size disponible for this relocation (for non-linear/external)
wantsize = {}
elem.each { |e|
e.reloc.sort.each_with_index { |(o, r), i|
# has external ref
if not r.target.bind(minbinding).reduce.kind_of?(Numeric) or not check_linear[r.target]
# find the biggest relocation type for the current target
wantsize[i] = elem.map { |edata|
edata.reloc.sort[i][1].type
}.sort_by { |type| Expression::INT_SIZE[type] }.last # XXX do not use rel.length
else
rec_checkminmax[i, r.target, {}, r.target.externals]
end
}
}
# reject candidates with reloc type too small
acceptable = elem.find_all { |edata|
r = edata.reloc.sort
(0...r.length).all? { |i|
if wantsize[i]
r[i][1].type == wantsize[i]
else
target_bounds[i].all? { |b| Expression.in_range?(b, r[i][1].type) }
end
}
}
raise EncodeError, "cannot find candidate in #{elem.inspect}, immediate too big #{wantsize.inspect} #{target_bounds.inspect}" if acceptable.empty?
# keep the shortest
acceptable.sort_by { |edata| edata.virtsize }.first
else
elem
end
}
# assemble all parts, resolve padding sizes, check offset directives
edata = EncodedData.new
# fills edata with repetitions of data until targetsize
fillwith = proc { |targetsize, data|
if data
while edata.virtsize + data.virtsize <= targetsize
edata << data
end
if edata.virtsize < targetsize
edata << data[0, targetsize - edata.virtsize]
end
else
edata.virtsize = targetsize
end
}
ary.each { |elem|
case elem
when EncodedData
edata << elem
when Align
fillwith[EncodedData.align_size(edata.virtsize, elem.val), elem.fillwith]
when Offset
raise EncodeError, "could not enforce .offset #{elem.val} #{elem.backtrace}: offset now #{edata.virtsize}" if edata.virtsize != Expression[elem.val].bind(edata.binding(0)).reduce
when Padding
nextoff = ary[ary.index(elem)..-1].grep(Offset).first
targetsize = Expression[nextoff.val].bind(edata.binding(0)).reduce
ary[ary.index(elem)+1..ary.index(nextoff)-1].each { |nelem| targetsize -= nelem.virtsize }
raise EncodeError, "no room for .pad #{elem.backtrace_str} before .offset #{nextoff.val}, would be #{targetsize-edata.length} bytes long" if targetsize < edata.length
fillwith[targetsize, elem.fillwith]
else raise "Internal error: #{elem.inspect}"
end
}
edata
end
end
class Expression
def encode(type, endianness, backtrace=nil)
case val = reduce
when Integer; EncodedData.new Expression.encode_immediate(val, type, endianness, backtrace)
else EncodedData.new(0.chr*(INT_SIZE[type]/8), :reloc => {0 => Relocation.new(self, type, endianness, backtrace)})
end
end
def self.encode_immediate(val, type, endianness, backtrace=nil)
raise "unsupported endianness #{endianness.inspect}" unless [:big, :little].include? endianness
raise(EncodeError, "immediate overflow 0x#{val.to_s 16} #{(Backtrace::backtrace_str(backtrace) if backtrace)}") if not in_range?(val, type)
s = (0...INT_SIZE[type]/8).map { |i| (val >> (8*i)) & 0xff }.pack('C*')
endianness != :little ? s.reverse : s
end
end
class Data
def encode(endianness)
edata = case @data
when :uninitialized
EncodedData.new('', :virtsize => Expression::INT_SIZE[INT_TYPE[@type]]/8)
when String
# db 'foo' => 'foo' # XXX could be optimised, but should not be significant
# dw 'foo' => "f\0o\0o\0" / "\0f\0o\0o"
@data.unpack('C*').inject(EncodedData.new) { |ed, chr| ed << Expression.encode_immediate(chr, INT_TYPE[@type], endianness, @backtrace) }
when Expression
@data.encode INT_TYPE[@type], endianness
when Array
@data.inject(EncodedData.new) { |ed, d| ed << d.encode(endianness) }
end
# n times
(0...@count).inject(EncodedData.new) { |ed, cnt| ed << edata }
end
end
class CPU
# returns an EncodedData or an ary of them
# uses +#parse_arg_valid?+ to find the opcode whose signature matches with the instruction
# uses +encode_instr_op+ (arch-specific)
def encode_instruction(program, i)
oplist = opcode_list_byname[i.opname].to_a.find_all { |o|
o.args.length == i.args.length and
o.args.zip(i.args).all? { |f, a| parse_arg_valid?(o, f, a) }
}
raise EncodeError, "no matching opcode found for #{i}" if oplist.empty?
oplist.map { |op| encode_instr_op(program, i, op) }.flatten.each { |ed| ed.reloc.each_value { |v| v.backtrace = i.backtrace } }
end
end
end

View File

@ -1,287 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
require 'metasm/encode'
require 'metasm/decode'
module Metasm
class AOut < ExeFormat
MAGIC = { 0407 => 'OMAGIC', 0410 => 'NMAGIC', 0413 => 'ZMAGIC',
0314 => 'QMAGIC', 0421 => 'CMAGIC'
}
MACHINE_TYPE = { 0 => 'OLDSUN2', 1 => '68010', 2 => '68020',
3 => 'SPARC', 100 => 'PC386', 134 => 'I386', 135 => 'M68K',
136 => 'M68K4K', 137 => 'NS32532', 138 => 'SPARC',
139 => 'PMAX', 140 => 'VAX', 141 => 'ALPHA', 142 => 'MIPS',
143 => 'ARM6', 151 => 'MIPS1', 152 => 'MIPS2', 300 => 'HP300',
0x20B => 'HPUX800', 0x20C => 'HPUX'
}
FLAGS = { 0x10 => 'PIC', 0x20 => 'DYNAMIC' }
SYMBOL_TYPE = { 0 => 'UNDF', 1 => 'ABS', 2 => 'TEXT',
3 => 'DATA', 4 => 'BSS', 5 => 'INDR', 6 => 'SIZE',
9 => 'COMM', 10=> 'SETA', 11=> 'SETT', 12=> 'SETD',
13=> 'SETB', 14=> 'SETV', 15=> 'FN'
}
attr_accessor :endianness, :header, :text, :data, :symbols, :textrel, :datarel
class Header
attr_accessor :magic, :machtype, :flags
attr_accessor :text, :data, :bss, :syms, :entry, :trsz, :drsz
def set_info(aout, info)
@magic = aout.int_to_hash(info & 0xffff, MAGIC)
@machtype = aout.int_to_hash((info >> 16) & 0xff, MACHINE_TYPE)
@flags = aout.bits_to_hash((info >> 24) & 0xff, FLAGS)
end
def get_info(aout)
(aout.int_from_hash(@magic, MAGIC) & 0xffff) |
((aout.int_from_hash(@machtype, MACHINE_TYPE) & 0xff) << 16) |
((aout.bits_from_hash(@flags, FLAGS) & 0xff) << 24)
end
def decode(aout)
set_info(aout, aout.decode_word)
case @magic
when 'OMAGIC', 'NMAGIC', 'ZMAGIC', 'QMAGIC'
else raise InvalidExeFormat
end
@text = aout.decode_word
@data = aout.decode_word
@bss = aout.decode_word
@syms = aout.decode_word
@entry= aout.decode_word
@trsz = aout.decode_word
@drsz = aout.decode_word
end
def encode(aout)
set_default_values aout
EncodedData.new <<
aout.encode_word(get_info(aout)) <<
aout.encode_word(@text) <<
aout.encode_word(@data) <<
aout.encode_word(@bss ) <<
aout.encode_word(@syms) <<
aout.encode_word(@entry)<<
aout.encode_word(@trsz) <<
aout.encode_word(@drsz)
end
def set_default_values(aout)
@magic ||= 'QMAGIC'
@machtype ||= 'PC386'
@flags ||= 0
@text ||= aout.text ? aout.text.length + (@magic == 'QMAGIC' ? 32 : 0) : 0
@data ||= aout.data ? aout.data.length : 0
@bss ||= 0
@syms ||= 0
@entry||= 0
@trsz ||= 0
@drsz ||= 0
end
end
class Relocation
attr_accessor :address, :symbolnum, :pcrel, :length, :extern,
:baserel, :jmptable, :relative, :rtcopy
def get_info(aout)
(@symbolnum & 0xffffff) |
((@pcrel ? 1 : 0) << 24) |
(({1=>0, 2=>1, 4=>2, 8=>3}[@length] || 0) << 25) |
((@extern ? 1 : 0) << 27) |
((@baserel ? 1 : 0) << 28) |
((@jmptable ? 1 : 0) << 29) |
((@relative ? 1 : 0) << 30) |
((@rtcopy ? 1 : 0) << 31)
end
def set_info(aout, info)
@symbolnum = info & 0xffffff
@pcrel = (info[24] == 1)
@length = 1 << ((info >> 25) & 3)
@extern = (info[27] == 1)
@baserel = (info[28] == 1)
@jmptable = (info[29] == 1)
@relative = (info[30] == 1)
@rtcopy = (info[31] == 1)
end
def encode(aout)
EncodedData.new <<
aout.encode_word(@address) <<
aout.encode_word(get_info(aout))
end
def decode(aout)
@address = aout.decode_word
set_info(aout, aout.decode_word)
end
def set_default_values(aout)
@address ||= 0
@length ||= 4
end
end
class Symbol
attr_accessor :name_p, :type, :extern, :stab, :other, :desc, :value
attr_accessor :name
def get_type(aout)
(extern ? 1 : 0) |
((aout.int_from_hash(@type, SYMBOL_TYPE) & 0xf) << 1) |
((@stab & 7) << 5)
end
def set_type(aout, type)
@extern = (type[0] == 1)
@type = aout.int_to_hash((type >> 1) & 0xf, SYMBOL_TYPE)
@stab = (type >> 5) & 7
end
def decode(aout, strings=nil)
@name_p = aout.decode_word
set_type(aout.decode_byte)
@other = aout.decode_byte
@desc = aout.decode_short
@value = aout.decode_word
if strings
@name = strings[@name_p...(strings.index(0, @name_p))]
end
end
def encode(aout, strings=nil)
set_default_values aout, strings
EncodedData.new <<
aout.encode_word(@name_p) <<
aout.encode_byte(get_type(aout)) <<
aout.encode_byte(@other) <<
aout.encode_short(@desc) <<
aout.encode_word(@value)
end
def set_default_values(aout, strings=nil)
if strings and @name and @name != ''
if not @name_p or strings[@name_p, @name.length] != @name
@name_p = strings.length
strings << @name << 0
end
else
@name_p ||= 0
end
@type ||= 0
@stab ||= 0
@other ||= 0
@desc ||= 0
@value ||= 0
end
end
def decode_byte(edata = @encoded) edata.decode_imm(:u8 , @endianness) end
def decode_half(edata = @encoded) edata.decode_imm(:u16, @endianness) end
def decode_word(edata = @encoded) edata.decode_imm(:u32, @endianness) end
def encode_byte(w) Expression[w].encode(:u8 , @endianness) end
def encode_half(w) Expression[w].encode(:u16, @endianness) end
def encode_word(w) Expression[w].encode(:u32, @endianness) end
def initialize(cpu = nil)
@endianness = cpu ? cpu.endianness : :little
@header = Header.new
@text = EncodedData.new
@data = EncodedData.new
super
end
def decode_header
@encoded.ptr = 0
@header.decode(self)
end
def decode
decode_header
tlen = @header.text
case @header.magic
when 'ZMAGIC'; @encoded.ptr = 1024
when 'QMAGIC'; tlen -= 32 # header is included in .text
end
@text = EncodedData.new << @encoded.read(tlen)
@data = EncodedData.new << @encoded.read(@header.data)
textrel = @encoded.read @header.trsz
datarel = @encoded.read @header.drsz
syms = @encoded.read @header.syms
strings = @encoded.read
end
def encode
# non mmapable on linux anyway
# could support OMAGIC..
raise EncodeError, 'cannot encode non-QMAGIC a.out' if @header.magic and @header.magic != 'QMAGIC'
# data must be 4096-aligned
# 32 bytes of header included in .text
@text.virtsize = (@text.virtsize + 32 + 4096 - 1) / 4096 * 4096 - 32
if @data.rawsize % 4096 != 0
@data[(@data.rawsize + 4096 - 1) / 4096 * 4096 - 1] = 0
end
@header.text = @text.length+32
@header.data = @data.rawsize
@header.bss = @data.virtsize - @data.rawsize
@encoded = EncodedData.new
@encoded << @header.encode(self)
binding = @text.binding(4096+32).merge @data.binding(4096 + @header.text)
@encoded << @text << @data
@encoded.fixup! binding
@encoded.data
end
def parse_init
@textsrc ||= []
@datasrc ||= []
@cursource ||= @textsrc
super
end
def parse_parser_instruction(instr)
case instr.raw.downcase
when '.text'; @cursource = @textsrc
when '.data'; @cursource = @datasrc
when '.entrypoint'
# ".entrypoint <somelabel/expression>" or ".entrypoint" (here)
@lexer.skip_space
if tok = @lexer.nexttok and tok.type == :string
raise instr if not entrypoint = Expression.parse(@lexer)
else
entrypoint = new_label('entrypoint')
@cursource << Label.new(entrypoint, instr.backtrace.dup)
end
@header.entry = entrypoint
else super
end
end
def assemble
@text << assemble_sequence(@textsrc, @cpu)
@textsrc.clear
@data << assemble_sequence(@datasrc, @cpu)
@datasrc.clear
self
end
def each_section
tva = 0
tva = 4096+32 if @header.magic == 'QMAGIC'
yield @text, tva
yield @data, tva + @text.virtsize
end
end
end

View File

@ -1,38 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
module Metasm
# special class that decodes a PE or ELF file from its signature
# does not support other exeformats (for now)
class AutoExe < ExeFormat
class UnknownSignature < InvalidExeFormat ; end
def self.load(str, *a)
s = str
s = str.data if s.kind_of? EncodedData
execlass_from_signature(s).load(str, *a)
end
def self.execlass_from_signature(raw)
if raw[0, 4] == "\x7fELF"; ELF
elsif off = raw[0x3c, 4].unpack('V').first and raw[off, 4] == "PE\0\0"; PE
else raise UnknownSignature, 'unrecognized executable file format'
end
end
def self.orshellcode(cpu=Ia32.new)
# here we create an anonymous subclass of AutoExe whose #exe_from_sig is patched to return a Shellcode if no signature is recognized (instead of raise()ing)
c = Class.new(self)
# yeeehaa
class << c ; self ; end.send(:define_method, :execlass_from_signature) { |raw|
begin
super
rescue UnknownSignature
Shellcode.withcpu(cpu)
end
}
c
end
end
end

View File

@ -1,344 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
module Metasm
# the COFF object file format
# mostly used on windows (PE/COFF)
class COFF < ExeFormat
CHARACTERISTIC_BITS = {
0x0001 => 'RELOCS_STRIPPED', 0x0002 => 'EXECUTABLE_IMAGE',
0x0004 => 'LINE_NUMS_STRIPPED', 0x0008 => 'LOCAL_SYMS_STRIPPED',
0x0010 => 'AGGRESSIVE_WS_TRIM', 0x0020 => 'LARGE_ADDRESS_AWARE',
0x0040 => 'x16BIT_MACHINE', 0x0080 => 'BYTES_REVERSED_LO',
0x0100 => 'x32BIT_MACHINE', 0x0200 => 'DEBUG_STRIPPED',
0x0400 => 'REMOVABLE_RUN_FROM_SWAP', 0x0800 => 'NET_RUN_FROM_SWAP',
0x1000 => 'SYSTEM', 0x2000 => 'DLL',
0x4000 => 'UP_SYSTEM_ONLY', 0x8000 => 'BYTES_REVERSED_HI'
}
MACHINE = {
0x0 => 'UNKNOWN', 0x184 => 'ALPHA', 0x1c0 => 'ARM',
0x1d3 => 'AM33', 0x8664=> 'AMD64', 0xebc => 'EBC',
0x9041=> 'M32R', 0x1f1 => 'POWERPCFP',
0x284 => 'ALPHA64', 0x14c => 'I386', 0x200 => 'IA64',
0x268 => 'M68K', 0x266 => 'MIPS16', 0x366 => 'MIPSFPU',
0x466 => 'MIPSFPU16', 0x1f0 => 'POWERPC', 0x162 => 'R3000',
0x166 => 'R4000', 0x168 => 'R10000', 0x1a2 => 'SH3',
0x1a3 => 'SH3DSP', 0x1a6 => 'SH4', 0x1a8 => 'SH5',
0x1c2 => 'THUMB', 0x169 => 'WCEMIPSV2'
}
# PE+ is for 64bits address spaces
SIGNATURE = { 0x10b => 'PE', 0x20b => 'PE+', 0x107 => 'ROM' }
SUBSYSTEM = {
0 => 'UNKNOWN', 1 => 'NATIVE', 2 => 'WINDOWS_GUI',
3 => 'WINDOWS_CUI', 5 => 'OS/2_CUI', 7 => 'POSIX_CUI',
8 => 'WIN9X_DRIVER', 9 => 'WINDOWS_CE_GUI',
10 => 'EFI_APPLICATION',
11 => 'EFI_BOOT_SERVICE_DRIVER', 12 => 'EFI_RUNTIME_DRIVER',
13 => 'EFI_ROM', 14 => 'XBOX'
}
DLL_CHARACTERISTIC_BITS = {
0x40 => 'DYNAMIC_BASE', 0x80 => 'FORCE_INTEGRITY', 0x100 => 'NX_COMPAT',
0x200 => 'NO_ISOLATION', 0x400 => 'NO_SEH', 0x800 => 'NO_BIND',
0x2000 => 'WDM_DRIVER', 0x8000 => 'TERMINAL_SERVER_AWARE'
}
BASE_RELOCATION_TYPE = { 0 => 'ABSOLUTE', 1 => 'HIGH', 2 => 'LOW', 3 => 'HIGHLOW',
4 => 'HIGHADJ', 5 => 'MIPS_JMPADDR', 9 => 'MIPS_JMPADDR16', 10 => 'DIR64'
}
RELOCATION_TYPE = Hash.new({}).merge(
'x64' => { 0 => 'ABSOLUTE', 1 => 'ADDR64', 2 => 'ADDR32', 3 => 'ADDR32NB',
4 => 'REL32', 5 => 'REL32_1', 6 => 'REL32_2', 7 => 'REL32_3',
8 => 'REL32_4', 9 => 'REL32_5', 10 => 'SECTION', 11 => 'SECREL',
12 => 'SECREL7', 13 => 'TOKEN', 14 => 'SREL32', 15 => 'PAIR',
16 => 'SSPAN32' },
'arm' => { 0 => 'ABSOLUTE', 1 => 'ADDR32', 2 => 'ADDR32NB', 3 => 'BRANCH24',
4 => 'BRANCH11', 14 => 'SECTION', 15 => 'SECREL' },
'I386' => { 0 => 'ABSOLUTE', 1 => 'DIR16', 2 => 'REL16', 6 => 'DIR32',
7 => 'DIR32NB', 9 => 'SEG12', 10 => 'SECTION', 11 => 'SECREL',
12 => 'TOKEN', 13 => 'SECREL7', 20 => 'REL32' }
)
# lsb of symbol type, unused
SYMBOL_TYPE = { 0 => 'NULL', 1 => 'VOID', 2 => 'CHAR', 3 => 'SHORT',
4 => 'INT', 5 => 'LONG', 6 => 'FLOAT', 7 => 'DOUBLE', 8 => 'STRUCT',
9 => 'UNION', 10 => 'ENUM', 11 => 'MOE', 12 => 'BYTE', 13 => 'WORD',
14 => 'UINT', 15 => 'DWORD'}
# msb of symbol type, onlf 0x20 used
SYMBOL_DTYPE = { 0 => 'NULL', 1 => 'POINTER', 2 => 'FUNCTION', 3 => 'ARRAY' }
DEBUG_TYPE = { 0 => 'UNKNOWN', 1 => 'COFF', 2 => 'CODEVIEW', 3 => 'FPO', 4 => 'MISC',
5 => 'EXCEPTION', 6 => 'FIXUP', 7 => 'OMAP_TO_SRC', 8 => 'OMAP_FROM_SRC',
9 => 'BORLAND', 10 => 'RESERVED10', 11 => 'CLSID' }
DIRECTORIES = %w[export_table import_table resource_table exception_table certificate_table
base_relocation_table debug architecture global_ptr tls_table load_config
bound_import iat delay_import com_runtime reserved]
SECTION_CHARACTERISTIC_BITS = {
0x20 => 'CONTAINS_CODE', 0x40 => 'CONTAINS_DATA', 0x80 => 'CONTAINS_UDATA',
0x100 => 'LNK_OTHER', 0x200 => 'LNK_INFO', 0x800 => 'LNK_REMOVE',
0x1000 => 'LNK_COMDAT', 0x8000 => 'GPREL',
0x20000 => 'MEM_PURGEABLE|16BIT', 0x40000 => 'MEM_LOCKED', 0x80000 => 'MEM_PRELOAD',
0x100000 => 'ALIGN_1BYTES', 0x200000 => 'ALIGN_2BYTES',
0x300000 => 'ALIGN_4BYTES', 0x400000 => 'ALIGN_8BYTES',
0x500000 => 'ALIGN_16BYTES', 0x600000 => 'ALIGN_32BYTES',
0x700000 => 'ALIGN_64BYTES', 0x800000 => 'ALIGN_128BYTES',
0x900000 => 'ALIGN_256BYTES', 0xA00000 => 'ALIGN_512BYTES',
0xB00000 => 'ALIGN_1024BYTES', 0xC00000 => 'ALIGN_2048BYTES',
0xD00000 => 'ALIGN_4096BYTES', 0xE00000 => 'ALIGN_8192BYTES',
0x01000000 => 'LNK_NRELOC_OVFL', 0x02000000 => 'MEM_DISCARDABLE',
0x04000000 => 'MEM_NOT_CACHED', 0x08000000 => 'MEM_NOT_PAGED',
0x10000000 => 'MEM_SHARED', 0x20000000 => 'MEM_EXECUTE',
0x40000000 => 'MEM_READ', 0x80000000 => 'MEM_WRITE'
}
# NRELOC_OVFL means there are more than 0xffff reloc
# the reloc count must be set to 0xffff, and the real reloc count
# is the VA of the first relocation
ORDINAL_REGEX = /^Ordinal_(\d+)$/
class Header
attr_accessor :machine, :num_sect, :time, :ptr_sym, :num_sym, :size_opthdr, :characteristics
end
# present in linked files (exe/dll/kmod)
class OptionalHeader
attr_accessor :signature, :link_ver_maj, :link_ver_min, :code_size, :idata_size, :udata_size, :entrypoint, :base_of_code,
:base_of_data, # not in PE+
# NT-specific fields
:image_base, :sect_align, :file_align, :os_ver_maj, :os_ver_min, :img_ver_maj, :img_ver_min, :subsys_maj, :subsys_min, :reserved,
:image_size, :headers_size, :checksum, :subsystem, :dll_characts, :stack_reserve, :stack_commit, :heap_reserve, :heap_commit, :ldrflags, :numrva
end
# contains the name of dynamic libraries required by the program, and the function to import from them
class ImportDirectory
attr_accessor :libname, :timestamp, :firstforwarder, :libname_p
attr_accessor :imports, :iat, :iat_p, :ilt_p
class Import
attr_accessor :ordinal, :hint, :hintname_p, :name, :target, :thunk
end
end
# lists the functions/addresses exported to the OS (pendant of ImportDirectory)
class ExportDirectory
attr_accessor :reserved, :timestamp, :ver_maj, :ver_min, :libname, :ordinal_base, :libname_p
attr_accessor :exports
class Export
attr_accessor :forwarder_lib, :forwarder_ordinal, :forwarder_name, :target, :name_p, :name, :ordinal
end
end
# array of relocations to apply to an executable file when it is loaded at an address that is not its preferred_base_address
class RelocationTable
attr_accessor :base_addr
attr_accessor :relocs
class Relocation
attr_accessor :offset, :type
end
end
# section table information, + raw section content (EncodedData)
class Section
attr_accessor :name, :virtsize, :virtaddr, :rawsize, :rawaddr, :relocaddr, :linenoaddr, :relocnr, :linenonr, :characteristics
attr_accessor :encoded
end
# the 'load configuration' directory
class LoadConfig
attr_accessor :signature, :timestamp, :major_version, :minor_version, :globalflags, :critsec_timeout,
:decommitblock, :decommittotal, :lockpfxtable, :maxalloc, :maxvirtmem, :process_affinity_mask, :process_heap_flags,
:servicepackid, :reserved, :editlist,
:security_cookie, :sehtable_p, :sehcount
attr_accessor :safeseh
end
class TLSDirectory
attr_accessor :start_va, :end_va, :index_addr, :callback_p, :zerofill_sz, :characteristics, :callbacks
end
# tree-like structure, holds all misc data the program might need (icons, cursors, version information)
# conventionnally structured in a 3-level depth structure:
# I resource type (icon/cursor/etc, see +TYPES+)
# II resource id (icon n1, icon 'toto', ...)
# III language-specific version (icon n1 en, icon n1 en-dvorak...)
# for the icon, the one that appears in the explorer is
# (NT) the one with the lowest ID
# (98) the first to appear in the table
class ResourceDirectory
attr_accessor :characteristics, :timestamp, :major_version, :minor_version
attr_accessor :entries
attr_accessor :curoff_label # internal use, in encoder
class Entry
attr_accessor :name_p, :name, :name_w,
:id, :subdir_p, :subdir, :dataentry_p,
:data_p, :data, :codepage, :reserved
end
def to_hash(depth=0)
map = case depth
when 0; TYPE
when 1; {} # resource-id
when 2; {} # lang
else {}
end
@entries.inject({}) { |h, e|
k = e.id ? map.fetch(e.id, e.id) : e.name ? e.name : e.name_w
v = e.subdir ? e.subdir.to_hash(depth+1) : e.data
h.update k => v
}
end
def self.from_hash(h, depth=0)
map = case depth
when 0; TYPE
when 1; {} # resource-id
when 2; {} # lang
else {}
end
ret = new
ret.entries = h.map { |k, v|
e = Entry.new
k.kind_of?(Integer) ? (e.id = k) : map.index(k) ? (e.id = map.index(k)) : (e.name = k) # name_w ?
v.kind_of?(Hash) ? (e.subdir = from_hash(v, depth+1)) : (e.data = v)
e
}
ret
end
# returns a string with the to_hash key tree
def to_s
to_s_a(0).join("\n")
end
def to_s_a(depth)
@entries.map { |e|
ar = []
ar << if e.id
if depth == 0 and TYPE.has_key?(e.id); "#{e.id.to_s} (#{TYPE[e.id]})".ljust(18)
else e.id.to_s.ljust(5)
end
else (e.name || e.name_w).inspect
end
if e.subdir
sa = e.subdir.to_s_a(depth+1)
if sa.length == 1
ar.last << " | #{sa.first}"
else
ar << sa.map { |s| ' ' + s }
end
elsif e.data.length > 16
ar.last << " #{e.data[0, 8].inspect}... <#{e.data.length} bytes>"
else
ar.last << ' ' << e.data.inspect
end
ar
}.flatten
end
TYPE = {
1 => 'CURSOR', 2 => 'BITMAP', 3 => 'ICON', 4 => 'MENU',
5 => 'DIALOG', 6 => 'STRING', 7 => 'FONTDIR', 8 => 'FONT',
9 => 'ACCELERATOR', 10 => 'RCADATA', 11 => 'MESSAGETABLE',
12 => 'GROUP_CURSOR', 14 => 'GROUP_ICON', 16 => 'VERSION',
17 => 'DLGINCLUDE', 19 => 'PLUGPLAY', 20 => 'VXD',
21 => 'ANICURSOR', 22 => 'ANIICON', 23 => 'HTML',
24 => 'MANIFEST'
}
ACCELERATOR_BITS = {
1 => 'VIRTKEY', 2 => 'NOINVERT', 4 => 'SHIFT', 8 => 'CTRL',
16 => 'ALT', 128 => 'LAST'
}
# cursor = raw data, cursor_group = header , pareil pour les icons
class Cursor
attr_accessor :xhotspot, :yhotspot, :data
end
end
attr_accessor :header, :optheader, :directory, :sections, :endianness, :export, :imports,
:relocations, :resource, :certificates, :delayimports, :loadconfig, :tls
def initialize(cpu=nil)
@directory = {} # DIRECTORIES.key => [rva, size]
@sections = []
@export = @imports = @relocations = @resource = @certificates = @delayimports = nil
@endianness = cpu ? cpu.endianness : :little
@header = Header.new
@optheader = OptionalHeader.new
@header.machine = case cpu
when nil; 'UNKNOWN'
when Ia32; 'I386'
else 'UNKNOWN'
end
super(cpu)
end
end
# the COFF archive file format
# may be used in .lib files (they hold binary import information for libraries)
class COFFArchive < ExeFormat
class Member
attr_accessor :name, :date, :uid, :gid, :mode, :size, :eoh
attr_accessor :offset
end
class ImportHeader
attr_accessor :sig1, :sig2, :version, :machine, :timestamp, :size_of_data, :hint, :type, :name_type, :reserved
attr_accessor :symname, :libname
end
attr_accessor :members, :signature, :first_linker, :second_linker
end
end
__END__
class Symbols
attr_reader :name, :value, :sectionnumber, :type, :storageclass, :nbaux, :aux
# name: if the first 4 bytes are null, the 4 next are the index to the name in the string table
def initialize(raw, offset)
@name = raw[offset..offset+7].delete("\0")
@value = bin(raw[offset+8 ..offset+11])
@sectionnumber = bin(raw[offset+12..offset+13])
@type = bin(raw[offset+14..offset+15])
@storageclass = raw[offset+16]
@nbaux = raw[offset+17]
@aux = Array.new
@nbaux.times { @aux << raw[offset..offset+17] ; offset += 18 }
end
end
class Strings < Array
attr_reader :size
def initialize(raw, offset)
@size = bin(raw[offset..offset+3])
endoffset = offset + @size
puts "String table: 0x%.8x .. 0x%.8x" % [offset, endoffset]
curstring = ''
while (offset < endoffset)
if raw[offset] != 0
curstring << raw[offset]
else
self << curstring
curstring = ''
end
offset += 1
end
end
end

View File

@ -1,753 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/coff'
require 'metasm/decode'
module Metasm
class COFF
class Header
# decodes a COFF header from coff.cursection
def decode(coff)
@machine = coff.int_to_hash(coff.decode_half, MACHINE)
@num_sect = coff.decode_half
@time = coff.decode_word
@ptr_sym = coff.decode_word
@num_sym = coff.decode_word
@size_opthdr = coff.decode_half
@characteristics = coff.bits_to_hash(coff.decode_half, CHARACTERISTIC_BITS)
end
end
class OptionalHeader
# decodes a COFF optional header from coff.cursection
# also decodes directories in coff.directory
def decode(coff)
@signature = coff.int_to_hash(coff.decode_half, SIGNATURE)
@link_ver_maj = coff.decode_uchar
@link_ver_min = coff.decode_uchar
@code_size = coff.decode_word
@data_size = coff.decode_word
@udata_size = coff.decode_word
@entrypoint = coff.decode_word
@base_of_code = coff.decode_word
@base_of_data = coff.decode_word if @signature != 'PE+'
@image_base = coff.decode_xword
@sect_align = coff.decode_word
@file_align = coff.decode_word
@os_ver_maj = coff.decode_half
@os_ver_min = coff.decode_half
@img_ver_maj= coff.decode_half
@img_ver_min= coff.decode_half
@subsys_maj = coff.decode_half
@subsys_min = coff.decode_half
@reserved = coff.decode_word
@image_size = coff.decode_word
@headers_size = coff.decode_word
@checksum = coff.decode_word
@subsystem = coff.int_to_hash(coff.decode_half, SUBSYSTEM)
@dll_characts = coff.bits_to_hash(coff.decode_half, DLL_CHARACTERISTIC_BITS)
@stack_reserve = coff.decode_xword
@stack_commit = coff.decode_xword
@heap_reserve = coff.decode_xword
@heap_commit = coff.decode_xword
@ldrflags = coff.decode_word
@numrva = coff.decode_word
nrva = @numrva
if @numrva > DIRECTORIES.length
puts "W: COFF: Invalid directories count #{@numrva}" if $VERBOSE
nrva = DIRECTORIES.length
end
coff.directory = {}
DIRECTORIES[0, nrva].each { |dir|
rva = coff.decode_word
sz = coff.decode_word
if rva != 0 or sz != 0
coff.directory[dir] = [rva, sz]
end
}
end
end
class Section
# decodes a COFF section header from coff.cursection
def decode(coff)
@name = coff.cursection.encoded.read(8)
@name = @name[0, @name.index(0)] if @name.index(0)
@virtsize = coff.decode_word
@virtaddr = coff.decode_word
@rawsize = coff.decode_word
@rawaddr = coff.decode_word
@relocaddr = coff.decode_word
@linenoaddr = coff.decode_word
@relocnr = coff.decode_half
@linenonr = coff.decode_half
@characteristics = coff.bits_to_hash(coff.decode_word, SECTION_CHARACTERISTIC_BITS)
end
end
class ExportDirectory
# decodes a COFF export table from coff.cursection
def decode(coff)
@reserved = coff.decode_word
@timestamp = coff.decode_word
@version_major = coff.decode_half
@version_minor = coff.decode_half
@libname_p = coff.decode_word
@ordinal_base = coff.decode_word
num_exports = coff.decode_word
num_names = coff.decode_word
func_p = coff.decode_word
names_p = coff.decode_word
ord_p = coff.decode_word
if coff.sect_at_rva(@libname_p)
@libname = coff.decode_strz
end
if coff.sect_at_rva(func_p)
@exports = []
addrs = []
num_exports.times { |i| addrs << coff.decode_word }
num_exports.times { |i|
e = Export.new
e.ordinal = i + @ordinal_base
addr = addrs[i]
if addr >= coff.directory['export_table'][0] and addr < coff.directory['export_table'][0] + coff.directory['export_table'][1] and coff.sect_at_rva(addr)
name = coff.decode_strz
e.forwarder_lib, name = name.split('.', 2)
if name[0] == ?#
e.forwarder_ordinal = name[1..-1].to_i
else
e.forwarder_name = name
end
else
e.target = addr
end
@exports << e
}
end
if coff.sect_at_rva(names_p)
namep = []
num_names.times { namep << coff.decode_word }
end
if coff.sect_at_rva(ord_p)
ords = []
num_names.times { ords << coff.decode_half }
end
if namep and ords
namep.zip(ords).each { |np, oi|
@exports[oi].name_p = np
if coff.sect_at_rva(np)
@exports[oi].name = coff.decode_strz
end
}
end
end
end
class ImportDirectory
# decodes all COFF import directories from coff.cursection
def self.decode(coff)
ret = []
loop do
idata = new
idata.decode_header(coff)
break if [idata.ilt_p, idata.libname_p, idata.iat_p].all? { |p| p == 0 }
ret << idata
end
ret.each { |idata| idata.decode_inner(coff) }
ret
end
# decode a COFF import table from coff.cursection
def decode_header(coff)
@ilt_p = coff.decode_word
@timestamp = coff.decode_word
@firstforwarder = coff.decode_word
@libname_p = coff.decode_word
@iat_p = coff.decode_word
end
# decode the tables referenced
def decode_inner(coff)
if coff.sect_at_rva(@libname_p)
@libname = coff.decode_strz
end
if coff.sect_at_rva(@ilt_p) || coff.sect_at_rva(@iat_p)
addrs = []
while (a = coff.decode_xword) != 0
addrs << a
end
@imports = []
ord_mask = 1 << (coff.optheader.signature == 'PE+' ? 63 : 31)
addrs.each { |a|
i = Import.new
if (a & ord_mask) != 0
i.ordinal = a & (~ord_mask)
else
i.hintname_p = a
if coff.sect_at_rva(a)
i.hint = coff.decode_half
i.name = coff.decode_strz
end
end
@imports << i
}
end
if coff.sect_at_rva(@iat_p)
@iat = []
while (a = coff.decode_xword) != 0
@iat << a
end
end
end
end
class DelayImportDirectory
def self.decode(coff)
ret = []
loop do
didata = new
didata.decode_header coff
break if [didata.libname_p, didata.handle_p, didata.diat_p].all? { |p| p == 0 }
ret << didata
end
ret.each { |didata| didata.decode_inner(coff) }
ret
end
def decode_header(coff)
@attributes = coff.decode_word
@libname_p = coff.decode_word
@handle_p = coff.decode_word # the loader stores the handle at the location pointed by this field at runtime
@diat_p = coff.decode_word
@dint_p = coff.decode_word
@bdiat_p = coff.decode_word
@udiat_p = coff.decode_word
@timestamp = coff.decode_word
end
def decode_inner(coff)
if coff.sect_at_rva(@libname_p)
@libname = coff.decode_strz
end
end
end
class RelocationTable
# decodes a relocation table from coff.encoded.ptr
def decode(coff)
@base_addr = coff.decode_word
@relocs = []
len = coff.decode_word
if len < 8 or len % 2 != 0
puts "W: COFF: Invalid relocation table length #{len}" if $VERBOSE
return
end
len -= 8
len /= 2
len.times {
raw = coff.decode_half
r = Relocation.new
r.offset = raw & 0xfff
r.type = coff.int_to_hash(((raw >> 12) & 15), BASE_RELOCATION_TYPE)
@relocs << r
}
end
end
class ResourceDirectory
def decode(coff, edata = coff.cursection.encoded, startptr = edata.ptr)
@characteristics = coff.decode_word(edata)
@timestamp = coff.decode_word(edata)
@major_version = coff.decode_half(edata)
@minor_version = coff.decode_half(edata)
nrnames = coff.decode_half(edata)
nrid = coff.decode_half(edata)
@entries = []
(nrnames+nrid).times {
e = Entry.new
e_id = coff.decode_word(edata)
e_ptr = coff.decode_word(edata)
tmp = edata.ptr
if (e_id >> 31) == 1
if $DEBUG
nrnames -= 1
puts "W: COFF: rsrc has invalid id #{id}" if nrnames < 0
end
e.name_p = e_id & 0x7fff_ffff
edata.ptr = startptr + e.name_p
namelen = coff.decode_half(edata)
e.name_w = edata.read(2*namelen)
if (chrs = e.name_w.unpack('v*')).all? { |c| c >= 0 and c <= 255 }
e.name = chrs.pack('C*')
end
else
if $DEBUG
puts "W: COFF: rsrc has invalid id #{id}" if nrnames > 0
end
e.id = e_id
end
if (e_ptr >> 31) == 1 # subdir
e.subdir_p = e_ptr & 0x7fff_ffff
if startptr + e.subdir_p >= edata.length
puts 'invalid resource structure: directory too far' if $VERBOSE
else
edata.ptr = startptr + e.subdir_p
e.subdir = ResourceDirectory.new
e.subdir.decode coff, edata, startptr
end
else
e.dataentry_p = e_ptr
edata.ptr = startptr + e.dataentry_p
e.data_p = coff.decode_word(edata)
sz = coff.decode_word(edata)
e.codepage = coff.decode_word(edata)
e.reserved = coff.decode_word(edata)
if coff.sect_at_rva(e.data_p)
e.data = coff.cursection.encoded.read(sz)
end
end
edata.ptr = tmp
@entries << e
}
end
end
class DebugDirectory
def decode(coff)
@characteristics = coff.decode_word
@timestamp = coff.decode_word
@major_version = coff.decode_half
@minor_version = coff.decode_half
@type = coff.int_from_hash(coff.decode_word, DEBUG_TYPE)
@size_of_data = coff.decode_word
@addr = coff.decode_word
@pointer = coff.decode_word
end
end
class TLSDirectory
def decode(coff)
@start_va = coff.decode_xword # must have a .reloc
@end_va = coff.decode_xword
@index_addr = coff.decode_xword # va ? rva ?
@callback_p = coff.decode_xword # ptr to 0-terminated x?word callback ptrs
@zerofill_sz = coff.decode_word # nr of 0 bytes to append to the template (start_va)
@characteristics = coff.decode_word
if coff.sect_at_va(@callback_p)
@callbacks = []
while (ptr = coff.decode_xword) != 0
# __stdcall void (*ptr)(void* dllhandle, dword reason, void* reserved)
# (same as dll entrypoint)
@callbacks << (ptr - coff.optheader.image_base)
end
end
end
end
class LoadConfig
def decode(coff)
@signature = coff.decode_word
@timestamp = coff.decode_word
@major_version = coff.decode_half
@minor_version = coff.decode_half
@globalflags = coff.decode_word
@critsec_timeout = coff.decode_word
@decommitblock = coff.decode_xword
@decommittotal = coff.decode_xword
@lockpfxtable = coff.decode_xword # VA of ary of instruction using LOCK prefix, to be nopped on singleproc machine (wtf?)
@maxalloc = coff.decode_xword
@maxvirtmem = coff.decode_xword
@process_affinity_mask = coff.decode_xword
@process_heap_flags = coff.decode_word
@servicepackid = coff.decode_half
@reserved = coff.decode_half
@editlist = coff.decode_xword
@security_cookie = coff.decode_xword
@sehtable_p = coff.decode_xword # VA
@sehcount = coff.decode_xword
# @sehcount is really the count ?
if @sehcount >= 0 and @sehcount < 100 and (@signature == 0x40 or @signature == 0x48) and coff.sect_at_va(@sehtable_p)
@safeseh = []
@sehcount.times { @safeseh << coff.decode_xword }
end
end
end
attr_accessor :cursection
def decode_uchar(edata = @cursection.encoded) ; edata.decode_imm(:u8, @endianness) end
def decode_half( edata = @cursection.encoded) ; edata.decode_imm(:u16, @endianness) end
def decode_word( edata = @cursection.encoded) ; edata.decode_imm(:u32, @endianness) end
def decode_xword(edata = @cursection.encoded) ; edata.decode_imm((@optheader.signature == 'PE+' ? :u64 : :u32), @endianness) end
def decode_strz( edata = @cursection.encoded) ; if i = edata.data.index(0, edata.ptr) ; edata.read(i+1-edata.ptr).chop ; end ; end
# converts an RVA (offset from base address of file when loaded in memory) to the section containing it using the section table
# updates @cursection and @cursection.encoded.ptr to point to the specified address
# may return self when rva points to the coff header
# returns nil if none match, 0 never matches
def sect_at_rva(rva)
return if not rva or rva <= 0
if sections and not @sections.empty?
if s = @sections.find { |s| s.virtaddr <= rva and s.virtaddr + s.virtsize > rva }
s.encoded.ptr = rva - s.virtaddr
@cursection = s
elsif rva < @sections.map { |s| s.virtaddr }.min
@encoded.ptr = rva
@cursection = self
end
elsif rva <= @encoded.length
@encoded.ptr = rva
@cursection = self
end
end
def sect_at_va(va)
sect_at_rva(va - @optheader.image_base)
end
def label_rva(name)
if name.kind_of? Integer
name
elsif s = @sections.find { |s| s.encoded.export[name] }
s.virtaddr + s.encoded.export[name]
else
@encoded.export[name]
end
end
def each_section
base = @optheader.image_base
base = 0 if not base.kind_of? Integer
yield @encoded[0, @optheader.headers_size], base
@sections.each { |s| yield s.encoded, base + s.virtaddr }
end
# decodes the COFF header, optional header, section headers
# marks entrypoint and directories as edata.expord
def decode_header
@cursection ||= self
@encoded.ptr ||= 0
@header.decode(self)
optoff = @encoded.ptr
@optheader.decode(self)
@cursection.encoded.ptr = optoff + @header.size_opthdr
@header.num_sect.times {
s = Section.new
s.decode self
@sections << s
decode_section_body(s)
}
if sect_at_rva(@optheader.entrypoint)
@cursection.encoded.add_export new_label('entrypoint')
end
(DIRECTORIES - ['certificate_table']).each { |d|
if @directory and @directory[d] and sect_at_rva(@directory[d][0])
@cursection.encoded.add_export new_label(d)
end
}
end
# decodes a section content (allows simpler LoadedPE override)
def decode_section_body(s)
s.encoded = @encoded[s.rawaddr, [s.rawsize, s.virtsize].min]
s.encoded.virtsize = s.virtsize
end
# decodes COFF export table from directory
# mark exported names as encoded.export
def decode_exports
if @directory and @directory['export_table'] and sect_at_rva(@directory['export_table'][0])
@export = ExportDirectory.new
@export.decode(self)
@export.exports.to_a.each { |e|
if e.name and sect_at_rva(e.target)
e.target = @cursection.encoded.add_export e.name
end
}
end
end
# decodes COFF import tables from directory
# mark iat entries as encoded.export
def decode_imports
if @directory and @directory['import_table'] and sect_at_rva(@directory['import_table'][0])
@imports = ImportDirectory.decode(self)
iatlen = (@optheader.signature == 'PE+' ? 8 : 4)
@imports.each { |id|
if sect_at_rva(id.iat_p)
ptr = @cursection.encoded.ptr
id.imports.each { |i|
if i.name
r = Metasm::Relocation.new(Expression[i.name], :u32, @endianness)
@cursection.encoded.reloc[ptr] = r
@cursection.encoded.add_export 'iat_'+i.name, ptr, true
end
ptr += iatlen
}
end
}
end
end
# decode TLS directory, including tls callback table
def decode_tls
if @directory and @directory['tls_table'] and sect_at_rva(@directory['tls_table'][0])
@tls = TLSDirectory.new
@tls.decode(self)
if s = sect_at_va(@tls.callback_p)
s.encoded.add_export 'tls_callback_table'
@tls.callbacks.each_with_index { |cb, i|
@tls.callbacks[i] = @cursection.encoded.add_export "tls_callback_#{i}" if sect_at_rva(cb)
}
end
end
end
# decode COFF relocation tables from directory
def decode_relocs
if @directory and @directory['base_relocation_table'] and sect_at_rva(@directory['base_relocation_table'][0])
end_ptr = @cursection.encoded.ptr + @directory['base_relocation_table'][1]
@relocations = []
while @cursection.encoded.ptr < end_ptr
rt = RelocationTable.new
rt.decode self
@relocations << rt
end
# interpret as EncodedData relocations
relocfunc = ('decode_reloc_' << @header.machine.downcase).to_sym
if not respond_to? relocfunc
puts "W: COFF: unsupported relocs for architecture #{@header.machine}" if $VERBOSE
return
end
@relocations.each { |rt|
rt.relocs.each { |r|
if s = sect_at_rva(rt.base_addr + r.offset)
e, p = s.encoded, s.encoded.ptr
rel = send(relocfunc, r)
e.reloc[p] = rel if rel
end
}
}
end
end
# decodes an I386 COFF relocation pointing to encoded.ptr
def decode_reloc_i386(r)
case r.type
when 'ABSOLUTE'
when 'HIGHLOW'
addr = decode_word
if s = sect_at_va(addr)
label = label_at(s.encoded, s.encoded.ptr, 'xref_%04x' % addr)
Metasm::Relocation.new(Expression[label], :u32, @endianness)
end
when 'DIR64'
addr = decode_xword
if s = sect_at_va(addr)
label = label_at(s.encoded, s.encoded.ptr, 'xref_%04x' % addr)
Metasm::Relocation.new(Expression[label], :u64, @endianness)
end
else puts "W: COFF: Unsupported i386 relocation #{r.inspect}" if $VERBOSE
end
end
# decodes resources from directory
def decode_resources
if @directory and @directory['resource_table'] and sect_at_rva(@directory['resource_table'][0])
@resource = ResourceDirectory.new
@resource.decode self
end
end
# decodes certificate table
def decode_certificates
if @directory and ct = @directory['certificate_table']
@encoded.ptr = ct[0]
@certificates = (0...(ct[1]/8)).map { @encoded.data[decode_word(@encoded), decode_word(encoded)] }
end
end
def decode_loadconfig
if @directory and lc = @directory['load_config'] and sect_at_rva(lc[0])
@loadconfig = LoadConfig.new
@loadconfig.decode(self)
end
end
# decodes a COFF file (headers/exports/imports/relocs/sections)
# starts at encoded.ptr
def decode
decode_header
decode_exports
decode_imports
decode_resources
decode_certificates
decode_tls
decode_relocs
end
# returns a metasm CPU object corresponding to +header.machine+
def cpu_from_headers
case @header.machine
when 'I386'; Ia32.new
else raise 'unknown cpu'
end
end
# returns an array including the PE entrypoint and the exported functions entrypoints
# TODO filter out exported data, include safeseh ?
def get_default_entrypoints
ep = []
ep.concat @tls.callbacks.to_a if tls
ep << (@optheader.image_base + label_rva(@optheader.entrypoint))
@export.exports.each { |e|
ep << (@optheader.image_base + label_rva(e.target)) if not e.forwarder_lib
} if @export
ep
end
def dump_section_header(addr, edata)
s = @sections.find { |s| s.virtaddr == addr-@optheader.image_base }
s ? "\n.section #{s.name.inspect} base=#{Expression[addr]}" :
addr == @optheader.image_base ? "// exe header at #{Expression[addr]}" : super
end
end
class COFFArchive
def self.decode(str)
ar = new
ar.encoded = EncodedData.new << str
ar.signature = ar.encoded.read(8)
raise InvalidExeFormat, "Invalid COFF Archive signature #{ar.signature.inspect}" if ar.signature != "!<arch>\n"
ar.members = []
while ar.encoded.ptr < ar.encoded.virtsize
ar.decode_member
end
ar.decode_first_linker
ar.decode_second_linker
ar.fixup_names
ar
end
class Member
def decode(ar)
@offset = ar.encoded.ptr
@name = ar.encoded.read(16).strip
@date = ar.encoded.read(12).to_i
@uid = ar.encoded.read(6).to_i
@gid = ar.encoded.read(6).to_i
@mode = ar.encoded.read(8).to_i 8
@size = ar.encoded.read(10).to_i
@eoh = ar.read(2) # should be <'\n>
end
end
class ImportHeader
def decode(ar)
@sig1 = ar.encoded.decode_imm(:u16, :little)
@sig2 = ar.encoded.decode_imm(:u16, :little)
@version = ar.encoded.decode_imm(:u16, :little)
@machine = ar.encoded.decode_imm(:u16, :little)
@timestamp = ar.encoded.decode_imm(:u32, :little)
@size_of_data = ar.encoded.decode_imm(:u32, :little)
@hint = ar.encoded.decode_imm(:u16, :little)
type = ar.encoded.decode_imm(:u16, :little)
@type = ar.int_from_hash((type >> 14) & 3, IMPORT_TYPE)
@name_type = ar.int_from_hash((type >> 11) & 7, NAME_TYPE)
@reserved = type & 0x7ff
@symname = ar.encoded.data[ar.encoded.ptr...ar.encoded.data.index(0, ar.encoded.ptr)]
ar.encoded.ptr += @symname.length + 1
@libname = ar.encoded.data[ar.encoded.ptr...ar.encoded.data.index(0, ar.encoded.ptr)]
end
end
def decode_member_header
h = Member.new
h.decode self
@members << h
end
def decode_member
decode_member_header
m = @members.last
m.encoded = @encoded[@encoded.ptr, m.size]
@encoded.ptr += m.size
end
def decode_first_linker
m = @members[0]
m.encoded.ptr = 0
numsym = m.encoded.decode_imm(:u32, :big)
offsets = []
numsym.times { offsets << m.encoded.decode_imm(:u32, :big) }
names = []
numsym.times {
names << ''
while (c = m.encoded.get_byte) != 0
names.last << c
end
}
# names[42] is found in object at file offset offsets[42]
# offsets are sorted by object index (all syms from 1st object, then 2nd etc)
@first_linker = names.zip(offsets).inject({}) { |h, (n, o)| h.update n => o }
end
def decode_second_linker
m = @members[1]
m.encoded.ptr = 0
nummb = m.encoded.decode_imm(:u32, :big)
mboffsets = []
nummb.times { mboffsets << m.encoded.decode_imm(:u32, :big) }
numsym = m.encoded.decode_imm(:u32, :big)
indices = []
numsym.times { indices << m.encoded.decode_imm(:u16, :big) }
names = []
numsym.times {
names << ''
while (c = m.encoded.get_byte) != 0
names.last << c
end
}
# names[42] is found in object at file offset mboffsets[indices[42]]
# symbols sorted by symbol name (supposed to be more efficient, but no index into string table...)
@second_linker = names.zip(indices).inject({}) { |h, (n, i)| h.update n => mboffsets[i] }
end
# set real name to archive members: look it up in the name table member if needed, or just remove the trailing /
def fixup_names
@members.each { |m|
case m.name
when '/'
when '//'
when /\/(\d+)/
m.name = @members[2].encoded.data[$1.to_i, @members[2].size]
m.name = m.name[0, m.name.index(0)]
else m.name.chomp! "/"
end
}
end
end
end

File diff suppressed because it is too large Load Diff

View File

@ -1,600 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
module Metasm
class ELF < ExeFormat
CLASS = { 0 => 'NONE', 1 => '32', 2 => '64', 200 => '64_icc' }
DATA = { 0 => 'NONE', 1 => 'LSB', 2 => 'MSB' }
VERSION = { 0 => 'INVALID', 1 => 'CURRENT' }
ABI = { 0 => 'SYSV', 1 => 'HPUX', 2 => 'NETBSD', 3 => 'LINUX',
6 => 'SOLARIS', 7 => 'AIX', 8 => 'IRIX', 9 => 'FREEBSD',
10 => 'TRU64', 11 => 'MODESTO', 12 => 'OPENBSD', 97 => 'ARM',
255 => 'STANDALONE'}
TYPE = { 0 => 'NONE', 1 => 'REL', 2 => 'EXEC', 3 => 'DYN', 4 => 'CORE' }
TYPE_LOPROC = 0xff00
TYPE_HIPROC = 0xffff
MACHINE = {
0 => 'NONE', 1 => 'M32', 2 => 'SPARC', 3 => '386',
4 => '68K', 5 => '88K', 6 => '486', 7 => '860',
8 => 'MIPS', 9 => 'S370', 10 => 'MIPS_RS3_LE',
15 => 'PARISC',
17 => 'VPP500',18 => 'SPARC32PLUS', 19 => '960',
20 => 'PPC', 21 => 'PPC64', 22 => 'S390',
36 => 'V800', 37 => 'FR20', 38 => 'RH32', 39 => 'MCORE',
40 => 'ARM', 41 => 'ALPHA_STD', 42 => 'SH', 43 => 'SPARCV9',
44 => 'TRICORE', 45 => 'ARC', 46 => 'H8_300', 47 => 'H8_300H',
48 => 'H8S', 49 => 'H8_500', 50 => 'IA_64', 51 => 'MIPS_X',
52 => 'COLDFIRE', 53 => '68HC12', 54 => 'MMA', 55 => 'PCP',
56 => 'NCPU', 57 => 'NDR1', 58 => 'STARCORE', 59 => 'ME16',
60 => 'ST100', 61 => 'TINYJ', 62 => 'X86_64', 63 => 'PDSP',
66 => 'FX66', 67 => 'ST9PLUS',
68 => 'ST7', 69 => '68HC16', 70 => '68HC11', 71 => '68HC08',
72 => '68HC05',73 => 'SVX', 74 => 'ST19', 75 => 'VAX',
76 => 'CRIS', 77 => 'JAVELIN',78 => 'FIREPATH', 79 => 'ZSP',
80 => 'MMIX', 81 => 'HUANY', 82 => 'PRISM', 83 => 'AVR',
84 => 'FR30', 85 => 'D10V', 86 => 'D30V', 87 => 'V850',
88 => 'M32R', 89 => 'MN10300',90 => 'MN10200',91 => 'PJ',
92 => 'OPENRISC', 93 => 'ARC_A5', 94 => 'XTENSA',
99 => 'PJ',
0x9026 => 'ALPHA'
}
FLAGS = Hash.new({}).merge(
'SPARC' => {0x100 => '32PLUS', 0x200 => 'SUN_US1',
0x400 => 'HAL_R1', 0x800 => 'SUN_US3',
0x8000_0000 => 'LEDATA'},
'SPARCV9' => {0 => 'TSO', 1 => 'PSO', 2 => 'RMO'}, # XXX not a flag
'MIPS' => {1 => 'NOREORDER', 2 => 'PIC', 4 => 'CPIC',
8 => 'XGOT', 16 => '64BIT_WHIRL', 32 => 'ABI2',
64 => 'ABI_ON32'}
)
DYNAMIC_TAG = { 0 => 'NULL', 1 => 'NEEDED', 2 => 'PLTRELSZ', 3 =>
'PLTGOT', 4 => 'HASH', 5 => 'STRTAB', 6 => 'SYMTAB', 7 => 'RELA',
8 => 'RELASZ', 9 => 'RELAENT', 10 => 'STRSZ', 11 => 'SYMENT',
12 => 'INIT', 13 => 'FINI', 14 => 'SONAME', 15 => 'RPATH',
16 => 'SYMBOLIC', 17 => 'REL', 18 => 'RELSZ', 19 => 'RELENT',
20 => 'PLTREL', 21 => 'DEBUG', 22 => 'TEXTREL', 23 => 'JMPREL',
24 => 'BIND_NOW',
25 => 'INIT_ARRAY', 26 => 'FINI_ARRAY',
27 => 'INIT_ARRAYSZ', 28 => 'FINI_ARRAYSZ',
29 => 'RUNPATH', 30 => 'FLAGS', 31 => 'ENCODING',
32 => 'PREINIT_ARRAY', 33 => 'PREINIT_ARRAYSZ',
0x6fff_fdf5 => 'GNU_PRELINKED',
0x6fff_fdf6 => 'GNU_CONFLICTSZ', 0x6fff_fdf7 => 'LIBLISTSZ',
0x6fff_fdf8 => 'CHECKSUM', 0x6fff_fdf9 => 'PLTPADSZ',
0x6fff_fdfa => 'MOVEENT', 0x6fff_fdfb => 'MOVESZ',
0x6fff_fdfc => 'FEATURE_1', 0x6fff_fdfd => 'POSFLAG_1',
0x6fff_fdfe => 'SYMINSZ', 0x6fff_fdff => 'SYMINENT',
0x6fff_fef5 => 'GNU_HASH',
0x6fff_fef6 => 'TLSDESC_PLT', 0x6fff_fef7 => 'TLSDESC_GOT',
0x6fff_fef8 => 'GNU_CONFLICT', 0x6fff_fef9 => 'GNU_LIBLIST',
0x6fff_fefa => 'CONFIG', 0x6fff_fefb => 'DEPAUDIT',
0x6fff_fefc => 'AUDIT', 0x6fff_fefd => 'PLTPAD',
0x6fff_fefe => 'MOVETAB', 0x6fff_feff => 'SYMINFO',
0x6fff_fff0 => 'VERSYM', 0x6fff_fff9 => 'RELACOUNT',
0x6fff_fffa => 'RELCOUNT', 0x6fff_fffb => 'FLAGS_1',
0x6fff_fffc => 'VERDEF', 0x6fff_fffd => 'VERDEFNUM',
0x6fff_fffe => 'VERNEED', 0x6fff_ffff => 'VERNEEDNUM'
}
DYNAMIC_TAG_LOPROC = 0x7000_0000
DYNAMIC_TAG_HIPROC = 0x7fff_ffff
DYNAMIC_FLAGS = { 1 => 'ORIGIN', 2 => 'SYMBOLIC', 4 => 'TEXTREL',
8 => 'BIND_NOW', 0x10 => 'STATIC_TLS' }
DYNAMIC_FLAGS_1 = { 1 => 'NOW', 2 => 'GLOBAL', 4 => 'GROUP',
8 => 'NODELETE', 0x10 => 'LOADFLTR', 0x20 => 'INITFIRST',
0x40 => 'NOOPEN', 0x80 => 'ORIGIN', 0x100 => 'DIRECT',
0x200 => 'TRANS', 0x400 => 'INTERPOSE', 0x800 => 'NODEFLIB',
0x1000 => 'NODUMP', 0x2000 => 'CONFALT', 0x4000 => 'ENDFILTEE',
0x8000 => 'DISPRELDNE', 0x10000 => 'DISPRELPND' }
DYNAMIC_FEATURE_1 = { 1 => 'PARINIT', 2 => 'CONFEXP' }
DYNAMIC_POSFLAG_1 = { 1 => 'LAZYLOAD', 2 => 'GROUPPERM' }
PH_TYPE = { 0 => 'NULL', 1 => 'LOAD', 2 => 'DYNAMIC', 3 => 'INTERP',
4 => 'NOTE', 5 => 'SHLIB', 6 => 'PHDR', 7 => 'TLS',
0x6474e550 => 'GNU_EH_FRAME', 0x6474e551 => 'GNU_STACK',
0x6474e552 => 'GNU_RELRO' }
PH_TYPE_LOPROC = 0x7000_0000
PH_TYPE_HIPROC = 0x7fff_ffff
PH_FLAGS = { 1 => 'X', 2 => 'W', 4 => 'R' }
SH_TYPE = { 0 => 'NULL', 1 => 'PROGBITS', 2 => 'SYMTAB', 3 => 'STRTAB',
4 => 'RELA', 5 => 'HASH', 6 => 'DYNAMIC', 7 => 'NOTE',
8 => 'NOBITS', 9 => 'REL', 10 => 'SHLIB', 11 => 'DYNSYM',
14 => 'INIT_ARRAY', 15 => 'FINI_ARRAY', 16 => 'PREINIT_ARRAY',
17 => 'GROUP', 18 => 'SYMTAB_SHNDX',
0x6fff_fff6 => 'GNU_HASH', 0x6fff_fff7 => 'GNU_LIBLIST',
0x6fff_fff8 => 'GNU_CHECKSUM',
0x6fff_fffd => 'GNU_verdef', 0x6fff_fffe => 'GNU_verneed',
0x6fff_ffff => 'GNU_versym' }
SH_TYPE_LOOS = 0x6000_0000
SH_TYPE_HIOS = 0x6fff_ffff
SH_TYPE_LOPROC = 0x7000_0000
SH_TYPE_HIPROC = 0x7fff_ffff
SH_TYPE_LOUSER = 0x8000_0000
SH_TYPE_HIUSER = 0xffff_ffff
SH_FLAGS = { 1 => 'WRITE', 2 => 'ALLOC', 4 => 'EXECINSTR',
0x10 => 'MERGE', 0x20 => 'STRINGS', 0x40 => 'INFO_LINK',
0x80 => 'LINK_ORDER', 0x100 => 'OS_NONCONFORMING',
0x200 => 'GROUP', 0x400 => 'TLS' }
SH_FLAGS_MASKPROC = 0xf000_0000
SH_INDEX = { 0 => 'UNDEF',
0xfff1 => 'ABS', 0xfff2 => 'COMMON',
0xffff => 'XINDEX', }
SH_INDEX_LORESERVE = 0xff00
SH_INDEX_LOPROC = 0xff00
SH_INDEX_HIPROC = 0xff1f
SH_INDEX_LOOS = 0xff20
SH_INDEX_HIOS = 0xff3f
SH_INDEX_HIRESERVE = 0xffff
SYMBOL_BIND = { 0 => 'LOCAL', 1 => 'GLOBAL', 2 => 'WEAK' }
SYMBOL_BIND_LOPROC = 13
SYMBOL_BIND_HIPROC = 15
SYMBOL_TYPE = { 0 => 'NOTYPE', 1 => 'OBJECT', 2 => 'FUNC',
3 => 'SECTION', 4 => 'FILE', 5 => 'COMMON', 6 => 'TLS' }
SYMBOL_TYPE_LOPROC = 13
SYMBOL_TYPE_HIPROC = 15
SYMBOL_VISIBILITY = { 0 => 'DEFAULT', 1 => 'INTERNAL', 2 => 'HIDDEN', 3 => 'PROTECTED' }
RELOCATION_TYPE = Hash.new({}).merge( # key are in MACHINE.values
'386' => { 0 => 'NONE', 1 => '32', 2 => 'PC32', 3 => 'GOT32',
4 => 'PLT32', 5 => 'COPY', 6 => 'GLOB_DAT',
7 => 'JMP_SLOT', 8 => 'RELATIVE', 9 => 'GOTOFF',
10 => 'GOTPC', 11 => '32PLT', 12 => 'TLS_GD_PLT',
13 => 'TLS_LDM_PLT', 14 => 'TLS_TPOFF', 15 => 'TLS_IE',
16 => 'TLS_GOTIE', 17 => 'TLS_LE', 18 => 'TLS_GD',
19 => 'TLS_LDM', 20 => '16', 21 => 'PC16', 22 => '8',
23 => 'PC8', 24 => 'TLS_GD_32', 25 => 'TLS_GD_PUSH',
26 => 'TLS_GD_CALL', 27 => 'TLS_GD_POP',
28 => 'TLS_LDM_32', 29 => 'TLS_LDM_PUSH',
30 => 'TLS_LDM_CALL', 31 => 'TLS_LDM_POP',
32 => 'TLS_LDO_32', 33 => 'TLS_IE_32',
34 => 'TLS_LE_32', 35 => 'TLS_DTPMOD32',
36 => 'TLS_DTPOFF32', 37 => 'TLS_TPOFF32' },
'ARM' => { 0 => 'NONE', 1 => 'PC24', 2 => 'ABS32', 3 => 'REL32',
4 => 'PC13', 5 => 'ABS16', 6 => 'ABS12',
7 => 'THM_ABS5', 8 => 'ABS8', 9 => 'SBREL32',
10 => 'THM_PC22', 11 => 'THM_PC8', 12 => 'AMP_VCALL9',
13 => 'SWI24', 14 => 'THM_SWI8', 15 => 'XPC25',
16 => 'THM_XPC22', 20 => 'COPY', 21 => 'GLOB_DAT',
22 => 'JUMP_SLOT', 23 => 'RELATIVE', 24 => 'GOTOFF',
25 => 'GOTPC', 26 => 'GOT32', 27 => 'PLT32',
100 => 'GNU_VTENTRY', 101 => 'GNU_VTINHERIT',
250 => 'RSBREL32', 251 => 'THM_RPC22', 252 => 'RREL32',
253 => 'RABS32', 254 => 'RPC24', 255 => 'RBASE' },
'IA_64' => { 0 => 'NONE',
0x21 => 'IMM14', 0x22 => 'IMM22', 0x23 => 'IMM64',
0x24 => 'DIR32MSB', 0x25 => 'DIR32LSB',
0x26 => 'DIR64MSB', 0x27 => 'DIR64LSB',
0x2a => 'GPREL22', 0x2b => 'GPREL64I',
0x2c => 'GPREL32MSB', 0x2d => 'GPREL32LSB',
0x2e => 'GPREL64MSB', 0x2f => 'GPREL64LSB',
0x32 => 'LTOFF22', 0x33 => 'LTOFF64I',
0x3a => 'PLTOFF22', 0x3b => 'PLTOFF64I',
0x3e => 'PLTOFF64MSB', 0x3f => 'PLTOFF64LSB',
0x43 => 'FPTR64I', 0x44 => 'FPTR32MSB',
0x45 => 'FPTR32LSB', 0x46 => 'FPTR64MSB',
0x47 => 'FPTR64LSB',
0x48 => 'PCREL60B', 0x49 => 'PCREL21B',
0x4a => 'PCREL21M', 0x4b => 'PCREL21F',
0x4c => 'PCREL32MSB', 0x4d => 'PCREL32LSB',
0x4e => 'PCREL64MSB', 0x4f => 'PCREL64LSB',
0x52 => 'LTOFF_FPTR22', 0x53 => 'LTOFF_FPTR64I',
0x54 => 'LTOFF_FPTR32MSB', 0x55 => 'LTOFF_FPTR32LSB',
0x56 => 'LTOFF_FPTR64MSB', 0x57 => 'LTOFF_FPTR64LSB',
0x5c => 'SEGREL32MSB', 0x5d => 'SEGREL32LSB',
0x5e => 'SEGREL64MSB', 0x5f => 'SEGREL64LSB',
0x64 => 'SECREL32MSB', 0x65 => 'SECREL32LSB',
0x66 => 'SECREL64MSB', 0x67 => 'SECREL64LSB',
0x6c => 'REL32MSB', 0x6d => 'REL32LSB',
0x6e => 'REL64MSB', 0x6f => 'REL64LSB',
0x74 => 'LTV32MSB', 0x75 => 'LTV32LSB',
0x76 => 'LTV64MSB', 0x77 => 'LTV64LSB',
0x79 => 'PCREL21BI', 0x7a => 'PCREL22',
0x7b => 'PCREL64I', 0x80 => 'IPLTMSB',
0x81 => 'IPLTLSB', 0x85 => 'SUB',
0x86 => 'LTOFF22X', 0x87 => 'LDXMOV',
0x91 => 'TPREL14', 0x92 => 'TPREL22',
0x93 => 'TPREL64I', 0x96 => 'TPREL64MSB',
0x97 => 'TPREL64LSB', 0x9a => 'LTOFF_TPREL22',
0xa6 => 'DTPMOD64MSB', 0xa7 => 'DTPMOD64LSB',
0xaa => 'LTOFF_DTPMOD22', 0xb1 => 'DTPREL14',
0xb2 => 'DTPREL22', 0xb3 => 'DTPREL64I',
0xb4 => 'DTPREL32MSB', 0xb5 => 'DTPREL32LSB',
0xb6 => 'DTPREL64MSB', 0xb7 => 'DTPREL64LSB',
0xba => 'LTOFF_DTPREL22' },
'M32' => { 0 => 'NONE', 1 => '32', 2 => '32_S', 3 => 'PC32_S',
4 => 'GOT32_S', 5 => 'PLT32_S', 6 => 'COPY',
7 => 'GLOB_DAT', 8 => 'JMP_SLOT', 9 => 'RELATIVE',
10 => 'RELATIVE_S' },
'MIPS' => {
0 => 'NONE', 1 => '16', 2 => '32', 3 => 'REL32',
4 => '26', 5 => 'HI16', 6 => 'LO16', 7 => 'GPREL16',
8 => 'LITERAL', 9 => 'GOT16', 10 => 'PC16',
11 => 'CALL16', 12 => 'GPREL32',
16 => 'SHIFT5', 17 => 'SHIFT6', 18 => '64',
19 => 'GOT_DISP', 20 => 'GOT_PAGE', 21 => 'GOT_OFST',
22 => 'GOT_HI16', 23 => 'GOT_LO16', 24 => 'SUB',
25 => 'INSERT_A', 26 => 'INSERT_B', 27 => 'DELETE',
28 => 'HIGHER', 29 => 'HIGHEST', 30 => 'CALL_HI16',
31 => 'CALL_LO16', 32 => 'SCN_DISP', 33 => 'REL16',
34 => 'ADD_IMMEDIATE', 35 => 'PJUMP', 36 => 'RELGOT',
37 => 'JALR', 38 => 'TLS_DTPMOD32', 39 => 'TLS_DTPREL32',
40 => 'TLS_DTPMOD64', 41 => 'TLS_DTPREL64',
42 => 'TLS_GD', 43 => 'TLS_LDM', 44 => 'TLS_DTPREL_HI16',
45 => 'TLS_DTPREL_LO16', 46 => 'TLS_GOTTPREL',
47 => 'TLS_TPREL32', 48 => 'TLS_TPREL64',
49 => 'TLS_TPREL_HI16', 50 => 'TLS_TPREL_LO16',
51 => 'GLOB_DAT', 52 => 'NUM' },
'PPC' => { 0 => 'NONE',
1 => 'ADDR32', 2 => 'ADDR24', 3 => 'ADDR16',
4 => 'ADDR16_LO', 5 => 'ADDR16_HI', 6 => 'ADDR16_HA',
7 => 'ADDR14', 8 => 'ADDR14_BRTAKEN', 9 => 'ADDR14_BRNTAKEN',
10 => 'REL24', 11 => 'REL14',
12 => 'REL14_BRTAKEN', 13 => 'REL14_BRNTAKEN',
14 => 'GOT16', 15 => 'GOT16_LO',
16 => 'GOT16_HI', 17 => 'GOT16_HA',
18 => 'PLTREL24', 19 => 'COPY',
20 => 'GLOB_DAT', 21 => 'JMP_SLOT',
22 => 'RELATIVE', 23 => 'LOCAL24PC',
24 => 'UADDR32', 25 => 'UADDR16',
26 => 'REL32', 27 => 'PLT32',
28 => 'PLTREL32', 29 => 'PLT16_LO',
30 => 'PLT16_HI', 31 => 'PLT16_HA',
32 => 'SDAREL16', 33 => 'SECTOFF',
34 => 'SECTOFF_LO', 35 => 'SECTOFF_HI',
36 => 'SECTOFF_HA', 67 => 'TLS',
68 => 'DTPMOD32', 69 => 'TPREL16',
70 => 'TPREL16_LO', 71 => 'TPREL16_HI',
72 => 'TPREL16_HA', 73 => 'TPREL32',
74 => 'DTPREL16', 75 => 'DTPREL16_LO',
76 => 'DTPREL16_HI', 77 => 'DTPREL16_HA',
78 => 'DTPREL32', 79 => 'GOT_TLSGD16',
80 => 'GOT_TLSGD16_LO', 81 => 'GOT_TLSGD16_HI',
82 => 'GOT_TLSGD16_HA', 83 => 'GOT_TLSLD16',
84 => 'GOT_TLSLD16_LO', 85 => 'GOT_TLSLD16_HI',
86 => 'GOT_TLSLD16_HA', 87 => 'GOT_TPREL16',
88 => 'GOT_TPREL16_LO', 89 => 'GOT_TPREL16_HI',
90 => 'GOT_TPREL16_HA', 101 => 'EMB_NADDR32',
102 => 'EMB_NADDR16', 103 => 'EMB_NADDR16_LO',
104 => 'EMB_NADDR16_HI', 105 => 'EMB_NADDR16_HA',
106 => 'EMB_SDAI16', 107 => 'EMB_SDA2I16',
108 => 'EMB_SDA2REL', 109 => 'EMB_SDA21',
110 => 'EMB_MRKREF', 111 => 'EMB_RELSEC16',
112 => 'EMB_RELST_LO', 113 => 'EMB_RELST_HI',
114 => 'EMB_RELST_HA', 115 => 'EMB_BIT_FLD',
116 => 'EMB_RELSDA' },
'SPARC' => { 0 => 'NONE', 1 => '8', 2 => '16', 3 => '32',
4 => 'DISP8', 5 => 'DISP16', 6 => 'DISP32',
7 => 'WDISP30', 8 => 'WDISP22', 9 => 'HI22',
10 => '22', 11 => '13', 12 => 'LO10', 13 => 'GOT10',
14 => 'GOT13', 15 => 'GOT22', 16 => 'PC10',
17 => 'PC22', 18 => 'WPLT30', 19 => 'COPY',
20 => 'GLOB_DAT', 21 => 'JMP_SLOT', 22 => 'RELATIVE',
23 => 'UA32', 24 => 'PLT32', 25 => 'HIPLT22',
26 => 'LOPLT10', 27 => 'PCPLT32', 28 => 'PCPLT22',
29 => 'PCPLT10', 30 => '10', 31 => '11', 32 => '64',
33 => 'OLO10', 34 => 'HH22', 35 => 'HM10', 36 => 'LM22',
37 => 'PC_HH22', 38 => 'PC_HM10', 39 => 'PC_LM22',
40 => 'WDISP16', 41 => 'WDISP19', 42 => 'GLOB_JMP',
43 => '7', 44 => '5', 45 => '6', 46 => 'DISP64',
47 => 'PLT64', 48 => 'HIX22', 49 => 'LOX10', 50 => 'H44',
51 => 'M44', 52 => 'L44', 53 => 'REGISTER', 54 => 'UA64',
55 => 'UA16', 56 => 'TLS_GD_HI22', 57 => 'TLS_GD_LO10',
58 => 'TLS_GD_ADD', 59 => 'TLS_GD_CALL',
60 => 'TLS_LDM_HI22', 61 => 'TLS_LDM_LO10',
62 => 'TLS_LDM_ADD', 63 => 'TLS_LDM_CALL',
64 => 'TLS_LDO_HIX22', 65 => 'TLS_LDO_LOX10',
66 => 'TLS_LDO_ADD', 67 => 'TLS_IE_HI22',
68 => 'TLS_IE_LO10', 69 => 'TLS_IE_LD',
70 => 'TLS_IE_LDX', 71 => 'TLS_IE_ADD',
72 => 'TLS_LE_HIX22', 73 => 'TLS_LE_LOX10',
74 => 'TLS_DTPMOD32', 75 => 'TLS_DTPMOD64',
76 => 'TLS_DTPOFF32', 77 => 'TLS_DTPOFF64',
78 => 'TLS_TPOFF32', 79 => 'TLS_TPOFF64' },
'X86_64' => { 0 => 'NONE',
1 => '64', 2 => 'PC32', 3 => 'GOT32', 4 => 'PLT32',
5 => 'COPY', 6 => 'GLOB_DAT', 7 => 'JMP_SLOT',
8 => 'RELATIVE', 9 => 'GOTPCREL', 10 => '32',
11 => '32S', 12 => '16', 13 => 'PC16', 14 => '8',
15 => 'PC8', 16 => 'DTPMOD64', 17 => 'DTPOFF64',
18 => 'TPOFF64', 19 => 'TLSGD', 20 => 'TLSLD',
21 => 'DTPOFF32', 22 => 'GOTTPOFF', 23 => 'TPOFF32' }
)
class Header
attr_accessor :type, :machine, :version, :entry, :phoff, :shoff, :flags, :ehsize, :phentsize, :phnum, :shentsize, :shnum, :shstrndx
attr_accessor :magic, :e_class, :data, :i_version, :abi, :abi_version, :ident
def self.size elf
x = elf.bitsize >> 3
40 + 3*x
end
end
class Segment
attr_accessor :type, :offset, :vaddr, :paddr, :filesz, :memsz, :flags, :align
attr_accessor :encoded
def self.size elf
x = elf.bitsize >> 3
8 + 6*x
end
end
class Section
attr_accessor :name_p, :name, :type, :flags, :addr, :offset, :size, :link, :info, :addralign, :entsize
attr_accessor :encoded
def self.size elf
x = elf.bitsize >> 3
16 + 6*x
end
end
class Symbol
attr_accessor :name_p, :name, :size, :bind, :value, :type, :other, :shndx
attr_accessor :thunk
def self.size elf
x = elf.bitsize >> 3
12 + x
end
def set_info(elf, info)
@bind = elf.int_to_hash((info >> 4) & 15, SYMBOL_BIND)
@type = elf.int_to_hash(info & 15, SYMBOL_TYPE)
end
def get_info(elf)
((elf.int_from_hash(@bind, SYMBOL_BIND) & 15) << 4) |
(elf.int_from_hash(@type, SYMBOL_TYPE) & 15)
end
end
class Relocation
attr_accessor :offset, :type, :symbol, :addend
def self.size elf
x = elf.bitsize >> 3
2*x
end
def self.size_a elf
x = elf.bitsize >> 3
3*x
end
def set_info(elf, info, symtab)
v = (elf.bitsize == 32 ? 8 : 32)
@type = elf.int_to_hash((info & ((1 << v) - 1)), RELOCATION_TYPE[elf.header.machine])
@symbol = (info >> v) & 0xffff_ffff
@symbol = symtab[@symbol] if symtab[@symbol]
end
def get_info(elf, symtab)
v = (elf.bitsize == 32 ? 8 : 32)
s = symbol || 0
s = symtab.index(s) if s.kind_of? Symbol
(s << v) |
(elf.int_from_hash(@type, RELOCATION_TYPE[elf.header.machine]) & ((1 << v)-1))
end
end
def self.hash_symbol_name(name)
name.unpack('C*').inject(0) { |hash, char|
break hash if char == 0
hash <<= 4
hash += char
hash ^= (hash >> 24) & 0xf0
hash &= 0x0fff_ffff
}
end
def self.gnu_hash_symbol_name(name)
name.unpack('C*').inject(5381) { |hash, char|
break hash if char == 0
hash *= 33
hash += char
hash &= 0xffff_ffff
}
end
attr_accessor :header, :segments, :sections, :tag, :symbols, :relocations, :endianness, :bitsize
def initialize(cpu=nil)
@header = Header.new
@tag = {}
@symbols = [Symbol.new]
@symbols.first.shndx = 'UNDEF'
@relocations = []
@sections = [Section.new]
@sections.first.type = 'NULL'
@segments = []
if cpu
@endianness = cpu.endianness
@bitsize = cpu.size
case cpu
when Ia32; @header.machine = '386'
end
else
@endianness = :little
@bitsize = 32
end
super
end
end
end
# TODO symbol version info
__END__
/*
* Version structures. There are three types of version structure:
*
* o A definition of the versions within the image itself.
* Each version definition is assigned a unique index (starting from
* VER_NDX_BGNDEF) which is used to cross-reference symbols associated to
* the version. Each version can have one or more dependencies on other
* version definitions within the image. The version name, and any
* dependency names, are specified in the version definition auxiliary
* array. Version definition entries require a version symbol index table.
*
* o A version requirement on a needed dependency. Each needed entry
* specifies the shared object dependency (as specified in DT_NEEDED).
* One or more versions required from this dependency are specified in the
* version needed auxiliary array.
*
* o A version symbol index table. Each symbol indexes into this array
* to determine its version index. Index values of VER_NDX_BGNDEF or
* greater indicate the version definition to which a symbol is associated.
* (the size of a symbol index entry is recorded in the sh_info field).
*/
#ifndef _ASM
typedef struct { /* Version Definition Structure. */
Elf32_Half vd_version; /* this structures version revision */
Elf32_Half vd_flags; /* version information */
Elf32_Half vd_ndx; /* version index */
Elf32_Half vd_cnt; /* no. of associated aux entries */
Elf32_Word vd_hash; /* version name hash value */
Elf32_Word vd_aux; /* no. of bytes from start of this */
/* verdef to verdaux array */
Elf32_Word vd_next; /* no. of bytes from start of this */
} Elf32_Verdef; /* verdef to next verdef entry */
typedef struct { /* Verdef Auxiliary Structure. */
Elf32_Word vda_name; /* first element defines the version */
/* name. Additional entries */
/* define dependency names. */
Elf32_Word vda_next; /* no. of bytes from start of this */
} Elf32_Verdaux; /* verdaux to next verdaux entry */
typedef struct { /* Version Requirement Structure. */
Elf32_Half vn_version; /* this structures version revision */
Elf32_Half vn_cnt; /* no. of associated aux entries */
Elf32_Word vn_file; /* name of needed dependency (file) */
Elf32_Word vn_aux; /* no. of bytes from start of this */
/* verneed to vernaux array */
Elf32_Word vn_next; /* no. of bytes from start of this */
} Elf32_Verneed; /* verneed to next verneed entry */
typedef struct { /* Verneed Auxiliary Structure. */
Elf32_Word vna_hash; /* version name hash value */
Elf32_Half vna_flags; /* version information */
Elf32_Half vna_other;
Elf32_Word vna_name; /* version name */
Elf32_Word vna_next; /* no. of bytes from start of this */
} Elf32_Vernaux; /* vernaux to next vernaux entry */
typedef Elf32_Half Elf32_Versym; /* Version symbol index array */
typedef struct {
Elf32_Half si_boundto; /* direct bindings - symbol bound to */
Elf32_Half si_flags; /* per symbol flags */
} Elf32_Syminfo;
#if (defined(_LP64) || ((__STDC__ - 0 == 0) && (!defined(_NO_LONGLONG))))
typedef struct {
Elf64_Half vd_version; /* this structures version revision */
Elf64_Half vd_flags; /* version information */
Elf64_Half vd_ndx; /* version index */
Elf64_Half vd_cnt; /* no. of associated aux entries */
Elf64_Word vd_hash; /* version name hash value */
Elf64_Word vd_aux; /* no. of bytes from start of this */
/* verdef to verdaux array */
Elf64_Word vd_next; /* no. of bytes from start of this */
} Elf64_Verdef; /* verdef to next verdef entry */
typedef struct {
Elf64_Word vda_name; /* first element defines the version */
/* name. Additional entries */
/* define dependency names. */
Elf64_Word vda_next; /* no. of bytes from start of this */
} Elf64_Verdaux; /* verdaux to next verdaux entry */
typedef struct {
Elf64_Half vn_version; /* this structures version revision */
Elf64_Half vn_cnt; /* no. of associated aux entries */
Elf64_Word vn_file; /* name of needed dependency (file) */
Elf64_Word vn_aux; /* no. of bytes from start of this */
/* verneed to vernaux array */
Elf64_Word vn_next; /* no. of bytes from start of this */
} Elf64_Verneed; /* verneed to next verneed entry */
typedef struct {
Elf64_Word vna_hash; /* version name hash value */
Elf64_Half vna_flags; /* version information */
Elf64_Half vna_other;
Elf64_Word vna_name; /* version name */
Elf64_Word vna_next; /* no. of bytes from start of this */
} Elf64_Vernaux; /* vernaux to next vernaux entry */
typedef Elf64_Half Elf64_Versym;
typedef struct {
Elf64_Half si_boundto; /* direct bindings - symbol bound to */
Elf64_Half si_flags; /* per symbol flags */
} Elf64_Syminfo;
#endif /* (defined(_LP64) || ((__STDC__ - 0 == 0) ... */
#endif
/*
* Versym symbol index values. Values greater than VER_NDX_GLOBAL
* and less then VER_NDX_LORESERVE associate symbols with user
* specified version descriptors.
*/
#define VER_NDX_LOCAL 0 /* symbol is local */
#define VER_NDX_GLOBAL 1 /* symbol is global and assigned to */
/* the base version */
#define VER_NDX_LORESERVE 0xff00 /* beginning of RESERVED entries */
#define VER_NDX_ELIMINATE 0xff01 /* symbol is to be eliminated */
/*
* Verdef and Verneed (via Veraux) flags values.
*/
#define VER_FLG_BASE 0x1 /* version definition of file itself */
#define VER_FLG_WEAK 0x2 /* weak version identifier */
/*
* Verdef version values.
*/
#define VER_DEF_NONE 0 /* Ver_def version */
#define VER_DEF_CURRENT 1
#define VER_DEF_NUM 2
/*
* Verneed version values.
*/
#define VER_NEED_NONE 0 /* Ver_need version */
#define VER_NEED_CURRENT 1
#define VER_NEED_NUM 2
/*
* Syminfo flag values
*/
#define SYMINFO_FLG_DIRECT 0x0001 /* direct bound symbol */
#define SYMINFO_FLG_PASSTHRU 0x0002 /* pass-thru symbol for translator */
#define SYMINFO_FLG_COPY 0x0004 /* symbol is a copy-reloc */
#define SYMINFO_FLG_LAZYLOAD 0x0008 /* symbol bound to object to be lazy */
/* loaded */
/*
* key values for Syminfo.si_boundto
*/
#define SYMINFO_BT_SELF 0xffff /* symbol bound to self */
#define SYMINFO_BT_PARENT 0xfffe /* symbol bound to parent */
#define SYMINFO_BT_LOWRESERVE 0xff00 /* beginning of reserved entries */
/*
* Syminfo version values.
*/
#define SYMINFO_NONE 0 /* Syminfo version */
#define SYMINFO_CURRENT 1
#define SYMINFO_NUM 2
P

View File

@ -1,725 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/decode'
require 'metasm/exe_format/elf'
module Metasm
class ELF
class Header
# decodes the elf header, pointed to by elf.encoded.ptr
def decode elf
@ident = elf.encoded.read 16
@magic = @ident[0, 4]
raise InvalidExeFormat, "E: ELF: invalid ELF signature #{@magic.inspect}" if @magic != "\x7fELF"
@e_class = elf.int_to_hash(@ident[4], CLASS)
case @e_class
when '32'; elf.bitsize = 32
when '64', '64_icc'; elf.bitsize = 64
else raise InvalidExeFormat, "E: ELF: unsupported class #{@e_class}"
end
@data = elf.int_to_hash(@ident[5], DATA)
case @data
when 'LSB'; elf.endianness = :little
when 'MSB'; elf.endianness = :big
else raise InvalidExeFormat, "E: ELF: unsupported endianness #{@data}"
end
# from there we can use elf.decode_word etc
@version = elf.int_to_hash(@ident[6], VERSION)
case @version
when 'CURRENT'
else raise "E: ELF: unsupported ELF version #{@version}"
end
@abi = elf.int_to_hash(@ident[7], ABI)
@abi_version = @ident[8]
# decodes the architecture-dependant part
@type = elf.int_to_hash(elf.decode_half, TYPE)
@machine = elf.int_to_hash(elf.decode_half, MACHINE)
@version = elf.int_to_hash(elf.decode_word, VERSION)
@entry = elf.decode_addr
@phoff = elf.decode_off
@shoff = elf.decode_off
@flags = elf.bits_to_hash(elf.decode_word, FLAGS[@machine])
@ehsize = elf.decode_half
@phentsize = elf.decode_half
@phnum = elf.decode_half
@shentsize = elf.decode_half
@shnum = elf.decode_half
@shstrndx = elf.decode_half
end
end
class Section
# decodes the section header pointed to by elf.encoded.ptr
def decode elf
@name_p = elf.decode_word
@type = elf.int_to_hash(elf.decode_word, SH_TYPE)
@flags = elf.bits_to_hash(elf.decode_xword, SH_FLAGS)
@addr = elf.decode_addr
@offset = elf.decode_off
@size = elf.decode_xword
@link = elf.decode_word
@info = elf.decode_word
@addralign = elf.decode_xword
@entsize = elf.decode_xword
end
end
class Segment
# decodes the program header pointed to by elf.encoded.ptr
def decode elf
@type = elf.int_to_hash(elf.decode_word, PH_TYPE)
@flags = elf.bits_to_hash(elf.decode_word, PH_FLAGS) if elf.bitsize == 64
@offset = elf.decode_off
@vaddr = elf.decode_addr
@paddr = elf.decode_addr
@filesz = elf.decode_xword
@memsz = elf.decode_xword
@flags = elf.bits_to_hash(elf.decode_word, PH_FLAGS) if elf.bitsize == 32
@align = elf.decode_xword
end
end
class Symbol
# decodes the symbol pointed to by elf.encoded.ptr
# read the symbol name from strtab
def decode elf, strtab=nil
case elf.bitsize
when 32
@name_p = elf.decode_word
@value = elf.decode_addr
@size = elf.decode_word
set_info(elf, elf.decode_uchar)
@other = elf.decode_uchar
@shndx = elf.int_to_hash(elf.decode_half, SH_INDEX)
when 64
@name_p = elf.decode_word
set_info(elf, elf.decode_uchar)
@other = elf.decode_uchar
@shndx = elf.int_to_hash(elf.decode_half, SH_INDEX)
@value = elf.decode_addr
@size = elf.decode_xword
end
@name = elf.readstr(strtab, @name_p) if strtab
end
end
class Relocation
# decodes the relocation with no explicit addend pointed to by elf.encoded.ptr
# the symbol is taken from ary if possible, and is set to nil for index 0
def decode(elf, symtab)
@offset = elf.decode_addr
set_info(elf, elf.decode_xword, symtab)
end
# same as +decode+, but with explicit addend (RELA)
def decode_addend(elf, symtab)
decode(elf, symtab)
@addend = elf.decode_sxword
end
end
# basic immediates decoding functions
def decode_uchar(edata = @encoded) edata.decode_imm(:u8, @endianness) end
def decode_half( edata = @encoded) edata.decode_imm(:u16, @endianness) end
def decode_word( edata = @encoded) edata.decode_imm(:u32, @endianness) end
def decode_sword(edata = @encoded) edata.decode_imm(:i32, @endianness) end
def decode_xword(edata = @encoded) edata.decode_imm((@bitsize == 32 ? :u32 : :u64), @endianness) end
def decode_sxword(edata= @encoded) edata.decode_imm((@bitsize == 32 ? :i32 : :i64), @endianness) end
alias decode_addr decode_xword
alias decode_off decode_xword
def readstr(str, off)
if off > 0 and i = str.index(0, off) rescue false # LoadedElf with arbitrary pointer...
str[off...i]
end
end
# transforms a virtual address to a file offset, from mmaped segments addresses
def addr_to_off addr
s = @segments.find { |s| s.type == 'LOAD' and s.vaddr <= addr and s.vaddr + s.memsz > addr } if addr
addr - s.vaddr + s.offset if s
end
# make an export of +self.encoded+, returns the label name if successful
def add_label(name, addr)
if not o = addr_to_off(addr)
puts "W: Elf: #{name} points to unmmaped space #{'0x%08X' % addr}" if $VERBOSE
else
l = new_label(name)
@encoded.add_export l, o
end
l
end
# decodes the elf header, section & program header
def decode_header(off = 0)
@encoded.ptr = off
@header.decode self
raise InvalidExeFormat, "Invalid elf header size: #{@header.ehsize}" if Header.size(self) != @header.ehsize
if @header.phoff != 0
decode_program_header(@header.phoff+off)
end
if @header.shoff != 0
decode_section_header(@header.shoff+off)
end
end
# decodes the section header
# section names are read from shstrndx if possible
def decode_section_header(off = @header.shoff)
raise InvalidExeFormat, "Invalid elf section header size: #{@header.shentsize}" if Section.size(self) != @header.shentsize
@encoded.add_export new_label('section_header'), off
@encoded.ptr = off
@sections.clear
@header.shnum.times {
s = Section.new
s.decode(self)
@sections << s
}
# read sections name
if @header.shstrndx != 0 and str = @sections[@header.shstrndx] and str.encoded = @encoded[str.offset, str.size]
# LoadedElf may not have shstr mmaped
@sections[1..-1].each { |s|
s.name = readstr(str.encoded.data, s.name_p)
add_label("section_#{s.name}", s.addr) if s.name and s.addr > 0
}
end
end
# decodes the program header table
# marks the elf entrypoint as an export of +self.encoded+
def decode_program_header(off = @header.phoff)
raise InvalidExeFormat, "Invalid elf program header size: #{@header.phentsize}" if Segment.size(self) != @header.phentsize
@encoded.add_export new_label('program_header'), off
@encoded.ptr = off
@segments.clear
@header.phnum.times {
s = Segment.new
s.decode(self)
@segments << s
}
if @header.entry != 0
add_label('entrypoint', @header.entry)
end
end
# read the dynamic symbols hash table, and checks that every global and named symbol is accessible through it
# outputs a warning if it's not and $VERBOSE is set
def check_symbols_hash(off = @tag['HASH'])
return if not @encoded.ptr = off
hash_bucket_len = decode_word
sym_count = decode_word
hash_bucket = [] ; hash_bucket_len.times { hash_bucket << decode_word }
hash_table = [] ; sym_count.times { hash_table << decode_word }
@symbols.each { |s|
next if not s.name or s.bind != 'GLOBAL' or s.shndx == 'UNDEF'
found = false
h = ELF.hash_symbol_name(s.name)
off = hash_bucket[h % hash_bucket_len]
sym_count.times { # to avoid DoS by loop
break if off == 0
if ss = @symbols[off] and ss.name == s.name
found = true
break
end
off = hash_table[off]
}
if not found
puts "W: Elf: Symbol #{s.name.inspect} not found in hash table" if $VERBOSE
end
}
end
# checks every symbol's accessibility through the gnu_hash table
def check_symbols_gnu_hash(off = @tag['GNU_HASH'])
return if not @encoded.ptr = off
# when present: the symndx first symbols are not sorted (SECTION/LOCAL/FILE/etc) symtable[symndx] is sorted (1st sorted symbol)
# the sorted symbols are sorted by [gnu_hash_symbol_name(symbol.name) % hash_bucket_len]
hash_bucket_len = decode_word
symndx = decode_word # index of first sorted symbol in symtab
maskwords = decode_word # number of words in the second part of the ghash section (32 or 64 bits)
shift2 = decode_word # used in the bloom filter
bloomfilter = [] ; maskwords.times { bloomfilter << decode_xword }
# "bloomfilter[N] has bit B cleared if there is no M (M > symndx) which satisfies (C = @header.class)
# ((gnu_hash(sym[M].name) / C) % maskwords) == N &&
# ((gnu_hash(sym[M].name) % C) == B ||
# ((gnu_hash(sym[M].name) >> shift2) % C) == B"
# bloomfilter may be [~0]
hash_bucket = [] ; hash_bucket_len.times { hash_bucket << decode_word }
# bucket[N] contains the lowest M for which
# gnu_hash(sym[M]) % nbuckets == N
# or 0 if none
symcount = 0 # XXX how do we get symcount ?
part4 = [] ; (symcount - symndx).times { part4 << decode_word }
# part4[N] contains
# (gnu_hash(sym[N].name) & ~1) | (N == dynsymcount-1 || (gnu_hash(sym[N].name) % nbucket) != (gnu_hash(sym[N+1].name) % nbucket))
# that's the hash, with its lower bit replaced by the bool [1 if i am the last sym having my hash as hash]
# TODO
end
# read dynamic tags array
def decode_tags(off = nil)
if not off
if s = @segments.find { |s| s.type == 'DYNAMIC' }
# this way it also works with LoadedELF
off = addr_to_off(s.vaddr)
elsif s = @sections.find { |s| s.type == 'DYNAMIC' }
# if no DYNAMIC segment, assume we decode an ET_REL from file
off = s.offset
end
end
return if not @encoded.ptr = off
@tag = {}
loop do
tag = decode_sxword
val = decode_xword
case tag = int_to_hash(tag, DYNAMIC_TAG)
when 'NULL'
@tag[tag] = val
break
when Integer
puts "W: Elf: unknown dynamic tag 0x#{tag.to_s 16}" if $VERBOSE
@tag[tag] ||= []
@tag[tag] << val
when 'NEEDED' # here, list of tags for which multiple occurences are allowed
@tag[tag] ||= []
@tag[tag] << val
when 'POSFLAG_1'
puts "W: Elf: ignoring dynamic tag modifier #{tag} #{int_to_hash(val, DYNAMIC_POSFLAG_1)}" if $VERBOSE
else
if @tag[tag]
puts "W: Elf: ignoring re-occurence of dynamic tag #{tag} (value #{'0x%08X' % val})" if $VERBOSE
else
@tag[tag] = val
end
end
end
end
# interprets tags (convert flags, arrays etc), mark them as self.encoded.export
def decode_segments_tags_interpret
if @tag['STRTAB']
if not sz = @tag['STRSZ']
puts "W: Elf: no string table size tag" if $VERBOSE
else
if l = add_label('dynamic_strtab', @tag['STRTAB'])
@tag['STRTAB'] = l
strtab = @encoded[l, sz].data
end
end
end
@tag.keys.each { |k|
case k
when Integer
when 'NEEDED'
# array of strings
if not strtab
puts "W: Elf: no string table, needed for tag #{k}" if $VERBOSE
next
end
@tag[k].map! { |v| readstr(strtab, v) }
when 'SONAME', 'RPATH', 'RUNPATH'
# string
if not strtab
puts "W: Elf: no string table, needed for tag #{k}" if $VERBOSE
next
end
@tag[k] = readstr(strtab, @tag[k])
when 'INIT', 'FINI', 'PLTGOT', 'HASH', 'GNU_HASH', 'SYMTAB', 'RELA', 'REL', 'JMPREL'
@tag[k] = add_label('dynamic_' + k.downcase, @tag[k]) || @tag[k]
when 'INIT_ARRAY', 'FINI_ARRAY', 'PREINIT_ARRAY'
next if not l = add_label('dynamic_' + k.downcase, @tag[k])
if not sz = @tag.delete(k+'SZ')
puts "W: Elf: tag #{k} has no corresponding size tag" if $VERBOSE
next
end
tab = @encoded[l, sz]
tab.ptr = 0
@tag[k] = []
while tab.ptr < tab.length
a = decode_addr(tab)
@tag[k] << (add_label("dynamic_#{k.downcase}_#{@tag[k].length}", a) || a)
end
when 'PLTREL'; @tag[k] = int_to_hash(@tag[k], DYNAMIC_TAG)
when 'FLAGS'; @tag[k] = bits_to_hash(@tag[k], DYNAMIC_FLAGS)
when 'FLAGS_1'; @tag[k] = bits_to_hash(@tag[k], DYNAMIC_FLAGS_1)
when 'FEATURES_1'; @tag[k] = bits_to_hash(@tag[k], DYNAMIC_FEATURES_1)
end
}
end
# read symbol table, and mark all symbols found as exports of self.encoded
# tables locations are found in self.tags
# XXX symbol count is found from the hash table, this may not work with GNU_HASH only binaries
def decode_segments_symbols
return unless @tag['STRTAB'] and @tag['STRSZ'] and @tag['SYMTAB'] and (@tag['HASH'] or @tag['GNU_HASH'])
raise "E: ELF: unsupported symbol entry size: #{@tag['SYMENT']}" if @tag['SYMENT'] != Symbol.size(self)
# find number of symbols
if @tag['HASH']
@encoded.ptr = @tag['HASH'] # assume tag already interpreted (would need addr_to_off otherwise)
decode_word
sym_count = decode_word
else
raise 'metasm internal error: TODO find sym_count from gnu_hash'
@encoded.ptr = @tag['GNU_HASH']
decode_word
sym_count = decode_word # non hashed symbols
# XXX UNDEF symbols are not hashed
end
strtab = @encoded[@tag['STRTAB'], @tag['STRSZ']].data
@encoded.ptr = @tag['SYMTAB']
@symbols.clear
sym_count.times {
s = Symbol.new
s.decode self, strtab
@symbols << s
# mark in @encoded.export
if s.name and s.shndx != 'UNDEF' and %w[NOTYPE OBJECT FUNC].include?(s.type)
if not o = addr_to_off(s.value)
# allow to point to end of segment
if not seg = @segments.find { |seg| seg.type == 'LOAD' and seg.vaddr + seg.memsz == s.value } # check end
puts "W: Elf: symbol points to unmmaped space (#{s.inspect})" if $VERBOSE and s.shndx != 'ABS'
next
end
# LoadedELF would have returned an addr_to_off = addr
o = s.value - seg.vaddr + seg.offset
end
name = s.name
while @encoded.export[name] and @encoded.export[name] != o
puts "W: Elf: symbol #{name} already seen at #{'%X' % @encoded.export[name]} - now at #{'%X' % o}) (may be a different version definition)" if $VERBOSE
name += '_' # do not modify inplace
end
@encoded.add_export name, o
end
}
check_symbols_hash if $VERBOSE
check_symbols_gnu_hash if $VERBOSE
end
# decode relocation tables (REL, RELA, JMPREL) from @tags
def decode_segments_relocs
@relocations.clear
if @encoded.ptr = @tag['REL']
raise "E: ELF: unsupported rel entry size #{@tag['RELENT']}" if @tag['RELENT'] != Relocation.size(self)
p_end = @encoded.ptr + @tag['RELSZ']
while @encoded.ptr < p_end
r = Relocation.new
r.decode self, @symbols
@relocations << r
end
end
if @encoded.ptr = @tag['RELA']
raise "E: ELF: unsupported rela entry size #{@tag['RELAENT'].inspect}" if @tag['RELAENT'] != Relocation.size_a(self)
p_end = @encoded.ptr + @tag['RELASZ']
while @encoded.ptr < p_end
r = Relocation.new
r.decode_addend self, @symbols
@relocations << r
end
end
if @encoded.ptr = @tag['JMPREL']
case reltype = @tag['PLTREL']
when 'REL'; msg = :decode
when 'RELA'; msg = :decode_addend
else raise "E: ELF: unsupported plt relocation type #{reltype}"
end
p_end = @encoded.ptr + @tag['PLTRELSZ']
while @encoded.ptr < p_end
r = Relocation.new
r.send(msg, self, @symbols)
@relocations << r
end
end
end
# use relocations as self.encoded.reloc
def decode_segments_relocs_interpret
relocproc = "arch_decode_segments_reloc_#{@header.machine.to_s.downcase}"
if not respond_to? relocproc
puts "W: Elf: relocs for arch #{@header.machine} unsupported" if $VERBOSE
@relocations.each { |r| puts Expression[r.offset] }
return
end
@relocations.each { |r|
next if r.offset == 0
if not o = addr_to_off(r.offset)
puts "W: Elf: relocation in unmmaped space (#{r.inspect})" if $VERBOSE
next
end
if @encoded.reloc[o]
puts "W: Elf: not rerelocating address #{'%08X' % r.offset}" if $VERBOSE
next
end
@encoded.ptr = o
if rel = send(relocproc, r)
@encoded.reloc[o] = rel
end
}
end
# returns the Metasm::Relocation that should be applied for reloc
# self.encoded.ptr must point to the location that will be relocated (for implicit addends)
def arch_decode_segments_reloc_386(reloc)
if reloc.symbol and n = reloc.symbol.name and reloc.symbol.shndx == 'UNDEF' and @sections and
s = @sections.find { |s| s.name and s.offset <= @encoded.ptr and s.offset + s.size > @encoded.ptr }
@encoded.add_export(new_label("#{s.name}_#{n}"), @encoded.ptr, true)
end
# decode addend if needed
case reloc.type
when 'NONE', 'COPY', 'GLOB_DAT', 'JMP_SLOT' # no addend
else addend = reloc.addend || decode_sword
end
case reloc.type
when 'NONE'
when 'RELATIVE'
# base = @segments.find_all { |s| s.type == 'LOAD' }.map { |s| s.vaddr }.min & 0xffff_f000
# compiled to be loaded at seg.vaddr
target = addend
if o = addr_to_off(target)
if not label = @encoded.inv_export[o]
label = new_label('xref_%04x' % target)
@encoded.add_export label, o
end
target = label
else
puts "W: Elf: relocation pointing out of mmaped space #{reloc.inspect}" if $VERBOSE
end
when 'GLOB_DAT', 'JMP_SLOT', '32', 'PC32', 'TLS_TPOFF', 'TLS_TPOFF32'
# XXX use versionned version
# lazy jmp_slot ?
target = 0
target = reloc.symbol.name if reloc.symbol.kind_of?(Symbol) and reloc.symbol.name
target = Expression[target, :-, reloc.offset] if reloc.type == 'PC32'
target = Expression[target, :+, addend] if addend and addend != 0
target = Expression[target, :+, 'tlsoffset'] if reloc.type == 'TLS_TPOFF'
target = Expression[:-, [target, :+, 'tlsoffset']] if reloc.type == 'TLS_TPOFF32'
when 'COPY'
# mark the address pointed as a copy of the relocation target
if not reloc.symbol or not name = reloc.symbol.name
puts "W: Elf: symbol to COPY has no name: #{reloc.inspect}" if $VERBOSE
name = ''
end
name = new_label("copy_of_#{name}")
@encoded.add_export name, @encoded.ptr
target = nil
else
puts "W: Elf: unhandled 386 reloc #{reloc.inspect}" if $VERBOSE
target = nil
end
Metasm::Relocation.new(Expression[target], :u32, @endianness) if target
end
# returns the Metasm::Relocation that should be applied for reloc
# self.encoded.ptr must point to the location that will be relocated (for implicit addends)
def arch_decode_segments_reloc_mips(reloc)
if reloc.symbol and n = reloc.symbol.name and reloc.symbol.shndx == 'UNDEF' and @sections and
s = @sections.find { |s| s.name and s.offset <= @encoded.ptr and s.offset + s.size > @encoded.ptr }
@encoded.add_export(new_label("#{s.name}_#{n}"), @encoded.ptr, true)
end
# decode addend if needed
case reloc.type
when 'NONE' # no addend
else addend = reloc.addend || decode_sword
end
case reloc.type
when 'NONE'
when '32', 'REL32'
target = 0
target = reloc.symbol.name if reloc.symbol.kind_of?(Symbol) and reloc.symbol.name
target = Expression[target, :-, reloc.offset] if reloc.type == 'REL32'
target = Expression[target, :+, addend] if addend and addend != 0
else
puts "W: Elf: unhandled MIPS reloc #{reloc.inspect}" if $VERBOSE
target = nil
end
Metasm::Relocation.new(Expression[target], :u32, @endianness) if target
end
# decodes the ELF dynamic tags, interpret them, and decodes symbols and relocs
def decode_segments_dynamic
return if not dynamic = @segments.find { |s| s.type == 'DYNAMIC' }
@encoded.ptr = add_label('dynamic_tags', dynamic.vaddr)
decode_tags
decode_segments_tags_interpret
decode_segments_symbols
decode_segments_relocs
decode_segments_relocs_interpret
end
# decodes the dynamic segment, fills segments.encoded
def decode_segments
decode_segments_dynamic
@segments.each { |s|
case s.type
when 'LOAD', 'INTERP'
s.encoded = @encoded[s.offset, s.filesz]
s.encoded.virtsize = s.memsz if s.memsz > s.encoded.virtsize
end
}
end
# decodes sections, interprets symbols/relocs, fills sections.encoded
def decode_sections
@sections.each { |s|
case s.type
when 'PROGBITS', 'NOBITS'
when 'TODO' # TODO
end
}
@sections.find_all { |s| s.type == 'PROGBITS' or s.type == 'NOBITS' }.each { |s|
if s.flags.include? 'ALLOC'
if s.type == 'NOBITS'
s.encoded = EncodedData.new :virtsize => s.size
else
s.encoded = @encoded[s.offset, s.size] || EncodedData.new
s.encoded.virtsize = s.size
end
end
}
end
# decodes the elf header, and depending on the elf type, decode segments or sections
def decode
decode_header
case @header.type
when 'DYN', 'EXEC'; decode_segments
when 'REL'; decode_sections
when 'CORE'
end
end
def each_section
@segments.each { |s| yield s.encoded, s.vaddr if s.type == 'LOAD' }
# @sections ?
end
# returns a metasm CPU object corresponding to +header.machine+
def cpu_from_headers
case @header.machine
when '386'; Ia32.new
when 'MIPS'; MIPS.new @endianness
else raise "unknown cpu #{@header.machine}"
end
end
# returns an array including the ELF entrypoint (if not null) and the FUNC symbols addresses
# TODO include init/init_array
def get_default_entrypoints
ep = []
ep << @header.entry if @header.entry != 0
@symbols.each { |s|
ep << s.value if s.shndx != 'UNDEF' and s.type == 'FUNC'
} if @symbols
ep
end
def dump_section_header(addr, edata)
if s = @segments.find { |s| s.vaddr == addr }
"\n// ELF segment at #{Expression[addr]}, flags = #{s.flags.sort.join(', ')}"
else super
end
end
# returns a disassembler with a special decodedfunction for dlsym, __libc_start_main, and a default function (i386 only)
def init_disassembler
d = super
d.backtrace_maxblocks_data = 4
case @cpu
when Ia32
old_cp = d.c_parser
d.c_parser = nil
d.parse_c 'void *dlsym(int, char *);'
d.parse_c 'void __libc_start_main(void(*)(), int, int, void(*)(), void(*)()) __attribute__((noreturn));'
dls = @cpu.decode_c_function_prototype(d.c_parser, 'dlsym')
main = @cpu.decode_c_function_prototype(d.c_parser, '__libc_start_main')
d.c_parser = old_cp
dls.btbind_callback = proc { |dasm, bind, funcaddr, calladdr, expr, origin, maxdepth|
sz = @cpu.size/8
raise 'dlsym call error' if not dasm.decoded[calladdr]
fnaddr = dasm.backtrace(Indirection.new(Expression[:esp, :+, 2*sz], sz, calladdr), calladdr, :include_start => true, :maxdepth => maxdepth)
if fnaddr.kind_of? ::Array and fnaddr.length == 1 and s = dasm.get_section_at(fnaddr.first) and fn = s[0].read(64) and i = fn.index(0) and i > sz # try to avoid ordinals
bind = bind.merge :eax => Expression[fn[0, i]]
end
bind
}
d.function[Expression['dlsym']] = dls
d.function[Expression['__libc_start_main']] = main
df = d.function[:default] = @cpu.disassembler_default_func
df.backtrace_binding[:esp] = Expression[:esp, :+, 4]
df.btbind_callback = nil
when MIPS
(d.address_binding[@header.entry] ||= {})[:$t9] ||= Expression[@header.entry]
@symbols.each { |s|
next if s.shndx == 'UNDEF' or s.type != 'FUNC'
(d.address_binding[s.value] ||= {})[:$t9] ||= Expression[s.value]
}
d.function[:default] = @cpu.disassembler_default_func
end
d
end
end
class LoadedELF < ELF
attr_accessor :load_address
def addr_to_off(addr)
@load_address ||= 0
addr >= @load_address ? addr - @load_address : addr if addr
end
# decodes the dynamic segment, fills segments.encoded
def decode_segments
decode_segments_dynamic
@segments.each { |s|
if s.type == 'LOAD'
s.encoded = @encoded[s.vaddr, s.memsz]
end
}
end
# do not try to decode the section header by default
def decode_header(off = 0)
@encoded.ptr = off
@header.decode self
decode_program_header(@header.phoff+off)
end
end
end

File diff suppressed because it is too large Load Diff

View File

@ -1,172 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
require 'metasm/parse'
require 'metasm/encode'
require 'metasm/decode'
module Metasm
class ExeFormat
attr_accessor :cpu, :encoded
# creates a new instance, populates self.encoded with the supplied string
def self.load(str, *a)
e = new(*a)
e.encoded << str
e
end
# same as +load+, but from a file
# uses VirtualFile if available
def self.load_file(path, *a)
if defined? VirtualFile
load(VirtualFile.read(path), *a)
else
File.open(path, 'rb') { |fd| load(fd.read, *a) }
end
end
# +load_file+ then decode
def self.decode_file(path, *a)
e = load_file(path, *a)
e.decode
e
end
# +load_file+ then decode header
def self.decode_file_header(path, *a)
e = load_file(path, *a)
e.decode_header
e
end
def self.decode(raw, *a)
e = load(raw, *a)
e.decode
e
end
def self.decode_header(raw, *a)
e = load(raw, *a)
e.decode_header
e
end
# creates a new object using the specified cpu, parses the asm source, and assemble
def self.assemble(cpu, source, file=nil, lineno=nil)
caller.first =~ /^(.*?):(\d+)/
lineno ||= file ? 1 : $2.to_i+1
file ||= $1
e = new(cpu)
puts 'parsing asm' if $VERBOSE
e.parse(source, file, lineno)
puts 'assembling' if $VERBOSE
e.assemble
e
end
def self.assemble_file(cpu, filename)
assemble(cpu, File.read(filename), filename, 1)
end
# creates a new object using the specified cpu, parse/compile/assemble the C source
def self.compile_c(cpu, source, file=nil, lineno=nil)
caller.first =~ /^(.*?):(\d+)/
lineno ||= file ? 1 : $2.to_i+1
file ||= $1
e = new(cpu)
cp = cpu.new_cparser
puts 'parsing C' if $VERBOSE
cp.parse(source, file, lineno)
puts 'compiling C' if $VERBOSE
asm_source = cpu.new_ccompiler(cp, e).compile
puts 'parsing asm' if $VERBOSE
e.parse(asm_source, 'C compiler output', 1)
puts 'assembling' if $VERBOSE
e.assemble
e.c_set_default_entrypoint
e
end
def self.compile_c_file(cpu, filename)
compile_c(cpu, File.read(filename), filename, 1)
end
# add directive to change the current assembler section to the assembler source +src+
def compile_setsection(src, section)
src << section
end
def c_set_default_entrypoint
end
attr_accessor :disassembler
# returns the exe disassembler
# if it does not exist, creates one, and feeds it with the exe sections
def init_disassembler
@cpu ||= cpu_from_headers
@disassembler = Disassembler.new(self)
each_section { |edata, base| @disassembler.add_section edata, base }
@disassembler
end
# disassembles the specified entrypoints
# initializes the disassembler if needed
# uses get_default_entrypoints if the argument list is empty
# returns the disassembler
def disassemble(*entrypoints)
init_disassembler if not disassembler
entrypoints = get_default_entrypoints if entrypoints.empty?
@disassembler.disassemble(*entrypoints)
end
# returns a list of entrypoints to disassemble (program entrypoint, exported functions...)
def get_default_entrypoints
[]
end
# encodes the executable as a string, checks that all relocations are
# resolved, and returns the raw string version
def encode_string(*a)
encode(*a)
raise ["Unresolved relocations:", @encoded.reloc.map { |o, r| "#{r.target} " + (Backtrace.backtrace_str(r.backtrace) if r.backtrace).to_s }].join("\n") if not @encoded.reloc.empty?
@encoded.data
end
# saves the result of +encode_string+ in the specified file
# fails if the file already exists
def encode_file(path, *a)
#raise Errno::EEXIST, path if File.exist? path # race, but cannot use O_EXCL, as O_BINARY is not defined in ruby
encode_string(*a)
File.open(path, 'wb') { |fd| fd.write(@encoded.data) }
end
# converts a constant name to its numeric value using the hash
# {1 => 'toto', 2 => 'tata'}: 'toto' => 1, 42 => 42, 'tutu' => raise
def int_from_hash(val, hash)
val.kind_of?(Integer) ? val : hash.index(val) or raise "unknown constant #{val.inspect}"
end
# converts an array of flag constants to its numeric value using the hash
# {1 => 'toto', 2 => 'tata'}: ['toto', 'tata'] => 3, 'toto' => 2, 42 => 42
def bits_from_hash(val, hash)
val.kind_of?(Array) ? val.inject(0) { |val, bitname| val | int_from_hash(bitname, hash) } : int_from_hash(val, hash)
end
# converts a numeric value to the corresponding constant name using the hash
# {1 => 'toto', 2 => 'tata'}: 1 => 'toto', 42 => 42, 'tata' => 'tata', 'tutu' => raise
def int_to_hash(val, hash)
val.kind_of?(Integer) ? hash.fetch(val, val) : (hash.index(val) ? val : raise("unknown constant #{val.inspect}"))
end
# converts a numeric value to the corresponding array of constant flag names using the hash
# {1 => 'toto', 2 => 'tata'}: 5 => ['toto', 4]
def bits_to_hash(val, hash)
(val.kind_of?(Integer) ? (hash.find_all { |k, v| val & k == k and val &= ~k }.map { |k, v| v } << val) : val.kind_of?(Array) ? val.map { |e| int_to_hash(e, hash) } : [int_to_hash(val, hash)]) - [0]
end
end
end

View File

@ -1,185 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
require 'metasm/encode'
require 'metasm/decode'
module Metasm
class MZ < ExeFormat
class Header
Fields = [:magic, :cblp, :cp, :crlc, :cparhdr, :minalloc, :maxalloc,
:ss, :sp, :csum, :ip, :cs, :lfarlc, :ovno]
attr_accessor(*Fields)
def encode(mz, relocs)
h = EncodedData.new
set_default_values mz, h, relocs
h << @magic
Fields[1..-1].each { |m| h << mz.encode_word(send(m)) }
h.align 16
h
end
def set_default_values mz, h, relocs
@magic ||= 'MZ'
@cblp ||= Expression[[mz.label_at(mz.body, mz.body.virtsize), :-, mz.label_at(h, 0)], :%, 512] # number of bytes used in last page
@cp ||= Expression[[mz.label_at(mz.body, mz.body.virtsize), :-, mz.label_at(h, 0)], :/, 512] # number of pages used
@crlc ||= relocs.virtsize/4
@cparhdr ||= Expression[[mz.label_at(relocs, 0), :-, mz.label_at(h, 0)], :/, 16] # header size in paragraphs (16o)
@minalloc ||= ((mz.body.virtsize - mz.body.rawsize) + 15) / 16
@maxalloc ||= @minalloc
@ss ||= 0
@sp ||= 0 # ss:sp points at 1st byte of body => works if body does not reach end of segment (or maybe the overflow make the stack go to header space)
@csum ||= 0
@ip ||= 0
@cs ||= 0
@lfarlc ||= Expression[mz.label_at(relocs, 0), :-, mz.label_at(h, 0)]
@ovno ||= 0
end
def decode(mz)
@magic = mz.encoded.read 2
raise InvalidExeFormat, "Invalid MZ signature #{h.magic.inspect}" if @magic != 'MZ'
Fields[1..-1].each { |m| send("#{m}=", mz.decode_word) }
end
end
class Relocation
attr_accessor :segment, :offset
def encode(mz)
mz.encode_word(@offset) << mz.encode_word(@segment)
end
def decode(mz)
@offset = mz.decode_word
@segment = mz.decode_word
end
end
# encodes a word in 16 bits
def encode_word(val) Expression[val].encode(:u16, @endianness) end
# decodes a 16bits word from self.encoded
def decode_word(edata = @encoded) edata.decode_imm(:u16, @endianness) end
attr_accessor :endianness, :header, :source
# the EncodedData representing the content of the file
attr_accessor :body
# an array of Relocations - quite obscure
attr_accessor :relocs
def initialize(cpu=nil)
@endianness = cpu ? cpu.endianness : :little
@relocs = []
@header = Header.new
@body = EncodedData.new
@source = []
super(cpu)
end
# assembles the source in the body, clears the source
def assemble
@body << assemble_sequence(@source, @cpu)
@body.fixup @body.binding
# XXX should create @relocs here
@source.clear
end
# sets up @cursource
def parse_init
@cursource = @source
super
end
# encodes the header and the relocation table, return them in an array, with the body.
def pre_encode
relocs = @relocs.inject(EncodedData.new) { |edata, r| edata << r.encode(self) }
header = @header.encode self, relocs
[header, relocs, @body]
end
# defines the exe-specific parser instructions:
# .entrypoint [<label>]: defines the program entrypoint to label (or create a new label at this location)
def parse_parser_instruction(instr)
case instr.raw.downcase
when '.entrypoint'
# ".entrypoint <somelabel/expression>" or ".entrypoint" (here)
@lexer.skip_space
if tok = @lexer.nexttok and tok.type == :string
raise instr, 'syntax error' if not entrypoint = Expression.parse(@lexer)
else
entrypoint = new_label('entrypoint')
@cursource << Label.new(entrypoint, instr.backtrace.dup)
end
@header.ip = Expression[entrypoint, :-, label_at(@body, 0, 'body')]
@lexer.skip_space
raise instr, 'eol expected' if t = @lexer.nexttok and t.type != :eol
end
end
# concats the header, relocation table and body
def encode
pre_encode.inject(@encoded) { |edata, pe| edata << pe }
@encoded.fixup @encoded.binding
end
# returns the raw content of the mz file, with updated checksum
def encode_string
super
encode_fix_checksum
@encoded.data
end
# sets the file checksum (untested)
def encode_fix_checksum
@encoded.ptr = 0
decode_header
mzlen = @header.cp * 512 + @header.cblp
@encoded.ptr = 0
csum = -@header.csum
(mzlen/2).times { csum += decode_word }
csum &= 0xffff
@encoded[2*Header::Fields.index(:csum), 2] = encode_word(csum)
end
# decodes the MZ header from the current offset in self.encoded
def decode_header
@header.decode self
end
# decodes the relocation table
def decode_relocs
@relocs.clear
@encoded.ptr = @header.lfarlc
@header.crlc.times {
r = Relocation.new
r.decode self
@relocs << r
}
end
# decodes the main part of the program
# mostly defines the 'start' export, to point to the MZ entrypoint
def decode_body
@body = @encoded[@header.cparhdr*16...@header.cp*512+@header.cblp]
@body.virtsize += @header.minalloc * 16
@body.add_export 'start', @header.cs * 16 + @header.ip
end
def decode
decode_header
decode_relocs
decode_body
end
def each_section
yield @body, 0
end
end
end

View File

@ -1,353 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
require 'metasm/exe_format/mz'
require 'metasm/exe_format/coff_encode'
require 'metasm/exe_format/coff_decode'
module Metasm
class PE < COFF
PESIG = "PE\0\0"
attr_accessor :coff_offset, :signature, :mz
def initialize(cpu=nil)
super(cpu)
@mz = MZ.new(cpu).share_namespace(self)
end
# overrides COFF#decode_header
# simply sets the offset to the PE pointer before decoding the COFF header
# also checks the PE signature
def decode_header
@cursection ||= self
@encoded.ptr = 0x3c
@encoded.ptr = decode_word
@signature = @encoded.read(4)
raise InvalidExeFormat, "Invalid PE signature #{@signature.inspect}" if @signature != PESIG
@coff_offset = @encoded.ptr
if @mz.encoded.empty?
@mz.encoded << @encoded[0, @coff_offset-4]
@mz.encoded.ptr = 0
@mz.decode_header
end
super
end
# creates a default MZ file to be used in the PE header
# this one is specially crafted to fit in the 0x3c bytes before the signature
def encode_default_mz_header
# XXX use single-quoted source, to avoid ruby interpretation of \r\n
@mz.cpu = Ia32.new(386, 16)
@mz.parse <<'EOMZSTUB'
db "Needs Win32!\r\n$"
.entrypoint
push cs
pop ds
xor dx, dx ; ds:dx = addr of $-terminated string
mov ah, 9 ; output string
int 21h
mov ax, 4c01h ; exit with code in al
int 21h
EOMZSTUB
@mz.assemble
mzparts = @mz.pre_encode
# put stuff before 0x3c
@mz.encoded << mzparts.shift
raise 'OH NOES !!1!!!1!' if @mz.encoded.virtsize > 0x3c # MZ header is too long, cannot happen
until mzparts.empty?
break if mzparts.first.virtsize + @mz.encoded.virtsize > 0x3c
@mz.encoded << mzparts.shift
end
# set PE signature pointer
@mz.encoded.align 0x3c
@mz.encoded << encode_word('pesigptr')
# put last parts of the MZ program
until mzparts.empty?
@mz.encoded << mzparts.shift
end
# ensure the sig will be 8bytes-aligned
@mz.encoded.align 8
@mz.encoded.fixup 'pesigptr' => @mz.encoded.virtsize
@mz.encoded.fixup @mz.encoded.binding
@mz.encoded.fill
@mz.encode_fix_checksum
end
# encodes the PE header before the COFF header, uses a default mz header if none defined
# the MZ header must have 0x3c pointing just past its last byte which should be 8bytes aligned
# the 2 1st bytes of the MZ header should be 'MZ'
def encode_header(*a)
encode_default_mz_header if @mz.encoded.empty?
@encoded << @mz.encoded.dup
# append the PE signature
@signature ||= PESIG
@encoded << @signature
super
end
# a returns a new PE with only minimal information copied:
# section name/perm/addr/content
# exports
# imports (with boundimport cleared)
# resources
def mini_copy(share_ns=true)
ret = self.class.new(@cpu)
ret.share_namespace(self) if share_ns
ret.header.machine = @header.machine
ret.optheader.entrypoint = @optheader.entrypoint
ret.optheader.image_base = @optheader.image_base
@sections.each { |s|
rs = Section.new
rs.name = s.name
rs.virtaddr = s.virtaddr
rs.characteristics = s.characteristics
rs.encoded = s.encoded
ret.sections << s
}
ret.resource = resource
ret.tls = tls
if imports
ret.imports = @imports.map { |id| id.dup }
ret.imports.each { |id|
id.timestamp = id.firstforwarder =
id.ilt_p = id.libname_p = nil
}
end
ret.export = export
ret
end
def c_set_default_entrypoint
return if @optheader.entrypoint
if @sections.find { |s| s.encoded.export['main'] }
@optheader.entrypoint = 'main'
elsif @sections.find { |s| s.encoded.export['DllEntryPoint'] }
@optheader.entrypoint = 'DllEntryPoint'
elsif @sections.find { |s| s.encoded.export['DllMain'] }
cp = @cpu.new_cparser
cp.parse <<EOS
enum { DLL_PROCESS_DETACH, DLL_PROCESS_ATTACH, DLL_THREAD_ATTACH, DLL_THREAD_DETACH, DLL_PROCESS_VERIFIER };
__stdcall int DllMain(void *handle, unsigned long reason, void *reserved);
__stdcall int DllEntryPoint(void *handle, unsigned long reason, void *reserved) {
int ret = DllMain(handle, reason, reserved);
if (ret == 0 && reason == DLL_PROCESS_ATTACH)
DllMain(handle, DLL_PROCESS_DETACH, reserved);
return ret;
}
EOS
parse(@cpu.new_ccompiler(cp, self).compile)
assemble
@optheader.entrypoint = 'DllEntryPoint'
elsif @sections.find { |s| s.encoded.export['WinMain'] }
cp = @cpu.new_cparser
cp.parse <<EOS
#define GetCommandLine GetCommandLineA
#define GetModuleHandle GetModuleHandleA
#define GetStartupInfo GetStartupInfoA
#define STARTF_USESHOWWINDOW 0x00000001
#define SW_SHOWDEFAULT 10
typedef unsigned long DWORD;
typedef unsigned short WORD;
typedef struct {
DWORD cb; char *lpReserved, *lpDesktop, *lpTitle;
DWORD dwX, dwY, dwXSize, dwYSize, dwXCountChars, dwYCountChars, dwFillAttribute, dwFlags;
WORD wShowWindow, cbReserved2; char *lpReserved2;
void *hStdInput, *hStdOutput, *hStdError;
} STARTUPINFO;
__stdcall void *GetModuleHandleA(const char *lpModuleName);
__stdcall void GetStartupInfoA(STARTUPINFO *lpStartupInfo);
__stdcall void ExitProcess(unsigned int uExitCode);
__stdcall char *GetCommandLineA(void);
__stdcall int WinMain(void *hInstance, void *hPrevInstance, char *lpCmdLine, int nShowCmd);
int main(void) {
STARTUPINFO startupinfo;
startupinfo.cb = sizeof(STARTUPINFO);
char *cmd = GetCommandLine();
int ret;
if (*cmd == '"') {
cmd++;
while (*cmd && *cmd != '"') {
if (*cmd == '\\\\') cmd++;
cmd++;
}
if (*cmd == '"') cmd++;
} else
while (*cmd && *cmd != ' ') cmd++;
while (*cmd == ' ') cmd++;
GetStartupInfo(&startupinfo);
ret = WinMain(GetModuleHandle(0), 0, cmd, (startupinfo.dwFlags & STARTF_USESHOWWINDOW) ? (int)startupinfo.wShowWindow : (int)SW_SHOWDEFAULT);
ExitProcess((DWORD)ret);
return ret;
}
EOS
parse(@cpu.new_ccompiler(cp, self).compile)
assemble
@optheader.entrypoint = 'main'
end
end
# handles writes to fs:[0] -> dasm SEH handler (first only, does not follow the chain)
# TODO seh prototype (args => context)
# TODO hook on (non)resolution of :w xref
def get_xrefs_x(dasm, di)
if @cpu.kind_of? Ia32 and a = di.instruction.args.first and a.kind_of? Ia32::ModRM and a.seg and a.seg.val == 4 and
w = get_xrefs_rw(dasm, di).find { |type, ptr, len| type == :w and ptr.externals.include? 'segment_base_fs' } and
dasm.backtrace(Expression[w[1], :-, 'segment_base_fs'], di.address) == [Expression[0]]
sehptr = w[1]
sz = @cpu.size/8
sehptr = Indirection.new(Expression[Indirection.new(sehptr, sz, di.address), :+, sz], sz, di.address)
a = dasm.backtrace(sehptr, di.address, :include_start => true, :origin => di.address, :type => :x, :detached => true)
puts "backtrace seh from #{di} => #{a.map { |addr| Expression[addr] }.join(', ')}" if $VERBOSE
a.each { |aa|
next if aa == Expression::Unknown
l = dasm.auto_label_at(aa, 'seh', 'loc', 'sub')
dasm.addrs_todo << [aa]
}
super
else
super
end
end
# returns a disassembler with a special decodedfunction for GetProcAddress (i386 only), and the default func
def init_disassembler
d = super
d.backtrace_maxblocks_data = 4
if @cpu.kind_of? Ia32
old_cp = d.c_parser
d.c_parser = nil
d.parse_c '__stdcall void *GetProcAddress(int, char *);'
gpa = @cpu.decode_c_function_prototype(d.c_parser, 'GetProcAddress')
d.c_parser = old_cp
@getprocaddr_unknown = []
gpa.btbind_callback = proc { |dasm, bind, funcaddr, calladdr, expr, origin, maxdepth|
break bind if @getprocaddr_unknown.include? [dasm, calladdr] or not Expression[expr].externals.include? :eax
sz = @cpu.size/8
break bind if not dasm.decoded[calladdr]
fnaddr = dasm.backtrace(Indirection.new(Expression[:esp, :+, 2*sz], sz, calladdr), calladdr, :include_start => true, :maxdepth => maxdepth)
if fnaddr.kind_of? ::Array and fnaddr.length == 1 and s = dasm.get_section_at(fnaddr.first) and fn = s[0].read(64) and i = fn.index(0) and i > sz # try to avoid ordinals
bind = bind.merge :eax => Expression[fn[0, i]]
else
@getprocaddr_unknown << [dasm, calladdr]
puts "unknown func name for getprocaddress from #{Expression[calladdr]}" if $VERBOSE
end
bind
}
d.function[Expression['GetProcAddress']] = gpa
d.function[:default] = @cpu.disassembler_default_func
end
d
end
end
# an instance of a PE file, loaded in memory
# just change the rva_to_off and the section content decoding methods
class LoadedPE < PE
attr_accessor :load_address
# use the virtualaddr/virtualsize fields of the section header
def decode_section_body(s)
s.encoded = @encoded[s.virtaddr, s.virtsize]
end
# reads a loaded PE from memory, returns a PE object
# dumps the header, optheader and all sections ; try to rebuild IAT (#memdump_imports)
def self.memdump(memory, baseaddr, entrypoint = nil)
loaded = LoadedPE.load memory[baseaddr, 0x1000_0000]
loaded.load_address = baseaddr
loaded.decode
dump = PE.new(loaded.cpu_from_headers)
dump.share_namespace loaded
dump.optheader.image_base = baseaddr
dump.optheader.entrypoint = (entrypoint || loaded.optheader.entrypoint + baseaddr) - baseaddr
dump.directory['resource_table'] = loaded.directory['resource_table']
loaded.sections.each { |s|
ss = Section.new
ss.name = s.name
ss.virtaddr = s.virtaddr
ss.encoded = s.encoded
ss.characteristics = s.characteristics
dump.sections << ss
}
loaded.memdump_imports(memory, dump)
dump
end
# rebuilds an IAT from the loaded pe and the memory
# for each loaded iat, find the matching dll in memory
# for each loaded iat entry, retrieve the exported name from the loaded dll
# then build the dump iat
# scans page by page backward from the first iat address for the loaded dll (must not be forwarded)
# TODO bound imports
def memdump_imports(memory, dump)
dump.imports ||= []
@imports.each { |id|
next if not id.iat or not id.iat.first
addr = id.iat.first & ~0xffff
256.times { break if memory[addr, 2] == 'MZ' ; addr -= 0x10000 }
next if memory[addr, 2] != 'MZ'
loaded_dll = LoadedPE.load memory[addr, 0x1000_0000]
loaded_dll.load_address = addr
loaded_dll.decode_header
loaded_dll.decode_exports
next if not loaded_dll.export
dump_id = ImportDirectory.new
dump_id.libname = loaded_dll.export.libname
dump_id.imports = []
dump_id.iat_p = id.iat_p
id.iat.each { |ptr|
if not e = loaded_dll.export.exports.find { |e| e.target == ptr - loaded_dll.load_address }
# check for forwarder
# XXX won't handle forwarder to forwarder
addr = ptr & ~0xffff
256.times { break if memory[addr, 2] == 'MZ' ; addr -= 0x10000 }
if memory[addr, 2] == 'MZ'
f_dll = LoadedPE.load memory[addr, 0x1000_0000]
f_dll.decode_header ; f_dll.decode_exports
if f_dll.export and ee = f_dll.export.exports.find { |ee| ee.target == ptr - addr }
e = loaded_dll.export.exports.find { |e| e.forwarder_name == ee.name }
end
end
if not e
dump_id = nil
break
end
end
dump_id.imports << ImportDirectory::Import.new
if e.name
dump_id.imports.last.name = e.name
else
dump_id.imports.last.ordinal = e.ordinal
end
}
dump.imports << dump_id if dump_id
} if @imports
end
end
end

View File

@ -1,96 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
module Metasm
# a shellcode is a simple sequence of instructions
class Shellcode < ExeFormat
# the array of source elements (Instr/Data etc)
attr_accessor :source
# the base address of the shellcode (nil if unspecified)
attr_accessor :base_addr
def initialize(cpu=nil, base_addr=nil)
@base_addr = base_addr
@source = []
super(cpu)
end
def parse_init
@cursource = @source
super
end
# allows definition of the base address
def parse_parser_instruction(instr)
case instr.raw.downcase
when '.base_addr'
# ".base_addr <expression>"
# expression should #reduce to integer
@lexer.skip_space
raise instr, 'syntax error' if not @base_addr = Expression.parse(@lexer).reduce
raise instr, 'syntax error' if tok = @lexer.nexttok and tok.type != :eol
else super
end
end
def get_section_at(addr)
base = @base_addr || 0
if not addr.kind_of? Integer
[@encoded, addr] if @encoded.ptr = @encoded.export[addr]
elsif addr >= base and addr < base + @encoded.virtsize
@encoded.ptr = addr - base
[@encoded, addr]
end
end
def each_section
yield @encoded, (@base_addr || 0)
end
# encodes the source found in self.source
# appends it to self.encoded
# clears self.source
# the optional parameter may contain a binding used to fixup! self.encoded
# uses self.base_addr if it exists
def assemble(binding={})
@encoded << assemble_sequence(@source, @cpu)
@source.clear
@encoded.fixup! binding
@encoded.fixup @encoded.binding(@base_addr)
@encoded.fill @encoded.rawsize
self
end
alias encode assemble
def decode
end
def self.disassemble(cpu, str, eip=0)
sc = decode(str, cpu)
sc.disassemble(eip)
end
def compile_setsection(src, section)
end
def dump_section_header(addr, edata)
''
end
def get_default_entrypoints
[@base_addr || 0]
end
# returns a virtual subclass of Shellcode whose cpu_from_headers will return cpu
def self.withcpu(cpu)
c = Class.new(self)
c.send(:define_method, :cpu_from_headers) { cpu }
c
end
end
end

View File

@ -1,272 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/exe_format/main'
require 'metasm/encode'
require 'metasm/decode'
module Metasm
class XCoff < ExeFormat
FLAGS = { 1 => 'RELFLG', 2 => 'EXEC', 4 => 'LNNO',
0x200 => 'AR32W', 0x400 => 'PATCH', 0x1000 => 'DYNLOAD',
0x2000 => 'SHROBJ', 0x4000 => 'LOADONLY' }
SECTION_FLAGS = { 8 => 'PAD', 0x20 => 'TEXT', 0x40 => 'DATA', 0x80 => 'BSS',
0x100 => 'EXCEPT', 0x200 => 'INFO', 0x1000 => 'LOADER',
0x2000 => 'DEBUG', 0x4000 => 'TYPCHK', 0x8000 => 'OVRFLO' }
attr_accessor :header, :segments, :relocs
class Header
attr_accessor :nsec, :timdat, :symptr, :nsym, :opthdr, :flags
attr_accessor :endianness, :intsize
def decode(xcoff)
@endianness, @intsize = case xcoff.encoded.read(2)
when "\1\xdf"; [:big, 32]
when "\xdf\1"; [:little, 32]
when "\1\xef"; [:big, 64]
when "\xef\1"; [:little, 64]
else raise InvalidExeFormat, "invalid a.out signature"
end
@nsec = xcoff.decode_half
@timdat = xcoff.decode_word
@symptr = xcoff.decode_xword
@nsym = xcoff.decode_word
@opthdr = xcoff.decode_half
@flags = xcoff.bits_to_hash(xcoff.decode_half, FLAGS)
end
def encode(xcoff)
set_default_values xcoff
EncodedData.new <<
xcoff.encode_half(@intsize == 32 ? 0x1df : 0x1ef) <<
xcoff.encode_half(@nsec) <<
xcoff.encode_word(@timdat) <<
xcoff.encode_xword(@symptr) <<
xcoff.encode_word(@nsym) <<
xcoff.encode_word(@opthdr) <<
xcoff.encode_word(xcoff.bits_from_hash(@flags, FLAGS))
end
def set_default_values(xcoff)
@endianness ||= xcoff.cpu ? xcoff.cpu.endianness : :little
@intsize ||= xcoff.cpu ? xcoff.cpu.size : 32
@nsec ||= xcoff.sections.size
@timdat ||= 0
@symptr ||= xcoff.symbols ? xcoff.new_label('symptr') : 0
@nsym ||= xcoff.symbols ? xcoff.symbols.length : 0
@opthdr ||= xcoff.optheader ? OptHeader.size(xcoff) : 0
@flags ||= 0
end
end
class OptHeader
attr_accessor :magic, :vstamp, :tsize, :dsize, :bsize, :entry, :text_start,
:data_start, :toc, :snentry, :sndata, :sntoc, :snloader, :snbss,
:algntext, :algndata, :modtype, :cpu, :maxstack, :maxdata, :debugger, :resv
def self.size(xcoff)
xcoff.header.intsize == 32 ? 2*2+7*4+10*2+2*4+2+8 : 2*2+7*8+10*2+2*8+2+120
end
def decode(xcoff)
@magic = xcoff.decode_half
@vstamp = xcoff.decode_half
@tsize = xcoff.decode_xword
@dsize = xcoff.decode_xword
@bsize = xcoff.decode_xword
@entry = xcoff.decode_xword
@text_start = xcoff.decode_xword
@data_start = xcoff.decode_xword
@toc = xcoff.decode_xword
@snentry = xcoff.decode_half
@sntext = xcoff.decode_half
@sndata = xcoff.decode_half
@sntoc = xcoff.decode_half
@snloader = xcoff.decode_half
@snbss = xcoff.decode_half
@algntext = xcoff.decode_half
@algndata = xcoff.decode_half
@modtype = xcoff.decode_half
@cpu = xcoff.decode_half
@maxstack = xcoff.decode_xword
@maxdata = xcoff.decode_xword
@debugger = xcoff.decode_word
@res = xcoff.read(xcoff.header.intsize == 32 ? 8 : 120)
end
def encode(xcoff)
set_default_values xcoff
EncodedData.new <<
xcoff.encode_half(@magic) <<
xcoff.encode_half(@vstamp) <<
xcoff.encode_xword(@tsize) <<
xcoff.encode_xword(@dsize) <<
xcoff.encode_xword(@bsize) <<
xcoff.encode_xword(@entry) <<
xcoff.encode_xword(@text_start) <<
xcoff.encode_xword(@data_start) <<
xcoff.encode_xword(@toc) <<
xcoff.encode_half(@snentry) <<
xcoff.encode_half(@sntext) <<
xcoff.encode_half(@sndata) <<
xcoff.encode_half(@sntoc) <<
xcoff.encode_half(@snloader) <<
xcoff.encode_half(@snbss) <<
xcoff.encode_half(@algntext) <<
xcoff.encode_half(@algndata) <<
xcoff.encode_half(@modtype) <<
xcoff.encode_half(@cpu) <<
xcoff.encode_xword(@maxstack) <<
xcoff.encode_xword(@maxdata) <<
xcoff.encode_word(@debugger) <<
@res
end
def set_default_values(xcoff)
@mflags ||= 0
@vstamp ||= 1
@tsize ||= 0
@dsize ||= 0
@bsize ||= 0
@entry ||= 0
@text_start ||= 0
@data_start ||= 0
@toc ||= 0
@snentry ||= 1
@sntext ||= 1
@sndata ||= 2
@sntoc ||= 3
@snloader ||= 4
@snbss ||= 5
@algntext ||= 0
@algndata ||= 0
@modtype ||= 0
@res ||= 0.chr * (xcoff.header.intsize == 32 ? 8 : 120)
end
end
class Section
attr_accessor :name, :paddr, :vaddr, :size, :scnptr, :relptr, :lnnoptr, :nreloc, :nlnno, :sflags
attr_accessor :encoded
def decode(xcoff)
@name = xcoff.read(8)
@name = @name[0, @name.index(0)] if @name.index[0]
@paddr = xcoff.decode_xword
@vaddr = xcoff.decode_xword
@size = xcoff.decode_xword
@scnptr = xcoff.decode_xword
@relptr = xcoff.decode_xword
@lnnoptr = xcoff.decode_xword
xhalf = xcoff.header.intsize == 32 ? 'decode_half' : 'decode_word'
@nreloc = xcoff.send xhalf
@nlnno = xcoff.send xhalf
@flags = xcoff.bits_to_hash(xcoff.send(xhalf), SECTION_FLAGS)
end
def encode(xcoff)
set_default_values xcoff
n = EncodedData.new << @name
raise "name #@name too long" if n.virtsize > 8
n.virtsize = 8
xhalf = xcoff.header.intsize == 32 ? 'half' : 'word'
n <<
xcoff.encode_xword(@paddr) <<
xcoff.encode_xword(@vaddr) <<
xcoff.encode_xword(@size) <<
xcoff.encode_xword(@scnptr) <<
xcoff.encode_xword(@relptr) <<
xcoff.encode_xword(@lnnoptr) <<
xcoff.send("encode_#{xhalf}", @nreloc) <<
xcoff.send("encode_#{xhalf}", @nlnno) <<
xcoff.send("encode_#{xhalf}", xcoff.bits_from_hash(@flags, SECTION_FLAGS))
end
def set_defalut_values(xcoff)
@name ||= @flags.kind_of?(::Array) ? ".#{@flags.first.to_s.downcase}" : ''
@vaddr ||= @paddr ? @paddr : @encoded ? xcoff.label_at(@encoded, 0, 's_vaddr') : 0
@paddr ||= @vaddr
@size ||= @encoded ? @encoded.size : 0
@scnptr ||= xcoff.new_label('s_scnptr')
@relptr ||= 0
@lnnoptr||= 0
@nreloc ||= 0
@nlnno ||= 0
@flags ||= 0
end
end
# basic immediates decoding functions
def decode_half( edata = @encoded) edata.decode_imm(:u16, @header.endianness) end
def decode_word( edata = @encoded) edata.decode_imm(:u32, @header.endianness) end
def decode_xword(edata = @encoded) edata.decode_imm((@header.intsize == 32 ? :u32 : :u64), @header.endianness) end
def encode_half(w) Expression[w].encode(:u16, @header.endianness) end
def encode_word(w) Expression[w].encode(:u32, @header.endianness) end
def encode_xword(w) Expression[w].encode((@header.intsize == 32 ? :u32 : :u64), @header.endianness) end
def initialize(cpu=nil)
@header = Header.new
@sections = []
super
end
def decode_header(off = 0)
@encoded.ptr = off
@header.decode(self)
if @header.opthdr != 0
@optheader = OptHeader.new
@optheader.decode(self)
end
@header.nsec.times {
s = Section.new
s.decode(self)
@sections << s
}
end
def decode
decode_header
@sections.each { |s|
s.encoded = @encoded[s.scnptr, s.size]
}
end
def encode
@encoded = EncodedData.new
@encoded << @header.encode(self)
@encoded << @optheader.encode(self) if @optheader
@sections.each { |s|
@encoded << s.encode(self)
}
va = @encoded.size
binding = {}
@sections.each { |s|
if s.scnptr.kind_of? ::String
binding[s.scnptr] = @encoded.size
else
raise 'scnptr too low' if @encoded.virtsize > s.scnptr
@encoded.virtsize = s.scnptr
end
va = (va + 4096 - 1)/4096*4096
if s.vaddr.kind_of? ::String
binding[s.vaddr] = va
else
va = s.vaddr
end
binding.update s.encoded.binding(va)
va += s.encoded.size
@encoded << s.encoded
}
@encoded.fixup!(binding)
@encoded.data
end
end
end

View File

@ -1,372 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'gtk2'
require 'metasm/gui/gtk_listing'
require 'metasm/gui/gtk_graph'
module Metasm
module GtkGui
class DisasmWidget < Gtk::VBox
attr_accessor :dasm, :entrypoints, :views
def initialize(dasm, ep=[])
super()
@dasm = dasm
@entrypoints = ep
@views = []
@pos_history = []
gui_update_counter = 0
dasm_working = false
@gtk_idle_handle = Gtk.idle_add {
# metasm disassembler loop
# update gui once in a while
dasm_working = true if not @entrypoints.empty? or not @dasm.addrs_todo.empty?
if dasm_working
begin
if not @dasm.disassemble_mainiter(@entrypoints)
dasm_working = false
gui_update_counter = 10000
end
rescue
messagebox [$!, $!.backtrace].join("\n")
end
gui_update_counter += 1
if gui_update_counter > 100
gui_update_counter = 0
gui_update
end
end
true
}
#pack_start iconbar, dasm_working_flag ?
@notebook = Gtk::Notebook.new
# hex view
pack_start @notebook
@notebook.show_border = false
@notebook.show_tabs = false
@views << AsmListingWidget.new(@dasm, self)
@views << GraphViewWidget.new(@dasm, self)
@notebook.append_page(@views[0], Gtk::Label.new('listing'))
@notebook.append_page(@views[1], Gtk::Label.new('graph'))
@notebook.focus_child = curview
end
def terminate
Gtk.idle_remove @gtk_idle_handle
end
def curview
@views[@notebook.page]
end
include Gdk::Keyval
def keypress(ev)
case ev.keyval
when GDK_Return, GDK_KP_Enter
focus_addr curview.hl_word
when GDK_Escape
focus_addr_back
when GDK_c # disassemble from this point
# if points to a call, make it return
return if not addr = curview.current_address
if di = @dasm.decoded[addr] and di.kind_of? DecodedInstruction and di.opcode.props[:saveip] and not @dasm.decoded[addr + di.bin_length]
di.block.add_to_subfuncret(addr+di.bin_length)
@dasm.addrs_todo << [addr + di.bin_length, addr, true]
else
@dasm.addrs_todo << [addr]
end
when GDK_f # list functions
list = [['name', 'addr']]
@dasm.function.keys.each { |f|
addr = @dasm.normalize(f)
next if not @dasm.decoded[addr]
list << [@dasm.prog_binding.index(addr), Expression[addr]]
}
title = "list of functions"
listwindow(title, list) { |i| focus_addr i[1] }
when GDK_g # jump to address
inputbox('address to go') { |v| focus_addr v }
when GDK_h # parses a C header
openfile('open C header') { |f|
@dasm.parse_c_file(f) rescue messagebox("#{$!}\n#{$!.backtrace}")
}
when GDK_n # name/rename a label
if not curview.hl_word or not addr = @dasm.prog_binding[curview.hl_word]
return if not addr = curview.current_address
end
if old = @dasm.prog_binding.index(addr)
inputbox("new name for #{old}") { |v| @dasm.rename_label(old, v) ; gui_update }
else
inputbox("label name for #{Expression[addr]}") { |v| @dasm.set_label_at(addr, @dasm.program.new_label(v)) ; gui_update }
end
when GDK_p # pause/play disassembler
@dasm_pause ||= []
if @dasm_pause.empty? and @dasm.addrs_todo.empty?
elsif @dasm_pause.empty?
@dasm_pause = @dasm.addrs_todo.dup
@dasm.addrs_todo.clear
puts "dasm paused (#{@dasm_pause.length})"
else
@dasm.addrs_todo.concat @dasm_pause
@dasm_pause.clear
puts "dasm restarted (#{@dasm.addrs_todo.length})"
end
when GDK_v # toggle verbose flag
$VERBOSE = ! $VERBOSE
puts "verbose #$VERBOSE"
when GDK_x # show xrefs to the current address
return if not addr = @dasm.prog_binding[curview.hl_word] || curview.current_address
list = [['address', 'type', 'instr']]
@dasm.each_xref(addr) { |xr|
list << [Expression[xr.origin], "#{xr.type}#{xr.len}"]
if di = @dasm.decoded[xr.origin] and di.kind_of? DecodedInstruction
list.last << di.instruction
end
}
title = "list of xrefs to #{Expression[addr]}"
if list.length == 1
messagebox "no xref to #{Expression[addr]}"
else
listwindow(title, list) { |i| focus_addr(i[0], nil, true) }
end
when GDK_space
focus_addr(curview.current_address, 1-@notebook.page)
when 0x20..0x7e # normal kbd (use ascii code)
# quiet
return false
when GDK_Shift_L, GDK_Shift_R, GDK_Control_L, GDK_Control_R,
GDK_Alt_L, GDK_Alt_R, GDK_Meta_L, GDK_Meta_R,
GDK_Super_L, GDK_Super_R, GDK_Menu
# quiet
return false
else
c = Gdk::Keyval.constants.find { |c| Gdk::Keyval.const_get(c) == ev.keyval }
p [:unknown_keypress, ev.keyval, c, ev.state]
return false
end
true
end
def focus_addr(addr, page=nil, quiet=false)
page ||= @notebook.page
case addr
when ::String
if @dasm.prog_binding[addr]
addr = @dasm.prog_binding[addr]
elsif (?0..?9).include? addr[0]
addr = '0x' + addr[0...-1] if addr[-1] == ?h
begin
addr = Integer(addr)
rescue ::ArgumentError
messagebox "Invalid address #{addr}" if not quiet
return
end
else
messagebox "Invalid address #{addr}" if not quiet
return
end
when nil; return
end
return if page == @notebook.page and addr == curview.current_address
oldpos = [@notebook.page, curview.get_cursor_pos]
@notebook.page = page
if curview.focus_addr(addr) or (0...@views.length).find { |v|
o_p = @views[v].get_cursor_pos
if @views[v].focus_addr(addr)
@notebook.page = v
true
else
@views[v].set_cursor_pos o_p
false
end
}
@pos_history << oldpos
true
else
messagebox "Invalid address #{Expression[addr]}" if not quiet
focus_addr_back oldpos
false
end
end
def focus_addr_back(val = @pos_history.pop)
return if not val
@notebook.page = val[0]
curview.set_cursor_pos val[1]
true
end
def gui_update
@views.each { |v| v.gui_update }
end
def messagebox(str)
MessageBox.new(toplevel, str)
end
def inputbox(str, &b)
InputBox.new(toplevel, str, &b)
end
def openfile(title, &b)
OpenFile.new(toplevel, title, &b)
end
def listwindow(title, list, &b)
ListWindow.new(toplevel, title, list, &b)
end
end
class MessageBox < Gtk::MessageDialog
# shows a message box (non-modal)
def initialize(onwer, str)
owner ||= Gtk::Window.toplevels.first
super(owner, Gtk::Dialog::DESTROY_WITH_PARENT, INFO, BUTTONS_CLOSE, str)
signal_connect('response') { destroy }
show_all
present # bring the window to the foreground & set focus
end
end
class InputBox < Gtk::Dialog
# shows a simplitic input box (eg window with a 1-line textbox + OK button), yields the text
# TODO history, dropdown, autocomplete, contexts, 3D stereo surround, etc
def initialize(owner, str)
owner ||= Gtk::Window.toplevels.first
super(nil, owner, Gtk::Dialog::DESTROY_WITH_PARENT,
[Gtk::Stock::OK, Gtk::Dialog::RESPONSE_ACCEPT], [Gtk::Stock::CANCEL, Gtk::Dialog::RESPONSE_REJECT])
label = Gtk::Label.new(str)
text = Gtk::TextView.new
text.signal_connect('key_press_event') { |w, ev|
case ev.keyval
when Gdk::Keyval::GDK_Escape; response(RESPONSE_REJECT) ; true
when Gdk::Keyval::GDK_Return, Gdk::Keyval::GDK_KP_Enter; response(RESPONSE_ACCEPT) ; true
end
}
signal_connect('response') { |win, id|
if id == RESPONSE_ACCEPT
text = text.buffer.text
destroy
yield text
else
destroy
end
true
}
vbox.pack_start label, false, false, 8
vbox.pack_start text, false, false, 8
show_all
present
end
end
class OpenFile < Gtk::FileChooserDialog
# shows an asynchronous FileChooser window, yields the chosen filename
# TODO save last path
def initialize(owner, title)
owner ||= Gtk::Window.toplevels.first
super(title, owner, Gtk::FileChooser::ACTION_OPEN, nil,
[Gtk::Stock::CANCEL, Gtk::Dialog::RESPONSE_CANCEL], [Gtk::Stock::OPEN, Gtk::Dialog::RESPONSE_ACCEPT])
signal_connect('response') { |win, id|
if id == Gtk::Dialog::RESPONSE_ACCEPT
file = filename
destroy
yield file
else
destroy
end
true
}
show_all
present
end
end
class ListWindow < Gtk::Dialog
# shows a window with a list of items
# the list is an array of arrays, displayed as String
# the first array is the column names
# each item double-clicked yields the block with the selected iterator
def initialize(owner, title, list)
owner ||= Gtk::Window.toplevels.first
super(title, owner, Gtk::Dialog::DESTROY_WITH_PARENT)
cols = list.shift
treeview = Gtk::TreeView.new
treeview.model = Gtk::ListStore.new(*[String]*cols.length)
treeview.selection.mode = Gtk::SELECTION_NONE
cols.each_with_index { |col, i|
crt = Gtk::CellRendererText.new
tvc = Gtk::TreeViewColumn.new(col, crt)
tvc.set_cell_data_func(crt) { |_tvc, _crt, model, iter| _crt.text = iter[i] }
treeview.append_column tvc
}
list.each { |e|
iter = treeview.model.append
e.each_with_index { |v, i| iter[i] = v.to_s }
}
treeview.model.set_sort_column_id(0)
treeview.signal_connect('cursor_changed') { |x|
if iter = treeview.selection.selected
yield iter
end
}
remove vbox
add Gtk::ScrolledWindow.new.add(treeview)
toplevel.set_default_size 200, 100
show_all
present
# so that the 1st line is not selected by default
treeview.selection.mode = Gtk::SELECTION_SINGLE
end
end
class MainWindow < Gtk::Window
attr_accessor :dasm_widget
def initialize(title = 'metasm disassembler')
super()
self.title = title
@dasm_widget = nil
end
def display(dasm, ep=[])
if @dasm_widget
@dasm_widget.terminate
remove @dasm_widget
end
@dasm_widget = DisasmWidget.new(dasm, ep)
add @dasm_widget
set_default_size 700, 500
show_all
end
end
end
end

File diff suppressed because it is too large Load Diff

View File

@ -1,521 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'gtk2'
module Metasm
module GtkGui
class AsmListingWidget < Gtk::HBox
attr_accessor :hl_word
# construction method
def initialize(dasm, parent_widget)
@dasm = dasm
@parent_widget = parent_widget
@arrows = [] # array of [linefrom, lineto] (may be :up or :down for offscreen)
@line_address = {}
@line_text = {}
@hl_word = nil
@caret_x = @caret_y = 0 # caret position in characters coordinates (column/line)
@oldcaret_x = @oldcaret_y = 42
@layout = Pango::Layout.new Gdk::Pango.context
@color = {}
super()
@arrows_widget = Gtk::DrawingArea.new
@listing_widget = Gtk::DrawingArea.new
@vscroll = Gtk::VScrollbar.new
pack_start @arrows_widget, false, false
pack_start @listing_widget
pack_end @vscroll, false, false
# TODO listing hscroll (viewport?)
@arrows_widget.set_size_request 40, 0 # TODO resizer
@vscroll.adjustment.lower = @dasm.sections.keys.min
@vscroll.adjustment.upper = @dasm.sections.keys.max + @dasm.sections[@dasm.sections.keys.max].length
@vscroll.adjustment.step_increment = 1
@vscroll.adjustment.page_increment = 10
@vscroll.adjustment.value = @dasm.prog_binding['entrypoint'] || @vscroll.adjustment.lower
set_font 'courier 10'
# receive mouse/kbd events
@listing_widget.set_events Gdk::Event::ALL_EVENTS_MASK
set_can_focus true
# callbacks
@arrows_widget.signal_connect('expose_event') { paint_arrows ; true }
@listing_widget.signal_connect('expose_event') { paint_listing ; true }
@listing_widget.signal_connect('button_press_event') { |w, ev|
case ev.event_type
when Gdk::Event::Type::BUTTON_PRESS
case ev.button
when 1; click(ev)
end
when Gdk::Event::Type::BUTTON2_PRESS
case ev.button
when 1; doubleclick(ev)
end
end
}
@listing_widget.signal_connect('size_allocate') { |w, alloc| # resize
lines = alloc.height / @font_height
cols = alloc.width / @font_width
@caret_y = lines-1 if @caret_y >= lines
@caret_x = cols-1 if @caret_x >= cols
@vscroll.adjustment.page_increment = lines/2
}
@vscroll.adjustment.signal_connect('value_changed') { |adj|
# align on @decoded boundary
addr = adj.value.to_i
if off = (0..16).find { |off| di = @dasm.decoded[addr-off] and di.respond_to? :bin_length and di.bin_length > off } and off != 0
@vscroll.adjustment.value = addr-off
else
@line_address.clear # make paint_listing call update_caret when done (hl_word etc)
redraw
end
}
signal_connect('key_press_event') { |w, ev| # keyboard
keypress(ev)
}
signal_connect('scroll_event') { |w, ev| # mouse wheel
mouse_wheel(ev)
}
signal_connect('realize') { # one-time initialize
# raw color declaration
{ :white => 'fff', :palegrey => 'ddd', :black => '000', :grey => '444',
:red => 'f00', :darkred => '800', :palered => 'fcc',
:green => '0f0', :darkgreen => '080', :palegreen => 'cfc',
:blue => '00f', :darkblue => '008', :paleblue => 'ccf',
:yellow => 'ff0', :darkyellow => '440', :paleyellow => 'ffc',
}.each { |tag, val|
@color[tag] = Gdk::Color.new(*val.unpack('CCC').map { |c| (c.chr*4).hex })
}
# register colors
@color.each_value { |c| window.colormap.alloc_color(c, true, true) }
# map functionnality => color
set_color_association :comment => :darkblue, :label => :darkgreen, :text => :black,
:instruction => :black, :address => :blue, :caret => :black,
:listing_bg => :white, :cursorline_bg => :paleyellow, :hl_word => :palered,
:arrows_bg => :palegrey,
:arrow_up => :darkblue, :arrow_dn => :darkyellow, :arrow_hl => :red
}
end
#
# methods used as Gtk callbacks
#
# TODO right click
def click(ev)
@caret_x = (ev.x-1).to_i / @font_width
@caret_y = ev.y.to_i / @font_height
update_caret
end
def doubleclick(ev)
@parent_widget.focus_addr(@hl_word)
end
def mouse_wheel(ev)
case ev.direction
when Gdk::EventScroll::Direction::UP
# TODO scroll up exactly win_height/2 lines
# at least cache page_down addresses
@vscroll.adjustment.value -= @vscroll.adjustment.page_increment
true
when Gdk::EventScroll::Direction::DOWN
pgdown = @line_address[@line_address.keys.max.to_i/2] || @vscroll.adjustment.value
pgdown += @vscroll.adjustment.page_increment if pgdown == @vscroll.adjustment.value
@vscroll.adjustment.value = pgdown
true
end
end
# renders the disassembler in the @listing_widget using @vscroll.adjustment.value
# creates the @arrows needed by #paint_arrows
def paint_listing
w = @listing_widget.window
gc = Gdk::GC.new(w)
a = @listing_widget.allocation
w_w, w_h = a.x + a.width, a.y + a.height
# draw caret line background
gc.set_foreground @color[:cursorline_bg]
w.draw_rectangle(gc, true, 0, @caret_y*@font_height, w_w, @font_height)
# TODO scroll line-by-line when an addr is displayed on multiple lines (eg labels/comments)
# TODO selection & current word hilight
curaddr = @vscroll.adjustment.value.to_i
want_update_caret = true if @line_address == {}
# map lineno => adress shown
@line_address = Hash.new(-1)
# map lineno => raw text
@line_text = Hash.new('')
# current line text buffer
fullstr = ''
# current line number
line = 0
# current window position
x = 1
y = 0
# list of arrows to draw ([addr_from, addr_to])
arrows_addr = []
# renders a string at current cursor position with a color
# must not include newline
render = proc { |str, color|
# function ends when we write under the bottom of the listing
next if y >= w_h or x >= w_w
fullstr << str
# TODO selection
if @hl_word and str =~ /^(.*)(\b#{Regexp.escape @hl_word}\b)/
s1, s2 = $1, $2
@layout.text = s1
pre_x = @layout.pixel_size[0]
@layout.text = s2
hl_x = @layout.pixel_size[0]
gc.set_foreground @color[:hl_word]
w.draw_rectangle(gc, true, x+pre_x, y, hl_x, @font_height)
end
@layout.text = str
gc.set_foreground @color[color]
w.draw_layout(gc, x, y, @layout)
x += @layout.pixel_size[0]
}
# newline: current line is fully rendered, update @line_address/@line_text etc
nl = proc {
next if y >= w_h
@line_text[line] = fullstr
@line_address[line] = curaddr
fullstr = ''
line += 1
x = 1
y += @font_height
}
# draw text until screen is full
# builds arrows_addr with addresses
while y < w_h
if di = @dasm.decoded[curaddr] and di.kind_of? DecodedInstruction
# a decoded instruction : check if it's a block start
if di.block.list.first == di
# render dump_block_header, add a few colors
b_header = '' ; @dasm.dump_block_header(di.block) { |l| b_header << l ; b_header << ?\n if b_header[-1] != ?\n }
b_header.each { |l| l.chomp!
col = :comment
col = :label if l[0, 2] != '//' and l[-1] == ?:
render[l, col]
nl[]
}
di.block.each_from_samefunc(@dasm) { |addr|
addr = @dasm.normalize addr
next if not addr.kind_of? ::Integer or (@dasm.decoded[addr].kind_of? DecodedInstruction and addr + @dasm.decoded[addr].bin_length == curaddr)
arrows_addr << [addr, curaddr]
}
end
if di.block.list.last == di
di.block.each_to_samefunc(@dasm) { |addr|
addr = @dasm.normalize addr
next if not addr.kind_of? ::Integer or (addr == curaddr + di.bin_length and
(not di.opcode.props[:saveip] or di.block.to_subfuncret))
arrows_addr << [curaddr, addr]
}
end
render[Expression[di.address].to_s + ' ', :address]
render[di.instruction.to_s.ljust(di.comment ? 24 : 0), :instruction]
render[' ; ' + di.comment.join(' '), :comment] if di.comment
nl[]
# instr overlapping
if off = (1...di.bin_length).find { |off| @dasm.decoded[curaddr + off] }
nl[]
curaddr += off
render["// ------ overlap (#{di.bin_length - off}) ------", :comment]
nl[]
else
curaddr += di.bin_length
end
elsif curaddr < @vscroll.adjustment.upper
# TODO real data display (dwords, xrefs, strings..)
if label = @dasm.prog_binding.index(curaddr) and @dasm.xrefs[curaddr]
render[Expression[curaddr].to_s + ' ', :address]
render[label + ' ', :label]
else
if label
render[label+':', :label]
nl[]
end
render[Expression[curaddr].to_s + ' ', :address]
end
s = @dasm.get_section_at(curaddr)
render['db '+((s and s[0].rawsize > s[0].ptr) ? Expression[s[0].read(1)[0]].to_s : '?'), :instruction]
nl[]
curaddr += 1
else
nl[]
end
end
# draw caret
# TODO selection
gc.set_foreground @color[:caret]
cx = @caret_x*@font_width+1
cy = @caret_y*@font_height
w.draw_line(gc, cx, cy, cx, cy+@font_height-1)
# convert arrows_addr to @arrows (with line numbers)
# updates @arrows_widget if @arrows changed
prev_arrows = @arrows
addr_line = @line_address.sort.inject({}) { |h, (l, a)| h.update a => l } # addr => last line (di)
@arrows = arrows_addr.uniq.sort.map { |from, to|
[(addr_line[from] || (from < curaddr ? :up : :down)),
(addr_line[ to ] || ( to < curaddr ? :up : :down))]
}
@arrows_widget.window.invalidate Gdk::Rectangle.new(0, 0, 100000, 100000), false if prev_arrows != @arrows
update_caret if want_update_caret
end
# draws the @arrows defined in paint_listing
def paint_arrows
return if @arrows.empty? or @line_address[@caret_y] == -1
w = @arrows_widget.window
gc = Gdk::GC.new(w)
w_w, w_h = @arrows_widget.allocation.width, @arrows_widget.allocation.height
slot_alloc = {} # [y1, y2] => x slot -- y1 <= y2
# find a free x slot for the vertical side of the arrow
max = (w_w-6)/3
find_free = proc { |y1, y2|
y1, y2 = y2, y1 if y2 < y1
slot_alloc[[y1, y2]] = (0...max).find { |off|
not slot_alloc.find { |(oy1, oy2), oo|
# return true if this slot cannot share with off
next if oo != off # not same slot => ok
next if oy1 == y1 and y1 != 0 # same upbound & in window
next if oy2 == y2 and y2 != w_h-1 # same lowbound & in window
# check overlapping segment
(y1 >= oy1 and y1 <= oy2) or
(y2 >= oy1 and y2 <= oy2) or
(oy1 >= y1 and oy1 <= y2) or
(oy2 >= y1 and oy2 <= y2)
}
} || (max-1)
}
# alloc slots for arrows, starts by the smallest
arrs = { :arrow_dn => [], :arrow_up => [], :arrow_hl => [] }
@arrows.sort_by { |from, to|
if from.kind_of? Numeric and to.kind_of? Numeric
(from-to).abs
else
100000
end
}.each { |from, to|
y1 = case from
when :up; 0
when :down; w_h-1
else from * @font_height + @font_height/2 - 1
end
y2 = case to
when :up; 0
when :down; w_h-1
else to * @font_height + @font_height/2 - 1
end
if y1 <= y2
y1 += 2 if y1 != 0
else
y1 -= 2 if y1 != w_h-1
end
col = :arrow_dn
col = :arrow_up if y1 > y2
col = :arrow_hl if @line_address[from] == @line_address[@caret_y] or @line_address[to] == @line_address[@caret_y]
arrs[col] << [y1, y2, find_free[y1, y2]]
}
slot_w = (w_w-4)/slot_alloc.values.uniq.length
# draw arrows (hl last to overwrite)
[:arrow_dn, :arrow_up, :arrow_hl].each { |col|
gc.set_foreground @color[col]
arrs[col].each { |y1, y2, slot|
x1 = w_w-1
x2 = w_w-4 - slot*slot_w - slot_w/2
w.draw_line(gc, x1, y1, x2, y1) if y1 != 0 and y1 != w_h-1
w.draw_line(gc, x2, y1, x2, y2)
w.draw_line(gc, x2, y2, x1, y2) if y2 != 0 and y2 != w_h-1
w.draw_line(gc, x1, y2, x1-3, y2-3) if y2 != 0 and y2 != w_h-1
w.draw_line(gc, x1, y2, x1-3, y2+3) if y2 != 0 and y2 != w_h-1
}
}
end
include Gdk::Keyval
# keyboard binding
# basic navigation (arrows, pgup etc)
# dasm navigation
# enter => go to label definition
# esc => jump back
# dasm interaction
# c => start disassembling from here
# g => prompt for an address to jump to
# h => prompt for a C header file to read
# n => rename a label
# p => pause/play disassembler
# x => show xrefs
#
def keypress(ev)
case ev.keyval
when GDK_Left
if @caret_x >= 1
@caret_x -= 1
update_caret
end
when GDK_Up
if @caret_y > 1 or (@caret_y == 1 and @vscroll.adjustment.value == @vscroll.adjustment.lower)
@caret_y -= 1
else
@vscroll.adjustment.value -= 1
end
update_caret
when GDK_Right
if @caret_x <= @line_text.values.map { |s| s.length }.max
@caret_x += 1
update_caret
end
when GDK_Down
if @caret_y < @line_text.length-2 or (@caret_y < @line_text.length - 1 and @vscroll.adjustment.value == @vscroll.adjustment.upper)
@caret_y += 1
else
off = 1
if a = @line_address[0] and @dasm.decoded[a].kind_of? DecodedInstruction
off = @dasm.decoded[a].bin_length
end
@vscroll.adjustment.value += off
end
update_caret
when GDK_Page_Up
@vscroll.adjustment.value -= @vscroll.adjustment.page_increment
update_caret
when GDK_Page_Down
pgdown = @line_address[@line_address.length/2] || @vscroll.adjustment.value
pgdown = @vscroll.adjustment.value + @vscroll.adjustment.page_increment if pgdown == @vscroll.adjustment.value
@vscroll.adjustment.value = pgdown
update_caret
when GDK_Home
@caret_x = 0
update_caret
when GDK_End
@caret_x = @line_text[@caret_y].length
update_caret
when GDK_r # reload this file
load __FILE__
redraw
puts 'reloaded'
return @parent_widget.keypress(ev)
else
return @parent_widget.keypress(ev)
end
true
end
def get_cursor_pos
[@vscroll.adjustment.value, @caret_x, @caret_y]
end
def set_cursor_pos(p)
@vscroll.adjustment.value, @caret_x, @caret_y = p
update_caret
end
# change the font of the listing
# arg is a Gtk Fontdescription string (eg 'courier 10')
def set_font(descr)
@layout.font_description = Pango::FontDescription.new(descr)
@layout.text = 'x'
@font_width, @font_height = @layout.pixel_size
redraw
end
# change the color association
# arg is a hash function symbol => color symbol
# color must be allocated
# check #initialize/sig('realize') for initial function/color list
def set_color_association(hash)
hash.each { |k, v| @color[k] = @color[v] }
@listing_widget.modify_bg Gtk::STATE_NORMAL, @color[:listing_bg]
@arrows_widget.modify_bg Gtk::STATE_NORMAL, @color[:arrows_bg]
redraw
end
# redraw the whole widget
def redraw
return if not @listing_widget.window
@listing_widget.window.invalidate Gdk::Rectangle.new(0, 0, 100000, 100000), false
@arrows_widget.window.invalidate Gdk::Rectangle.new(0, 0, 100000, 100000), false
end
# hint that the caret moved
# redraws the caret, change the hilighted word, redraw if needed
def update_caret
return if not l = @line_text[@caret_y]
word = l[0...@caret_x].to_s[/\w*$/] << l[@caret_x..-1].to_s[/^\w*/]
word = nil if word == ''
if @hl_word != word or @oldcaret_y != @caret_y
@hl_word = word
redraw
else
return if @oldcaret_x == @caret_x and @oldcaret_y == @caret_y
x = @oldcaret_x*@font_width+1
y = @oldcaret_y*@font_height
@listing_widget.window.invalidate Gdk::Rectangle.new(x-1, y, x+1, y+@font_height), false
x = @caret_x*@font_width+1
y = @caret_y*@font_height
@listing_widget.window.invalidate Gdk::Rectangle.new(x-1, y, x+1, y+@font_height), false
if @arrows.find { |f, t| f == @caret_y or t == @caret_y or f == @oldcaret_y or t == @oldcaret_y }
@arrows_widget.window.invalidate Gdk::Rectangle.new(0, 0, 100000, 100000), false
end
end
@oldcaret_x = @caret_x
@oldcaret_y = @caret_y
end
# focus on addr
# addr may be a dasm label, dasm address, dasm address in string form (eg "0DEADBEEFh")
# may scroll the window
# returns true on success (address exists)
def focus_addr(addr)
if l = @line_address.index(addr) and l < @line_address.keys.max - 4
@caret_y, @caret_x = @line_address.keys.find_all { |k| @line_address[k] == addr }.max, 0
elsif addr >= @vscroll.adjustment.lower and addr <= @vscroll.adjustment.upper
@vscroll.adjustment.value, @caret_x, @caret_y = addr, 0, 0
else
return
end
update_caret
true
end
# returns the address of the data under the cursor
def current_address
@line_address[@caret_y]
end
def gui_update
redraw
end
end
end
end

File diff suppressed because it is too large Load Diff

View File

@ -1,788 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/ia32/opcodes'
require 'metasm/decode'
module Metasm
class Ia32
class ModRM
def self.decode(edata, byte, endianness, adsz, opsz, seg=nil, regclass=Reg)
m = (byte >> 6) & 3
rm = byte & 7
if m == 3
return regclass.new(rm, opsz)
end
sum = Sum[adsz][m][rm]
s, i, b, imm = nil
sum.each { |a|
case a
when Integer
if not b
b = Reg.new(a, adsz)
else
s = 1
i = Reg.new(a, adsz)
end
when :sib
sib = edata.get_byte.to_i
ii = ((sib >> 3) & 7)
if ii != 4
s = 1 << ((sib >> 6) & 3)
i = Reg.new(ii, adsz)
end
bb = sib & 7
if bb == 5 and m == 0
imm = Expression[edata.decode_imm("i#{adsz}".to_sym, endianness)]
else
b = Reg.new(bb, adsz)
end
when :i8, :i16, :i32
imm = Expression[edata.decode_imm(a, endianness)]
end
}
new adsz, opsz, s, i, b, imm, seg
end
end
class Farptr
def self.decode(edata, endianness, adsz)
addr = Expression[edata.decode_imm("u#{adsz}".to_sym, endianness)]
seg = Expression[edata.decode_imm(:u16, endianness)]
new seg, addr
end
end
def build_opcode_bin_mask(op)
# bit = 0 if can be mutated by an field value, 1 if fixed by opcode
op.bin_mask = Array.new(op.bin.length, 0)
op.fields.each { |f, (oct, off)|
op.bin_mask[oct] |= (@fields_mask[f] << off)
}
op.bin_mask.map! { |v| 255 ^ v }
end
def build_bin_lookaside
# sets up a hash byte value => list of opcodes that may match
# opcode.bin_mask is built here
lookaside = Array.new(256) { [] }
@opcode_list.each { |op|
build_opcode_bin_mask op
b = op.bin[0]
msk = op.bin_mask[0]
for i in b..(b | (255^msk))
next if i & msk != b & msk
lookaside[i] << op
end
}
lookaside
end
def decode_prefix(instr, byte)
# XXX check multiple occurences ?
instr.prefix ||= {}
(instr.prefix[:list] ||= []) << byte
case byte
when 0x66; instr.prefix[:opsz] = true
when 0x67; instr.prefix[:adsz] = true
when 0xF0; instr.prefix[:lock] = true
when 0xF2; instr.prefix[:rep] = :nz
when 0xF3; instr.prefix[:rep] = :z # postprocessed by decode_instr
when 0x26, 0x2E, 0x36, 0x3E, 0x64, 0x65
if byte & 0x40 == 0
v = (byte >> 3) & 3
else
v = byte & 7
end
instr.prefix[:seg] = SegReg.new(v)
instr.prefix[:jmphint] = ((byte & 0x10) == 0x10)
else
return false
end
true
end
# tries to find the opcode encoded at edata.ptr
# if no match, tries to match a prefix (update di.instruction.prefix)
# on match, edata.ptr points to the first byte of the opcode (after prefixes)
def decode_findopcode(edata)
di = DecodedInstruction.new self
while edata.ptr < edata.data.length
pfx = di.instruction.prefix || {}
return di if di.opcode = @bin_lookaside[edata.data[edata.ptr]].find { |op|
# fetch the relevant bytes from edata
bseq = edata.data[edata.ptr, op.bin.length].unpack('C*')
# check against full opcode mask
op.bin.zip(bseq, op.bin_mask).all? { |b1, b2, m| b2 and ((b1 & m) == (b2 & m)) } and
# check special cases
!(
# fail if any of those is true
(fld = op.fields[:seg2A] and (bseq[fld[0]] >> fld[1]) & @fields_mask[:seg2A] == 1) or
(fld = op.fields[:seg3A] and (bseq[fld[0]] >> fld[1]) & @fields_mask[:seg3A] < 4) or
(fld = op.fields[:seg3A] || op.fields[:seg3] and (bseq[fld[0]] >> fld[1]) & @fields_mask[:seg3] > 5) or
(fld = op.fields[:modrmA] and (bseq[fld[0]] >> fld[1]) & 0xC0 == 0xC0) or
(sz = op.props[:opsz] and ((pfx[:opsz] and @size != 48-sz) or
(not pfx[:opsz] and @size != sz))) or
(pfx = op.props[:needpfx] and not (pfx[:list] || []).include? pfx)
)
}
break if not decode_prefix(di.instruction, edata.get_byte)
di.bin_length += 1
end
end
def decode_instr_op(edata, di)
before_ptr = edata.ptr
op = di.opcode
di.instruction.opname = op.name
bseq = edata.read(op.bin.length).unpack('C*') # decode_findopcode ensures that data >= op.length
pfx = di.instruction.prefix || {}
field_val = proc { |f|
if fld = op.fields[f]
(bseq[fld[0]] >> fld[1]) & @fields_mask[f]
end
}
if field_val[:w] == 0
opsz = 8
elsif pfx[:opsz]
opsz = 48 - @size
else
opsz = @size
end
if pfx[:adsz]
adsz = 48 - @size
else
adsz = @size
end
op.args.each { |a|
mmxsz = ((op.props[:xmmx] && pfx[:opsz]) ? 128 : 64)
di.instruction.args << case a
when :reg; Reg.new field_val[a], opsz
when :eeec; CtrlReg.new field_val[a]
when :eeed; DbgReg.new field_val[a]
when :seg2, :seg2A, :seg3, :seg3A; SegReg.new field_val[a]
when :regfp; FpReg.new field_val[a]
when :regmmx; SimdReg.new field_val[a], mmxsz
when :regxmm; SimdReg.new field_val[a], 128
when :farptr; Farptr.decode edata, @endianness, adsz
when :i8, :u8, :u16; Expression[edata.decode_imm(a, @endianness)]
when :i; Expression[edata.decode_imm("#{op.props[:unsigned_imm] ? 'a' : 'i'}#{opsz}".to_sym, @endianness)]
when :mrm_imm; ModRM.decode edata, (adsz == 16 ? 6 : 5), @endianness, adsz, opsz, pfx[:seg]
when :modrm, :modrmA; ModRM.decode edata, field_val[a], @endianness, adsz, (op.props[:argsz] || opsz), pfx[:seg]
when :modrmmmx; ModRM.decode edata, field_val[:modrm], @endianness, adsz, mmxsz, pfx[:seg], SimdReg
when :modrmxmm; ModRM.decode edata, field_val[:modrm], @endianness, adsz, 128, pfx[:seg], SimdReg
when :imm_val1; Expression[1]
when :imm_val3; Expression[3]
when :reg_cl; Reg.new 1, 8
when :reg_eax; Reg.new 0, opsz
when :reg_dx; Reg.new 2, 16
when :regfp0; FpReg.new nil # implicit?
else raise SyntaxError, "Internal error: invalid argument #{a} in #{op.name}"
end
}
di.bin_length += edata.ptr - before_ptr
if op.name == 'movsx' or op.name == 'movzx'
if opsz == 8
di.instruction.args[1].sz = 8
else
di.instruction.args[1].sz = 16
end
if pfx[:opsz]
di.instruction.args[0].sz = 48 - @size
else
di.instruction.args[0].sz = @size
end
end
pfx.delete :seg
case r = pfx.delete(:rep)
when :nz
if di.opcode.props[:strop]
pfx[:rep] = 'rep'
elsif di.opcode.props[:stropz]
pfx[:rep] = 'repnz'
end
when :z
if di.opcode.props[:stropz]
pfx[:rep] = 'repz'
end
end
di
end
# converts relative jump/call offsets to absolute addresses
# adds the eip delta to the offset +off+ of the instruction (may be an Expression) + its bin_length
# do not call twice on the same di !
def decode_instr_interpret(di, addr)
if di.opcode.props[:setip] and di.instruction.args.last.kind_of? Expression and di.instruction.opname[0, 3] != 'ret'
delta = di.instruction.args.last.reduce
arg = Expression[[addr, :+, di.bin_length], :+, delta].reduce
di.instruction.args[-1] = Expression[arg]
end
di
end
# interprets a condition code (in an opcode name) as an expression involving backtracked eflags
# eflag_p is never computed, and this returns Expression::Unknown for this flag
# ex: 'z' => Expression[:eflag_z]
def decode_cc_to_expr(cc)
case cc
when 'o'; Expression[:eflag_o]
when 'no'; Expression[:'!', :eflag_o]
when 'b', 'nae'; Expression[:eflag_c]
when 'nb', 'ae'; Expression[:'!', :eflag_c]
when 'z', 'e'; Expression[:eflag_z]
when 'nz', 'ne'; Expression[:'!', :eflag_z]
when 'be', 'na'; Expression[:eflag_c, :|, :eflag_z]
when 'nbe', 'a'; Expression[:'!', [:eflag_c, :|, :eflag_z]]
when 's'; Expression[:eflag_s]
when 'ns'; Expression[:'!', :eflag_s]
when 'p', 'pe'; Expression::Unknown
when 'np', 'po'; Expression::Unknown
when 'l', 'nge'; Expression[:eflag_s, :'!=', :eflag_o]
when 'nl', 'ge'; Expression[:eflag_s, :==, :eflag_o]
when 'le', 'ng'; Expression[[:eflag_s, :'!=', :eflag_o], :|, :eflag_z]
when 'nle', 'g'; Expression[[:eflag_s, :==, :eflag_o], :&, :eflag_z]
end
end
def backtrace_binding(di)
a = di.instruction.args.map { |arg|
case arg
when ModRM; arg.symbolic(di.address)
when Reg, SimdReg; arg.symbolic
else arg
end
}
# XXX TODO opsz override ?
opsz = @size
opsz = 48 - opsz if di.instruction.prefix and di.instruction.prefix[:opsz]
mask = (1 << opsz)-1 # 32bits => 0xffff_ffff
binding =
case op = di.opcode.name
when 'mov', 'movsx', 'movzx', 'movd', 'movq'; { a[0] => Expression[a[1]] }
when 'lea'; { a[0] => a[1].target }
when 'xchg'; { a[0] => Expression[a[1]], a[1] => Expression[a[0]] }
when 'add', 'sub', 'or', 'xor', 'and', 'pxor', 'adc', 'sbb'
e_op = { 'add' => :+, 'sub' => :-, 'or' => :|, 'and' => :&, 'xor' => :^, 'pxor' => :^, 'adc' => :+, 'sbb' => :- }[op]
ret = Expression[a[0], e_op, a[1]]
ret = Expression[ret, e_op, :eflag_c] if op == 'adc' or op == 'sbb'
# optimises :eax ^ :eax => 0
# avoid hiding memory accesses (to not hide possible fault)
ret = Expression[ret.reduce] if not a[0].kind_of? Indirection
{ a[0] => ret }
when 'inc'; { a[0] => Expression[a[0], :+, 1] }
when 'dec'; { a[0] => Expression[a[0], :-, 1] }
when 'not'; { a[0] => Expression[a[0], :^, mask] }
when 'neg'; { a[0] => Expression[:-, a[0]] }
when 'rol', 'ror'
inv_op = (op[2] == ?r ? :<< : :>>)
e_op = (op[2] == ?r ? :>> : :<<)
sz = [a[1], :%, opsz]
isz = [[opsz, :-, a[1]], :%, opsz]
# ror a, b => (a >> b) | (a << (32-b))
{ a[0] => Expression[[[a[0], e_op, sz], :|, [a[0], inv_op, isz]], :&, mask] }
when 'sar', 'shl', 'sal'; { a[0] => Expression[a[0], (op[-1] == ?r ? :>> : :<<), [a[1], :%, opsz]] }
when 'shr'; { a[0] => Expression[[a[0], :&, mask], :>>, [a[1], :%, opsz]] }
when 'cdq'; { :edx => Expression[0xffff_ffff, :*, [[:eax, :>>, opsz-1], :&, 1]] }
when 'push', 'push.i16'
{ :esp => Expression[:esp, :-, opsz/8],
Indirection[:esp, opsz/8, di.address] => Expression[a[0]] }
when 'pop'
{ :esp => Expression[:esp, :+, opsz/8],
a[0] => Indirection[:esp, opsz/8, di.address] }
when 'pushfd'
# TODO Unknown per bit
efl = Expression[0x202]
bts = proc { |pos, v| efl = Expression[efl, :|, [[v, :&, 1], :<<, pos]] }
bts[0, :eflag_c]
bts[6, :eflag_z]
bts[7, :eflag_s]
bts[11, :eflag_o]
{ :esp => Expression[:esp, :-, opsz/8], Indirection[:esp, opsz/8, di.address] => efl }
when 'popfd'
bt = proc { |pos| Expression[[Indirection[:esp, opsz/8, di.address], :>>, pos], :&, 1] }
{ :esp => Expression[:esp, :+, opsz/8], :eflag_c => bt[0], :eflag_z => bt[6], :eflag_s => bt[7], :eflag_o => bt[11] }
when 'sahf'
bt = proc { |pos| Expression[[:eax, :>>, pos], :&, 1] }
{ :eflag_c => bt[0], :eflag_z => bt[6], :eflag_s => bt[7] }
when 'lahf'
efl = Expression[2]
bts = proc { |pos, v| efl = Expression[efl, :|, [[v, :&, 1], :<<, pos]] }
bts[0, :eflag_c]
#bts[2, :eflag_p]
#bts[4, :eflag_a]
bts[6, :eflag_z]
bts[7, :eflag_s]
{ :eax => efl }
when 'pushad'
ret = {}
st_off = 0
[:eax, :ecx, :edx, :ebx, :esp, :ebp, :esi, :edi].reverse_each { |r|
ret[Indirection[Expression[:esp, :+, st_off].reduce, opsz/8, di.address]] = Expression[r]
st_off += opsz/8
}
ret[:esp] = Expression[:esp, :-, st_off]
ret
when 'popad'
ret = {}
st_off = 0
[:eax, :ecx, :edx, :ebx, :esp, :ebp, :esi, :edi].reverse_each { |r|
ret[r] = Indirection[Expression[:esp, :+, st_off].reduce, opsz/8, di.address]
st_off += opsz/8
}
ret
when 'call'
{ :esp => Expression[:esp, :-, opsz/8],
Indirection[:esp, opsz/8, di.address] => Expression[Expression[di.address, :+, di.bin_length].reduce] }
when 'ret'; { :esp => Expression[:esp, :+, [opsz/8, :+, a[0] || 0]] }
when 'loop', 'loopz', 'loopnz'; { :ecx => Expression[:ecx, :-, 1] }
when 'enter'
depth = a[1].reduce % 32
b = { Indirection[:esp, opsz/8, di.address] => Expression[:ebp], :ebp => Expression[:esp, :-, opsz/8],
:esp => Expression[:esp, :-, a[0].reduce + ((opsz/8) * depth)] }
(1..depth).each { |i| # XXX test me !
b[Indirection[[:esp, :-, i*opsz/8], opsz/8, di.address]] = Indirection[[:ebp, :-, i*opsz/8], opsz/8, di.address] }
b
when 'leave'; { :ebp => Indirection[[:ebp], opsz/8, di.address], :esp => Expression[:ebp, :+, opsz/8] }
when 'aaa'; { :eax => Expression::Unknown }
when 'imul'
if a[2]; e = Expression[a[1], :*, a[2]]
else e = Expression[[a[0], :*, a[1]], :&, (1 << (di.instruction.args.first.sz || opsz)) - 1]
end
{ a[0] => e }
when 'rdtsc'; { :eax => Expression::Unknown, :edx => Expression::Unknown }
when /^(stos|movs)([bwd])$/
e_op = $1
sz = { 'b' => 1, 'w' => 2, 'd' => 4 }[$2]
dir = :+
dir = :- if di.block and (di.block.list.find { |ddi| ddi.opcode.name == 'std' } rescue nil)
pesi = Indirection[:esi, sz, di.address]
pedi = Indirection[:edi, sz, di.address]
pfx = di.instruction.prefix || {}
case e_op
when 'movs'
case pfx[:rep]
when nil; { pedi => pesi, :esi => Expression[:esi, dir, sz], :edi => Expression[:edi, dir, sz] }
else { pedi => pesi, :esi => Expression::Unknown, :edi => Expression::Unknown } # repz/repnz..
end
when 'stos'
case pfx[:rep]
when nil; { pedi => Expression[:eax], :edi => Expression[:edi, dir, sz] }
else { pedi => Expression[:eax], :edi => Expression[:edi, dir, [sz, :*, :ecx]] } # XXX create an xref at edi+sz*ecx ?
end
end
when 'clc'; { :eflag_c => Expression[0] }
when 'stc'; { :eflag_c => Expression[1] }
when 'cmc'; { :eflag_c => Expression[:'!', :eflag_c] }
when 'cld'; { :eflag_d => Expression[0] }
when 'std'; { :eflag_d => Expression[1] }
when 'setalc'; { :eax => Expression[:eflag_c, :*, 0xff] }
when /^set(.*)/
cd = decode_cc_to_expr($1)
{ a[0] => Expression[cd] }
when /^j(.*)/
binding = { 'dummy_metasm_0' => Expression[a[0]] }
if fl = decode_cc_to_expr($1)
binding['dummy_metasm_1'] = fl # mark eflags as read
end
binding
when 'nop', 'pause', 'wait', 'cmp', 'test'; {}
else
puts "unhandled instruction to backtrace: #{di}" if $VERBOSE
# assume nothing except the 1st arg
case a[0]
when Indirection, Symbol; { a[0] => Expression::Unknown }
else {}
end
end
# eflags side-effects
sign = proc { |v| Expression[[[v, :&, mask], :>>, opsz-1], :'!=', 0] }
case op
when 'adc', 'add', 'and', 'cmp', 'or', 'sbb', 'sub', 'xor', 'test'
e_op = { 'adc' => :+, 'add' => :+, 'and' => :&, 'cmp' => :-, 'or' => :|, 'sbb' => :-, 'sub' => :-, 'xor' => :^, 'test' => :& }[op]
res = Expression[[a[0], :&, mask], e_op, [a[1], :&, mask]]
res = Expression[res, e_op, :eflag_c] if op == 'adc' or op == 'sbb'
binding[:eflag_z] = Expression[[res, :&, mask], :==, 0]
binding[:eflag_s] = sign[res]
binding[:eflag_c] = case e_op
when :+; Expression[res, :>, mask]
when :-; Expression[[a[0], :&, mask], :<, [a[1], :&, mask]]
else Expression[0]
end
binding[:eflag_o] = case e_op
when :+; Expression[[sign[a[0]], :==, sign[a[1]]], :'&&', [sign[a[0]], :'!=', sign[res]]]
when :-; Expression[[sign[a[0]], :==, [:'!', sign[a[1]]]], :'&&', [sign[a[0]], :'!=', sign[res]]]
else Expression[0]
end
when 'inc', 'dec', 'neg', 'shl', 'shr', 'sar', 'ror', 'rol', 'rcr', 'rcl', 'shld', 'shrd'
res = binding[a[0]]
binding[:eflag_z] = Expression[[res, :&, mask], :==, 0]
binding[:eflag_s] = sign[res]
case op
when 'neg'; binding[:eflag_c] = Expression[[res, :&, mask], :'!=', 0]
when 'inc', 'dec' # don't touch carry flag
else binding[:eflag_c] = Expression::Unknown
end
binding[:eflag_o] = case op
when 'inc'; Expression[[a[0], :&, mask], :==, mask >> 1]
when 'dec'; Expression[[res , :&, mask], :==, mask >> 1]
when 'neg'; Expression[[a[0], :&, mask], :==, (mask+1) >> 1]
else Expression::Unknown # TODO someday
end
when 'imul', 'mul', 'idiv', 'div'
binding[:eflag_z] = binding[:eflag_s] = binding[:eflag_c] = binding[:eflag_o] = Expression::Unknown
end
binding
end
def get_xrefs_x(dasm, di)
return [] if not di.opcode.props[:setip]
case di.opcode.name
when 'ret'; return [Indirection[:esp, @size/8, di.address]]
when 'jmp'
a = di.instruction.args.first
if a.kind_of? ModRM and a.imm and a.s == @size/8 and not a.b and s = dasm.get_section_at(Expression[a.imm, :-, 3*@size/8])
# jmp table
ret = [Expression[a.symbolic(di.address)]]
v = -3
loop do
diff = Expression[s[0].decode_imm("u#@size".to_sym, @endianness), :-, di.address].reduce
if diff.kind_of? ::Integer and diff.abs < 4096
ret << Indirection[[a.imm, :+, v*@size/8], @size/8, di.address]
elsif v > 0
break
end
v += 1
end
return ret
end
end
case tg = di.instruction.args.first
when ModRM
tg.sz ||= @size if tg.kind_of? ModRM
[Expression[tg.symbolic(di.address)]]
when Reg
[Expression[tg.symbolic]]
when Expression, ::Integer
[Expression[tg]]
else
puts "unhandled setip at #{di.address} #{di.instruction}" if $DEBUG
[]
end
end
# checks if expr is a valid return expression matching the :saveip instruction
def backtrace_is_function_return(expr, di=nil)
expr = Expression[expr].reduce_rec
expr.kind_of? Indirection and expr.len == @size/8 and expr.target == Expression[:esp]
end
# updates the function backtrace_binding
# XXX assume retaddrlist is either a list of addr of ret or a list with a single entry which is an external function name (thunk)
def backtrace_update_function_binding(dasm, faddr, f, retaddrlist)
b = f.backtrace_binding
# XXX handle retaddrlist for multiple/mixed thunks
if retaddrlist and not dasm.decoded[retaddrlist.first] and di = dasm.decoded[faddr]
# no return instruction, must be a thunk : find the last instruction (to backtrace from it)
done = []
while ndi = dasm.decoded[di.block.to_subfuncret.to_a.first] || dasm.decoded[di.block.to_normal.to_a.first] and ndi.kind_of? DecodedInstruction and not done.include? ndi.address
done << ndi.address
di = ndi
end
if not di.block.to_subfuncret.to_a.first and di.block.to_normal and di.block.to_normal.length > 1
thunklast = di.block.list.last.address
end
end
bt_val = proc { |r|
next if not retaddrlist
bt = []
retaddrlist.each { |retaddr|
bt |= dasm.backtrace(Expression[r], (thunklast ? thunklast : retaddr),
:include_start => true, :snapshot_addr => faddr, :origin => retaddr, :from_subfuncret => thunklast)
}
if bt.length != 1
b[r] = Expression::Unknown
else
b[r] = bt.first
end
}
[:eax, :ebx, :ecx, :edx, :esi, :edi, :ebp, :esp].each(&bt_val)
return if f.need_finalize
sz = @size/8
if b[:ebp] != Expression[:ebp]
# may be a custom 'enter' function (eg recent Visual Studio)
# TODO put all memory writes in the binding ?
[[:ebp], [:esp, :+, 1*sz], [:esp, :+, 2*sz], [:esp, :+, 3*sz]].each { |ptr|
ind = Indirection[ptr, sz, faddr]
bt_val[ind]
b.delete(ind) if b[ind] and not [:ebx, :edx, :esi, :edi, :ebp].include? b[ind].reduce_rec
}
end
if dasm.funcs_stdabi
if b[:ebp] == Expression::Unknown
puts "update_func_bind: #{Expression[faddr]} has ebp -> unknown, presume it is preserved" if $DEBUG
b[:ebp] = Expression[:ebp]
end
if b[:esp] == Expression::Unknown and not f.btbind_callback
puts "update_func_bind: #{Expression[faddr]} has esp -> unknown, use dynamic callback" if $DEBUG
f.btbind_callback = disassembler_default_btbind_callback
end
else
if b[:esp] != prevesp and not Expression[b[:esp], :-, :esp].reduce.kind_of?(::Integer)
puts "update_func_bind: #{Expression[faddr]} has esp -> #{b[:esp]}" if $DEBUG
end
end
# rename some functions
# TODO database and real signatures
rename =
if Expression[b[:eax], :-, faddr].reduce == 0
'geteip' # metasm pic linker
elsif Expression[b[:eax], :-, :eax].reduce == 0 and Expression[b[:ebx], :-, Indirection[:esp, sz, nil]].reduce == 0
'get_pc_thunk_ebx' # elf pic convention
elsif Expression[b[:esp], :-, [:esp, :-, [Indirection[[:esp, :+, 2*sz], sz, nil], :+, 0x18]]].reduce == 0
'__SEH_prolog'
elsif Expression[b[:esp], :-, [:ebp, :+, sz]].reduce == 0 and Expression[b[:ebx], :-, Indirection[[:esp, :+, 4*sz], sz, nil]].reduce == 0
'__SEH_epilog'
end
dasm.auto_label_at(faddr, rename, 'loc', 'sub') if rename
b
end
# returns true if the expression is an address on the stack
def backtrace_is_stack_address(expr)
Expression[expr].expr_externals.include? :esp
end
# updates an instruction's argument replacing an expression with another (eg label renamed)
def replace_instr_arg_immediate(i, old, new)
i.args.map! { |a|
case a
when Expression; a == old ? new : Expression[a.bind(old => new).reduce]
when ModRM
a.imm = (a.imm == old ? new : Expression[a.imm.bind(old => new).reduce]) if a.imm
a
else a
end
}
end
# returns a DecodedFunction from a parsed C function prototype
# TODO rebacktrace already decoded functions (load a header file after dasm finished)
# TODO walk structs args
def decode_c_function_prototype(cp, sym, orig=nil)
sym = cp.toplevel.symbol[sym] if sym.kind_of?(::String)
df = DecodedFunction.new
orig ||= Expression[sym.name]
new_bt = proc { |expr, rlen|
df.backtracked_for << BacktraceTrace.new(expr, orig, expr, rlen ? :r : :x, rlen)
}
# return instr emulation
new_bt[Indirection[:esp, @size/8, orig], nil] if not sym.attributes.to_a.include? 'noreturn'
# register dirty (XXX assume standard ABI)
df.backtrace_binding.update :eax => Expression::Unknown, :ecx => Expression::Unknown, :edx => Expression::Unknown
# emulate ret <n>
al = cp.typesize[:ptr]
if sym.attributes.to_a.include? 'stdcall'
argsz = sym.type.args.to_a.inject(al) { |sum, a| sum += (cp.sizeof(a) + al - 1) / al * al }
df.backtrace_binding[:esp] = Expression[:esp, :+, argsz]
else
df.backtrace_binding[:esp] = Expression[:esp, :+, al]
end
# scan args for function pointers
# TODO walk structs/unions..
stackoff = al
sym.type.args.to_a.each { |a|
if a.type.untypedef.kind_of? C::Pointer
pt = a.type.untypedef.type.untypedef
if pt.kind_of? C::Function
new_bt[Indirection[[:esp, :+, stackoff], al, orig], nil]
df.backtracked_for.last.detached = true
elsif pt.kind_of? C::Struct
new_bt[Indirection[[:esp, :+, stackoff], al, orig], al]
else
new_bt[Indirection[[:esp, :+, stackoff], al, orig], cp.sizeof(nil, pt)]
end
end
stackoff += (cp.sizeof(a) + al - 1) / al * al
}
df
end
# the proc for the :default backtrace_binding callback of the disassembler
# tries to determine the stack offset of unprototyped functions
# working:
# checks that origin is a ret, that expr is an indirection from :esp and that expr.origin is the ret
# bt_walk from calladdr until we finds a call into us, and assumes it is the current function start
# TODO handle foo: call bar ; bar: pop eax ; call <withourcallback> ; ret -> bar is not the function start (foo is)
# then backtrace expr from calladdr to funcstart (snapshot), using esp -> esp+<stackoffvariable>
# from the result, compute stackoffvariable (only if trivial)
# will not work if the current function calls any other unknown function (unless all are __cdecl)
# will not work if the current function is framed (ebp leave ret): in this case the function will return, but its :esp will be unknown
# TODO remember failed attempts and rebacktrace them if we find our stackoffset later ? (other funcs may depend on us)
# if the stack offset is found and funcaddr is a string, fixup the static binding and remove the dynamic binding
# TODO dynamise thunks
def disassembler_default_btbind_callback
proc { |dasm, bind, funcaddr, calladdr, expr, origin, maxdepth|
@dasm_func_default_off ||= {}
if off = @dasm_func_default_off[[dasm, calladdr]]
bind = bind.merge(:esp => Expression[:esp, :+, off])
break bind
end
break bind if not odi = dasm.decoded[origin] or odi.opcode.name != 'ret'
expr = expr.reduce_rec if expr.kind_of? Expression
break bind unless expr.kind_of? Indirection and expr.origin == origin
break bind unless expr.externals.reject { |e| e =~ /^autostackoffset_/ } == [:esp]
# scan from calladdr for the probable parent function start
func_start = nil
dasm.backtrace_walk(true, calladdr, false, false, nil, maxdepth) { |ev, foo, h|
if ev == :up and h[:sfret] != :subfuncret and di = dasm.decoded[h[:to]] and di.opcode.name == 'call'
func_start = h[:from]
break
elsif ev == :end
# entrypoints are functions too
func_start = h[:addr]
break
end
}
break bind if not func_start
puts "automagic #{funcaddr}: found func start for #{dasm.decoded[origin]} at #{Expression[func_start]}" if dasm.debug_backtrace
s_off = "autostackoffset_#{Expression[funcaddr]}_#{Expression[calladdr]}"
list = dasm.backtrace(expr.bind(:esp => Expression[:esp, :+, s_off]), calladdr, :include_start => true, :snapshot_addr => func_start, :maxdepth => maxdepth, :origin => origin)
e_expr = list.find { |e_expr|
# TODO cleanup this
e_expr = Expression[e_expr].reduce_rec
next if not e_expr.kind_of? Indirection
off = Expression[[:esp, :+, s_off], :-, e_expr.target].reduce
off.kind_of? Integer and off >= @size/8 and off < 10*@size/8 and (off % (@size/8)) == 0
} || list.first
e_expr = e_expr.rexpr if e_expr.kind_of? Expression and e_expr.op == :+ and not e_expr.lexpr
break bind unless e_expr.kind_of? Indirection
off = Expression[[:esp, :+, s_off], :-, e_expr.target].reduce
case off
when Expression
bd = off.externals.grep(/^stackoff=/).inject({}) { |bd, xt| bd.update xt => @size/8 }
bd.delete s_off
# all __cdecl
off = @size/8 if off.bind(bd).reduce == @size/8
when Integer
if off < @size/8 or off > 20*@size/8 or (off % (@size/8)) != 0
puts "autostackoffset: ignoring off #{off} for #{Expression[funcaddr]} from #{dasm.decoded[calladdr]}" if $VERBOSE
off = :unknown
end
end
bind = bind.merge :esp => Expression[:esp, :+, off] if off != :unknown
if funcaddr != :default
if not off.kind_of? ::Integer
#XXX we allow the current function to return, so we should handle the func backtracking its :esp
#(and other register that are saved and restored in epilog)
puts "stackoff #{dasm.decoded[calladdr]} | #{Expression[func_start]} | #{expr} | #{e_expr} | #{off}" if dasm.debug_backtrace
else
puts "autostackoffset: found #{off} for #{Expression[funcaddr]} from #{dasm.decoded[calladdr]}" if $VERBOSE
dasm.function[funcaddr].btbind_callback = nil
dasm.function[funcaddr].backtrace_binding = bind
# rebacktrace the return address, so that other unknown funcs that depend on us are solved
dasm.backtrace(Indirection[:esp, @size/8, origin], origin, :origin => origin)
end
else
if off.kind_of? ::Integer and dasm.decoded[calladdr]
puts "autostackoffset: found #{off-@size/8} for #{dasm.decoded[calladdr]}" if $VERBOSE
di = dasm.decoded[calladdr]
di.comment.delete_if { |c| c =~ /^stackoff=/ } if di.comment
di.add_comment "stackoff=#{off-@size/8}"
@dasm_func_default_off[[dasm, calladdr]] = off
dasm.backtrace(Indirection[:esp, @size/8, origin], origin, :origin => origin)
elsif cachedoff = @dasm_func_default_off[[dasm, calladdr]]
bind[:esp] = Expression[:esp, :+, cachedoff]
elsif off.kind_of? ::Integer
dasm.decoded[calladdr].add_comment "stackoff=#{off-@size/8}"
end
puts "stackoff #{dasm.decoded[calladdr]} | #{Expression[func_start]} | #{expr} | #{e_expr} | #{off}" if dasm.debug_backtrace
end
bind
}
end
# the :default backtracked_for callback
# returns empty unless funcaddr is not default or calladdr is a call or a jmp
def disassembler_default_btfor_callback
proc { |dasm, btfor, funcaddr, calladdr|
if funcaddr != :default
btfor
elsif di = dasm.decoded[calladdr] and (di.opcode.name == 'call' or di.opcode.name == 'jmp')
btfor
else
[]
end
}
end
# returns a DecodedFunction suitable for :default
# uses disassembler_default_bt{for/bind}_callback
def disassembler_default_func
cp = new_cparser
cp.parse 'void stdfunc(void);'
f = decode_c_function_prototype(cp, 'stdfunc', :default)
f.backtrace_binding[:esp] = Expression[:esp, :+, :unknown]
f.btbind_callback = disassembler_default_btbind_callback
f.btfor_callback = disassembler_default_btfor_callback
f
end
end
end

View File

@ -1,297 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/ia32/opcodes'
require 'metasm/ia32/parse'
require 'metasm/encode'
module Metasm
class Ia32
class InvalidModRM < Exception ; end
class ModRM
# returns the byte representing the register encoded as modrm
# works with Reg/SimdReg
def self.encode_reg(reg, mregval = 0)
0xc0 | (mregval << 3) | reg.val
end
# The argument is an integer representing the 'reg' field of the mrm
#
# caller is responsible for setting the adsz
# returns an array, 1 element per possible immediate size (for un-reduce()able Expression)
def encode(reg = 0, endianness = :little)
case @adsz
when 16; encode16(reg, endianness)
when 32; encode32(reg, endianness)
end
end
private
def encode16(reg, endianness)
if not @b
# imm only
return [EncodedData.new << (6 | (reg << 3)) << @imm.encode(:u16, endianness)]
end
imm = @imm.reduce
imm = nil if imm == 0
ret = EncodedData.new
ret <<
case [@b.val, (@s.val if @s)]
when [3, 6], [6, 3]; 0
when [3, 7], [7, 3]; 1
when [5, 6], [6, 5]; 2
when [5, 7], [7, 5]; 3
when [6, nil]; 4
when [7, nil]; 5
when [5, nil]
imm ||= 0
6
when [3, nil]; 7
else raise InvalidModRM, 'invalid modrm16'
end
ret.data[0] |= reg << 3
if imm
case Expression.in_range?(imm, :i8)
when true
ret.data[0] |= 1 << 6
[ret << Expression.encode_immediate(imm, :i8, endianness)]
when false
ret.data[0] |= 2 << 6
[ret << Expression.encode_immediate(imm, :i16, endianness)]
when nil
retl = ret.dup
ret.data[0] |= 1 << 6
retl.data[0] |= 2 << 6
ret << @imm.encode(:i8, endianness)
retl << @imm.encode(:i16, endianness)
[retl, ret]
end
else
[ret]
end
end
def encode32(reg, endianness)
# 0 => [ [0 ], [1 ], [2 ], [3 ], [:sib ], [:i32 ], [6 ], [7 ] ], \
# 1 => [ [0, :i8 ], [1, :i8 ], [2, :i8 ], [3, :i8 ], [:sib, :i8 ], [5, :i8 ], [6, :i8 ], [7, :i8 ] ], \
# 2 => [ [0, :i32], [1, :i32], [2, :i32], [3, :i32], [:sib, :i32], [5, :i32], [6, :i32], [7, :i32] ]
#
# b => 0 1 2 3 4 5+i|i 6 7
# i => 0 1 2 3 nil 5 6 7
ret = EncodedData.new << (reg << 3)
if not self.b and not self.i
ret.data[0] |= 5
[ret << @imm.encode(:u32, endianness)]
elsif not self.b and self.s != 1
# sib with no b
raise EncodeError, "Invalid ModRM #{self}" if @i.val == 4
ret.data[0] |= 4
s = {8=>3, 4=>2, 2=>1}[@s]
imm = self.imm || Expression[0]
[ret << ((s << 6) | (@i.val << 3) | 5) << imm.encode(:a32, endianness)]
else
imm = @imm.reduce if self.imm
imm = nil if imm == 0
if not self.i or (not self.b and self.s == 1)
# no sib byte (except for [esp])
b = self.b || self.i
ret.data[0] |= b.val
ret << 0x24 if b.val == 4
else
# sib
ret.data[0] |= 4
i, b = @i, @b
b, i = i, b if @s == 1 and (i.val == 4 or b.val == 5)
raise EncodeError, "Invalid ModRM #{self}" if i.val == 4
s = {8=>3, 4=>2, 2=>1, 1=>0}[@s]
ret << ((s << 6) | (i.val << 3) | b.val)
end
imm ||= 0 if b.val == 5
if imm
case Expression.in_range?(imm, :i8)
when true
ret.data[0] |= 1 << 6
[ret << Expression.encode_immediate(imm, :i8, endianness)]
when false
ret.data[0] |= 2 << 6
[ret << Expression.encode_immediate(imm, :a32, endianness)]
when nil
rets = ret.dup
rets.data[0] |= 1 << 6
rets << @imm.encode(:i8, endianness)
ret.data[0] |= 2 << 6
ret << @imm.encode(:a32, endianness)
[ret, rets]
end
else
[ret]
end
end
end
end
class Farptr
def encode(endianness, atype)
@addr.encode(atype, endianness) << @seg.encode(:u16, endianness)
end
end
# returns all forms of the encoding of instruction i using opcode op
# program may be used to create a new label for relative jump/call
def encode_instr_op(program, i, op)
base = op.bin.pack('C*')
oi = op.args.zip(i.args)
set_field = proc { |base, f, v|
fld = op.fields[f]
base[fld[0]] |= v << fld[1]
}
#
# handle prefixes and bit fields
#
pfx = i.prefix.map { |k, v|
case k
when :jmp; {:jmp => 0x3e, :nojmp => 0x2e}[v]
when :lock; 0xf0
when :rep; {'repnz' => 0xf2, 'repz' => 0xf3, 'rep' => 0xf2}[v] # TODO
end
}.pack 'C*'
pfx << op.props[:needpfx].pack('C*') if op.props[:needpfx]
# opsize override (:w field)
if op.name == 'movsx' or op.name == 'movzx'
case [i.args[0].sz, i.args[1].sz]
when [32, 16]
set_field[base, :w, 1]
pfx << 0x66 if @size == 16
when [16, 16]
set_field[base, :w, 1]
pfx << 0x66 if @size == 32
when [32, 8]
pfx << 0x66 if @size == 16
when [16, 8]
pfx << 0x66 if @size == 32
end
else
opsz = nil
oi.each { |oa, ia|
case oa
when :reg, :reg_eax, :modrm, :modrmA, :mrm_imm
raise EncodeError, "Incompatible arg size in #{i}" if (ia.sz and opsz and opsz != ia.sz) or (ia.sz == 8 and not op.fields[:w])
opsz = ia.sz
end
}
pfx << 0x66 if (opsz and ((opsz == 16 and @size == 32) or (opsz == 32 and @size == 16))) or (op.props[:opsz] and op.props[:opsz] != @size)
if op.props[:opsz] and @size == 48 - op.props[:opsz]
opsz = op.props[:opsz]
end
set_field[base, :w, 1] if op.fields[:w] and opsz != 8
end
opsz ||= @size
# addrsize override / segment override
if mrm = i.args.grep(ModRM).first
if (mrm.b and mrm.b.sz != @size) or (mrm.i and mrm.i.sz != @size)
pfx << 0x67
adsz = 48 - @size
end
pfx << "\x26\x2E\x36\x3E\x64\x65"[mrm.seg.val] if mrm.seg
end
adsz ||= @size
#
# encode embedded arguments
#
postponed = []
oi.each { |oa, ia|
case oa
when :reg, :seg3, :seg3A, :seg2, :seg2A, :eeec, :eeed, :regfp, :regmmx, :regxmm
# field arg
set_field[base, oa, ia.val]
pfx << 0x66 if oa == :regmmx and op.props[:xmmx] and ia.sz == 128
when :imm_val1, :imm_val3, :reg_cl, :reg_eax, :reg_dx, :regfp0
# implicit
else
postponed << [oa, ia]
end
}
if not (op.args & [:modrm, :modrmA, :modrmxmm, :modrmmmx]).empty?
# reg field of modrm
regval = (base[-1] >> 3) & 7
base.chop!
end
# convert label name for jmp/call/loop to relative offset
if op.props[:setip] and op.name[0, 3] != 'ret' and i.args.first.kind_of? Expression
postlabel = program.new_label('post'+op.name)
target = postponed.first[1]
target = target.rexpr if target.kind_of? Expression and target.op == :+ and not target.lexpr
postponed.first[1] = Expression[target, :-, postlabel]
end
#
# append other arguments
#
ret = EncodedData.new(pfx + base)
postponed.each { |oa, ia|
case oa
when :farptr; ed = ia.encode(@endianness, "a#{adsz}".to_sym)
when :modrm, :modrmA, :modrmmmx, :modrmxmm
if ia.kind_of? ModRM
ed = ia.encode(regval, @endianness)
if ed.kind_of?(::Array)
if ed.length > 1
# we know that no opcode can have more than 1 modrm
ary = []
ed.each { |m|
ary << (ret.dup << m)
}
ret = ary
next
else
ed = ed.first
end
end
else
ed = ModRM.encode_reg(ia, regval)
end
when :mrm_imm; ed = ia.imm.encode("a#{adsz}".to_sym, @endianness)
when :i8, :u8, :u16; ed = ia.encode(oa, @endianness)
when :i; ed = ia.encode("a#{opsz}".to_sym, @endianness)
else raise SyntaxError, "Internal error: want to encode field #{oa.inspect} as arg in #{i}"
end
if ret.kind_of?(::Array)
ret.each { |e| e << ed }
else
ret << ed
end
}
# we know that no opcode with setip accept both modrm and immediate arg, so ret is not an ::Array
ret.add_export(postlabel, ret.virtsize) if postlabel
ret
end
end
end

View File

@ -1,154 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class Ia32 < CPU
# some ruby magic
class Argument
@simple_list = []
@double_list = []
class << self
# for Argument
attr_reader :simple_list, :double_list
# for subclasses
attr_reader :i_to_s, :s_to_i
end
private
def self.simple_map(a)
Argument.simple_list << self
@i_to_s = Hash[*a.flatten]
@s_to_i = @i_to_s.invert
class_eval {
attr_accessor :val
def initialize(v)
raise Exception, "invalid #{self.class} #{v}" unless self.class.i_to_s[v]
@val = v
end
}
end
def self.double_map(h)
Argument.double_list << self
@i_to_s = h
@s_to_i = {} ; h.each { |sz, hh| hh.each_with_index { |r, i| @s_to_i[r] = [i, sz] } }
class_eval {
attr_accessor :val, :sz
def initialize(v, sz)
raise Exception, "invalid #{self.class} #{sz}/#{v}" unless self.class.i_to_s[sz] and self.class.i_to_s[sz][v]
@val = v
@sz = sz
end
}
end
end
class SegReg < Argument
simple_map((0..5).zip(%w(es cs ss ds fs gs)))
end
class DbgReg < Argument
simple_map [0, 1, 2, 3, 6, 7].map { |i| [i, "dr#{i}"] }
end
class CtrlReg < Argument
simple_map [0, 2, 3, 4].map { |i| [i, "cr#{i}"] }
end
class FpReg < Argument
simple_map((0..7).map { |i| [i, "ST(#{i})"] } << [nil, 'ST'])
end
class SimdReg < Argument
double_map 64 => (0..7).map { |n| "mm#{n}" },
128 => (0..7).map { |n| "xmm#{n}" }
def symbolic ; to_s.to_sym end
end
class Reg < Argument
double_map 8 => %w{ al cl dl bl ah ch dh bh},
16 => %w{ ax cx dx bx sp bp si di},
32 => %w{eax ecx edx ebx esp ebp esi edi}
#64 => %w{rax rcx rdx rbx rsp rbp rsi rdi}
Sym = @i_to_s[32].map { |s| s.to_sym }
def symbolic
if @sz == 8 and to_s[-1] == ?h
Expression[Sym[@val-4], :>>, 8]
else
Sym[@val]
end
end
end
class Farptr < Argument
attr_reader :seg, :addr
def initialize(seg, addr)
@seg, @addr = seg, addr
end
end
class ModRM < Argument
Sum = {
16 => {
0 => [ [3, 6], [3, 7], [5, 6], [5, 7], [6], [7], [:i16], [3] ],
1 => [ [3, 6, :i8 ], [3, 7, :i8 ], [5, 6, :i8 ], [5, 7, :i8 ], [6, :i8 ], [7, :i8 ], [5, :i8 ], [3, :i8 ] ],
2 => [ [3, 6, :i16], [3, 7, :i16], [5, 6, :i16], [5, 7, :i16], [6, :i16], [7, :i16], [5, :i16], [3, :i16] ]
},
32 => {
0 => [ [0], [1], [2], [3], [:sib], [:i32], [6], [7] ],
1 => [ [0, :i8 ], [1, :i8 ], [2, :i8 ], [3, :i8 ], [:sib, :i8 ], [5, :i8 ], [6, :i8 ], [7, :i8 ] ],
2 => [ [0, :i32], [1, :i32], [2, :i32], [3, :i32], [:sib, :i32], [5, :i32], [6, :i32], [7, :i32] ]
}
}
attr_accessor :adsz, :sz
attr_accessor :seg
attr_accessor :s, :i, :b, :imm
def initialize(adsz, sz, s, i, b, imm, seg = nil)
@adsz, @sz = adsz, sz
@s, @i = s, i if i
@b = b if b
@imm = imm if imm
@seg = seg if seg
end
def symbolic(orig=nil)
p = nil
p = Expression[p, :+, @b.symbolic] if b
p = Expression[p, :+, [@s, :*, @i.symbolic]] if i
p = Expression[p, :+, @imm] if imm
p = Expression["segment_base_#@seg", :+, p] if seg and seg.val != ((b && (@b.val == 4 || @b.val == 5)) ? 2 : 3)
Indirection[p.reduce, @sz/8, orig]
end
end
def initialize(family = :latest, size = 32)
super()
@endianness = :little
@size = size
send "init_#{family}"
end
def tune_cparser(cp)
super
cp.lexer.define('_M_IX86', 500) if not cp.lexer.definition['_M_IX86']
cp.lexer.define('_X86_') if not cp.lexer.definition['_X86_']
end
end
end

View File

@ -1,787 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/ia32/main'
module Metasm
class Ia32
def init_cpu_constants
@fields_mask.update :w => 1, :s => 1, :d => 1, :modrm => 0xc7,
:reg => 7, :eeec => 7, :eeed => 7, :seg2 => 3, :seg3 => 7,
:regfp => 7, :regmmx => 7, :regxmm => 7
@fields_mask[:seg2A] = @fields_mask[:seg2]
@fields_mask[:seg3A] = @fields_mask[:seg3]
@fields_mask[:modrmA] = @fields_mask[:modrm]
@valid_args.concat [:i, :i8, :u8, :u16, :reg, :seg2, :seg2A,
:seg3, :seg3A, :eeec, :eeed, :modrm, :modrmA, :mrm_imm,
:farptr, :imm_val1, :imm_val3, :reg_cl, :reg_eax,
:reg_dx, :regfp, :regfp0, :modrmmmx, :regmmx,
:modrmxmm, :regxmm] - @valid_args
@valid_props.concat [:strop, :stropz, :opsz, :argsz, :setip,
:stopexec, :saveip, :unsigned_imm, :random, :needpfx,
:xmmx] - @valid_props
end
# only most common instructions from the 386 instruction set
# inexhaustive list :
# no aaa, arpl, mov crX, call/jmp/ret far, in/out, bts, xchg...
def init_386_common_only
init_cpu_constants
addop_macro1 'adc', 2
addop_macro1 'add', 0
addop_macro1 'and', 4, :u
addop 'bswap', [0x0F, 0xC8], :reg
addop 'call', [0xE8], nil, {}, :stopexec, :setip, :i, :saveip
addop 'call', [0xFF], 2, {}, :stopexec, :setip, :saveip
addop('cbw', [0x98]) { |o| o.props[:opsz] = 16 }
addop('cdq', [0x99]) { |o| o.props[:opsz] = 32 }
addop_macro1 'cmp', 7
addop_macrostr 'cmps', [0xA6], :stropz
addop 'dec', [0x48], :reg
addop 'dec', [0xFE], 1, {:w => [0, 0]}
addop 'div', [0xF6], 6, {:w => [0, 0]}
addop 'enter', [0xC8], nil, {}, :u16, :u8
addop 'idiv', [0xF6], 7, {:w => [0, 0]}
addop 'imul', [0xF6], 5, {:w => [0, 0]}, :reg_eax
addop 'imul', [0x0F, 0xAF], :mrm
addop 'imul', [0x69], :mrm, {:s => [0, 1]}, :i
addop 'inc', [0x40], :reg
addop 'inc', [0xFE], 0, {:w => [0, 0]}
addop 'int', [0xCC], nil, {}, :imm_val3, :stopexec
addop 'int', [0xCD], nil, {}, :u8
addop_macrotttn 'j', [0x70], nil, {}, :setip, :i8
addop_macrotttn 'j', [0x0F, 0x80], nil, {}, :setip, :i
addop 'jmp', [0xE9], nil, {:s => [0, 1]}, :setip, :i, :stopexec
addop 'jmp', [0xFF], 4, {}, :setip, :stopexec
addop 'lea', [0x8D], :mrmA
addop 'leave', [0xC9]
addop_macrostr 'lods', [0xAC], :strop
addop 'loop', [0xE2], nil, {}, :setip, :i8
addop 'loopz', [0xE1], nil, {}, :setip, :i8
addop 'loope', [0xE1], nil, {}, :setip, :i8
addop 'loopnz',[0xE0], nil, {}, :setip, :i8
addop 'loopne',[0xE0], nil, {}, :setip, :i8
addop 'mov', [0xA0], nil, {:w => [0, 0], :d => [0, 1]}, :mrm_imm, :reg_eax
addop 'mov', [0x88], :mrmw,{:d => [0, 1]}
addop 'mov', [0xB0], :reg, {:w => [0, 3]}, :u
addop 'mov', [0xC6], 0, {:w => [0, 0]}, :u
addop_macrostr 'movs', [0xA4], :strop
addop 'movsx', [0x0F, 0xBE], :mrmw
addop 'movzx', [0x0F, 0xB6], :mrmw
addop 'mul', [0xF6], 4, {:w => [0, 0]}
addop 'neg', [0xF6], 3, {:w => [0, 0]}
addop 'nop', [0x90]
addop 'not', [0xF6], 2, {:w => [0, 0]}
addop_macro1 'or', 1, :u
addop 'pop', [0x58], :reg
addop 'pop', [0x8F], 0
addop 'push', [0x50], :reg
addop 'push', [0xFF], 6
addop('push.i16', [0x68], nil, {}, :i) { |o| o.props[:opsz] = 16 } # order matters !
addop 'push', [0x68], nil, {:s => [0, 1]}, :i # order matters !
addop('push.i32', [0x68], nil, {}, :i) { |o| o.props[:opsz] = 32 } # order matters !
addop 'ret', [0xC3], nil, {}, :stopexec, :setip
addop 'ret', [0xC2], nil, {}, :stopexec, :u16, :setip
addop_macro3 'rol', 0
addop_macro3 'ror', 1
addop_macro3 'sar', 7
addop_macro1 'sbb', 3
addop_macrostr 'scas', [0xAE], :stropz
addop_macrotttn('set', [0x0F, 0x90], 0) { |o| o.props[:argsz] = 8 }
addop_macro3 'shl', 4
addop_macro3 'sal', 4
addop 'shld', [0x0F, 0xA4], :mrm, {}, :u8
addop 'shld', [0x0F, 0xA5], :mrm, {}, :reg_cl
addop_macro3 'shr', 5
addop 'shrd', [0x0F, 0xAC], :mrm, {}, :u8
addop 'shrd', [0x0F, 0xAD], :mrm, {}, :reg_cl
addop_macrostr 'stos', [0xAA], :strop
addop_macro1 'sub', 5
addop 'test', [0x84], :mrmw
addop 'test', [0xA8], nil, {:w => [0, 0]}, :reg_eax, :u
addop 'test', [0xF6], 0, {:w => [0, 0]}, :u
addop_macro1 'xor', 6, :u
end
def init_386_only
init_cpu_constants
addop 'aaa', [0x37]
addop 'aad', [0xD5, 0x0A]
addop 'aam', [0xD4, 0x0A]
addop 'aas', [0x3F]
addop 'arpl', [0x63], :mrm
addop 'bound', [0x62], :mrmA
addop 'bsf', [0x0F, 0xBC], :mrm
addop 'bsr', [0x0F, 0xBD], :mrm
addop_macro2 'bt' , 0
addop_macro2 'btc', 3
addop_macro2 'btr', 2
addop_macro2 'bts', 1
addop 'call', [0x9A], nil, {}, :stopexec, :setip, :farptr, :saveip
addop 'callf', [0xFF], 3, {}, :stopexec, :setip, :saveip
addop 'clc', [0xF8]
addop 'cld', [0xFC]
addop 'cli', [0xFA]
addop 'clts', [0x0F, 0x06]
addop 'cmc', [0xF5]
addop 'cmpxchg',[0x0F, 0xB0], :mrmw
addop 'cpuid', [0x0F, 0xA2]
addop('cwd', [0x99]) { |o| o.props[:opsz] = 16 }
addop('cwde', [0x98]) { |o| o.props[:opsz] = 32 }
addop 'daa', [0x27]
addop 'das', [0x2F]
addop 'hlt', [0xF4], nil, {}, :stopexec
addop 'in', [0xE4], nil, {:w => [0, 0]}, :reg_eax, :u8
addop 'in', [0xE4], nil, {:w => [0, 0]}, :u8
addop 'in', [0xEC], nil, {:w => [0, 0]}, :reg_eax, :reg_dx
addop 'in', [0xEC], nil, {:w => [0, 0]}, :reg_eax
addop 'in', [0xEC], nil, {:w => [0, 0]}
addop_macrostr 'ins', [0x6C], :strop
addop 'into', [0xCE]
addop 'invd', [0x0F, 0x08]
addop 'invlpg',[0x0F, 0x01], 7
addop 'iret', [0xCF], nil, {}, :stopexec, :setip
addop 'iretd', [0xCF], nil, {}, :stopexec, :setip
addop('jcxz', [0xE3], nil, {}, :setip, :i8) { |o| o.props[:opsz] = 16 }
addop('jecxz', [0xE3], nil, {}, :setip, :i8) { |o| o.props[:opsz] = 32 }
addop 'jmp', [0xEA], nil, {}, :farptr, :stopexec
addop 'jmpf', [0xFF], 5, {}, :stopexec # reg ?
addop 'lahf', [0x9F]
addop 'lar', [0x0F, 0x02], :mrm
addop 'lds', [0xC5], :mrmA
addop 'les', [0xC4], :mrmA
addop 'lfs', [0x0F, 0xB4], :mrmA
addop 'lgs', [0x0F, 0xB5], :mrmA
addop 'lgdt', [0x0F, 0x01], 2
addop 'lidt', [0x0F, 0x01, 0x18], nil, {:modrmA => [2, 0]}, :modrmA
addop 'lldt', [0x0F, 0x00], 2
addop 'lmsw', [0x0F, 0x01], 6
# prefix addop 'lock', [0xF0]
addop 'lsl', [0x0F, 0x03], :mrm
addop 'lss', [0x0F, 0xB2], :mrmA
addop 'ltr', [0x0F, 0x00], 3
addop 'mov', [0x0F, 0x20, 0xC0], :reg, {:d => [1, 1], :eeec => [2, 3]}, :eeec
addop 'mov', [0x0F, 0x21, 0xC0], :reg, {:d => [1, 1], :eeed => [2, 3]}, :eeed
addop('mov', [0x8C], 0, {:d => [0, 1], :seg3 => [1, 3]}, :seg3) { |op| op.args.reverse! }
addop 'out', [0xE6], nil, {:w => [0, 0]}, :reg_eax, :u8
addop 'out', [0xE6], nil, {:w => [0, 0]}, :u8
addop 'out', [0xEE], nil, {:w => [0, 0]}, :reg_eax, :reg_dx
addop 'out', [0xEE], nil, {:w => [0, 0]}, :reg_eax # implicit arguments
addop 'out', [0xEE], nil, {:w => [0, 0]}
addop_macrostr 'outs', [0x6E], :strop
addop 'pop', [0x07], nil, {:seg2A => [0, 3]}, :seg2A
addop 'pop', [0x0F, 0x81], nil, {:seg3A => [1, 3]}, :seg3A
addop('popa', [0x61]) { |o| o.props[:opsz] = 16 }
addop('popad', [0x61]) { |o| o.props[:opsz] = 32 }
addop('popf', [0x9D]) { |o| o.props[:opsz] = 16 }
addop('popfd', [0x9D]) { |o| o.props[:opsz] = 32 }
addop 'push', [0x06], nil, {:seg2 => [0, 3]}, :seg2
addop 'push', [0x0F, 0x80], nil, {:seg3A => [1, 3]}, :seg3A
addop('pusha', [0x60]) { |o| o.props[:opsz] = 16 }
addop('pushad',[0x60]) { |o| o.props[:opsz] = 32 }
addop('pushf', [0x9C]) { |o| o.props[:opsz] = 16 }
addop('pushfd',[0x9C]) { |o| o.props[:opsz] = 32 }
addop_macro3 'rcl', 2
addop_macro3 'rcr', 3
addop 'rdmsr', [0x0F, 0x32]
addop 'rdpmc', [0x0F, 0x33]
addop 'rdtsc', [0x0F, 0x31], nil, {}, :random
addop 'retf', [0xCB], nil, {}, :stopexec, :setip
addop 'retf', [0xCA], nil, {}, :stopexec, :u16, :setip
addop 'rsm', [0x0F, 0xAA]
addop 'sahf', [0x9E]
addop 'sgdt', [0x0F, 0x01, 0x00], nil, {:modrmA => [2, 0]}, :modrmA
addop 'sidt', [0x0F, 0x01, 0x08], nil, {:modrmA => [2, 0]}, :modrmA
addop 'sldt', [0x0F, 0x00], 0
addop 'smsw', [0x0F, 0x01], 4
addop 'stc', [0xF9]
addop 'std', [0xFD]
addop 'sti', [0xFB]
addop 'str', [0x0F, 0x00], 1
addop 'ud2', [0x0F, 0x0B]
addop 'verr', [0x0F, 0x00], 4
addop 'verw', [0x0F, 0x00], 5
addop 'wait', [0x9B]
addop 'wbinvd',[0x0F, 0x09]
addop 'wrmsr', [0x0F, 0x30]
addop 'xadd', [0x0F, 0xC0], :mrmw
addop 'xchg', [0x90], :reg, {}, :reg_eax
addop('xchg', [0x90], :reg, {}, :reg_eax) { |o| o.args.reverse! } # xchg eax, ebx == xchg ebx, eax)
addop 'xchg', [0x86], :mrmw
addop 'xlat', [0xD7]
# pfx: addrsz = 0x67, lock = 0xf0, opsz = 0x66, repnz = 0xf2, rep/repz = 0xf3
# cs/nojmp = 0x2E, ds/jmp = 0x3E, es = 0x26, fs = 0x64, gs = 0x65, ss = 0x36
# undocumented opcodes
# TODO put these in the right place (486/P6/...)
addop 'aam', [0xD4], nil, {}, :u8
addop 'aad', [0xD5], nil, {}, :u8
addop 'setalc', [0xD6]
addop 'salc', [0xD6]
addop 'icebp', [0xF1]
addop 'loadall',[0x0F, 0x07]
addop 'ud2', [0x0F, 0xB9]
addop 'umov', [0x0F, 0x10], :mrmw,{:d => [1, 1]}
end
def init_387_only
init_cpu_constants
addop 'f2xm1', [0xD9, 0xF0]
addop 'fabs', [0xD9, 0xE1]
addop_macrofpu1 'fadd', 0
addop 'faddp', [0xDE, 0xC0], :regfp
addop('fbld', [0xDF, 0x20], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 80 }
addop('fbstp', [0xDF, 0x30], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 80 }
addop 'fchs', [0xD9, 0xE0], nil, {}, :regfp0
addop 'fnclex', [0xDB, 0xE2]
addop 'fclex', [0x9B, 0xDB, 0xE2]
addop_macrofpu1 'fcom', 2
addop_macrofpu1 'fcomp', 3
addop 'fcompp',[0xDE, 0xD9]
addop 'fcomip',[0xDF, 0xF0], :regfp
addop 'fcos', [0xD9, 0xFF], nil, {}, :regfp0
addop 'fdecstp', [0xD9, 0xF6]
addop_macrofpu1 'fdiv', 6
addop_macrofpu1 'fdivr', 7
addop 'fdivp', [0xDE, 0xF8], :regfp
addop 'fdivrp',[0xDE, 0xF0], :regfp
addop 'ffree', [0xDD, 0xC0], nil, {:regfp => [1, 0]}, :regfp
addop_macrofpu2 'fiadd', 0
addop_macrofpu2 'fimul', 1
addop_macrofpu2 'ficom', 2
addop_macrofpu2 'ficomp',3
addop_macrofpu2 'fisub', 4
addop_macrofpu2 'fisubr',5
addop_macrofpu2 'fidiv', 6
addop_macrofpu2 'fidivr',7
addop 'fincstp', [0xD9, 0xF7]
addop 'fninit', [0xDB, 0xE3]
addop 'finit', [0x9B, 0xDB, 0xE3]
addop_macrofpu2 'fist', 2, 1
addop_macrofpu3 'fild', 0
addop_macrofpu3 'fistp',3
addop('fld', [0xD9, 0x00], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
addop('fld', [0xDD, 0x00], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
addop('fld', [0xDB, 0x28], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 80 }
addop 'fld', [0xD9, 0xC0], :regfp
addop('fldcw', [0xD9, 0x28], nil, {:modrmA => [1, 0]}, :modrmA) { |o| o.props[:argsz] = 16 }
addop 'fldenv', [0xD9, 0x20], nil, {:modrmA => [1, 0]}, :modrmA
addop 'fld1', [0xD9, 0xE8]
addop 'fldl2t', [0xD9, 0xE9]
addop 'fldl2e', [0xD9, 0xEA]
addop 'fldpi', [0xD9, 0xEB]
addop 'fldlg2', [0xD9, 0xEC]
addop 'fldln2', [0xD9, 0xED]
addop 'fldz', [0xD9, 0xEE]
addop_macrofpu1 'fmul', 1
addop 'fmulp', [0xDE, 0xC8], :regfp
addop 'fnop', [0xD9, 0xD0]
addop 'fpatan', [0xD9, 0xF3]
addop 'fprem', [0xD9, 0xF8]
addop 'fprem1', [0xD9, 0xF5]
addop 'fptan', [0xD9, 0xF2]
addop 'frndint',[0xD9, 0xFC]
addop 'frstor', [0xDD, 0x20], nil, {:modrmA => [1, 0]}, :modrmA
addop 'fnsave', [0xDD, 0x30], nil, {:modrmA => [1, 0]}, :modrmA
addop 'fnstsw', [0xDF, 0xE0]
addop('fnstsw', [0xDD, 0x38], nil, {:modrmA => [1, 0]}, :modrmA) { |o| o.props[:argsz] = 16 }
addop 'fscale', [0xD9, 0xFD]
addop 'fsin', [0xD9, 0xFE]
addop 'fsincos',[0xD9, 0xFB]
addop 'fsqrt', [0xD9, 0xFA]
addop('fst', [0xD9, 0x10], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
addop('fst', [0xDD, 0x10], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
addop 'fst', [0xD9, 0xD0], :regfp
addop('fstp', [0xD9, 0x18], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
addop('fstp', [0xDD, 0x18], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
addop('fstp', [0xDB, 0x38], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 80 }
addop 'fstp', [0xDD, 0xD8], :regfp
addop('fstcw', [0xD9, 0x38], nil, {:modrmA => [1, 0]}, :modrmA) { |o| o.props[:argsz] = 16 }
addop 'fstenv', [0xD9, 0x30], nil, {:modrmA => [1, 0]}, :modrmA
addop 'fstsw', [0x9B, 0xDF, 0xE0]
addop('fstsw', [0x9B, 0xDD, 0x38], nil, {:modrmA => [1, 0]}, :modrmA) { |o| o.props[:argsz] = 16 }
addop_macrofpu1 'fsub', 4
addop 'fsubp', [0xDE, 0xE8], :regfp
addop_macrofpu1 'fsubp', 5
addop 'fsubrp', [0xDE, 0xE0], :regfp
addop 'ftst', [0xD9, 0xE4]
addop 'fucom', [0xDD, 0xE0], :regfp
addop 'fucomp', [0xDD, 0xE8], :regfp
addop 'fucompp',[0xDA, 0xE9]
addop 'fucomi', [0xDB, 0xE8], :regfp
addop 'fxam', [0xD9, 0xE5]
addop 'fxch', [0xD9, 0xC8], :regfp
addop 'fxtract',[0xD9, 0xF4]
addop 'fyl2x', [0xD9, 0xF1]
addop 'fyl2xp1',[0xD9, 0xF9]
addop 'fwait', [0x9B]
end
def init_486_only
init_cpu_constants
# TODO add new segments (fs/gs) ?
end
def init_pentium_only
init_cpu_constants
addop 'cmpxchg8b', [0x0F, 0xC7], 1
# lock cmpxchg8b eax
#addop 'f00fbug', [0xF0, 0x0F, 0xC7, 0xC8]
# mmx
addop 'emms', [0x0F, 0x77]
addop('movd', [0x0F, 0x6E], :mrmmmx, {:d => [1, 4]}) { |o| o.args[o.args.index(:modrmmmx)] = :modrm }
addop('movq', [0x0F, 0x6F], :mrmmmx, {:d => [1, 4]}) { |o| o.args.reverse! } # TODO check ohter mrmmmx
addop 'packssdw', [0x0F, 0x6B], :mrmmmx
addop 'packsswb', [0x0F, 0x63], :mrmmmx
addop 'packuswb', [0x0F, 0x67], :mrmmmx
addop_macrogg 0..2, 'padd', [0x0F, 0xFC], :mrmmmx
addop_macrogg 0..1, 'padds', [0x0F, 0xEC], :mrmmmx
addop_macrogg 0..1, 'paddus',[0x0F, 0xDC], :mrmmmx
addop 'pand', [0x0F, 0xDB], :mrmmmx
addop 'pandn', [0x0F, 0xDF], :mrmmmx
addop_macrogg 0..2, 'pcmpeq',[0x0F, 0x74], :mrmmmx
addop_macrogg 0..2, 'pcmpgt',[0x0F, 0x64], :mrmmmx
addop 'pmaddwd', [0x0F, 0xF5], :mrmmmx
addop 'pmulhuw', [0x0F, 0xE4], :mrmmmx
addop 'pmulhw',[0x0F, 0xE5], :mrmmmx
addop 'pmullw',[0x0F, 0xD5], :mrmmmx
addop 'por', [0x0F, 0xEB], :mrmmmx
addop_macrommx 1..3, 'psll', 3
addop_macrommx 1..2, 'psra', 2
addop_macrommx 1..3, 'psrl', 1
addop_macrogg 0..2, 'psub', [0x0F, 0xF8], :mrmmmx
addop_macrogg 0..1, 'psubs', [0x0F, 0xE8], :mrmmmx
addop_macrogg 0..1, 'psubus',[0x0F, 0xD8], :mrmmmx
addop_macrogg 1..3, 'punchkh', [0x0F, 0x68], :mrmmmx
addop_macrogg 1..3, 'punpckl', [0x0F, 0x60], :mrmmmx
addop 'pxor', [0x0F, 0xEF], :mrmmmx
end
def init_p6_only
addop_macrotttn 'cmov', [0x0F, 0x40], :mrm
%w{b e be u}.each_with_index { |tt, i|
addop 'fcmov' +tt, [0xDA, 0xC0 | (i << 3)], :regfp
addop 'fcmovn'+tt, [0xDB, 0xC0 | (i << 3)], :regfp
}
addop 'fcomi', [0xDB, 0xF0], :regfp
addop('fxrstor', [0x0F, 0xAE, 0x08], nil, {:modrmA => [2, 0]}, :modrmA) { |o| o.props[:argsz] = 512*8 }
addop('fxsave', [0x0F, 0xAE, 0x00], nil, {:modrmA => [2, 0]}, :modrmA) { |o| o.props[:argsz] = 512*8 }
addop 'sysenter',[0x0F, 0x34]
addop 'sysexit', [0x0F, 0x35]
end
def init_3dnow_only
init_cpu_constants
[['pavgusb', 0xBF], ['pfadd', 0x9E], ['pfsub', 0x9A],
['pfsubr', 0xAA], ['pfacc', 0xAE], ['pfcmpge', 0x90],
['pfcmpgt', 0xA0], ['fpcmpeq', 0xB0], ['pfmin', 0x94],
['pfmax', 0xA4], ['pi2fd', 0x0D], ['pf2id', 0x1D],
['pfrcp', 0x96], ['pfrsqrt', 0x97], ['pfmul', 0xB4],
['pfrcpit1', 0xA6], ['pfrsqit1', 0xA7], ['pfrcpit2', 0xB6],
['pmulhrw', 0xB7]].each { |str, bin|
addop str, [0x0F, 0x0F, bin], :mrmmmx
}
# 3dnow prefix fallback
addop '3dnow', [0x0F, 0x0F], :mrmmmx, {}, :u8
addop 'femms', [0x0F, 0x0E]
addop 'prefetch', [0x0F, 0x0D, 0x00], nil, {:modrmA => [2, 0] }, :modrmA
addop 'prefetchw', [0x0F, 0x0D, 0x08], nil, {:modrmA => [2, 0] }, :modrmA
end
def init_sse_only
init_cpu_constants
addop_macrossps 'addps', [0x0F, 0xA8], :mrmxmm
addop 'andnps', [0x0F, 0xAA], :mrmxmm
addop 'andps', [0x0F, 0xA4], :mrmxmm
addop_macrossps 'cmpps', [0x0F, 0xC2], :mrmxmm
addop 'comiss', [0x0F, 0x2F], :mrmxmm
[['pi2ps', 0x2A], ['ps2pi', 0x2D], ['tps2pi', 0x2C]].each { |str, bin|
addop('cvt' << str, [0x0F, bin], :mrmxmm) { |o| o.args[o.args.index(:modrmxmm)] = :modrmmmx }
addop('cvt' << str.tr('p', 's'), [0x0F, bin], :mrmxmm) { |o| o.args[o.args.index(:modrmxmm)] = :modrm ; o.props[:needpfx] = 0xF3 }
}
addop_macrossps 'divps', [0x0F, 0x5E], :mrmxmm
addop 'ldmxcsr', [0x0F, 0xAE, 0x10], nil, {:modrmA => [2, 0]}, :modrmA
addop_macrossps 'maxps', [0x0F, 0x5F], :mrmxmm
addop_macrossps 'minps', [0x0F, 0x5D], :mrmxmm
addop 'movaps', [0x0F, 0x28], :mrmxmm, {:d => [1, 0]}
# movhlps(reg, reg){nomem} == movlps(reg, mrm){no restriction}...
addop 'movhlps', [0x0F, 0x12], :mrmxmm, {:d => [1, 0]}
addop 'movlps', [0x0F, 0x12], :mrmxmm, {:d => [1, 0]}
addop 'movlhps', [0x0F, 0x16], :mrmxmm, {:d => [1, 0]}
addop 'movhps', [0x0F, 0x16], :mrmxmm, {:d => [1, 0]}
addop 'movmskps',[0x0F, 0x50, 0xC0], nil, {:reg => [2, 3], :regxmm => [2, 0]}, :regxmm, :reg
addop('movss', [0x0F, 0x10], :mrmxmm, {:d => [1, 0]}) { |o| o.props[:needpfx] = 0xF3 }
addop 'movups', [0x0F, 0x10], :mrmxmm, {:d => [1, 0]}
addop_macrossps 'mulps', [0x0F, 0x59], :mrmxmm
addop 'orps', [0x0F, 0x56], :mrmxmm
addop_macrossps 'rcpps', [0x0F, 0x53], :mrmxmm
addop_macrossps 'rsqrtps',[0x0F, 0x52], :mrmxmm
addop 'shufps', [0x0F, 0xC6], :mrmxmm, {}, :u8
addop_macrossps 'sqrtps', [0x0F, 0x51], :mrmxmm
addop 'stmxcsr', [0x0F, 0xAE, 0x18], nil, {:modrmA => [2, 0]}, :modrmA
addop_macrossps 'subps', [0x0F, 0x5C], :mrmxmm
addop 'ucomiss', [0x0F, 0x2E], :mrmxmm
addop 'unpckhps',[0x0F, 0x15], :mrmxmm
addop 'unpcklps',[0x0F, 0x14], :mrmxmm
addop 'xorps', [0x0F, 0x57], :mrmxmm
# start of integer instruction (accept opsz override prefix to access xmm)
addop('pavgb', [0x0F, 0xE0], :mrmmmx) { |o| o.props[:xmmx] = true }
addop('pavgw', [0x0F, 0xE3], :mrmmmx) { |o| o.props[:xmmx] = true }
# TODO addop('pextrw', [0x0F, 0xC5], :mrmmmx) { |o| o.fields[:reg] = o.fields.delete(:regmmx) } { |o| o.props[:xmmx] = true ; o.args << :u8 }
# addop('pinsrw', [0x0F, 0xC4], :mrmmmx) { |o| o.fields[:reg] = o.fields.delete(:regmmx) } { |o| o.props[:xmmx] = true ; o.args << :u8 }
addop('pmaxsw', [0x0F, 0xEE], :mrmmmx) { |o| o.props[:xmmx] = true }
addop('pmaxub', [0x0F, 0xDE], :mrmmmx) { |o| o.props[:xmmx] = true }
addop('pminsw', [0x0F, 0xEA], :mrmmmx) { |o| o.props[:xmmx] = true }
addop('pminub', [0x0F, 0xDA], :mrmmmx) { |o| o.props[:xmmx] = true }
# addop('pmovmskb',[0x0F, 0xD4], :mrmmmx) { |o| o.fields[:reg] = o.fields.delete(:regmmx) } ) { |o| o.props[:xmmx] = true } # no mem ref in the mrm
addop('pmulhuw', [0x0F, 0xE4], :mrmmmx) { |o| o.props[:xmmx] = true }
addop('psadbw', [0x0F, 0xF6], :mrmmmx) { |o| o.props[:xmmx] = true }
addop('pshufw', [0x0F, 0x70], :mrmmmx) { |o| o.props[:xmmx] = true ; o.args << :u8 }
addop('maskmovq',[0x0F, 0xF7], :mrmmmx) { |o| o.props[:xmmx] = true } # nomem
addop('movntq', [0x0F, 0xE7], :mrmmmx) { |o| o.props[:xmmx] = true }
addop 'movntps', [0x0F, 0x2B], :mrmxmm
addop 'prefetcht0', [0x0F, 0x18, 0x08], nil, {:modrmA => [2, 0]}, :modrmA
addop 'prefetcht1', [0x0F, 0x18, 0x10], nil, {:modrmA => [2, 0]}, :modrmA
addop 'prefetcht2', [0x0F, 0x18, 0x18], nil, {:modrmA => [2, 0]}, :modrmA
addop 'prefetchnta',[0x0F, 0x18, 0x00], nil, {:modrmA => [2, 0]}, :modrmA
addop 'sfence', [0x0F, 0xAE, 0xF8]
end
# XXX must be done after init_sse (patches :regmmx opcodes)
# TODO complete the list
def init_sse2_only
init_cpu_constants
@opcode_list.each { |o| o.props[:xmmx] = true if o.args.include? :regmmx and o.args.include? :modrmmmx }
# TODO <..blabla...integer...blabla..>
# nomem
addop 'clflush', [0x0F, 0xAE, 0x38], nil, {:modrm => [2, 0]}, :modrm # mrmA ?
addop('maskmovdqu', [0x0F, 0xF7], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('movntpd', [0x0F, 0x2B], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('movntdq', [0x0F, 0xE7], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop 'movnti', [0x0F, 0xC3], :mrm
addop('pause', [0x90]) { |o| o.props[:needpfx] = 0xF3 }
addop 'lfence', [0x0F, 0xAE, 0xE8]
addop 'mfence', [0x0F, 0xAE, 0xF0]
end
def init_sse3_only
init_cpu_constants
addop('addsubpd', [0x0F, 0xD0], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('addsubps', [0x0F, 0xD0], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop('haddpd', [0x0F, 0x7C], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('haddps', [0x0F, 0x7C], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop('hsubpd', [0x0F, 0x7D], :mrmxmm) { |o| o.props[:needpfx] = 0x66 }
addop('hsubps', [0x0F, 0x7D], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop 'monitor', [0x0F, 0x01, 0xC8]
addop 'mwait', [0x0F, 0x01, 0xC9]
addop('fisttp', [0xDF, 0x08], nil, {:modrmA => [1, 0]}, :modrmA) { |o| o.props[:argsz] = 16 }
addop('fisttp', [0xDB, 0x08], nil, {:modrmA => [1, 0]}, :modrmA) { |o| o.props[:argsz] = 32 }
addop('fisttp', [0xDD, 0x08], nil, {:modrmA => [1, 0]}, :modrmA) { |o| o.props[:argsz] = 64 }
addop('lddqu', [0x0F, 0xF0], :mrmxmm) { |o| o.args[o.args.index(:modrmxmm)] = :modrmA ; o.props[:needpfx] = 0xF2 }
addop('movddup', [0x0F, 0x12], :mrmxmm) { |o| o.props[:needpfx] = 0xF2 }
addop('movshdup', [0x0F, 0x16], :mrmxmm) { |o| o.props[:needpfx] = 0xF3 }
addop('movsldup', [0x0F, 0x12], :mrmxmm) { |o| o.props[:needpfx] = 0xF3 }
end
def init_vmx_only
init_cpu_constants
addop 'vmcall', [0x0F, 0x01, 0xC1]
# 64bits only, if I trust intel manuals..
addop('vmclear', [0x66, 0x0F, 0xC7, 6<<3], nil, {:modrmA => [3, 0]}, :modrmA) { |o| o.props[:argsz] = 64 }
addop 'vmlaunch', [0x0F, 0x01, 0xC2]
addop 'vmresume', [0x0F, 0x01, 0xC3]
addop('vmptrld', [0x0F, 0xC7, 6<<3], nil, {:modrmA => [2, 0]}, :modrmA) { |o| o.props[:argsz] = 64 }
addop('vmptrrst', [0x0F, 0xC7, 7<<3], nil, {:modrmA => [2, 0]}, :modrmA) { |o| o.props[:argsz] = 64 }
addop 'vmread', [0x0F, 0x78], :mrm
addop 'vmread', [0x0F, 0x78], :mrm
addop 'vmwrite', [0x0F, 0x79], :mrm
addop 'vmxoff', [0x0F, 0x01, 0xC4]
addop('vmxon', [0xF3, 0x0F, 0xC7, 6<<3], nil, {:modrmA => [3, 0]}, :modrmA) { |o| o.props[:argsz] = 64 }
end
#
# CPU family dependencies
#
def init_386_common
init_386_common_only
end
def init_386
init_386_common
init_386_only
end
def init_387
init_387_only
end
def init_486
init_386
init_387
init_486_only
end
def init_pentium
init_486
init_pentium_only
end
def init_3dnow
init_pentium
init_3dnow_only
end
def init_p6
init_pentium
init_p6_only
end
def init_sse
init_p6
init_sse_only
end
def init_sse2
init_sse
init_sse2_only
end
def init_sse3
init_sse2
init_sse3_only
end
def init_vmx
init_sse3
init_vmx_only
end
def init_all
init_vmx
init_3dnow_only
end
alias init_latest init_all
#
# addop_* macros
#
def addop_macro1(name, num, immtype=:i)
addop name, [(num << 3) | 4], nil, {:w => [0, 0]}, :reg_eax, immtype
addop name, [num << 3], :mrmw, {:d => [0, 1]}
addop name, [0x80], num, {:w => [0, 0], :s => [0, 1]}, immtype
end
def addop_macro2(name, num)
addop name, [0x0F, 0xBA], (4 | num), {}, :u8
addop(name, [0x0F, 0xA3 | (num << 3)], :mrm) { |op| op.args.reverse! }
end
def addop_macro3(name, num)
addop name, [0xD0], num, {:w => [0, 0]}, :imm_val1
addop name, [0xD2], num, {:w => [0, 0]}, :reg_cl
addop name, [0xC0], num, {:w => [0, 0]}, :u8
end
def addop_macrotttn(name, bin, hint, fields = {}, *props, &blk)
[%w{o}, %w{no}, %w{b nae}, %w{nb ae},
%w{z e}, %w{nz ne}, %w{be na}, %w{nbe a},
%w{s}, %w{ns}, %w{p pe}, %w{np po},
%w{l nge}, %w{nl ge}, %w{le ng}, %w{nle g}].each_with_index { |e, i|
b = bin.dup
if b[0] == 0x0F
b[1] |= i
else
b[0] |= i
end
e.each { |k| addop(name + k, b.dup, hint, fields.dup, *props, &blk) }
}
end
def addop_macrostr(name, bin, type)
# addop(name, bin.dup, {:w => [0, 0]}) { |o| o.props[type] = true } # TODO allow segment override
addop(name+'b', bin) { |o| o.props[:opsz] = 16 ; o.props[type] = true }
addop(name+'b', bin) { |o| o.props[:opsz] = 32 ; o.props[type] = true }
bin = bin.dup
bin[0] |= 1
addop(name+'w', bin) { |o| o.props[:opsz] = 16 ; o.props[type] = true }
addop(name+'d', bin) { |o| o.props[:opsz] = 32 ; o.props[type] = true }
end
def addop_macrofpu1(name, n)
addop(name, [0xD8, n<<3], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
addop(name, [0xDC, n<<3], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
addop name, [0xD8, 0xC0|(n<<3)], :regfp, {:d => [0, 2]}
end
def addop_macrofpu2(name, n, n2=0)
addop(name, [0xDE|n2, n<<3], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 16 }
addop(name, [0xDA|n2, n<<3], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 32 }
end
def addop_macrofpu3(name, n)
addop_macrofpu2 name, n, 1
addop(name, [0xDF, 0x28|(n<<3)], nil, {:modrmA => [1, 0]}, :modrmA, :regfp0) { |o| o.props[:argsz] = 64 }
end
def addop_macrogg(ggrng, name, bin, *args, &blk)
ggrng.each { |gg|
bindup = bin.dup
bindup[1] |= gg
sfx = %w(b w d q)[gg]
addop name+sfx, bindup, *args, &blk
}
end
def addop_macrommx(ggrng, name, val)
addop_macrogg ggrng, name, [0x0F, 0xC0 | (val << 4)], :mrmmmx
addop_macrogg ggrng, name, [0x0F, 0x70, 0xC0 | (val << 4)], nil, {:regmmx => [2, 0]}, :u8
end
def addop_macrossps(name, bin, hint)
# don't allow fields argument, as this will be modified by addop (.dup it if needed)
addop name, bin, hint
addop(name.tr('p', 's'), bin, hint) { |o| o.props[:needpfx] = 0xF3 }
end
# helper function: creates a new Opcode based on the arguments, eventually
# yields it for further customisation, and append it to the instruction set
# is responsible of the creation of disambiguating opcodes if necessary (:s flag hardcoding)
def addop(name, bin, hint=nil, fields={}, *argprops)
op = Opcode.new name
op.bin = bin
op.fields.replace fields
case hint
when nil
when :mrm, :mrmw, :mrmA
h = (hint == :mrmA ? :modrmA : :modrm)
op.fields[:reg] = [bin.length, 3]
op.fields[h] = [bin.length, 0]
op.fields[:w] = [bin.length - 1, 0] if hint == :mrmw
argprops.unshift :reg, h
op.bin << 0
when :reg
op.fields[:reg] = [bin.length-1, 0]
argprops.unshift :reg
when :regfp
op.fields[:regfp] = [bin.length-1, 0]
argprops.unshift :regfp, :regfp0
when Integer # mod/m, reg == opcode extension = hint
op.fields[:modrm] = [bin.length, 0]
op.bin << (hint << 3)
argprops.unshift :modrm
when :mrmmmx
op.fields[:regmmx] = [bin.length, 3]
op.fields[:modrm] = [bin.length, 0]
bin << 0
argprops.unshift :regmmx, :modrmmmx
when :mrmxmm
op.fields[:regxmm] = [bin.length, 3]
op.fields[:modrm] = [bin.length, 0]
bin << 0
argprops.unshift :regxmm, :modrmxmm
else
raise SyntaxError, "invalid hint #{hint.inspect} for #{name}"
end
if argprops.index(:u)
argprops << :unsigned_imm
argprops[argprops.index(:u)] = :i
end
(argprops & @valid_props).each { |p| op.props[p] = true }
argprops -= @valid_props
op.args.concat(argprops & @valid_args)
argprops -= @valid_args
raise "Invalid opcode definition: #{name}: unknown #{argprops.inspect}" unless argprops.empty?
yield op if block_given?
argprops = (op.props.keys - @valid_props) + (op.args - @valid_args) + (op.fields.keys - @fields_mask.keys)
raise "Invalid opcode customisation: #{name}: #{argprops.inspect}" unless argprops.empty?
addop_post(op)
end
# this recursive method is in charge of Opcode duplication (eg to hardcode some flag)
def addop_post(op)
dupe = proc { |o|
dop = Opcode.new o.name.dup
dop.bin, dop.fields, dop.props, dop.args = o.bin.dup, o.fields.dup, o.props.dup, o.args.dup
dop
}
if df = op.fields.delete(:d)
# hardcode the bit
dop = dupe[op]
dop.args.reverse!
addop_post dop
op.bin[df[0]] |= 1 << df[1]
addop_post op
return
elsif sf = op.fields.delete(:s)
# add explicit choice versions, with lower precedence (so that disassembling will return the general version)
# eg "jmp", "jmp.i8", "jmp.i"
# also hardcode the bit
op32 = op
addop_post op32
op8 = dupe[op]
op8.bin[sf[0]] |= 1 << sf[1]
op8.args.map! { |arg| arg == :i ? :i8 : arg }
addop_post op8
op32 = dupe[op32]
op32.name << '.i'
addop_post op32
op8 = dupe[op8]
op8.name << '.i8'
addop_post op8
return
elsif op.args.include? :regfp0
dop = dupe[op]
dop.args.delete :regfp0
addop_post dop
end
@opcode_list << op
end
end
end

View File

@ -1,301 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/ia32/opcodes'
require 'metasm/ia32/encode'
require 'metasm/parse'
module Metasm
class Ia32
class ModRM
# may return a SegReg
# must be called before SegReg parser (which could match only the seg part of a modrm)
def self.parse(lexer, otok)
tok = otok
# read operand size specifier
if tok and tok.type == :string and tok.raw =~ /^(?:byte|[dqo]?word|_(\d+)bits)$/
ptsz =
if $1
$1.to_i
else
case tok.raw
when 'byte'; 8
when 'word'; 16
when 'dword'; 32
when 'qword'; 64
when 'oword'; 128
else raise otok, 'mrm: bad ptr size'
end
end
lexer.skip_space
if tok = lexer.readtok and tok.type == :string and tok.raw == 'ptr'
lexer.skip_space
tok = lexer.readtok
end
end
# read segment selector
if tok and tok.type == :string and seg = SegReg.s_to_i[tok.raw]
lexer.skip_space
seg = SegReg.new(seg)
if not ntok = lexer.readtok or ntok.type != :punct or ntok.raw != ':'
raise otok, 'invalid modrm' if ptsz
lexer.unreadtok ntok
return seg
end
lexer.skip_space
tok = lexer.readtok
end
# ensure we have a modrm
if not tok or tok.type != :punct or tok.raw != '['
raise otok, 'invalid modrm' if ptsz or seg
return
end
lexer.skip_space_eol
# support fasm syntax [fs:eax] for segment selector
if tok = lexer.readtok and tok.type == :string and not seg and seg = SegReg.s_to_i[tok.raw]
raise otok, 'invalid modrm' if not ntok = lexer.readtok or ntok.type != :punct or ntok.raw != ':'
seg = SegReg.new(seg)
lexer.skip_space_eol
else
lexer.unreadtok tok
end
# read modrm content as generic expression
content = Expression.parse(lexer)
lexer.skip_space_eol
raise(otok, 'bad modrm') if not content or not ntok = lexer.readtok or ntok.type != :punct or ntok.raw != ']'
# converts matching externals to Regs in an expression
regify = proc { |o|
case o
when Expression
o.lexpr = regify[o.lexpr]
o.rexpr = regify[o.rexpr]
o
when String
if Reg.s_to_i.has_key? o
Reg.new(*Reg.s_to_i[o])
else o
end
else o
end
}
s = i = b = imm = nil
# assigns the Regs in the expression to base or index field of the modrm
walker = proc { |o|
case o
when nil
when Reg
if b
raise otok, 'mrm: too many regs' if i
i = o
s = 1
else
b = o
end
when Expression
if o.op == :* and (o.rexpr.kind_of? Reg or o.lexpr.kind_of? Reg)
# scaled index
raise otok, 'mrm: too many indexes' if i
s = o.lexpr
i = o.rexpr
s, i = i, s if s.kind_of? Reg
raise otok, 'mrm: bad scale' unless s.kind_of? Integer
elsif o.op == :+
# recurse
walker[o.lexpr]
walker[o.rexpr]
else
# found (a part of) the immediate
imm = Expression[imm, :+, o]
end
else
# found (a part of) the immediate
imm = Expression[imm, :+, o]
end
}
# do it
walker[regify[content.reduce]]
# ensure found immediate is really an immediate
raise otok, 'mrm: reg in imm' if imm.kind_of? Expression and not imm.externals.grep(Reg).empty?
# find default address size
adsz = b ? b.sz : i ? i.sz : lexer.program.cpu.size
# ptsz may be nil now, will be fixed up later (in parse_instr_fixup) to match another instruction argument's size
new adsz, ptsz, s, i, b, imm, seg
end
end
# handles cpu-specific parser instruction, falls back to Ancestor's version if unknown keyword
# XXX changing the cpu size in the middle of the code may have baaad effects...
def parse_parser_instruction(lexer, instr)
case instr.raw.downcase
when '.mode', '.bits'
lexer.skip_space
if tok = lexer.readtok and tok.type == :string and (tok.raw == '16' or tok.raw == '32')
@size = tok.raw.to_i
lexer.skip_space
raise instr, 'syntax error' if ntok = lexer.nexttok and ntok.type != :eol
else
raise instr, 'invalid cpu mode'
end
else super
end
end
def parse_prefix(i, pfx)
# XXX check for redefinition ?
# implicit 'true' return value when assignment occur
i.prefix ||= {}
case pfx
when 'lock'; i.prefix[:lock] = true
when 'rep'; i.prefix[:rep] = 'rep'
when 'repe', 'repz'; i.prefix[:rep] = 'repz'
when 'repne', 'repnz'; i.prefix[:rep] = 'repnz'
end
end
# parses a arbitrary ia32 instruction argument
def parse_argument(lexer)
# reserved names (registers/segments etc)
@args_token ||= (Argument.double_list + Argument.simple_list).map { |a| a.s_to_i.keys }.flatten.inject({}) { |h, e| h.update e => true }
lexer.skip_space
return if not tok = lexer.readtok
if tok.type == :string and tok.raw == 'ST'
lexer.skip_space
if ntok = lexer.readtok and ntok.type == :punct and ntok.raw == '('
lexer.skip_space
if not nntok = lexer.readtok or nntok.type != :string or nntok.raw != /^[0-9]$/ or
not ntok = (lexer.skip_space; lexer.readtok) or ntok.type != :punct or ntok.raw != ')'
raise tok, 'invalid FP register'
else
tok.raw << '(' << nntok.raw << ')'
if FpReg.s_to_i.has_key? tok.raw
return FpReg.new(FpReg.s_to_i[tok.raw])
else
raise tok, 'invalid FP register'
end
end
else
lexer.unreadtok ntok
end
end
if ret = ModRM.parse(lexer, tok)
ret
elsif @args_token[tok.raw]
# most frequent first: standard register
Argument.double_list.each { |a|
return a.new(*a.s_to_i[tok.raw]) if a.s_to_i.has_key? tok.raw
}
Argument.simple_list.each { |a|
return a.new( a.s_to_i[tok.raw]) if a.s_to_i.has_key? tok.raw
}
raise tok, 'internal error'
else
lexer.unreadtok tok
expr = Expression.parse(lexer)
lexer.skip_space
# may be a farptr
if expr and ntok = lexer.readtok and ntok.type == :punct and ntok.raw == ':'
raise tok, 'invalid farptr' if not addr = Expression.parse(lexer)
Farptr.new expr, addr
else
lexer.unreadtok ntok
expr
end
end
end
# check if the argument matches the opcode's argument spec
def parse_arg_valid?(o, spec, arg)
return false if s = o.props[:argsz] and (arg.kind_of? Reg or arg.kind_of? ModRM) and arg.sz and s != arg.sz
case spec
when :reg
arg.class == Reg and
if not o.fields[:w] or o.name == 'movsx' or o.name == 'movzx'
# we know the prototype of movsx: :reg is the large param
# no al/bl/bh/etc allowed
arg.sz >= 16
else true
end
when :modrm
(arg.class == ModRM or arg.class == Reg) and
if not o.fields[:w]
!arg.sz or arg.sz >= 16
elsif o.name == 'movsx' or o.name == 'movzx'
# we know the prototype of movsx: :modrm is the small param
!arg.sz or arg.sz <= 16
else true
end
when :i; arg.kind_of? Expression
when :imm_val1; arg.kind_of? Expression and arg.reduce == 1
when :imm_val3; arg.kind_of? Expression and arg.reduce == 3
when :reg_eax; arg.class == Reg and arg.val == 0
when :reg_cl; arg.class == Reg and arg.val == 1 and arg.sz == 8
when :reg_dx; arg.class == Reg and arg.val == 2 and arg.sz == 16
when :seg3; arg.class == SegReg
when :seg3A; arg.class == SegReg and arg.val > 3
when :seg2; arg.class == SegReg and arg.val < 4
when :seg2A; arg.class == SegReg and arg.val < 4 and arg.val != 1
when :eeec; arg.class == CtrlReg
when :eeed; arg.class == DbgReg
when :modrmA; arg.class == ModRM
when :mrm_imm; arg.class == ModRM and not arg.s and not arg.i and not arg.b
when :farptr; arg.class == Farptr
when :regfp; arg.class == FpReg
when :regfp0; arg.class == FpReg and (arg.val == nil or arg.val == 0) # XXX optional argument
when :modrmmmx; arg.class == ModRM or (arg.class == SimdReg and (arg.sz == 64 or (arg.sz == 128 and o.props[:xmmx])))
when :regmmx; arg.class == SimdReg and (arg.sz == 64 or (arg.sz == 128 and o.props[:xmmx]))
when :modrmxmm; arg.class == ModRM or (arg.class == SimdReg and arg.sz == 128)
when :regxmm; arg.class == SimdReg and arg.sz == 128
when :i8, :u8, :u16
arg.kind_of? Expression and
Expression.in_range?(arg, spec) != false # true or nil allowed
else raise EncodeError, "Internal error: unknown argument specification #{spec.inspect}"
end
end
def parse_instruction_checkproto(i)
case i.opname
when 'imul'
if i.args.length == 2 and i.args.first.kind_of? Reg and i.args.last.kind_of? Expression
i.args.unshift i.args.first.dup
end
end
super
end
# fixup the ptsz of a modrm argument, defaults to other argument size or current cpu mode
def parse_instruction_fixup(i)
if m = i.args.grep(ModRM).first and not m.sz
if i.opname == 'movzx' or i.opname == 'movsx'
m.sz = 8
else
if r = i.args.grep(Reg).first
m.sz = r.sz
else
# this is also the size of ctrlreg/dbgreg etc
# XXX fpu/simd ?
m.sz = @size
end
end
end
end
end
end

View File

@ -1,93 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/ia32/opcodes'
require 'metasm/render'
# XXX move context in another file ?
module Metasm
class Ia32
class Argument
include Renderable
@simple_list.each { |c| c.class_eval {
def render ; [self.class.i_to_s[@val]] end
} }
@double_list.each { |c| c.class_eval {
def render ; [self.class.i_to_s[@sz][@val]] end
def context ; {'set sz' => proc { |s| @sz = s }} end
} }
end
class Farptr
def render
[@seg, ':', @addr]
end
end
class ModRM
def qualifier(sz)
{
8 => 'byte',
16 => 'word',
32 => 'dword',
64 => 'qword',
128 => 'oword'
}.fetch(sz) { |k| "_#{sz}bits" }
end
def render
r = []
# is 'dword ptr' needed ?
# if not instr or not instr.args.grep(Reg).find {|a| a.sz == @sz}
r << ( qualifier(@sz) << ' ptr ' )
# end
r << @seg << ':' if seg
e = nil
e = Expression[e, :+, @b] if b
e = Expression[e, :+, @imm.reduce] if imm
e = Expression[e, :+, (@s == 1 ? @i : [@s, :*, @i])] if s
r << '[' << e << ']'
end
def context
{'set targetsz' => proc { |s| @sz = s },
'set seg' => proc { |s| @seg = Seg.new s }
}
end
end
def render_instruction(i)
r = []
r << 'lock ' if i.prefix and i.prefix[:lock]
r << i.prefix[:rep] << ' ' if i.prefix and i.prefix[:rep]
r << i.opname
i.args.each { |a|
r << (r.last == i.opname ? ' ' : ', ') << a
}
r
end
def instruction_context(i)
# XXX
h = {}
op = opcode_list_byname[i.opname].first
if i.prefix and i.prefix[:rep]
h['toogle repz'] = proc { i.prefix[:rep] = {'repnz' => 'repz', 'repz' => 'repnz'}[i.prefix[:rep]] } if op.props[:stropz]
h['rm rep'] = proc { i.prefix.delete :rep }
else
h['set rep'] = proc { (i.prefix ||= {})[:rep] = 'rep' } if op.props[:strop]
h['set rep'] = proc { (i.prefix ||= {})[:rep] = 'repz' } if op.props[:stropz]
end
if i.args.find { |a| a.kind_of? ModRM and a.seg }
h['rm seg'] = proc { i.args.find { |a| a.kind_of? ModRM and a.seg }.seg = nil }
end
h['toggle lock'] = proc { (i.prefix ||= {})[:lock] = !i.prefix[:lock] }
h
end
end
end

View File

@ -1,946 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
module Metasm
VERSION = 0x0001 # major major minor minor
# superclass for all metasm exceptions
class Exception < RuntimeError ; end
# parse error
class ParseError < Exception ; end
# invalid exeformat signature
class InvalidExeFormat < Exception ; end
# cannot honor .offset specification, reloc fixup overflow
class EncodeError < Exception ; end
# holds context of a processor
# endianness, current mode, opcode list...
class CPU
attr_accessor :valid_args, :valid_props, :fields_mask, :opcode_list
attr_accessor :endianness, :size
attr_accessor :generate_PIC
def initialize
@fields_mask = {}
@valid_args = []
@valid_props = [:setip, :saveip, :stopexec]
@opcode_list = []
@generate_PIC = true
end
# returns a hash opcode_name => array of opcodes with this name
def opcode_list_byname
@opcode_list_byname ||= @opcode_list.inject({}) { |h, o| (h[o.name] ||= []) << o ; h }
end
# assume that all subfunction calls returns (may fXXk up disasm backtracker)
def make_call_return
@opcode_list.each { |o| o.props.delete :stopexec if o.props[:saveip] }
end
# sets up the C parser : standard macro definitions, type model (size of int etc)
def tune_cparser(cp)
cp.send "ilp#@size"
cp.lexer.define('_STDC', 1) if not cp.lexer.definition['_STDC']
# TODO cp.lexer.define('BIGENDIAN')
# TODO gcc -dM -E - </dev/null
# TODO ExeFormat-specific definitions
end
# returns a new & tuned C::Parser
def new_cparser
cp = C::Parser.new
tune_cparser cp
cp
end
# returns a new C::Compiler
def new_ccompiler(parser, exe=ExeFormat.new)
exe.cpu ||= self
C::Compiler.new(parser, exe)
end
end
# generic CPU, with no instructions, just size/endianness
class UnknownCPU < CPU
def initialize(size, endianness)
super()
@size, @endianness = size, endianness
end
end
# a cpu instruction 'formal' description
class Opcode
# the name of the instruction
attr_accessor :name
# formal description of arguments (array of cpu-specific symbols)
attr_accessor :args
# binary encoding of the opcode (integer for risc, array of bytes for cisc)
attr_accessor :bin
# list of bit fields in the binary encoding
# hash position => field
# position is bit shift for risc, [byte index, bit shift] for risc
# field is cpu-specific
attr_accessor :fields
# hash of opcode generic properties/restrictions (mostly property => true/false)
attr_accessor :props
# binary mask for decoding
attr_accessor :bin_mask
def initialize(name)
@name = name
@args = []
@fields = {}
@props = {}
end
end
# defines an attribute self.backtrace (array of filename/lineno)
# and a method backtrace_str which dumps this array to a human-readable form
module Backtrace
# array [file, lineno, file, lineno]
# if file 'A' does #include 'B' you'll get ['A', linenoA, 'B', linenoB]
attr_accessor :backtrace
# builds a readable string from self.backtrace
def backtrace_str
Backtrace.backtrace_str(@backtrace)
end
# builds a readable backtrace string from an array of [file, lineno, file, lineno, ..]
def self.backtrace_str(ary)
return '' if not ary
i = ary.length
bt = ''
while i > 0
bt << ",\n\tincluded from " if ary[i]
i -= 2
bt << "#{ary[i].inspect} line #{ary[i+1]}"
end
bt
end
def exception(msg='syntax error')
ParseError.new "at #{backtrace_str}: #{msg}"
end
end
# an instruction: opcode name + arguments
class Instruction
# arguments (cpu-specific objects)
attr_accessor :args
# hash of prefixes (unused in simple cpus)
attr_accessor :prefix
# name of the associated opcode
attr_accessor :opname
# reference to the cpu which issued this instruction (used for rendering)
attr_accessor :cpu
include Backtrace
def initialize(cpu, opname=nil, args=[], pfx=nil, backtrace=nil)
@cpu = cpu
@opname = opname
@args = args
@prefix = pfx if pfx
@backtrace = backtrace
end
# duplicates the argument list and prefix hash
def dup
Instruction.new(@cpu, (@opname.dup if opname), @args.dup, (@prefix.dup if prefix), (@backtrace.dup if backtrace))
end
end
# all kind of data description (including repeated/uninitialized)
class Data
# maps data type to Expression parameters (signedness/bit size)
INT_TYPE = {'db' => :u8, 'dw' => :u16, 'dd' => :u32, 'dq' => :u64}
# an Expression, an Array of Data, a String, or :uninitialized
attr_accessor :data
# the data type, from INT_TYPE (TODO store directly Expression parameters ?)
attr_accessor :type
# the repetition count of the data parameter (dup constructs)
attr_accessor :count
include Backtrace
def initialize(type, data, count=1, backtrace=nil)
@data, @type, @count, @backtrace = data, type, count, backtrace
end
end
# a name for a location
class Label
attr_reader :name
include Backtrace
def initialize(name, backtrace=nil)
@name, @backtrace = name, backtrace
end
end
# alignment directive
class Align
# the size to align to
attr_accessor :val
# the Data used to pad
attr_accessor :fillwith
include Backtrace
def initialize(val, fillwith=nil, backtrace=nil)
@val, @fillwith, @backtrace = val, fillwith, backtrace
end
end
# padding directive
class Padding
# Data used to pad
attr_accessor :fillwith
include Backtrace
def initialize(fillwith=nil, backtrace=nil)
@fillwith, @backtrace = fillwith, backtrace
end
end
# offset directive
# can be used to fix padding length or to assert some code/data compiled length
class Offset
# the assembler will arrange to make this pseudo-instruction
# be at this offset from beginning of current section
attr_accessor :val
include Backtrace
def initialize(val, backtrace=nil)
@val, @backtrace = val, backtrace
end
end
# contiguous/uninterrupted sequence of instructions, chained to other blocks
# TODO
class InstructionBlock
end
# the superclass of all real executable formats
# main methods:
# self.decode(str) => decodes the file format (imports/relocs/etc), no asm disassembly
# parse(source) => parses assembler source, fills self.source
# assemble => assembles self.source in binary sections/segments/whatever
# encode => builds imports/relocs tables, put all this together, links everything in self.encoded
class ExeFormat
# array of Data/Instruction/Align/Padding/Offset/Label, populated in parse
attr_accessor :cursource
# contains the binary version of the compiled program (EncodedData)
attr_accessor :encoded
# reference to the current CPU used (may be nil)
attr_accessor :cpu
# array of labels generated by new_label
attr_accessor :unique_labels_cache
# initializes self.cpu, creates an empty self.encoded
def initialize(cpu=nil)
@cpu = cpu
@encoded = EncodedData.new
@unique_labels_cache = []
end
# return the label name corresponding to the specified offset of the encodeddata, creates it if necessary
def label_at(edata, offset, base = '')
if not l = edata.inv_export[offset]
edata.add_export(l = new_label(base), offset)
end
l
end
# creates a new label, that is guaranteed to never be returned again as long as this object (ExeFormat) exists
def new_label(base = '')
base = base.dup.tr('^a-zA-Z0-9_', '_')
# use %x instead of to_s(16) for negative values
base = (base << '_uuid' << ('%08x' % base.object_id)).freeze if base.empty? or @unique_labels_cache.include? base
@unique_labels_cache << base
base
end
# share self.unique_labels_cache with other, checks for conflicts, returns self
def share_namespace(other)
return self if other.unique_labels_cache.equal? @unique_labels_cache
raise "share_ns #{(other.unique_labels_cache & @unique_labels_cache).inspect}" if not (other.unique_labels_cache & @unique_labels_cache).empty?
@unique_labels_cache.concat other.unique_labels_cache
other.unique_labels_cache = @unique_labels_cache
self
end
end
# superclass for classes similar to Expression
# must define #bind, #reduce_rec, #match_rec, #externals
class ExpressionType
def +(o) Expression[self, :+, o].reduce end
def -(o) Expression[self, :-, o].reduce end
end
# handle immediate values, and arbitrary arithmetic/logic expression involving variables
# boolean values are treated as in C : true is 1, false is 0
# TODO replace #type with #size => bits + #type => [:signed/:unsigned/:any/:floating]
# TODO handle floats
class Expression < ExpressionType
INT_SIZE = {:u8 => 8, :u16 => 16, :u32 => 32, :u64 => 64,
:i8 => 8, :i16 => 16, :i32 => 32, :i64 => 64,
:a8 => 8, :a16 => 16, :a32 => 32, :a64 => 64
}
INT_MIN = {:u8 => 0, :u16 => 0, :u32 => 0, :u64 => 0,
:i8 =>-0x80, :i16 =>-0x8000, :i32 =>-0x80000000, :i64 => -0x8000_0000_0000_0000,
:a8 =>-0x80, :a16 =>-0x8000, :a32 =>-0x80000000, :a64 => -0x8000_0000_0000_0000
}
INT_MAX = {:u8 => 0xff, :u16 => 0xffff, :u32 => 0xffffffff, :u64 => 0xffff_ffff_ffff_ffff,
:i8 => 0x7f, :i16 => 0x7fff, :i32 => 0x7fffffff, :i64 => 0x7fff_ffff_ffff_ffff,
:a8 => 0xff, :a16 => 0xffff, :a32 => 0xffffffff, :a64 => 0xffff_ffff_ffff_ffff
}
# alternative constructor
# in operands order, and allows nesting using sub-arrays
# ex: Expression[[:-, 42], :*, [1, :+, [4, :*, 7]]]
# with a single argument, return it if already an Expression, else construct a new one (using unary +/-)
def self.[](l, op = nil, r = nil)
raise ArgumentError, 'invalid Expression[nil]' if not l and not r and not op
return l if l.kind_of? Expression and not op
l, op, r = nil, :-, -l if not op and l.kind_of? ::Numeric and l < 0
l, op, r = nil, :+, l if not op
l, op, r = nil, l, op if not r
l = self[*l] if l.kind_of? ::Array
r = self[*r] if r.kind_of? ::Array
new(op, r, l)
end
# checks if a given Expression/Integer is in the type range
# returns true if it is, false if it overflows, and nil if cannot be determined (eg unresolved variable)
def self.in_range?(val, type)
val = val.reduce if val.kind_of? self
return unless val.kind_of? ::Numeric
if INT_MIN[type]
val == val.to_i and
val >= INT_MIN[type] and val <= INT_MAX[type]
end
end
# the operator (symbol)
attr_accessor :op
# the lefthandside expression (nil for unary expressions)
attr_accessor :lexpr
# the righthandside expression
attr_accessor :rexpr
# basic constructor
# XXX funny args order, you should use +Expression[]+ instead
def initialize(op, rexpr, lexpr)
raise ArgumentError, "Expression: invalid arg order: #{[lexpr, op, rexpr].inspect}" if not op.kind_of? ::Symbol
@op, @lexpr, @rexpr = op, lexpr, rexpr
end
# recursive check of equity using #==
# will not match 1+2 and 2+1
def ==(o)
# shortcircuit recursion
o.object_id == object_id or (o.class == self.class and [o.op, o.rexpr, o.lexpr] == [@op, @rexpr, @lexpr])
end
# make it useable as Hash key (see +==+)
def hash
[@lexpr, @op, @rexpr].hash
end
alias eql? ==
# returns a new Expression with all variables found in the binding replaced with their value
# does not check the binding's key class except for numeric
# calls lexpr/rexpr #bind if they respond_to? it
def bind(binding = {})
if binding[self]
return binding[self].dup
end
l, r = @lexpr, @rexpr
if l and binding[l]
raise "internal error - bound #{l.inspect}" if l.kind_of? ::Numeric
l = binding[l]
elsif l.kind_of? ExpressionType
l = l.bind(binding)
end
if r and binding[r]
raise "internal error - bound #{r.inspect}" if r.kind_of? ::Numeric
r = binding[r]
elsif r.kind_of? ExpressionType
r = r.bind(binding)
end
Expression[l, @op, r]
end
# bind in place (replace self.lexpr/self.rexpr with the binding value)
# only recurse with Expressions (does not use respond_to?)
def bind!(binding = {})
if @lexpr.kind_of?(Expression)
@lexpr.bind!(binding)
elsif @lexpr
@lexpr = binding[@lexpr] || @lexpr
end
if @rexpr.kind_of?(Expression)
@rexpr.bind!(binding)
elsif @rexpr
@rexpr = binding[@rexpr] || @rexpr
end
self
end
# returns a simplified copy of self
# can return an +Expression+ or a +Numeric+, may return self
# see +reduce_rec+ for simplifications description
def reduce
case e = reduce_rec
when Expression, Numeric; e
else Expression[e]
end
end
# resolves logic operations (true || false, etc)
# computes numeric operations (1 + 3)
# expands substractions to addition of the opposite
# reduces double-oppositions (-(-1) => 1)
# reduces addition of 0 and unary +
# canonicalize additions: put variables in the lhs, descend addition tree in the rhs => (a + (b + (c + 12)))
# make formal reduction if finds somewhere in addition tree (a) and (-a)
def reduce_rec
l = @lexpr.kind_of?(ExpressionType) ? @lexpr.reduce_rec : @lexpr
r = @rexpr.kind_of?(ExpressionType) ? @rexpr.reduce_rec : @rexpr
v =
if r.kind_of?(::Numeric) and (l == nil or l.kind_of?(::Numeric))
# calculate numerics
if [:'&&', :'||', :'>', :'<', :'>=', :'<=', :'==', :'!='].include?(@op)
# bool expr
raise 'internal error' if not l
case @op
when :'&&'; (l != 0) && (r != 0)
when :'||'; (l != 0) || (r != 0)
when :'>' ; l > r
when :'>='; l >= r
when :'<' ; l < r
when :'<='; l <= r
when :'=='; l == r
when :'!='; l != r
end ? 1 : 0
elsif not l
case @op
when :'!'; (r == 0) ? 1 : 0
when :+; r
when :-; -r
when :~; ~r
end
else
# use ruby evaluator
l.send(@op, r)
end
# shortcircuit
elsif l == 0 and @op == :'&&'
0
elsif l.kind_of?(::Numeric) and l != 0 and @op == :'||'
1
elsif @op == :>> or @op == :<<
if l == 0; 0
elsif r == 0; l
elsif l.kind_of? Expression and l.op == @op
Expression[l.lexpr, @op, [l.rexpr, :+, r]].reduce_rec
# XXX (a >> 1) << 1 != a (lose low bit)
# XXX (a << 1) >> 1 != a (with real cpus, lose high bit)
end
elsif @op == :'!'
if r.kind_of? Expression and op = {:'==' => :'!=', :'!=' => :'==', :< => :>=, :> => :<=, :<= => :>, :>= => :<}[r.op]
Expression[r.lexpr, op, r.rexpr].reduce_rec
end
elsif @op == :^
if l == :unknown or r == :unknown; :unknown
elsif l == 0; r
elsif r == 0; l
elsif l == r; 0
elsif r == 1 and l.kind_of? Expression and [:'==', :'!=', :<, :>, :<=, :>=].include? l.op
Expression[nil, :'!', l].reduce_rec
elsif l.kind_of? Expression and l.op == :^
# a^(b^c) => (a^b)^c
Expression[l.lexpr, :^, [l.rexpr, :^, r]].reduce_rec
elsif r.kind_of? Expression and r.op == :^
# (a^b)^a => b
if r.rexpr == l; r.lexpr
elsif r.lexpr == l; r.rexpr
end
end
elsif @op == :&
if l == 0 or r == 0; 0
elsif r == 1 and l.kind_of? Expression and [:'==', :'!=', :<, :>, :<=, :>=].include? l.op
l
elsif l == r; l
elsif l.kind_of? Expression and l.op == :& and r.kind_of? Integer and l.rexpr.kind_of? Integer; Expression[l.lexpr, :&, r & l.rexpr].reduce_rec
elsif r.kind_of? ::Integer and l.kind_of? Expression and l.op == :|
# check for rol/ror composition
m = Expression[[['var', :sh_op, 'amt'], :|, ['var', :inv_sh_op, 'inv_amt']], :&, 'mask']
if vars = match(m, 'var', :sh_op, 'amt', :inv_sh_op, 'inv_amt', 'mask') and vars[:sh_op] == {:>> => :<<, :<< => :>>}[ vars[:inv_sh_op]] and
((vars['amt'].kind_of?(::Integer) and vars['inv_amt'].kind_of?(::Integer) and ampl = vars['amt'] + vars['inv_amt']) or
(vars['amt'].kind_of? Expression and vars['amt'].op == :% and vars['amt'].rexpr.kind_of? ::Integer and
vars['inv_amt'].kind_of? Expression and vars['inv_amt'].op == :% and vars['amt'].rexpr == vars['inv_amt'].rexpr and ampl = vars['amt'].rexpr)) and
vars['mask'].kind_of?(::Integer) and vars['mask'] == (1<<ampl)-1 and vars['var'].kind_of? Expression and # it's a rotation
ivars = vars['var'].match(m, 'var', :sh_op, 'amt', :inv_sh_op, 'inv_amt', 'mask') and ivars[:sh_op] == {:>> => :<<, :<< => :>>}[ivars[:inv_sh_op]] and
((ivars['amt'].kind_of?(::Integer) and ivars['inv_amt'].kind_of?(::Integer) and ampl = ivars['amt'] + ivars['inv_amt']) or
(ivars['amt'].kind_of? Expression and ivars['amt'].op == :% and ivars['amt'].rexpr.kind_of? ::Integer and
ivars['inv_amt'].kind_of? Expression and ivars['inv_amt'].op == :% and ivars['amt'].rexpr == ivars['inv_amt'].rexpr and ampl = ivars['amt'].rexpr)) and
ivars['mask'].kind_of?(::Integer) and ivars['mask'] == (1<<ampl)-1 and ivars['mask'] == vars['mask'] # it's a composed rotation
if ivars[:sh_op] != vars[:sh_op]
# ensure the rotations are the same orientation
ivars[:sh_op], ivars[:inv_sh_op] = ivars[:inv_sh_op], ivars[:sh_op]
ivars['amt'], ivars['inv_amt'] = ivars['inv_amt'], ivars['amt']
end
amt = Expression[[vars['amt'], :+, ivars['amt']], :%, ampl]
invamt = Expression[[vars['inv_amt'], :+, ivars['inv_amt']], :%, ampl]
Expression[[[ivars['var'], vars[:sh_op], amt], :|, [ivars['var'], vars[:inv_sh_op], invamt]], :&, vars['mask']].reduce_rec
end
end
elsif @op == :|
if l == 0; r
elsif r == 0; l
elsif l == -1 or r == -1; -1
elsif l == r; l
end
elsif @op == :*
if l == 0 or r == 0; 0
elsif l == 1; r
elsif r == 1; l
end
elsif @op == :-
if l == :unknown or r == :unknown; :unknown
elsif not l and r.kind_of? Expression and (r.op == :- or r.op == :+)
if r.op == :- # no lexpr (reduced)
# -(-x) => x
r.rexpr
else # :+ and lexpr (r is reduced)
# -(a+b) => (-a)+(-b)
Expression[[:-, r.lexpr], :+, [:-, r.rexpr]].reduce_rec
end
elsif l
# a-b => a+(-b)
Expression[l, :+, [:-, r]].reduce_rec
end
elsif @op == :+
if l == :unknown or r == :unknown; :unknown
elsif not l; r # +x => x
elsif r == 0; l # x+0 => x
elsif l.kind_of?(::Numeric)
if r.kind_of? Expression and r.op == :+
# 1+(x+y) => x+(y+1)
Expression[r.lexpr, :+, [r.rexpr, :+, l]].reduce_rec
else
# 1+a => a+1
Expression[r, :+, l].reduce_rec
end
elsif l.kind_of? Expression and l.op == :+
# (a+b)+foo => a+(b+foo)
Expression[l.lexpr, :+, [l.rexpr, :+, r]].reduce_rec
elsif l.kind_of? Expression and r.kind_of? Expression and l.op == :% and r.op == :% and l.rexpr.kind_of?(::Integer) and l.rexpr == r.rexpr
Expression[[l.lexpr, :+, r.lexpr], :%, l.rexpr].reduce_rec
else
# a+(b+(c+(-a))) => b+c+0
# a+((-a)+(b+c)) => 0+b+c
neg_l = l.rexpr if l.kind_of? Expression and l.op == :-
# recursive search & replace -lexpr by 0
simplifier = proc { |cur|
if (neg_l and neg_l == cur) or (cur.kind_of? Expression and cur.op == :- and not cur.lexpr and cur.rexpr == l)
# -l found
0
else
# recurse
if cur.kind_of? Expression and cur.op == :+
if newl = simplifier[cur.lexpr]
Expression[newl, cur.op, cur.rexpr].reduce_rec
elsif newr = simplifier[cur.rexpr]
Expression[cur.lexpr, cur.op, newr].reduce_rec
end
end
end
}
simplifier[r]
end
end
case v
when nil
# no dup if no new value
(r == :unknown or l == :unknown) ? :unknown :
((r == @rexpr and l == @lexpr) ? self : Expression[l, @op, r])
when Expression
(v.lexpr == :unknown or v.rexpr == :unknown) ? :unknown : v
else v
end
end
# a pattern-matching method
# Expression[42, :+, 28].match(Expression['any', :+, 28], 'any') => {'any' => 42}
# Expression[42, :+, 28].match(Expression['any', :+, 'any'], 'any') => false
# Expression[42, :+, 42].match(Expression['any', :+, 'any'], 'any') => {'any' => 42}
# vars can match anything except nil
def match(target, *vars)
match_rec(target, vars.inject({}) { |h, v| h.update v => nil })
end
def match_rec(target, vars)
return false if not target.kind_of? Expression
[target.lexpr, target.op, target.rexpr].zip([@lexpr, @op, @rexpr]) { |targ, exp|
if targ and vars[targ]
return false if exp != vars[targ]
elsif targ and vars.has_key? targ
return false if not vars[targ] = exp
elsif targ.kind_of? ExpressionType
return false if not exp.kind_of? ExpressionType or not exp.match_rec(targ, vars)
else
return false if targ != exp
end
}
vars
end
# returns the array of non-numeric members of the expression
# if a variables appears 3 times, it will be present 3 times in the returned array
def externals
[@rexpr, @lexpr].inject([]) { |a, e|
case e
when ExpressionType; a.concat e.externals
when nil, ::Numeric; a
else a << e
end
}
end
# returns the externals that appears in the expression, does not walk through other ExpressionType
def expr_externals
[@rexpr, @lexpr].inject([]) { |a, e|
case e
when Expression; a.concat e.expr_externals
when nil, ::Numeric, ExpressionType; a
else a << e
end
}
end
def inspect
"Expression[#{@lexpr.inspect.sub(/^Expression/, '') + ', ' if @lexpr}#{@op.inspect + ', ' if @lexpr or @op != :+}#{@rexpr.inspect.sub(/^Expression/, '')}]"
end
Unknown = self[:unknown]
end
# an EncodedData relocation, specifies a value to patch in
class Relocation
# the relocation value (an Expression)
attr_accessor :target
# the relocation expression type
attr_accessor :type
# the endianness of the relocation
attr_accessor :endianness
include Backtrace
def initialize(target, type, endianness, backtrace = nil)
raise ArgumentError, "bad args #{[target, type, endianness].inspect}" if not target.kind_of? Expression or not type.kind_of? ::Symbol or not endianness.kind_of? ::Symbol
@target, @type, @endianness, @backtrace = target, type, endianness, backtrace
end
# fixup the encodeddata with value (reloc starts at off)
def fixup(edata, off, value)
str = Expression.encode_immediate(value, @type, @endianness, @backtrace)
edata.fill off
edata.data[off, str.length] = str
end
# size of the relocation field, in bytes
def length
Expression::INT_SIZE[@type]/8
end
end
# a String-like, with export/relocation informations added
class EncodedData
# string with raw data
attr_accessor :data
# hash, key = offset within data, value = +Relocation+
attr_accessor :reloc
# hash, key = export name, value = offset within data - use add_export to update
attr_accessor :export
# hash, key = offset, value = 1st export name
attr_accessor :inv_export
# virtual size of data (all 0 by default, see +fill+)
attr_accessor :virtsize
# arbitrary pointer, often used when decoding immediates
# may be initialized with an export value
attr_reader :ptr
def ptr=(p)
@ptr = @export[p] || p
end
# opts' keys in :reloc, :export, :virtsize, defaults to empty/empty/data.length
def initialize(data = '', opts={})
@data = data
@reloc = opts[:reloc] || {}
@export = opts[:export] || {}
@inv_export = @export.invert
@virtsize = opts[:virtsize] || @data.length
@ptr = 0
end
def add_export(label, off=@ptr, set_inv=false)
@export[label] = off
if set_inv or not @inv_export[off]
@inv_export[off] = label
end
end
# returns the size of raw data, that is [data.length, last relocation end].max
def rawsize
[@data.length, *@reloc.map { |off, rel| off + rel.length } ].max
end
# String-like
alias length virtsize
# String-like
alias size virtsize
def empty?
@virtsize == 0
end
# returns a copy of itself, with reloc/export duped (but not deep)
def dup
self.class.new @data.dup, :reloc => @reloc.dup, :export => @export.dup, :virtsize => @virtsize
end
# resolve relocations:
# calculate each reloc target using Expression#bind(binding)
# if numeric, replace the raw data with the encoding of this value (+fill+s preceding data if needed) and remove the reloc
# if replace_target is true, the reloc target is replaced with its bound counterpart
def fixup_choice(binding, replace_target)
@reloc.keys.each { |off|
val = @reloc[off].target.bind(binding).reduce
if val.kind_of? Integer
reloc = @reloc[off]
reloc.fixup(self, off, val)
@reloc.delete(off) # delete only if not overflowed
elsif replace_target
@reloc[off].target = val
end
}
end
# +fixup_choice+ binding, false
def fixup(binding)
fixup_choice(binding, false)
end
# +fixup_choice+ binding, true
def fixup!(binding)
fixup_choice(binding, true)
end
# returns a default binding suitable for use in +fixup+
# every export is expressed as base + offset
# base defaults to the first export name + its offset
def binding(base = nil)
if not base
key = @export.keys.sort_by { |k| @export[k] }.first
return {} if not key
base = (@export[key] == 0 ? key : Expression[key, :-, @export[key]])
end
@export.inject({}) { |binding, (n, o)| binding.update n => Expression[base, :+, o] }
end
# returns the offset where the relocation for target t is to be applied
def offset_of_reloc(t)
t = Expression[t]
@reloc.keys.find { |off| @reloc[off].target == t }
end
# fill virtual space by repeating pattern (String) up to len
# expand self if len is larger than self.virtsize
def fill(len = @virtsize, pattern = 0.chr)
@virtsize = len if len > @virtsize
@data = @data.ljust(len, pattern) if len > @data.length
end
# rounds up virtsize to next multiple of len
def align(len)
@virtsize = EncodedData.align_size(@virtsize, len)
end
# returns the value val rounded up to next multiple of len
def self.align_size(val, len)
((val + len - 1) / len).to_i * len
end
# concatenation of another +EncodedData+ (or nil/Fixnum/anything supporting String#<<)
def << other
case other
when nil
when ::Fixnum
fill
@data = @data.realstring if defined? VirtualString and @data.kind_of? VirtualString
@data << other
@virtsize += 1
when EncodedData
fill if not other.data.empty?
other.reloc.each { |k, v| @reloc[k + @virtsize] = v }
cf = (other.export.keys & @export.keys).find_all { |k| other.export[k] != @export[k] - @virtsize }
raise "edata merge: label conflict #{cf.inspect}" if not cf.empty?
other.export.each { |k, v| @export[k] = v + @virtsize }
other.inv_export.each { |k, v| @inv_export[@virtsize + k] = v }
if @data.empty?; @data = other.data.dup
elsif defined? VirtualString and @data.kind_of? VirtualString; @data = @data.realstring << other.data
else @data << other.data
end
@virtsize += other.virtsize
else
fill
if @data.empty?; @data = other.dup
elsif defined? VirtualString and @data.kind_of? VirtualString; @data = @data.realstring << other
else @data << other
end
@virtsize += other.length
end
self
end
# equivalent to dup << other, filters out Integers & nil
def + other
raise ArgumentError if not other or other.kind_of?(Integer)
dup << other
end
# slice
def [](from, len=nil)
if not len and from.kind_of? Range
b = from.begin
e = from.end
b = @export[b] if @export[b]
e = @export[e] if @export[e]
b = b + @virtsize if b < 0
e = e + @virtsize if e < 0
len = e - b
len += 1 if not from.exclude_end?
from = b
end
from = @export[from] if @export[from]
from = from + @virtsize if from < 0
return if from > @virtsize or from < 0
return @data[from] if not len
len = @virtsize - from if from+len > @virtsize
ret = EncodedData.new @data[from, len]
ret.virtsize = len
@reloc.each { |o, r|
ret.reloc[o - from] = r if o >= from and o + r.length <= from+len
}
@export.each { |e, o|
ret.export[e] = o - from if o >= from and o <= from+len # XXX include end ?
}
@inv_export.each { |o, e|
ret.inv_export[o-from] = e if o >= from and o <= from+len
}
ret
end
# slice replacement, supports size change (shifts following relocs/exports)
# discards old exports/relocs from the overwritten space
def []=(from, len, val=nil)
if not val
val = len
len = nil
end
if not len and from.kind_of? ::Range
b = from.begin
e = from.end
b = @export[b] if @export[b]
e = @export[e] if @export[e]
b = b + @virtsize if b < 0
e = e + @virtsize if e < 0
len = e - b
len += 1 if not from.exclude_end?
from = b
end
from = @export[from] || from
raise "invalid offset #{from}" if not from.kind_of? ::Integer
from = from + @virtsize if from < 0
if not len
val = val.chr if val.kind_of? ::Integer
len = val.length
end
raise "invalid slice length #{len}" if not len.kind_of? ::Integer or len < 0
if from >= @virtsize
len = 0
elsif from+len > @virtsize
len = @virtsize-from
end
val = EncodedData.new << val
# remove overwritten metadata
@export.delete_if { |name, off| off > from and off < from + len }
@reloc.delete_if { |off, rel| off - rel.length > from and off < from + len }
# shrink/grow
if val.length != len
diff = val.length - len
@export.keys.each { |name| @export[name] = @export[name] + diff if @export[name] > from }
@inv_export.keys.each { |off| @inv_export[off+diff] = @inv_export.delete(off) if off > from }
@reloc.keys.each { |off| @reloc[off + diff] = @reloc.delete(off) if off > from }
if @virtsize >= from+len
@virtsize += diff
end
end
@virtsize = from + val.length if @virtsize < from + val.length
if from + len < @data.length # patch real data
val.fill
@data[from, len] = val.data
elsif not val.data.empty? # patch end of real data
@data << (0.chr*(from-@data.length)) if @data.length < from
@data[from..-1] = val.data
else # patch end of real data with fully virtual
@data = @data[0, from]
end
val.export.each { |name, off| @export[name] = from + off }
val.inv_export.each { |off, name| @inv_export[from+off] = name }
val.reloc.each { |off, rel| @reloc[from + off] = rel }
end
# replace a portion of self
# from/to may be Integers (offsets) or labels (from self.export)
# content is a String or an EncodedData, which will be inserted in the specified location (padded if necessary)
# raise if the string does not fit in.
def patch(from, to, content)
from = @export[from] || from
raise "invalid offset specification #{from}" if not from.kind_of? Integer
to = @export[to] || to
raise "invalid offset specification #{to}" if not to.kind_of? Integer
raise EncodeError, 'cannot patch data: new content too long' if to - from < content.length
self[from, content.length] = content
end
end
end

View File

@ -1,7 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/mips/parse'
require 'metasm/compile_c'

View File

@ -1,223 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/mips/opcodes'
require 'metasm/decode'
module Metasm
class MIPS
def build_opcode_bin_mask(op)
# bit = 0 if can be mutated by an field value, 1 if fixed by opcode
op.bin_mask = 0
op.args.each { |f|
op.bin_mask |= @fields_mask[f] << @fields_shift[f]
}
op.bin_mask = 0xffffffff ^ op.bin_mask
end
def build_bin_lookaside
lookaside = Array.new(256) { [] }
@opcode_list.each { |op|
build_opcode_bin_mask op
b = op.bin >> 24
msk = op.bin_mask >> 24
for i in b..(b | (255^msk))
next if i & msk != b & msk
lookaside[i] << op
end
}
lookaside
end
def decode_findopcode(edata)
return if edata.ptr >= edata.data.length
# TODO handle relocations !!
di = DecodedInstruction.new(self)
val = edata.decode_imm(:u32, @endianness)
edata.ptr -= 4
di if di.opcode = @bin_lookaside[val >> 24].find { |op|
(op.bin & op.bin_mask) == (val & op.bin_mask)
}
end
def decode_instr_op(edata, di)
# TODO handle relocations !!
before_ptr = edata.ptr
op = di.opcode
di.instruction.opname = op.name
val = edata.decode_imm(:u32, @endianness)
field_val = proc { |f|
r = (val >> @fields_shift[f]) & @fields_mask[f]
# XXX do that cleanly (Expr.decode_imm)
case f
when :sa, :i16, :it
((r >> 15) == 1) ? (r - (1 << 16)) : r
when :i20
((r >> 19) == 1) ? (r - (1 << 20)) : r
when :i26
((r >> 25) == 1) ? (r - (1 << 26)) : r
else r
end
}
op.args.each { |a|
di.instruction.args << case a
when :rs, :rt, :rd; Reg.new field_val[a]
when :sa, :i16, :i20, :i26, :it; Expression[field_val[a]]
when :rs_i16; Memref.new Reg.new(field_val[:rs]), Expression[field_val[:i16]]
when :ft; FpReg.new field_val[a]
when :idm1, :idb; Expression['unsupported']
else raise SyntaxError, "Internal error: invalid argument #{a} in #{op.name}"
end
}
di.bin_length += edata.ptr - before_ptr
di
end
# converts relative branch offsets to absolute addresses
# else just add the offset +off+ of the instruction + its length (off may be an Expression)
# assumes edata.ptr points just after the instruction (as decode_instr_op left it)
# do not call twice on the same di !
def decode_instr_interpret(di, addr)
if di.opcode.props[:setip] and di.instruction.args.last.kind_of? Expression and di.opcode.name[0] != ?t
delta = Expression[di.instruction.args.last, :<<, 2].reduce
arg = Expression[[addr, :+, di.bin_length], :+, delta].reduce
di.instruction.args[-1] = Expression[arg]
end
di
end
def backtrace_binding(di)
a = di.instruction.args.map { |arg|
case arg
when Memref; arg.symbolic(di.address)
when Reg; arg.symbolic
else arg
end
}
binding =
case op = di.opcode.name
when 'nop', 'j', 'jr'; {}
when 'lui'; { a[0] => Expression[a[1], :<<, 16] }
when 'add', 'addu', 'addi', 'addiu'; { a[0] => Expression[a[1], :+, a[2]] } # XXX addiu $sp, -40h should be addiu $sp, 0xffc0 from the books, but..
when 'sub', 'subu'; { a[0] => Expression[a[1], :-, a[2]] }
when 'slt', 'slti'; { a[0] => Expression[a[1], :<, a[2]] }
when 'and', 'andi'; { a[0] => Expression[a[1], :&, a[2]] }
when 'or', 'ori'; { a[0] => Expression[a[1], :|, a[2]] }
when 'nor'; { a[0] => Expression[:~, [a[1], :|, a[2]]] }
when 'xor'; { a[0] => Expression[a[1], :^, a[2]] }
when 'sll'; { a[0] => Expression[a[1], :>>, a[2]] }
when 'srl', 'sra'; { a[0] => Expression[a[1], :<<, a[2]] } # XXX sign-extend
when 'lw'; { a[0] => Expression[a[1]] }
when 'sw'; { a[1] => Expression[a[0]] }
when 'lh', 'lhu'; { a[0] => Expression[a[1]] } # XXX sign-extend
when 'sh'; { a[1] => Expression[a[0]] }
when 'lb', 'lbu'; { a[0] => Expression[a[1]] }
when 'sb'; { a[1] => Expression[a[0]] }
when 'mfhi'; { a[0] => Expression[:hi] }
when 'mflo'; { a[0] => Expression[:lo] }
when 'mult'; { :hi => Expression[[a[0], :*, a[1]], :>>, 32], :lo => Expression[[a[0], :*, a[1]], :&, 0xffff_ffff] }
when 'div'; { :hi => Expression[a[0], :%, a[1]], :lo => Expression[a[0], :/, a[1]] }
when 'jalr'; { :$ra => Expression[Expression[di.address, :+, 2*di.bin_length].reduce] }
else
if op[0] == ?b and di.opcode.props[:setip]
else
puts "unknown instruction to emu #{di}" if $VERBOSE
end
{}
end
binding.delete 0 # allow add $zero, 42 => nop
binding
end
def get_xrefs_x(dasm, di)
return [] if not di.opcode.props[:setip]
arg = di.instruction.args.last
[Expression[
case arg
when Memref; Indirection[[arg.base.to_s.to_sym, :+, arg.offset], @size/8, di.address]
when Reg; arg.to_s.to_sym
else arg
end]]
end
def backtrace_update_function_binding(dasm, faddr, f, retaddrlist)
retaddrlist.map! { |retaddr| dasm.decoded[retaddr] ? dasm.decoded[retaddr].block.list.last.address : retaddr }
b = f.backtrace_binding
bt_val = proc { |r|
bt = []
retaddrlist.each { |retaddr|
bt |= dasm.backtrace(Expression[r], retaddr,
:include_start => true, :snapshot_addr => faddr, :origin => retaddr)
}
b[r] = ((bt.length == 1) ? bt.first : Expression::Unknown)
}
Reg.i_to_s.values.map { |r| r.to_sym }.each(&bt_val)
puts "update_func_bind: #{Expression[faddr]} has sp -> #{b[:$sp]}" if not f.need_finalize and not Expression[b[:$sp], :-, :$sp].reduce.kind_of?(::Integer) if $VERBOSE
end
def backtrace_is_function_return(expr, di=nil)
expr.reduce_rec == :$ra
end
def backtrace_is_stack_address(expr)
Expression[expr].expr_externals.include? :$sp
end
def replace_instr_arg_immediate(i, old, new)
i.args.map! { |a|
case a
when Expression; a == old ? new : Expression[a.bind(old => new).reduce]
when Memref
a.offset = (a.offset == old ? new : Expression[a.offset.bind(old => new).reduce]) if a.offset
a
else a
end
}
end
# make the target of the call know the value of $t9 (specified by the ABI)
# XXX hackish
def backtrace_found_result(dasm, di, expr, type, len)
if di.opcode.name == 'jalr' and di.instruction.args == [:$t9]
expr = dasm.normalize(expr)
(dasm.address_binding[expr] ||= {})[:$t9] ||= expr
end
end
# branch.*likely has no delay slot
def delay_slot(di)
(di.opcode.name[0] == ?b and di.opcode.name[-1] == ?l) ? 0 : 1
end
def disassembler_default_func
df = DecodedFunction.new
# from http://www.cs.rpi.edu/~chrisc/COURSES/CSCI-4250/FALL-2004/MIPS-regs.html
df.backtrace_binding = %w[v0 v1 a0 a1 a2 a3 t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 at k0 k1].inject({}) { |h, r| h.update "$#{r}".to_sym => Expression::Unknown }
df.backtracked_for = [BacktraceTrace.new(Expression[:$ra], :default, Expression[:$ra], :x)]
df.btfor_callback = proc { |dasm, btfor, funcaddr, calladdr|
if funcaddr != :default
btfor
elsif di = dasm.decoded[calladdr] and di.opcode.props[:saveip] and di.instruction.to_s != 'jr $ra'
btfor
else []
end
}
df
end
end
end

View File

@ -1,50 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/mips/opcodes'
require 'metasm/encode'
module Metasm
class MIPS
private
def encode_instr_op(exe, instr, op)
base = op.bin
set_field = proc { |f, v|
base |= (v & @fields_mask[f]) << @fields_shift[f]
}
val, mask, shift = 0, 0, 0
# convert label name for jmp/call/loop to relative offset
if op.props[:setip] and op.name[0] != ?t and instr.args.last.kind_of? Expression
postlabel = exe.new_label('jmp_offset')
instr = instr.dup
instr.args[-1] = Expression[[instr.args[-1], :-, postlabel], :>>, 2]
postdata = EncodedData.new '', :export => {postlabel => 0}
else
postdata = ''
end
op.args.zip(instr.args).each { |sym, arg|
case sym
when :rs, :rt, :rd
set_field[sym, arg.i]
when :ft
set_field[sym, arg.i]
when :rs_i16
set_field[:rs, arg.base.i]
val, mask, shift = arg.offset, @fields_mask[:i16], @fields_shift[:i16]
when :sa, :i16, :i20
val, mask, shift = arg, @fields_mask[sym], @fields_shift[sym]
when :i26
val, mask, shift = Expression[arg, :>>, 2], @fields_mask[sym], @fields_shift[sym]
end
}
Expression[base, :+, [[val, :&, mask], :<<, shift]].encode(:u32, @endianness) << postdata
end
end
end

View File

@ -1,66 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class MIPS < CPU
class Reg
class << self
attr_reader :s_to_i, :i_to_s
end
@s_to_i = {}
@i_to_s = {}
(0..31).each { |i| @s_to_i["r#{i}"] = @s_to_i["$r#{i}"] = @s_to_i["$#{i}"] = i }
%w[zero at v0 v1 a0 a1 a2 a3
t0 t1 t2 t3 t4 t5 t6 t7
s0 s1 s2 s3 s4 s5 s6 s7
t8 t9 k0 k1 gp sp fp ra].each_with_index { |r, i| @s_to_i[r] = @s_to_i['$'+r] = i ; @i_to_s[i] = '$'+r }
attr_accessor :i
def initialize(i)
@i = i
end
Sym = @i_to_s.sort.map { |k, v| v.to_sym }
def symbolic ; @i == 0 ? 0 : Sym[@i] end
end
class FpReg
class << self
attr_reader :s_to_i
end
@s_to_i = (0..31).inject({}) { |h, i| h.update "f#{i}" => i, "$f#{i}" => i }
attr_accessor :i
def initialize(i)
@i = i
end
end
class Memref
attr_accessor :base, :offset
def initialize(base, offset)
@base, @offset = base, offset
end
def symbolic(orig)
p = nil
p = Expression[p, :+, @base.symbolic] if base
p = Expression[p, :+, @offset] if offset
Expression[Indirection.new(p, 4, orig)].reduce
end
end
def initialize(endianness = :big, family = :mips32r2)
super()
@endianness = endianness
@size = 32
@fields_shift = {}
send "init_#{family}"
end
end
end

View File

@ -1,438 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/mips/main'
# TODO coprocessors, floating point, 64bits, thumb mode
module Metasm
class MIPS
private
def addop(name, bin, *args)
o = Opcode.new name
o.bin = bin
o.args.concat(args & @fields_mask.keys)
(args & @valid_props).each { |p| o.props[p] = true }
if $DEBUG
a = (args - @valid_props - @fields_mask.keys)
p ['mips unhandled args',a] if not a.empty?
end
@opcode_list << o
end
def init_mips32_obsolete
addop 'beql', 0b010100 << 26, :rt, :rs, :i16, :setip # == , exec delay slot only if jump taken
addop 'bnel', 0b010101 << 26, :rt, :rs, :i16, :setip # !=
addop 'blezl',0b010110 << 26, :rt_z, :rs, :i16, :setip # <= 0
addop 'bgtzl',0b010111 << 26, :rt_z, :rs, :i16, :setip # > 0
addop 'bltzl',1 << 26 | 0b00010 << 16, :rs, :i16, :setip
addop 'bgezl',1 << 26 | 0b00011 << 16, :rs, :i16, :setip
addop 'bltzall', 1 << 26 | 0b10010 << 16, :rs, :i16, :setip
addop 'bgezall', 1 << 26 | 0b10011 << 16, :rs, :i16, :setip
end
def init_mips32_reserved
addop 'future111011', 0b111011 << 26, :i26
%w[011000 011001 011010 011011 100111 101100 101101 110100 110111 111100 111111].each { |b|
addop "reserved#{b}", b.to_i(2) << 26, :i26
}
addop 'ase_jalx', 0b011101 << 26, :i26
addop 'ase011110', 0b011110 << 26, :i26
# TODO add all special/regimm/...
end
def init_mips32
@fields_mask.update :rs => 0x1f, :rt => 0x1f, :rd => 0x1f, :sa => 0x1f,
:i16 => 0xffff, :i26 => 0x3ffffff, :rs_i16 => 0x3e0ffff, :it => 0x1f,
:ft => 0x1f, :idm1 => 0x1f, :idb => 0x1f, :sel => 7, :i20 => 0xfffff #, :i32 => 0
@fields_shift.update :rs => 21, :rt => 16, :rd => 11, :sa => 6,
:i16 => 0, :i26 => 0, :rs_i16 => 0, :it => 16,
:ft => 16, :idm1 => 11, :idb => 11, :sel => 0, :i20 => 6 #, :i32 => 0
init_mips32_obsolete
init_mips32_reserved
addop 'j', 0b000010 << 26, :i26, :setip, :stopexec # sets the program counter to (i26 << 2) | ((pc+4) & 0xfc000000) ie i26*4 in the 256M-aligned section containing the instruction in the delay slot
addop 'jal', 0b000011 << 26, :i26, :setip, :stopexec, :saveip # same thing, saves return addr in r31
addop 'mov', 0b001000 << 26, :rt, :rs # rt <- rs+0
addop 'addi', 0b001000 << 26, :rt, :rs, :i16 # add rt <- rs+i
addop 'addiu',0b001001 << 26, :rt, :rs, :i16 # add unsigned
addop 'slti', 0b001010 << 26, :rt, :rs, :i16 # set on less than
addop 'sltiu',0b001011 << 26, :rt, :rs, :i16 # set on less than unsigned
addop 'andi', 0b001100 << 26, :rt, :rs, :i16 # and
addop 'ori', 0b001101 << 26, :rt, :rs, :i16 # or
addop 'xori', 0b001110 << 26, :rt, :rs, :i16 # xor
addop 'lui', 0b001111 << 26, :rt, :i16 # load upper
# addop 'li', 0b001111 << 26, :rt, :i32 # pseudoinstruction
addop 'beq', 0b000100 << 26, :rt, :rs, :i16, :setip # ==
addop 'bne', 0b000101 << 26, :rt, :rs, :i16, :setip # !=
addop 'blez', 0b000110 << 26, :rs, :i16, :setip # <= 0
addop 'bgtz', 0b000111 << 26, :rs, :i16, :setip # > 0
addop 'lb', 0b100000 << 26, :rt, :rs_i16 # load byte rs <- [rt+i]
addop 'lh', 0b100001 << 26, :rt, :rs_i16 # load halfword
addop 'lwl', 0b100010 << 26, :rt, :rs_i16 # load word left
addop 'lw', 0b100011 << 26, :rt, :rs_i16 # load word
addop 'lbu', 0b100100 << 26, :rt, :rs_i16 # load byte unsigned
addop 'lhu', 0b100101 << 26, :rt, :rs_i16 # load halfword unsigned
addop 'lwr', 0b100110 << 26, :rt, :rs_i16 # load word right
addop 'sb', 0b101000 << 26, :rt, :rs_i16 # store byte
addop 'sh', 0b101001 << 26, :rt, :rs_i16 # store halfword
addop 'swl', 0b101010 << 26, :rt, :rs_i16 # store word left
addop 'sw', 0b101011 << 26, :rt, :rs_i16 # store word
addop 'swr', 0b101110 << 26, :rt, :rs_i16 # store word right
addop 'll', 0b110000 << 26, :rt, :rs_i16 # load linked word (read for atomic r/modify/w, sc does the w)
addop 'sc', 0b111000 << 26, :rt, :rs_i16 # store conditional word
addop 'lwc1', 0b110001 << 26, :ft, :rs_i16 # load word in fpreg low
addop 'swc1', 0b111001 << 26, :ft, :rs_i16 # store low fpreg word
addop 'lwc2', 0b110010 << 26, :rt, :rs_i16 # load word to copro2 register low
addop 'swc2', 0b111010 << 26, :rt, :rs_i16 # store low coproc2 register
addop 'ldc1', 0b110101 << 26, :ft, :rs_i16 # load dword in fpreg low
addop 'sdc1', 0b111101 << 26, :ft, :rs_i16 # store fpreg
addop 'ldc2', 0b110110 << 26, :rt, :rs_i16 # load dword to copro2 register
addop 'sdc2', 0b111110 << 26, :rt, :rs_i16 # store coproc2 register
addop 'pref', 0b110011 << 26, :it, :rs_i16 # prefetch (it = %w[load store r2 r3 load_streamed store_streamed load_retained store_retained
# r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 writeback_invalidate
# id26 id27 id28 id29 prepare_for_store id31]
addop 'cache',0b101111 << 26, :it, :rs_i16 # do things with the proc cache
# special
addop 'nop', 0
addop 'ssnop',1<<6
addop 'ehb', 3<<6
addop 'sll', 0b000000, :rd, :rt, :sa
addop 'movf', 0b000001, :rd, :rs, :cc
addop 'movt', 0b000001 | (1<<16), :rd, :rs, :cc
addop 'srl', 0b000010, :rd, :rt, :sa
addop 'sra', 0b000011, :rd, :rt, :sa
addop 'sllv', 0b000100, :rd, :rt, :rs
addop 'srlv', 0b000110, :rd, :rt, :rs
addop 'srav', 0b000111, :rd, :rt, :rs
addop 'jr', 0b001000, :rs, :setip, :stopexec # hint field ?
addop 'jr.hb',0b001000 | (1<<10), :rs, :setip, :stopexec
addop 'jalr', 0b001001 | (31<<11), :rs, :setip, :stopexec, :saveip # rd = r31 implicit
addop 'jalr', 0b001001, :rd, :rs, :setip, :stopexec, :saveip
addop 'jalr.hb', 0b001001 | (1<<10) | (31<<11), :rs, :setip, :stopexec, :saveip
addop 'jalr.hb', 0b001001 | (1<<10), :rd, :rs, :setip, :stopexec, :saveip
addop 'movz', 0b001010, :rd, :rs, :rt # rt == 0 ? rd <- rs
addop 'movn', 0b001011, :rd, :rs, :rt
addop 'syscall', 0b001100, :i20
addop 'break',0b001101, :i20
addop 'sync', 0b001111 # type 0 implicit
addop 'sync', 0b001111, :sa
addop 'mfhi', 0b010000, :rd # copies special reg HI to reg
addop 'mthi', 0b010001, :rd # copies reg to special reg HI
addop 'mflo', 0b010010, :rd # copies special reg LO to reg
addop 'mtlo', 0b010011, :rd # copies reg to special reg LO
addop 'mult', 0b011000, :rs, :rt # multiplies the registers and store the result in HI:LO
addop 'multu',0b011001, :rs, :rt
addop 'div', 0b011010, :rs, :rt
addop 'divu', 0b011011, :rs, :rt
addop 'add', 0b100000, :rd, :rs, :rt
addop 'addu', 0b100001, :rd, :rs, :rt
addop 'sub', 0b100010, :rd, :rs, :rt
addop 'subu', 0b100011, :rd, :rs, :rt
addop 'and', 0b100100, :rd, :rs, :rt
addop 'or', 0b100101, :rd, :rs, :rt
addop 'xor', 0b100110, :rd, :rs, :rt
addop 'nor', 0b100111, :rd, :rs, :rt
addop 'slt', 0b101010, :rd, :rs, :rt # rs<rt ? rd<-1 : rd<-0
addop 'sltu', 0b101011, :rd, :rs, :rt
addop 'tge', 0b110000, :rs, :rt # rs >= rt ? trap
addop 'tgeu', 0b110001, :rs, :rt
addop 'tlt', 0b110010, :rs, :rt
addop 'tltu', 0b110011, :rs, :rt
addop 'teq', 0b110100, :rs, :rt
addop 'tne', 0b110110, :rs, :rt
# regimm
addop 'bltz', (1<<26) | (0b00000<<16), :rs, :i16, :setip
addop 'bgez', (1<<26) | (0b00001<<16), :rs, :i16, :setip
addop 'tgei', (1<<26) | (0b01000<<16), :rs, :i16, :setip
addop 'tgfiu',(1<<26) | (0b01001<<16), :rs, :i16, :setip
addop 'tlti', (1<<26) | (0b01010<<16), :rs, :i16, :setip
addop 'tltiu',(1<<26) | (0b01011<<16), :rs, :i16, :setip
addop 'teqi', (1<<26) | (0b01100<<16), :rs, :i16, :setip
addop 'tnei', (1<<26) | (0b01110<<16), :rs, :i16, :setip
addop 'bltzal', (1<<26) | (0b10000<<16), :rs, :i16, :setip
addop 'bgezal', (1<<26) | (0b10001<<16), :rs, :i16, :setip
# special2
addop 'madd', (0b011100<<26) | 0b000000, :rs, :rt
addop 'maddu',(0b011100<<26) | 0b000001, :rs, :rt
addop 'mul', (0b011100<<26) | 0b000010, :rd, :rs, :rt
addop 'msub', (0b011100<<26) | 0b000100, :rs, :rt
addop 'msubu',(0b011100<<26) | 0b000101, :rs, :rt
addop 'clz', (0b011100<<26) | 0b100000, :rd, :rs, :rt # must have rs == rt
addop 'clo', (0b011100<<26) | 0b100001, :rd, :rs, :rt # must have rs == rt
addop 'sdbbp',(0b011100<<26) | 0b111111, :i20
# cp0
addop 'mfc0', (0b010000<<26) | (0b00000<<21), :rt, :rd
addop 'mfc0', (0b010000<<26) | (0b00000<<21), :rt, :rd, :sel
addop 'mtc0', (0b010000<<26) | (0b00100<<21), :rt, :rd
addop 'mtc0', (0b010000<<26) | (0b00100<<21), :rt, :rd, :sel
addop 'tlbr', (0b010000<<26) | (1<<25) | 0b000001
addop 'tlbwi',(0b010000<<26) | (1<<25) | 0b000010
addop 'tlbwr',(0b010000<<26) | (1<<25) | 0b000110
addop 'tlbp', (0b010000<<26) | (1<<25) | 0b001000
addop 'eret', (0b010000<<26) | (1<<25) | 0b011000
addop 'deret',(0b010000<<26) | (1<<25) | 0b011111
addop 'wait', (0b010000<<26) | (1<<25) | 0b100000 # mode field ?
end
def init_mips32r2
init_mips32
addop 'rotr', 0b000010 | (1<<21), :rd, :rt, :sa
addop 'rotrv',0b000110 | (1<<6), :rd, :rt, :rs
addop 'synci',(1<<26) | (0b11111<<16), :rs_i16
# special3
addop 'ext', (0b011111<<26) | 0b000000, :rt, :rs, :sa, :idm1
addop 'ins', (0b011111<<26) | 0b000100, :rt, :rs, :sa, :idb
addop 'rdhwr',(0b011111<<26)| 0b111011, :rt, :rd
addop 'wsbh',(0b011111<<26) | (0b00010<<6) | 0b100000, :rd, :rt
addop 'seb', (0b011111<<26) | (0b10000<<6) | 0b100000, :rd, :rt
addop 'seh', (0b011111<<26) | (0b11000<<6) | 0b100000, :rd, :rt
# cp0
addop 'rdpgpr', (0b010000<<26) | (0b01010<<21), :rd, :rt
addop 'wrpgpr', (0b010000<<26) | (0b01110<<21), :rd, :rt
addop 'di', (0b010000<<26) | (0b01011<<21) | (0b01100<<11) | (0<<5)
addop 'di', (0b010000<<26) | (0b01011<<21) | (0b01100<<11) | (0<<5), :rt
addop 'ei', (0b010000<<26) | (0b01011<<21) | (0b01100<<11) | (1<<5)
addop 'ei', (0b010000<<26) | (0b01011<<21) | (0b01100<<11) | (1<<5), :rt
end
end
end
__END__
def macro_addop_cop1(name, bin, *aprops)
flds = [ :rt, :fs ]
addop name, :cop1, bin, 'rt, fs', flds, *aprops
end
def macro_addop_cop1_precision(name, type, bin, fmt, *aprops)
flds = [ :ft, :fs, :fd ]
addop name+'.'+(type.to_s[5,7]), type, bin, fmt, flds, *aprops
end
public
# Initialize the instruction set with the MIPS32 Instruction Set
def init_mips32
:cc => [7, 18, :fpcc],
:op => [0x1F, 16, :op ], :cp2_rt => [0x1F, 16, :cp2_reg ],
:stype => [0x1F, 6, :imm ],
:code => [0xFFFFF, 6, :code ],
:sel => [3, 0, :sel ]})
# ---------------------------------------------------------------
# COP0, field rs
# ---------------------------------------------------------------
addop 'mfc0', :cop0, 0b00000, 'rt, rd, sel', [ :rt, :rd, :sel ]
addop 'mtc0', :cop0, 0b00100, 'rt, rd, sel', [ :rt, :rd, :sel ]
# ---------------------------------------------------------------
# COP0 when rs=C0
# ---------------------------------------------------------------
macro_addop_cop0_c0 'tlbr', 0b000001
macro_addop_cop0_c0 'tlbwi', 0b000010
macro_addop_cop0_c0 'tlwr', 0b000110
macro_addop_cop0_c0 'tlbp', 0b001000
macro_addop_cop0_c0 'eret', 0b011000
macro_addop_cop0_c0 'deret', 0b011111
macro_addop_cop0_c0 'wait', 0b100000
# ---------------------------------------------------------------
# COP1, field rs
# ---------------------------------------------------------------
macro_addop_cop1 'mfc1', 0b00000
macro_addop_cop1 'cfc1', 0b00010
macro_addop_cop1 'mtc1', 0b00100
macro_addop_cop1 'ctc1', 0b00110
addop "bc1f", :cop1, 0b01000, 'cc, off', [ :cc, :off ], :diff_bits, [ 16, 3, 0 ]
addop "bc1fl", :cop1, 0b01000, 'cc, off', [ :cc, :off ], :diff_bits, [ 16, 3, 2 ]
addop "bc1t", :cop1, 0b01000, 'cc, off', [ :cc, :off ], :diff_bits, [ 16, 3, 1 ]
addop "bc1tl", :cop1, 0b01000, 'cc, off', [ :cc, :off ], :diff_bits, [ 16, 3, 3 ]
# ---------------------------------------------------------------
# COP1, field rs=S/D
# ---------------------------------------------------------------
[ :cop1_s, :cop1_d ].each do |type|
type_str = type.to_s[5,7]
macro_addop_cop1_precision 'add', type, 0b000000, 'fd, fs, ft'
macro_addop_cop1_precision 'sub', type, 0b000001, 'fd, fs, ft'
macro_addop_cop1_precision 'mul', type, 0b000010, 'fd, fs, ft'
macro_addop_cop1_precision 'abs', type, 0b000101, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'mov', type, 0b000110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'neg', type, 0b000111, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'movz', type, 0b010010, 'fd, fs, ft'
macro_addop_cop1_precision 'movn', type, 0b010011, 'fd, fs, ft'
addop "movf.#{type_str}", type, 0b010001, 'fd, fs, cc', [ :cc, :fs, :fd ], :diff_bits, [ 16, 1, 0 ]
addop "movt.#{type_str}", type, 0b010001, 'fd, fs, cc', [ :cc, :fs, :fd ], :diff_bits, [ 16, 1, 1 ]
%w(f un eq ueq olt ult ole ule sf ngle seq ngl lt nge le ngt).each_with_index do |cond, index|
addop "c.#{cond}.#{type_str}", type, 0b110000+index, 'cc, fs, ft',
[ :ft, :fs, :cc ]
end
end
# S and D Without PS
[:cop1_s, :cop1_d].each do |type|
macro_addop_cop1_precision 'div', type, 0b000011, 'fd, fs, ft'
macro_addop_cop1_precision 'sqrt', type, 0b000100, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'round.w', type, 0b001100, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'trunc.w', type, 0b001101, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'ceil.w', type, 0b001110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'floor.w', type, 0b001111, 'fd, fs', :ft_zero
end
# COP2 is not decoded (pretty useless)
[:cop1_d,:cop1_w].each { |type| macro_addop_cop1_precision 'cvt.s', type, 0b100000, 'fd, fs', :ft_zero }
[:cop1_s,:cop1_w].each { |type| macro_addop_cop1_precision 'cvt.d', type, 0b100001, 'fd, fs', :ft_zero }
[:cop1_s,:cop1_d].each { |type| macro_addop_cop1_precision 'cvt.w', type, 0b100100, 'fd, fs', :ft_zero }
[ :normal, :special, :regimm, :special2, :cop0, :cop0_c0, :cop1, :cop1_s,
:cop1_d, :cop1_w ].each \
{ |t| @@opcodes_by_class[t] = opcode_list.find_all { |o| o.type == t } }
end
# Initialize the instruction set with the MIPS32 Instruction Set Release 2
def init_mips64
init_mips32
#SPECIAL
macro_addop_special "rotr", 0b000010, 'rd, rt, sa', :diff_bits, [ 26, 1, 1 ]
macro_addop_special "rotrv", 0b000110, 'rd, rt, rs', :diff_bits, [ 6, 1, 1 ]
# REGIMM
addop "synci", :regimm, 0b11111, '', {:base => [5,21], :off => [16, 0] }
# ---------------------------------------------------------------
# SPECIAL3 opcode encoding of function field
# ---------------------------------------------------------------
addop "ext", :special3, 0b00000, 'rt, rs, pos, size', { :rs => [5, 21], :rt => [5, 16],
:msbd => [5, 11], :lsb => [5, 6] }
addop "ins", :special3, 0b00100, 'rt, rs, pos, size', { :rs => [5, 21], :rt => [5, 16],
:msb => [5, 11], :lsb => [5, 6] }
addop "rdhwr", :special3, 0b111011, 'rt, rd', { :rt => [5, 16], :rd => [5, 11] }
addop "wsbh", :bshfl, 0b00010, 'rd, rt', { :rt => [5, 16], :rd => [5, 11] }
addop "seb", :bshfl, 0b10000, 'rd, rt', { :rt => [5, 16], :rd => [5, 11] }
addop "seh", :bshfl, 0b11000, 'rd, rt', { :rt => [5, 16], :rd => [5, 11] }
# ---------------------------------------------------------------
# COP0
# ---------------------------------------------------------------
addop "rdpgpr", :cop0, 0b01010, 'rt, rd', {:rt => [5, 16], :rd => [5, 11] }
addop "wdpgpr", :cop0, 0b01110, 'rt, rd', {:rt => [5, 16], :rd => [5, 11] }
addop "di", :cop0, 0b01011, '', {}, :diff_bits, [ 5, 1 , 0]
addop "ei", :cop0, 0b01011, '', {}, :diff_bits, [ 5, 1 , 1]
# ---------------------------------------------------------------
# COP1, field rs
# ---------------------------------------------------------------
macro_addop_cop1 "mfhc1", 0b00011
macro_addop_cop1 "mthc1", 0b00111
# Floating point
[:cop1_s, :cop1_d].each do |type|
macro_addop_cop1_precision 'round.l', type, 0b001000, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'trunc.l', type, 0b001001, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'ceil.l', type, 0b001010, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'floor.l', type, 0b001011, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'recip', type, 0b010101, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'rsqrt', type, 0b010110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'cvt.l', type, 0b100101, 'fd, fs', :ft_zero
end
macro_addop_cop1_precision 'cvt.ps', :cop1_s, 0b100110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'cvt.s', :cop1_l, 0b100000, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'cvt.d', :cop1_l, 0b100000, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'add', :cop1_ps, 0b000000, 'fd, fs, ft'
macro_addop_cop1_precision 'sub', :cop1_ps, 0b000001, 'fd, fs, ft'
macro_addop_cop1_precision 'mul', :cop1_ps, 0b000010, 'fd, fs, ft'
macro_addop_cop1_precision 'abs', :cop1_ps, 0b000101, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'mov', :cop1_ps, 0b000110, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'neg', :cop1_ps, 0b000111, 'fd, fs', :ft_zero
macro_addop_cop1_precision 'movz', :cop1_ps, 0b010010, 'fd, fs, ft'
macro_addop_cop1_precision 'movn', :cop1_ps, 0b010011, 'fd, fs, ft'
addop "movf.#{:cop1_ps_str}", :cop1_ps, 0b010001, 'fd, fs, cc', [ :cc, :fs, :fd ]
addop "movt.#{:cop1_ps_str}", :cop1_ps, 0b010001, 'fd, fs, cc', [ :cc, :fs, :fd ]
%w(f un eq ueq olt ult ole ule sf ngle seq ngl lt nge le ngt).each_with_index do |cond, index|
addop "c.#{cond}.ps", :cop1_cond, 0b110000+index, 'cc, fs, ft',
[ :ft, :fs, :cc ]
# TODO: COP1X
[ :special3, :bshfl, :cop1_l, :cop1_ps ].each \
{ |t| @@opcodes_by_class[t] = opcode_list.find_all { |o| o.type == t } }
end
end
# Reset all instructions
def reset
metaprops_allowed.clear
args_allowed.clear
props_allowed.clear
fields_spec.clear
opcode_list.clear
end
end
# Array containing all the supported opcodes
attr_reader :opcode_list
init_mips32
end
end

View File

@ -1,51 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/mips/opcodes'
require 'metasm/parse'
module Metasm
class MIPS
def parse_arg_valid?(op, sym, arg)
# special case for lw reg, imm32(reg) ? (pseudo-instr, need to convert to 'lui t0, up imm32 ori t0 down imm32 add t0, reg lw reg, 0(t0)
case sym
when :rs, :rt, :rd; arg.class == Reg
when :sa, :i16, :i20, :i26; arg.kind_of? Expression
when :rs_i16; arg.class == Memref
when :ft; arg.class == FpReg
else raise "internal error: mips arg #{sym.inspect}"
end
end
def parse_argument(pgm)
pgm.skip_space
return if not tok = pgm.nexttok
if tok.type == :string and Reg.s_to_i[tok.raw]
pgm.readtok
arg = Reg.new Reg.s_to_i[tok.raw]
elsif tok.type == :string and FpReg.s_to_i[tok.raw]
pgm.readtok
arg = FpReg.new FpReg.s_to_i[tok.raw]
else
arg = Expression.parse pgm
pgm.skip_space
# check memory indirection: 'off(base reg)' # XXX scaled index ?
if arg and pgm.nexttok and pgm.nexttok.type == :punct and pgm.nexttok.raw == '('
pgm.readtok
pgm.skip_space_eol
ntok = pgm.readtok
raise tok, "Invalid base #{ntok}" unless ntok and ntok.type == :string and Reg.s_to_i[ntok.raw]
base = Reg.new Reg.s_to_i[ntok.raw]
pgm.skip_space_eol
ntok = pgm.readtok
raise tok, "Invalid memory reference, ')' expected" if not ntok or ntok.type != :punct or ntok.raw != ')'
arg = Memref.new base, arg
end
end
arg
end
end
end

View File

@ -1,43 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/mips/opcodes'
require 'metasm/render'
module Metasm
class MIPS
class Reg
include Renderable
def render ; [self.class.i_to_s[@i]] end
end
class FpReg
include Renderable
def render ; [self.class.i_to_s[@i]] end
end
class Memref
include Renderable
def render ; [@offset, '(', @base, ')'] end
end
def render_instruction(i)
r = []
r << i.opname
if not i.args.empty?
r << ' '
if (a = i.args.first).kind_of? Expression and a.op == :- and a.lexpr.kind_of? String and a.rexpr.kind_of? String and opcode_list_byname[i.opname].first.props[:setip]
# jmp foo is stored as jmp foo - bar ; bar:
r << a.lexpr
else
i.args.each { |a|
r << a << ', '
}
r.pop
end
end
r
end
end
end

View File

@ -1,479 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/os/main'
module Metasm
class PTrace32
attr_reader :buf, :pid
def self.open(target)
ptrace = new(target)
return ptrace if not block_given?
ret = yield ptrace
ptrace.detach
ret
end
# creates a ptraced process (target = path)
# or opens a running process (target = pid)
def initialize(target)
@buf = [0].pack('l')
@bufptr = [@buf].pack('P').unpack('l').first
begin
@pid = Integer(target)
attach
rescue ArgumentError
if not @pid = fork
traceme
exec target
end
end
Process.wait(@pid)
puts "Ptrace: attached to #@pid" if $DEBUG
end
# interpret the value turned as an unsigned long
def bufval
@buf.unpack('l').first
end
# reads a memory range
def readmem(off, len)
decal = off & 3
buf = ''
if decal > 0
off -= decal
peekdata(off)
off += 4
buf << @buf[decal..3]
end
offend = off + len
while off < offend
peekdata(off)
buf << @buf[0, 4]
off += 4
end
buf[0, len]
end
def writemem(off, str)
decal = off & 3
if decal > 0
off -= decal
peekdata(off)
str = @buf[0...decal] + str
end
decal = str.length & 3
if decal > 0
peekdata(off+str.length-decal)
str += @buf[decal..3]
end
i = 0
while i < str.length
pokedata(off+i, str[i, 4])
i += 4
end
end
# linux/ptrace.h
COMMAND = {
'TRACEME' => 0, 'PEEKTEXT' => 1,
'PEEKDATA' => 2, 'PEEKUSR' => 3,
'POKETEXT' => 4, 'POKEDATA' => 5,
'POKEUSR' => 6, 'CONT' => 7,
'KILL' => 8, 'SINGLESTEP' => 9,
'ATTACH' => 16, 'DETACH' => 17,
'SYSCALL' => 24,
# i486-asm/ptrace.h
# Arbitrarily choose the same ptrace numbers as used by the Sparc code.
'GETREGS' => 12, 'SETREGS' => 13,
'GETFPREGS' => 14, 'SETFPREGS' => 15,
'GETFPXREGS' => 18, 'SETFPXREGS' => 19,
'OLDSETOPTIONS' => 21, 'GET_THREAD_AREA' => 25,
'SET_THREAD_AREA' => 26, 'SYSEMU' => 31,
'SYSEMU_SINGLESTEP'=> 32,
# 0x4200-0x4300 are reserved for architecture-independent additions.
'SETOPTIONS' => 0x4200, 'GETEVENTMSG' => 0x4201,
'GETSIGINFO' => 0x4202, 'SETSIGINFO' => 0x4203
}
OPTIONS = {
# options set using PTRACE_SETOPTIONS
'TRACESYSGOOD' => 0x01, 'TRACEFORK' => 0x02,
'TRACEVFORK' => 0x04, 'TRACECLONE' => 0x08,
'TRACEEXEC' => 0x10, 'TRACEVFORKDONE'=> 0x20,
'TRACEEXIT' => 0x40
}
WAIT_EXTENDEDRESULT = {
# Wait extended result codes for the above trace options.
'EVENT_FORK' => 1, 'EVENT_VFORK' => 2,
'EVENT_CLONE' => 3, 'EVENT_EXEC' => 4,
'EVENT_VFORK_DONE' => 5, 'EVENT_EXIT' => 6
}
REGS_I386 = {
'EBX' => 0, 'ECX' => 1, 'EDX' => 2, 'ESI' => 3,
'EDI' => 4, 'EBP' => 5, 'EAX' => 6, 'DS' => 7,
'ES' => 8, 'FS' => 9, 'GS' => 10, 'ORIG_EAX' => 11,
'EIP' => 12, 'CS' => 13, 'EFL' => 14, 'UESP'=> 15,
'EFLAGS' => 14, 'ESP' => 15,
'SS' => 16, 'FRAME_SIZE' => 17,
# from ptrace.c in kernel source & asm-i386/user.h
'DR0' => 63, 'DR1' => 64, 'DR2' => 65, 'DR3' => 66,
'DR4' => 67, 'DR5' => 68, 'DR6' => 69, 'DR7' => 70
}
# this struct defines the way the registers are stored on the stack during a system call.
# struct pt_regs {
# long ebx; long ecx; long edx; long esi;
# long edi; long ebp; long eax; int xds;
# int xes; long orig_eax; long eip; int xcs;
# long eflags; long esp; int xss;
# };
SYSCALLNR = {
'restart_syscall' => 0, 'exit' => 1, 'fork' => 2, 'read' => 3,
'write' => 4, 'open' => 5, 'close' => 6, 'waitpid' => 7,
'creat' => 8, 'link' => 9, 'unlink' => 10, 'execve' => 11,
'chdir' => 12, 'time' => 13, 'mknod' => 14, 'chmod' => 15,
'lchown' => 16, 'break' => 17, 'oldstat' => 18, 'lseek' => 19,
'getpid' => 20, 'mount' => 21, 'umount' => 22, 'setuid' => 23,
'getuid' => 24, 'stime' => 25, 'ptrace' => 26, 'alarm' => 27,
'oldfstat' => 28, 'pause' => 29, 'utime' => 30, 'stty' => 31,
'gtty' => 32, 'access' => 33, 'nice' => 34, 'ftime' => 35,
'sync' => 36, 'kill' => 37, 'rename' => 38, 'mkdir' => 39,
'rmdir' => 40, 'dup' => 41, 'pipe' => 42, 'times' => 43,
'prof' => 44, 'brk' => 45, 'setgid' => 46, 'getgid' => 47,
'signal' => 48, 'geteuid' => 49, 'getegid' => 50, 'acct' => 51,
'umount2' => 52, 'lock' => 53, 'ioctl' => 54, 'fcntl' => 55,
'mpx' => 56, 'setpgid' => 57, 'ulimit' => 58, 'oldolduname' => 59,
'umask' => 60, 'chroot' => 61, 'ustat' => 62, 'dup2' => 63,
'getppid' => 64, 'getpgrp' => 65, 'setsid' => 66, 'sigaction' => 67,
'sgetmask' => 68, 'ssetmask' => 69, 'setreuid' => 70, 'setregid' => 71,
'sigsuspend' => 72, 'sigpending' => 73, 'sethostname' => 74, 'setrlimit' => 75,
'getrlimit' => 76, 'getrusage' => 77, 'gettimeofday' => 78, 'settimeofday' => 79,
'getgroups' => 80, 'setgroups' => 81, 'select' => 82, 'symlink' => 83,
'oldlstat' => 84, 'readlink' => 85, 'uselib' => 86, 'swapon' => 87,
'reboot' => 88, 'readdir' => 89, 'mmap' => 90, 'munmap' => 91,
'truncate' => 92, 'ftruncate' => 93, 'fchmod' => 94, 'fchown' => 95,
'getpriority' => 96, 'setpriority' => 97, 'profil' => 98, 'statfs' => 99,
'fstatfs' => 100, 'ioperm' => 101, 'socketcall' => 102, 'syslog' => 103,
'setitimer' => 104, 'getitimer' => 105, 'stat' => 106, 'lstat' => 107,
'fstat' => 108, 'olduname' => 109, 'iopl' => 110, 'vhangup' => 111,
'idle' => 112, 'vm86old' => 113, 'wait4' => 114, 'swapoff' => 115,
'sysinfo' => 116, 'ipc' => 117, 'fsync' => 118, 'sigreturn' => 119,
'clone' => 120, 'setdomainname' => 121, 'uname' => 122, 'modify_ldt' => 123,
'adjtimex' => 124, 'mprotect' => 125, 'sigprocmask' => 126, 'create_module' => 127,
'init_module' => 128, 'delete_module' => 129, 'get_kernel_syms' => 130, 'quotactl' => 131,
'getpgid' => 132, 'fchdir' => 133, 'bdflush' => 134, 'sysfs' => 135,
'personality' => 136, 'afs_syscall' => 137, 'setfsuid' => 138, 'setfsgid' => 139,
'_llseek' => 140, 'getdents' => 141, '_newselect' => 142, 'flock' => 143,
'msync' => 144, 'readv' => 145, 'writev' => 146, 'getsid' => 147,
'fdatasync' => 148, '_sysctl' => 149, 'mlock' => 150, 'munlock' => 151,
'mlockall' => 152, 'munlockall' => 153, 'sched_setparam' => 154, 'sched_getparam' => 155,
'sched_setscheduler' => 156, 'sched_getscheduler' => 157, 'sched_yield' => 158, 'sched_get_priority_max' => 159,
'sched_get_priority_min' => 160, 'sched_rr_get_interval' => 161, 'nanosleep' => 162, 'mremap' => 163,
'setresuid' => 164, 'getresuid' => 165, 'vm86' => 166, 'query_module' => 167,
'poll' => 168, 'nfsservctl' => 169, 'setresgid' => 170, 'getresgid' => 171,
'prctl' => 172, 'rt_sigreturn' => 173, 'rt_sigaction' => 174, 'rt_sigprocmask' => 175,
'rt_sigpending' => 176, 'rt_sigtimedwait' => 177, 'rt_sigqueueinfo' => 178, 'rt_sigsuspend' => 179,
'pread64' => 180, 'pwrite64' => 181, 'chown' => 182, 'getcwd' => 183,
'capget' => 184, 'capset' => 185, 'sigaltstack' => 186, 'sendfile' => 187,
'getpmsg' => 188, 'putpmsg' => 189, 'vfork' => 190, 'ugetrlimit' => 191,
'mmap2' => 192, 'truncate64' => 193, 'ftruncate64' => 194, 'stat64' => 195,
'lstat64' => 196, 'fstat64' => 197, 'lchown32' => 198, 'getuid32' => 199,
'getgid32' => 200, 'geteuid32' => 201, 'getegid32' => 202, 'setreuid32' => 203,
'setregid32' => 204, 'getgroups32' => 205, 'setgroups32' => 206, 'fchown32' => 207,
'setresuid32' => 208, 'getresuid32' => 209, 'setresgid32' => 210, 'getresgid32' => 211,
'chown32' => 212, 'setuid32' => 213, 'setgid32' => 214, 'setfsuid32' => 215,
'setfsgid32' => 216, 'pivot_root' => 217, 'mincore' => 218, 'madvise' => 219,
'getdents64' => 220, 'fcntl64' => 221, 'gettid' => 224, 'readahead' => 225,
'setxattr' => 226, 'lsetxattr' => 227, 'fsetxattr' => 228, 'getxattr' => 229,
'lgetxattr' => 230, 'fgetxattr' => 231, 'listxattr' => 232, 'llistxattr' => 233,
'flistxattr' => 234, 'removexattr' => 235, 'lremovexattr' => 236, 'fremovexattr' => 237,
'tkill' => 238, 'sendfile64' => 239, 'futex' => 240, 'sched_setaffinity' => 241,
'sched_getaffinity' => 242, 'set_thread_area' => 243, 'get_thread_area' => 244, 'io_setup' => 245,
'io_destroy' => 246, 'io_getevents' => 247, 'io_submit' => 248, 'io_cancel' => 249,
'fadvise64' => 250, 'exit_group' => 252, 'lookup_dcookie' => 253,
'epoll_create' => 254, 'epoll_ctl' => 255, 'epoll_wait' => 256, 'remap_file_pages' => 257,
'set_tid_address' => 258, 'timer_create' => 259, 'timer_settime' => 260, 'timer_gettime' => 261,
'timer_getoverrun' => 262, 'timer_delete' => 263, 'clock_settime' => 264, 'clock_gettime' => 265,
'clock_getres' => 266, 'clock_nanosleep' => 267, 'statfs64' => 268, 'fstatfs64' => 269,
'tgkill' => 270, 'utimes' => 271, 'fadvise64_64' => 272, 'vserver' => 273,
'mbind' => 274, 'get_mempolicy' => 275, 'set_mempolicy' => 276, 'mq_open' => 277,
'mq_unlink' => 278, 'mq_timedsend' => 279, 'mq_timedreceive' => 280, 'mq_notify' => 281,
'mq_getsetattr' => 282, 'kexec_load' => 283, 'waitid' => 284, 'sys_setaltroot' => 285,
'add_key' => 286, 'request_key' => 287, 'keyctl' => 288, 'ioprio_set' => 289,
'ioprio_get' => 290, 'inotify_init' => 291, 'inotify_add_watch' => 292, 'inotify_rm_watch' => 293,
'migrate_pages' => 294, 'openat' => 295, 'mkdirat' => 296, 'mknodat' => 297,
'fchownat' => 298, 'futimesat' => 299, 'fstatat64' => 300, 'unlinkat' => 301,
'renameat' => 302, 'linkat' => 303, 'symlinkat' => 304, 'readlinkat' => 305,
'fchmodat' => 306, 'faccessat' => 307, 'pselect6' => 308, 'ppoll' => 309,
'unshare' => 310, 'set_robust_list' => 311, 'get_robust_list' => 312, 'splice' => 313,
'sync_file_range' => 314, 'tee' => 315, 'vmsplice' => 316, 'move_pages' => 317,
'getcpu' => 318, 'epoll_pwait' => 319, 'utimensat' => 320, 'signalfd' => 321,
'timerfd' => 322, 'eventfd' => 323 }
def ptrace(req, pid, addr, data)
addr = [addr].pack('L').unpack('l').first if addr >= 0x8000_0000
Kernel.syscall(26, req, pid, addr, data)
end
def traceme
ptrace(COMMAND['TRACEME'], 0, 0, 0)
end
def peektext(addr)
ptrace(COMMAND['PEEKTEXT'], @pid, addr, @bufptr)
@buf
end
def peekdata(addr)
ptrace(COMMAND['PEEKDATA'], @pid, addr, @bufptr)
@buf
end
def peekusr(addr)
ptrace(COMMAND['PEEKUSR'], @pid, 4*addr, @bufptr)
bufval
end
def poketext(addr, data)
ptrace(COMMAND['POKETEXT'], @pid, addr, data.unpack('l').first)
end
def pokedata(addr, data)
ptrace(COMMAND['POKEDATA'], @pid, addr, data.unpack('l').first)
end
def pokeusr(addr, data)
ptrace(COMMAND['POKEUSR'], @pid, 4*addr, data)
end
def cont(sig = 0)
ptrace(COMMAND['CONT'], @pid, 0, sig)
end
def kill
ptrace(COMMAND['KILL'], @pid, 0, 0)
end
def singlestep(sig = 0)
ptrace(COMMAND['SINGLESTEP'], @pid, 0, sig)
end
def syscall
ptrace(COMMAND['SYSCALL'], @pid, 0, 0)
end
def attach
ptrace(COMMAND['ATTACH'], @pid, 0, 0)
end
def detach
ptrace(COMMAND['DETACH'], @pid, 0, 0)
end
end
class LinuxRemoteString < VirtualString
attr_accessor :pid, :readfd, :invalid_addr
attr_accessor :ptrace
# returns a virtual string proxying the specified process memory range
# reads are cached (4096 aligned bytes read at once), from /proc/pid/mem
# writes are done directly by ptrace
# XXX could fallback to ptrace if no /proc/pid...
def initialize(pid, addr_start=0, length=0xffff_ffff, ptrace=nil)
@pid = pid
@readfd = File.open("/proc/#@pid/mem")
@ptrace = ptrace if ptrace
@invalid_addr = false
super(addr_start, length)
end
def dup(addr = @addr_start, len = @length)
self.class.new(@pid, addr, len, ptrace)
end
def do_ptrace
if ptrace
yield @ptrace
else
PTrace32.open(@pid) { |ptrace| yield ptrace }
end
end
def rewrite_at(addr, data)
# target must be stopped
do_ptrace { |ptrace| ptrace.writemem(addr, data) }
end
def get_page(addr)
@readfd.pos = addr
# target must be stopped
do_ptrace {
begin
@readfd.read 4096
rescue Errno::EIO
nil
end
}
end
def realstring
super
@readfd.pos = @addr_start
do_ptrace { @readfd.read @length }
end
end
class GNUExports
# exported symbol name => exporting library name for common libraries
# used by ELF#automagic_symbols
EXPORT = {}
# see samples/elf_listexports for the generator of this data
data = <<EOL # XXX libraries do not support __END__/DATA...
libc.so.6
_IO_adjust_column _IO_adjust_wcolumn _IO_default_doallocate _IO_default_finish _IO_default_pbackfail _IO_default_uflow _IO_default_xsgetn _IO_default_xsputn
_IO_do_write _IO_do_write _IO_doallocbuf _IO_fclose _IO_fclose _IO_fdopen _IO_fdopen _IO_feof _IO_ferror _IO_fflush _IO_fgetpos _IO_fgetpos _IO_fgetpos64
_IO_fgetpos64 _IO_fgets _IO_file_attach _IO_file_attach _IO_file_close _IO_file_close_it _IO_file_close_it _IO_file_doallocate _IO_file_finish _IO_file_fopen
_IO_file_fopen _IO_file_init _IO_file_init _IO_file_open _IO_file_overflow _IO_file_overflow _IO_file_read _IO_file_seek _IO_file_seekoff _IO_file_seekoff
_IO_file_setbuf _IO_file_setbuf _IO_file_stat _IO_file_sync _IO_file_sync _IO_file_underflow _IO_file_underflow _IO_file_write _IO_file_write _IO_file_xsputn
_IO_file_xsputn _IO_flockfile _IO_flush_all _IO_flush_all_linebuffered _IO_fopen _IO_fopen _IO_fputs _IO_fread _IO_free_backup_area _IO_free_wbackup_area
_IO_fsetpos _IO_fsetpos _IO_fsetpos64 _IO_fsetpos64 _IO_ftell _IO_ftrylockfile _IO_funlockfile _IO_fwrite _IO_getc _IO_getline _IO_getline_info _IO_gets
_IO_init _IO_init_marker _IO_init_wmarker _IO_iter_begin _IO_iter_end _IO_iter_file _IO_iter_next _IO_least_wmarker _IO_link_in _IO_list_lock
_IO_list_resetlock _IO_list_unlock _IO_marker_delta _IO_marker_difference _IO_padn _IO_peekc_locked _IO_popen _IO_popen _IO_printf _IO_proc_close
_IO_proc_close _IO_proc_open _IO_proc_open _IO_putc _IO_puts _IO_remove_marker _IO_seekmark _IO_seekoff _IO_seekpos _IO_seekwmark _IO_setb _IO_setbuffer
_IO_setvbuf _IO_sgetn _IO_sprintf _IO_sputbackc _IO_sputbackwc _IO_sscanf _IO_str_init_readonly _IO_str_init_static _IO_str_overflow _IO_str_pbackfail
_IO_str_seekoff _IO_str_underflow _IO_sungetc _IO_sungetwc _IO_switch_to_get_mode _IO_switch_to_main_wget_area _IO_switch_to_wbackup_area
_IO_switch_to_wget_mode _IO_un_link _IO_ungetc _IO_unsave_markers _IO_unsave_wmarkers _IO_vfprintf _IO_vfscanf _IO_vsprintf _IO_wdefault_doallocate
_IO_wdefault_finish _IO_wdefault_pbackfail _IO_wdefault_uflow _IO_wdefault_xsgetn _IO_wdefault_xsputn _IO_wdo_write _IO_wdoallocbuf _IO_wfile_overflow
_IO_wfile_seekoff _IO_wfile_sync _IO_wfile_underflow _IO_wfile_xsputn _IO_wmarker_delta _IO_wsetb _Unwind_Find_FDE __adjtimex __argz_count __argz_next
__argz_stringify __asprintf __assert __assert_fail __assert_perror_fail __backtrace __backtrace_symbols __backtrace_symbols_fd __bsd_getpgrp __bzero
__chk_fail __clone __cmpdi2 __cmsg_nxthdr __confstr_chk __ctype_b_loc __ctype_tolower_loc __ctype_toupper_loc __cxa_atexit __cxa_finalize
__cyg_profile_func_enter __cyg_profile_func_exit __dcgettext __default_morecore __deregister_frame __deregister_frame_info __deregister_frame_info_bases
__dgettext __divdi3 __dup2 __duplocale __endmntent __errno_location __fbufsize __ffs __fgets_chk __fgets_unlocked_chk __fgetws_chk __fgetws_unlocked_chk
__finite __finitef __finitel __fixunsdfdi __fixunsxfdi __flbf __floatdidf __fork __fpending __fprintf_chk __fpurge __frame_state_for __freadable __freading
__freelocale __fsetlocking __fwprintf_chk __fwritable __fwriting __fxstat __fxstat64 __fxstat64 __fxstatat __fxstatat64 __gai_sigqueue __gconv_get_alias_db
__gconv_get_cache __gconv_get_modules_db __getcwd_chk __getdomainname_chk __getgroups_chk __gethostname_chk __getlogin_r_chk __getmntent_r __getpagesize
__getpgid __getpid __gets_chk __gettimeofday __getwd_chk __gmtime_r __h_errno_location __internal_endnetgrent __internal_getnetgrent_r __internal_setnetgrent
__isalnum_l __isalpha_l __isblank_l __iscntrl_l __isctype __isdigit_l __isgraph_l __isinf __isinff __isinfl __islower_l __isnan __isnanf __isnanl __isprint_l
__ispunct_l __isspace_l __isupper_l __iswalnum_l __iswalpha_l __iswblank_l __iswcntrl_l __iswctype __iswctype_l __iswdigit_l __iswgraph_l __iswlower_l
__iswprint_l __iswpunct_l __iswspace_l __iswupper_l __iswxdigit_l __isxdigit_l __ivaliduser __libc_allocate_rtsig __libc_allocate_rtsig_private __libc_calloc
__libc_current_sigrtmax __libc_current_sigrtmax_private __libc_current_sigrtmin __libc_current_sigrtmin_private __libc_dl_error_tsd __libc_dlclose
__libc_dlopen_mode __libc_dlsym __libc_fatal __libc_fork __libc_free __libc_freeres __libc_init_first __libc_longjmp __libc_mallinfo __libc_malloc
__libc_mallopt __libc_memalign __libc_msgrcv __libc_msgsnd __libc_pthread_init __libc_pvalloc __libc_pwrite __libc_realloc __libc_sa_len __libc_siglongjmp
__libc_start_main __libc_system __libc_thread_freeres __libc_valloc __lxstat __lxstat64 __lxstat64 __mbrlen __mbrtowc __mbsnrtowcs_chk __mbsrtowcs_chk
__mbstowcs_chk __memcpy_by2 __memcpy_by4 __memcpy_c __memcpy_chk __memcpy_g __memmove_chk __mempcpy __mempcpy_by2 __mempcpy_by4 __mempcpy_byn __mempcpy_chk
__mempcpy_small __memset_cc __memset_ccn_by2 __memset_ccn_by4 __memset_cg __memset_chk __memset_gcn_by2 __memset_gcn_by4 __memset_gg __moddi3 __modify_ldt
__monstartup __newlocale __nl_langinfo_l __nss_configure_lookup __nss_database_lookup __nss_disable_nscd __nss_group_lookup __nss_hostname_digits_dots
__nss_hosts_lookup __nss_lookup_function __nss_next __nss_passwd_lookup __nss_services_lookup __open_catalog __overflow __pipe __poll __pread64_chk
__pread_chk __printf_chk __printf_fp __profile_frequency __ptsname_r_chk __rawmemchr __read_chk __readlink_chk __readlinkat_chk __realpath_chk __recv_chk
__recvfrom_chk __register_atfork __register_frame __register_frame_info __register_frame_info_bases __register_frame_info_table
__register_frame_info_table_bases __register_frame_table __res_iclose __res_init __res_maybe_init __res_nclose __res_ninit __res_randomid __res_state
__rpc_thread_createerr __rpc_thread_svc_fdset __rpc_thread_svc_max_pollfd __rpc_thread_svc_pollfd __sbrk __sched_cpucount __sched_get_priority_max
__sched_get_priority_min __sched_getparam __sched_getscheduler __sched_setscheduler __sched_yield __secure_getenv __select __setmntent __setpgid __sigaddset
__sigdelset __sigismember __signbit __signbitf __signbitl __sigpause __sigsetjmp __sigsuspend __snprintf_chk __sprintf_chk __stack_chk_fail __statfs __stpcpy
__stpcpy_chk __stpcpy_g __stpcpy_small __stpncpy __stpncpy_chk __strcasecmp __strcasecmp_l __strcasestr __strcat_c __strcat_chk __strcat_g __strchr_c
__strchr_g __strchrnul_c __strchrnul_g __strcmp_gg __strcoll_l __strcpy_chk __strcpy_g __strcpy_small __strcspn_c1 __strcspn_c2 __strcspn_c3 __strcspn_cg
__strcspn_g __strdup __strerror_r __strfmon_l __strftime_l __strlen_g __strncasecmp_l __strncat_chk __strncat_g __strncmp_g __strncpy_by2 __strncpy_by4
__strncpy_byn __strncpy_chk __strncpy_gg __strndup __strpbrk_c2 __strpbrk_c3 __strpbrk_cg __strpbrk_g __strrchr_c __strrchr_g __strsep_1c __strsep_2c
__strsep_3c __strsep_g __strspn_c1 __strspn_c2 __strspn_c3 __strspn_cg __strspn_g __strstr_cg __strstr_g __strtod_internal __strtof_internal __strtok_r
__strtok_r_1c __strtol_internal __strtold_internal __strtoll_internal __strtoq_internal __strtoul_internal __strtoull_internal __strtouq_internal __strverscmp
__strxfrm_l __swprintf_chk __sysconf __sysctl __syslog_chk __sysv_signal __tolower_l __toupper_l __towctrans __towctrans_l __towlower_l __towupper_l
__ttyname_r_chk __ucmpdi2 __udivdi3 __uflow __umoddi3 __underflow __uselocale __vfork __vfprintf_chk __vfscanf __vfwprintf_chk __vprintf_chk __vsnprintf_chk
__vsprintf_chk __vswprintf_chk __vsyslog_chk __vwprintf_chk __waitpid __wcpcpy_chk __wcpncpy_chk __wcrtomb_chk __wcscasecmp_l __wcscat_chk __wcscoll_l
__wcscpy_chk __wcsftime_l __wcsncasecmp_l __wcsncat_chk __wcsncpy_chk __wcsnrtombs_chk __wcsrtombs_chk __wcstod_internal __wcstof_internal __wcstol_internal
__wcstold_internal __wcstoll_internal __wcstombs_chk __wcstoul_internal __wcstoull_internal __wcsxfrm_l __wctomb_chk __wctrans_l __wctype_l __wmemcpy_chk
__wmemmove_chk __wmempcpy_chk __wmemset_chk __woverflow __wprintf_chk __wuflow __wunderflow __xmknod __xmknodat __xpg_basename __xpg_strerror_r __xstat
__xstat64 __xstat64 _authenticate _dl_addr _dl_mcount_wrapper _dl_mcount_wrapper_check _dl_sym _dl_vsym _exit _mcleanup _mcount _nss_files_parse_grent
_nss_files_parse_pwent _nss_files_parse_spent _obstack_allocated_p _obstack_begin _obstack_begin_1 _obstack_free _obstack_memory_used _obstack_newchunk
_rpc_dtablesize _seterr_reply _setjmp _tolower _toupper a64l abort abs acct addseverity alarm alphasort alphasort64 alphasort64 argz_delete asctime atexit
atof atoi atol atoll authdes_create authdes_getucred authdes_pk_create authnone_create authunix_create authunix_create_default basename bcopy bdflush bind
bindresvport bsearch callrpc capget capset catclose catgets catopen cbc_crypt cfgetispeed cfgetospeed cfmakeraw cfsetispeed cfsetospeed cfsetspeed chflags
chown chown chroot clearerr clearerr_unlocked clnt_broadcast clnt_create clnt_pcreateerror clnt_perrno clnt_perror clnt_spcreateerror clnt_sperrno
clnt_sperror clntraw_create clnttcp_create clntudp_bufcreate clntudp_create clntunix_create clock closelog confstr creat64 create_module ctermid ctime ctime_r
cuserid daemon delete_module des_setparity difftime dirfd dirname div dprintf drand48 drand48_r dysize ecb_crypt ecvt ecvt_r endaliasent endfsent endgrent
endhostent endnetent endnetgrent endprotoent endpwent endrpcent endservent endspent endttyent endusershell endutxent envz_add envz_entry envz_get envz_merge
envz_remove envz_strip epoll_create epoll_ctl epoll_pwait epoll_wait erand48 err errx ether_aton ether_aton_r ether_hostton ether_line ether_ntoa ether_ntoa_r
ether_ntohost execl execle execlp execv execvp exit faccessat fattach fchflags fchmodat fchownat fclose fclose fcvt fcvt_r fdatasync fdetach fdopen fdopen
feof_unlocked ferror_unlocked fexecve fflush_unlocked ffs ffsll fgetgrent fgetpos fgetpos fgetpos64 fgetpos64 fgetpwent fgets_unlocked fgetspent fgetws
fgetws_unlocked fgetxattr fileno flistxattr fmemopen fmtmsg fnmatch fnmatch fopen fopen fopencookie fopencookie fprintf fputc fputc_unlocked fputs_unlocked
fputwc fputwc_unlocked fputws fputws_unlocked fread_unlocked free freeaddrinfo freeifaddrs fremovexattr freopen freopen64 fscanf fseek fseeko fseeko64 fsetpos
fsetpos fsetpos64 fsetpos64 fsetxattr fstatvfs ftello ftello64 ftime ftok fts_children fts_close fts_open fts_read fts_set ftw ftw64 futimens futimesat fwide
fwrite_unlocked fwscanf gai_strerror gcvt get_current_dir_name get_kernel_syms get_myaddress getaddrinfo getaliasbyname getaliasbyname_r getaliasbyname_r
getaliasent getaliasent_r getaliasent_r getchar getchar_unlocked getdate getdirentries getdirentries64 getdomainname getenv getfsent getfsfile getfsspec
getgrent getgrent_r getgrent_r getgrgid getgrgid_r getgrgid_r getgrnam getgrnam_r getgrnam_r getgrouplist gethostbyaddr gethostbyaddr_r gethostbyaddr_r
gethostbyname gethostbyname2 gethostbyname2_r gethostbyname2_r gethostbyname_r gethostbyname_r gethostent gethostent_r gethostent_r gethostid getifaddrs
getipv4sourcefilter getloadavg getlogin getlogin_r getmntent getmsg getnameinfo getnetbyaddr getnetbyaddr_r getnetbyaddr_r getnetbyname getnetbyname_r
getnetbyname_r getnetent getnetent_r getnetent_r getnetgrent getnetname getopt getopt_long getopt_long_only getpass getpgrp getpid getpmsg getpriority
getprotobyname getprotobyname_r getprotobyname_r getprotobynumber getprotobynumber_r getprotobynumber_r getprotoent getprotoent_r getprotoent_r getpublickey
getpwent getpwent_r getpwent_r getpwnam getpwnam_r getpwnam_r getpwuid getpwuid_r getpwuid_r getrlimit getrlimit getrlimit64 getrlimit64 getrpcbyname
getrpcbyname_r getrpcbyname_r getrpcbynumber getrpcbynumber_r getrpcbynumber_r getrpcent getrpcent_r getrpcent_r getrpcport getsecretkey getservbyname
getservbyname_r getservbyname_r getservbyport getservbyport_r getservbyport_r getservent getservent_r getservent_r getsid getsockname getsourcefilter getspent
getspent_r getspent_r getspnam getspnam_r getspnam_r getsubopt getttyent getttynam getusershell getutmp getutmpx getutxent getutxid getutxline getw getwchar
getwchar_unlocked getwd getxattr glob glob64 glob64 globfree globfree64 gmtime gnu_dev_major gnu_dev_makedev gnu_dev_minor grantpt gtty hcreate hcreate_r
hdestroy_r herror host2netname hsearch hsearch_r hstrerror htonl htons iconv iconv_close iconv_open if_freenameindex if_indextoname if_nameindex
if_nametoindex inet6_opt_append inet6_opt_find inet6_opt_finish inet6_opt_get_val inet6_opt_init inet6_opt_next inet6_opt_set_val inet6_option_alloc
inet6_option_append inet6_option_find inet6_option_init inet6_option_next inet6_option_space inet6_rth_add inet6_rth_getaddr inet6_rth_init inet6_rth_reverse
inet6_rth_segments inet6_rth_space inet_addr inet_lnaof inet_makeaddr inet_netof inet_network inet_nsap_addr inet_nsap_ntoa inet_ntoa inet_ntop inet_pton
init_module initgroups innetgr inotify_add_watch inotify_init inotify_rm_watch insque ioperm iopl iruserok iruserok_af isalnum isalpha isascii isastream
isblank iscntrl isdigit isfdtype isgraph islower isprint ispunct isspace isupper isxdigit jrand48 key_decryptsession key_decryptsession_pk key_encryptsession
key_encryptsession_pk key_gendes key_get_conv key_secretkey_is_set key_setnet key_setsecret killpg klogctl l64a labs lchmod lcong48 ldiv lfind lgetxattr
linkat listen listxattr llabs lldiv llistxattr localeconv localeconv localtime lockf lockf64 lrand48 lrand48_r lremovexattr lsearch lsetxattr lutimes madvise
malloc mblen mbstowcs mbtowc mcheck mcheck_check_all mcheck_pedantic memcmp memcpy memfrob memmem memmove mempcpy memset mincore mkdirat mkdtemp mkfifo
mkfifoat mkstemp mkstemp64 mktemp mktime mlock mlockall mprobe mrand48 mrand48_r msgctl msgctl msgget mtrace munlock munlockall muntrace netname2host
netname2user nfsservctl nftw nftw nftw64 nftw64 nice nl_langinfo nrand48 ntp_gettime obstack_free open_memstream open_wmemstream openlog parse_printf_format
passwd2des pclose pclose perror pivot_root pmap_getmaps pmap_getport pmap_rmtcall pmap_set pmap_unset popen popen posix_fadvise posix_fadvise64
posix_fadvise64 posix_fallocate posix_fallocate64 posix_fallocate64 posix_madvise posix_spawn posix_spawn_file_actions_addclose
posix_spawn_file_actions_adddup2 posix_spawn_file_actions_addopen posix_spawn_file_actions_destroy posix_spawn_file_actions_init posix_spawnattr_destroy
posix_spawnattr_getflags posix_spawnattr_getpgroup posix_spawnattr_getschedparam posix_spawnattr_getschedpolicy posix_spawnattr_getsigdefault
posix_spawnattr_getsigmask posix_spawnattr_init posix_spawnattr_setflags posix_spawnattr_setpgroup posix_spawnattr_setschedparam
posix_spawnattr_setschedpolicy posix_spawnattr_setsigdefault posix_spawnattr_setsigmask posix_spawnp ppoll printf printf_size printf_size_info psignal
pthread_attr_destroy pthread_attr_getdetachstate pthread_attr_getinheritsched pthread_attr_getschedparam pthread_attr_getschedpolicy pthread_attr_getscope
pthread_attr_init pthread_attr_init pthread_attr_setdetachstate pthread_attr_setinheritsched pthread_attr_setschedparam pthread_attr_setschedpolicy
pthread_attr_setscope pthread_cond_broadcast pthread_cond_broadcast pthread_cond_destroy pthread_cond_destroy pthread_cond_init pthread_cond_init
pthread_cond_signal pthread_cond_signal pthread_cond_timedwait pthread_cond_timedwait pthread_cond_wait pthread_cond_wait pthread_condattr_destroy
pthread_condattr_init pthread_equal pthread_exit pthread_getschedparam pthread_mutex_destroy pthread_mutex_init pthread_mutex_lock pthread_mutex_unlock
pthread_self pthread_setcanceltype pthread_setschedparam ptrace ptsname putc_unlocked putchar putchar_unlocked putenv putgrent putmsg putpmsg putpwent
putspent pututxline putw putwc putwc_unlocked putwchar putwchar_unlocked qecvt qecvt_r qfcvt qfcvt_r qgcvt qsort query_module quotactl raise rand rand_r rcmd
rcmd_af readdir64 readdir64 readdir64_r readdir64_r readlinkat realloc realpath realpath reboot regexec regexec registerrpc remove removexattr remque rename
renameat revoke rewind rewinddir rexec rexec_af rpmatch rresvport rresvport_af rtime ruserok ruserok_af ruserpass scandir scandir64 scandir64 scanf
sched_getaffinity sched_getaffinity sched_getcpu sched_setaffinity sched_setaffinity seed48 seekdir semctl semctl semget semop semtimedop sendfile sendfile64
setaliasent setbuf setdomainname setegid seteuid setfsent setfsgid setfsuid setgrent setgroups sethostent sethostid sethostname setipv4sourcefilter setjmp
setlinebuf setlocale setlogin setlogmask setnetent setnetgrent setpgrp setpriority setprotoent setpwent setrlimit setrlimit setrlimit64 setrpcent setservent
setsockopt setsourcefilter setspent setttyent setusershell setutxent setxattr sgetspent shmat shmctl shmctl shmdt shmget sigaddset sigandset sigdelset
sigemptyset sigfillset siggetmask sighold sigignore siginterrupt sigisemptyset sigismember sigorset sigpending sigrelse sigset sigstack sockatmark splice
sprintf srand48 sscanf sstk statvfs stime strcat strchr strcmp strcoll strcpy strcspn strerror strerror_l strfmon strfry strftime strlen strncat strncmp
strncpy strnlen strpbrk strptime strrchr strsignal strspn strstr strtoimax strtok strtol strtoll strtoul strtoull strtoumax strxfrm stty svc_exit svc_getreq
svc_getreq_common svc_getreq_poll svc_getreqset svc_register svc_run svc_sendreply svc_unregister svcerr_auth svcerr_decode svcerr_noproc svcerr_noprog
svcerr_progvers svcerr_systemerr svcerr_weakauth svcfd_create svcraw_create svctcp_create svcudp_bufcreate svcudp_create svcudp_enablecache svcunix_create
svcunixfd_create swab swprintf swscanf symlinkat sync sync_file_range syscall sysinfo syslog tcflow tcflush tcgetpgrp tcgetsid tcsendbreak tcsetattr tcsetpgrp
tee telldir tempnam time timegm tmpfile tmpfile tmpfile64 tmpnam tmpnam_r toascii tolower toupper towlower towupper tr_break truncate64 ttyname ttyslot ualarm
ungetwc unlinkat unlockpt unshare updwtmpx uselib user2netname usleep ustat utime utimensat utmpxname verr verrx versionsort versionsort64 versionsort64
vfprintf vhangup vlimit vm86 vm86 vmsplice vprintf vswscanf vsyslog vtimes vwarn vwarnx vwprintf vwscanf warn warnx wcschr wcscmp wcscpy wcscspn wcsdup
wcsftime wcsncat wcsncmp wcspbrk wcsrchr wcsspn wcsstr wcstoimax wcstok wcstol wcstoll wcstombs wcstoul wcstoull wcstoumax wcswidth wcsxfrm wctob wctomb
wcwidth wmemchr wmemcmp wmemset wordexp wordfree wprintf wscanf xdecrypt xdr_accepted_reply xdr_array xdr_authdes_cred xdr_authdes_verf xdr_authunix_parms
xdr_bool xdr_bytes xdr_callhdr xdr_callmsg xdr_char xdr_cryptkeyarg xdr_cryptkeyarg2 xdr_cryptkeyres xdr_des_block xdr_double xdr_enum xdr_float xdr_free
xdr_getcredres xdr_hyper xdr_int xdr_int16_t xdr_int32_t xdr_int64_t xdr_int8_t xdr_key_netstarg xdr_key_netstres xdr_keybuf xdr_keystatus xdr_long
xdr_longlong_t xdr_netnamestr xdr_netobj xdr_opaque xdr_opaque_auth xdr_pmap xdr_pmaplist xdr_pointer xdr_quad_t xdr_reference xdr_rejected_reply xdr_replymsg
xdr_rmtcall_args xdr_rmtcallres xdr_short xdr_sizeof xdr_string xdr_u_char xdr_u_hyper xdr_u_int xdr_u_long xdr_u_longlong_t xdr_u_quad_t xdr_u_short
xdr_uint16_t xdr_uint32_t xdr_uint64_t xdr_uint8_t xdr_union xdr_unixcred xdr_vector xdr_void xdr_wrapstring xdrmem_create xdrrec_create xdrrec_endofrecord
xdrrec_eof xdrrec_skiprecord xdrstdio_create xencrypt xprt_register xprt_unregister
EOL
curlibname = nil
data.each_line { |l|
list = l.split
curlibname = list.shift if l[0, 1] != ' '
list.each { |export| EXPORT[export] = curlibname }
}
end
end

View File

@ -1,228 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
module Metasm
# This class implements an objects that behaves like a regular string, but
# whose real data is dynamically fetched or generated on demand
# its size is immutable
# implements a page cache
# substrings are Strings (small substring) or a sub-VirtualString
# (a kind of 'window' on the original VString, when the substring length is > 4096)
class VirtualString
# formats parameters for reading
def [](from, len=nil)
if not len and from.kind_of? Range
b = from.begin
e = from.end
b = 1 + b + length if b < 0
e = 1 + e + length if e < 0
len = e - b
len += 1 if not from.exclude_end?
from = b
end
from = 1 + from + length if from < 0
return nil if from > length
len = length - from if len and from + len > length
read_range(from, len)
end
# formats parameters for overwriting portion of the string
def []=(from, len, val=nil)
raise TypeError, 'cannot modify frozen virtualstring' if frozen?
if not val
val = len
len = nil
end
if not len and from.kind_of? Range
b = from.begin
e = from.end
b = b + length if b < 0
e = e + length if e < 0
len = e - b
len += 1 if not from.exclude_end?
from = b
elsif not len
len = 1
val = val.chr
end
from = from + length if from < 0
raise IndexError, 'Index out of string' if from > length
raise IndexError, 'Cannot modify virtualstring length' if val.length != len or from + len > length
write_range(from, val)
end
# returns the full raw data (if not too big)
def realstring
raise 'realstring too big' if length > 0x1000000
end
# alias to realstring
# for bad people checking respond_to? :to_str (like String#<<)
# XXX alias does not work (not virtual (a la C++))
def to_str
realstring
end
# forwards unhandled messages to a frozen realstring
def method_missing(m, *args, &b)
if ''.respond_to? m
puts "Using VirtualString.realstring for #{m} from:", caller if $DEBUG
realstring.freeze.send(m, *args, &b)
else
super
end
end
# avoid triggering realstring from method_missing if possible
def length
raise "implement this!"
end
# avoid triggering realstring from method_missing if possible
def empty?
length == 0
end
# avoid triggering realstring from method_missing if possible
# heavily used in to find 0-terminated strings in ExeFormats
def index(chr, base=0)
if i = self[base, 64].index(chr) or i = self[base, 4096].index(chr)
base + i
else
realstring.index(chr, base)
end
end
# implements a read page cache
# the real address of our first byte
attr_accessor :addr_start
# our length
attr_accessor :length
# array of [addr, raw data], sorted by first == last accessed
attr_accessor :pagecache
# maximum length of self.pagecache (number of cached pages)
attr_accessor :pagecache_len
def initialize(addr_start, length)
@addr_start = addr_start
@length = length
@pagecache = []
@pagecache_len = 4
end
# invalidates the page cache
def invalidate
@pagecache.clear
end
# returns the 4096-bytes page starting at addr
# return nil if the page is invalid/inaccessible
# addr is page-aligned by the caller
# addr is absolute
#def get_page(addr)
#end
# searches the cache for a page containing addr, updates if not found
def cache_get_page(addr)
addr &= 0xffff_ffff_ffff_f000
@pagecache.each { |c|
if addr == c[0]
@pagecache.unshift @pagecache.delete(c) if c != @pagecache[0]
return c
end
}
@pagecache.pop if @pagecache.length >= @pagecache_len
@pagecache.unshift [addr, get_page(addr) || 0.chr*4096]
@pagecache.first
end
# reads a range from the page cache
# returns a new VirtualString (using dup) if the request is bigger than 4096 bytes
def read_range(from, len)
from += @addr_start
base, page = cache_get_page(from)
if not len
page[from - base]
elsif len <= 4096
s = page[from - base, len]
if from+len-base > 4096 # request crosses a page boundary
base, page = cache_get_page(from+len)
s << page[0, from+len-base]
end
s
else
# big request: return a new virtual page
dup(from, len)
end
end
# rewrites a segment of data
# the length written is the length of the content (a VirtualString cannot grow/shrink)
def write_range(from, content)
invalidate
rewrite_at(from + @addr_start, content)
end
# overwrites a section of the original data
#def rewrite_at(addr, content)
#end
end
class VirtualFile < VirtualString
# returns a new VirtualFile of the whole file content (defaults readonly)
# returns a String if the file is small (<4096o) and readonly access
def self.read(path, mode='rb')
raise 'no filename specified' if not path
if sz = File.size(path) <= 4096 and (mode == 'rb' or mode == 'r')
File.open(path, mode) { |fd| fd.read }
else
File.open(path, mode) { |fd| new fd, 0, sz }
end
end
# the underlying file descriptor
attr_accessor :fd
# creates a new virtual mapping of a section of the file
# the file descriptor must be seekable
def initialize(fd, addr_start = 0, length = nil)
@fd = fd.dup
if not length
@fd.seek(0, File::SEEK_END)
length = @fd.tell - addr_start
end
super(addr_start, length)
end
def dup(addr = @addr_start, len = @length)
self.class.new(@fd, addr, len)
end
# reads an aligned page from the file, at file offset addr
def get_page(addr)
@fd.pos = addr
@fd.read 4096
end
# overwrite a section of the file
def rewrite_at(addr, data)
@fd.pos = addr
@fd.write data
end
# returns the full content of the file
def realstring
super
@fd.pos = @addr_start
@fd.read(@length)
end
end
end

File diff suppressed because it is too large Load Diff

View File

@ -1,748 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
require 'metasm/encode'
require 'metasm/preprocessor'
module Metasm
class Data
# keywords for data definition (used to recognize label names)
DataSpec = %w[db dw dd dq]
end
class CPU
# parses prefix/name/arguments
# returns an +Instruction+ or raise a ParseError
def parse_instruction(lexer)
i = Instruction.new self
# find prefixes, break on opcode name
while tok = lexer.readtok and parse_prefix(i, tok.raw)
lexer.skip_space_eol
end
# allow '.' in opcode name
tok = tok.dup
while ntok = lexer.nexttok and ntok.type == :punct and ntok.raw == '.'
tok.raw << lexer.readtok.raw
ntok = lexer.readtok
raise tok, 'invalid opcode name' if not ntok or ntok.type != :string
tok.raw << ntok.raw
end
raise tok, 'invalid opcode' if not opcode_list_byname[tok.raw]
i.opname = tok.raw
i.backtrace = tok.backtrace.dup
lexer.skip_space
# find arguments list
loop do
break if not ntok = lexer.nexttok
break if i.args.empty? and opcode_list_byname[ntok.raw] and opcode_list_byname[i.opname].find { |op| op.args.empty? }
break if not arg = parse_argument(lexer)
i.args << arg
lexer.skip_space
break if not ntok = lexer.nexttok or ntok.type != :punct or ntok.raw != ','
lexer.readtok
lexer.skip_space_eol
end
if not parse_instruction_checkproto(i)
raise tok, "invalid opcode arguments #{i.to_s.inspect}, allowed : #{opcode_list_byname[i.opname].to_a.map { |o| o.args }.inspect}"
end
parse_instruction_fixup(i)
i
end
def parse_instruction_checkproto(i)
opcode_list_byname[i.opname].to_a.find { |o|
o.args.length == i.args.length and o.args.zip(i.args).all? { |f, a| parse_arg_valid?(o, f, a) }
}
end
# called after the instruction is fully parsed
def parse_instruction_fixup(i)
end
# return false if not a prefix
def parse_prefix(i, word)
end
# returns a parsed argument
# add your own arguments parser here (registers, memory references..)
def parse_argument(lexer)
Expression.parse(lexer)
end
# handles .instructions
# XXX handle HLA here ?
def parse_parser_instruction(lexer, instr)
raise instr, 'unknown parser instruction'
end
end
# asm-specific preprocessor
# handles asm arguments (; ... eol)
# asm macros (name macro args\nbody endm, name equ val)
# initializes token.value (reads integers in hex etc)
# merges consecutive space/eol
class AsmPreprocessor < Preprocessor
# an assembler macro, similar to preprocessor macro
# handles local labels
class Macro
attr_accessor :name, :args, :body, :labels
def initialize(name)
@name = name
@args, @body, @labels = [], [], []
end
# returns the array of token resulting from the application of the macro
# parses arguments if needed, handles macro-local labels
def apply(macro, lexer, program)
args = Preprocessor::Macro.parse_arglist(lexer).to_a
raise @name, 'invalid argument count' if args.length != @args.length
labels = @labels.inject({}) { |h, l| h.update l => program.new_label(l) }
args = @args.zip(args).inject({}) { |h, (fa, a)| h.update fa.raw => a }
# apply macro
@body.map { |t|
t = t.dup
t.backtrace += macro.backtrace[-2..-1] if not macro.backtrace.empty?
if labels[t.raw]
t.raw = labels[t.raw]
t
elsif args[t.raw]
# XXX update toks backtrace ?
args[t.raw]
else
t
end
}.flatten
end
# parses the argument list and the body from lexer
# recognize the local labels
# XXX add eax,
# toto db 42 ; zomg h4x
def parse_definition(lexer)
lexer.skip_space
while tok = lexer.nexttok and tok.type != :eol
# no preprocess argument list
raise @name, 'invalid arg definition' if not tok = lexer.readtok or tok.type != :string
@args << tok
lexer.skip_space
raise @name, 'invalid arg separator' if not tok = lexer.readtok or ((tok.type != :punct or tok.raw != ',') and tok.type != :eol)
break if tok.type == :eol
lexer.skip_space
end
lexer.skip_space_eol
while tok = lexer.readtok and (tok.type != :string or tok.raw != 'endm')
@body << tok
if @body[-2] and @body[-2].type == :string and @body[-1].raw == ':' and (not @body[-3] or @body[-3].type == :eol) and @body[-2].raw !~ /^[1-9][0-9]*$/
@labels << @body[-2].raw
elsif @body[-3] and @body[-3].type == :string and @body[-2].type == :space and Data::DataSpec.include?(@body[-1].raw) and (not @body[-4] or @body[-4].type == :eol)
@labels << @body[-3].raw
end
end
end
end
# the program (used to create new label names)
attr_accessor :program
# hash macro name => Macro
attr_accessor :macro
def initialize(program)
@program = program
@macro = {}
super()
end
def skip_space_eol
readtok while t = nexttok and (t.type == :space or t.type == :eol)
end
def skip_space
readtok while t = nexttok and t.type == :space
end
def nexttok
t = readtok
unreadtok t
t
end
# reads a token, handles macros/comments/integers/etc
# argument is for internal use
def readtok(rec = false)
tok = super()
# handle ; comments
if tok and tok.type == :punct and tok.raw == ';'
tok.type = :eol
begin
tok = tok.dup
while ntok = super() and ntok.type != :eol
tok.raw << ntok.raw
end
tok.raw << ntok.raw if ntok
rescue ParseError
# unterminated string
end
end
# aggregate space/eol
if tok and (tok.type == :space or tok.type == :eol)
if ntok = readtok(true) and ntok.type == :space
tok = tok.dup
tok.raw << ntok.raw
elsif ntok and ntok.type == :eol
tok = tok.dup
tok.raw << ntok.raw
tok.type = :eol
else
unreadtok ntok
end
end
# handle macros
# the rec parameter is used to avoid reading the whole text at once when reading ahead to check 'macro' keyword
if not rec and tok and tok.type == :string
if @macro[tok.raw]
@macro[tok.raw].apply(tok, self, @program).reverse_each { |t| unreadtok t }
tok = readtok
else
if ntok = readtok(true) and ntok.type == :space and nntok = readtok(true) and nntok.type == :string and (nntok.raw == 'macro' or nntok.raw == 'equ')
puts "W: asm: redefinition of macro #{tok.raw} at #{tok.backtrace_str}, previous definition at #{@macro[tok.raw].name.backtrace_str}" if @macro[tok.raw]
m = Macro.new tok
# XXX this allows nested macro definition..
if nntok.raw == 'macro'
m.parse_definition self
else
# equ
raise nntok if not etok = readtok
unreadtok etok
raise nntok if not v = Expression.parse(self)
etok = etok.dup
etok.type = :string
etok.value = v
etok.raw = v.to_s
m.body << etok
end
@macro[tok.raw] = m
tok = readtok
else
unreadtok nntok
unreadtok ntok
end
end
end
tok
end
end
class ExeFormat
# setup self.cursource here
def parse_init
@locallabels_bkw ||= {}
@locallabels_fwd ||= {}
end
# hash mapping local anonymous label number => unique name
# defined only while parsing
# usage:
# jmp 1f
# 1:
# jmp 1f
# jmp 1b
# 1:
# defined in #parse, replaced in use by Expression#parse
# no macro-scope (macro are gsub-like, and no special handling for those labels is done)
def locallabels_bkw(id)
@locallabels_bkw[id]
end
def locallabels_fwd(id)
@locallabels_fwd[id] ||= new_label("local_#{id}")
end
# parses an asm source file to an array of Instruction/Data/Align/Offset/Padding
def parse(text, file='<ruby>', lineno=0)
parse_init
@lexer ||= AsmPreprocessor.new(self)
@lexer.feed text, file, lineno
lasteol = true
while not @lexer.eos?
tok = @lexer.readtok
next if not tok
case tok.type
when :space
when :eol
lasteol = true
when :punct
case tok.raw
when '.'
tok = tok.dup
while ntok = @lexer.nexttok and ((ntok.type == :string) or (ntok.type == :punct and ntok.raw == '.'))
tok.raw << @lexer.readtok.raw
end
parse_parser_instruction tok
else raise tok, 'syntax error'
end
lasteol = false
when :string
ntok = nntok = nil
if lasteol and ((ntok = @lexer.readtok and ntok.type == :punct and ntok.raw == ':') or
(ntok and ntok.type == :space and nntok = @lexer.nexttok and nntok.type == :string and Data::DataSpec.include?(nntok.raw)))
if tok.raw =~ /^[1-9][0-9]*$/
# handle anonymous local labels
lname = @locallabels_bkw[tok.raw] = @locallabels_fwd.delete(tok.raw) || new_label('local_'+tok.raw)
else
lname = tok.raw
raise tok, "label redefinition" if new_label(lname) != lname
end
l = Label.new(lname)
l.backtrace = tok.backtrace.dup
@cursource << l
lasteol = false
else
lasteol = false
@lexer.unreadtok ntok
@lexer.unreadtok tok
if Data::DataSpec.include?(tok.raw)
@cursource << parse_data
else
@cursource << @cpu.parse_instruction(@lexer)
end
end
else
raise tok, 'syntax error'
end
end
puts "Undefined forward reference to anonymous labels #{@locallabels_fwd.keys.inspect}" if $VERBOSE and not @locallabels_fwd.empty?
self
end
# handles special directives (alignment, changing section, ...)
# special directives start with a dot
def parse_parser_instruction(tok)
case tok.raw.downcase
when '.align'
e = Expression.parse(@lexer).reduce
raise self, 'need immediate alignment size' unless e.kind_of? ::Integer
@lexer.skip_space
if ntok = @lexer.readtok and ntok.type == :punct and ntok.raw == ','
@lexer.skip_space_eol
# allow single byte value or full data statement
if not ntok = @lexer.readtok or not ntok.type == :string or not Data::DataSpec.include?(ntok.raw)
@lexer.unreadtok ntok
type = 'db'
else
type = ntok.raw
end
fillwith = parse_data_data type
else
@lexer.unreadtok ntok
end
raise tok, 'syntax error' if ntok = @lexer.nexttok and ntok.type != :eol
@cursource << Align.new(e, fillwith, tok.backtrace.dup)
when '.pad'
@lexer.skip_space
if ntok = @lexer.readtok and ntok.type != :eol
# allow single byte value or full data statement
if not ntok.type == :string or not Data::DataSpec.include?(ntok.raw)
@lexer.unreadtok ntok
type = 'db'
else
type = ntok.raw
end
fillwith = parse_data_data(type)
else
@lexer.unreadtok ntok
end
raise tok, 'syntax error' if ntok = @lexer.nexttok and ntok.type != :eol
@cursource << Padding.new(fillwith, tok.backtrace.dup)
when '.offset'
e = Expression.parse(@lexer)
raise tok, 'syntax error' if ntok = @lexer.nexttok and ntok.type != :eol
@cursource << Offset.new(e, tok.backtrace.dup)
when '.padto'
e = Expression.parse(@lexer)
@lexer.skip_space
if ntok = @lexer.readtok and ntok.type == :punct and ntok.raw == ','
@lexer.skip_space
# allow single byte value or full data statement
if not ntok = @lexer.readtok or not ntok.type == :string or not Data::DataSpec.include?(ntok.raw)
@lexer.unreadtok ntok
type = 'db'
else
type = ntok.raw
end
fillwith = parse_data_data type
else
@lexer.unreadtok ntok
end
raise tok, 'syntax error' if ntok = @lexer.nexttok and ntok.type != :eol
@cursource << Padding.new(fillwith, tok.backtrace.dup) << Offset.new(e, tok.backtrace.dup)
else
@cpu.parse_parser_instruction(self, tok)
end
end
def parse_data
raise ParseError, 'internal error' if not tok = @lexer.readtok
raise tok, 'invalid data type' if tok.type != :string or not Data::DataSpec.include?(tok.raw)
type = tok.raw
@lexer.skip_space_eol
arr = []
loop do
arr << parse_data_data(type)
@lexer.skip_space
if ntok = @lexer.readtok and ntok.type == :punct and ntok.raw == ','
@lexer.skip_space_eol
else
@lexer.unreadtok ntok
break
end
end
Data.new(type, arr, 1, tok.backtrace.dup)
end
def parse_data_data(type)
raise ParseError, 'need data content' if not tok = @lexer.readtok
if tok.type == :punct and tok.raw == '?'
Data.new type, :uninitialized, 1, tok.backtrace.dup
elsif tok.type == :quoted
Data.new type, tok.value, 1, tok.backtrace.dup
else
@lexer.unreadtok tok
raise tok, 'invalid data' if not i = Expression.parse(@lexer)
@lexer.skip_space
if ntok = @lexer.readtok and ntok.type == :string and ntok.raw.downcase == 'dup'
raise ntok, 'need immediate count expression' unless (count = i.reduce).kind_of? ::Integer
raise ntok, 'syntax error, ( expected' if not ntok = @lexer.readtok or ntok.type != :punct or ntok.raw != '('
content = []
loop do
content << parse_data_data(type)
@lexer.skip_space
if ntok = @lexer.readtok and ntok.type == :punct and ntok.raw == ','
@lexer.skip_space_eol
else
@lexer.unreadtok ntok
break
end
end
raise ntok, 'syntax error, ) expected' if not ntok = @lexer.readtok or ntok.type != :punct or ntok.raw != ')'
Data.new type, content, count, tok.backtrace.dup
else
@lexer.unreadtok ntok
Data.new type, i, 1, tok.backtrace.dup
end
end
end
end
class Expression
# key = operator, value = hash regrouping operators of lower precedence
OP_PRIO = [[:'||'], [:'&&'], [:|], [:^], [:&], [:'==', :'!='],
[:'<', :'>', :'<=', :'>='], [:<<, :>>], [:+, :-], [:*, :/, :%]
].inject({}) { |h, oplist|
lessprio = h.keys.inject({}) { |hh, op| hh.update op => true }
oplist.each { |op| h[op] = lessprio }
h }
class << self
# reads an operator from the lexer, returns the corresponding symbol or nil
def readop(lexer)
if not tok = lexer.readtok or tok.type != :punct
lexer.unreadtok tok
return
end
if tok.value
if OP_PRIO[tok.value]
return tok
else
lexer.unreadtok tok
return
end
end
op = tok
case op.raw
# may be followed by itself or '='
when '>', '<'
if ntok = lexer.readtok and ntok.type == :punct and (ntok.raw == op.raw or ntok.raw == '=')
op = op.dup
op.raw << ntok.raw
else
lexer.unreadtok ntok
end
# may be followed by itself
when '|', '&'
if ntok = lexer.readtok and ntok.type == :punct and ntok.raw == op.raw
op = op.dup
op.raw << ntok.raw
else
lexer.unreadtok ntok
end
# must be followed by '='
when '!', '='
if not ntok = lexer.readtok or ntok.type != :punct and ntok.raw != '='
lexer.unreadtok ntok
lexer.unreadtok tok
return
end
op = op.dup
op.raw << ntok.raw
# ok
when '^', '+', '-', '*', '/', '%'
# unknown
else
lexer.unreadtok tok
return
end
op.value = op.raw.to_sym
op
end
# parses floats/hex into tok.value, returns nothing
# does not parse unary operators (-/+/~)
def parse_num_value(lexer, tok)
if not tok.value and tok.raw =~ /^[a-f][0-9a-f]*h$/i
# warn on variable name like ffffh
puts "W: Parser: you may want to add a leading 0 to #{tok.raw.inspect} at #{tok.backtrace[-2]}:#{tok.backtrace[-1]}" if $VERBOSE
end
return if tok.value
return if tok.raw[0] != ?. and not (?0..?9).include? tok.raw[0]
case tr = tok.raw.downcase
when /^0b([01][01_]*)$/, /^([01][01_]*)b$/
tok.value = $1.to_i(2)
when /^(0[0-7][0-7_]*)$/
tok.value = $1.to_i(8)
when /^([0-9][a-f0-9_]*)h$/
tok.value = $1.to_i(16)
when /^0x([a-f0-9][a-f0-9_]*)(u?l?l?|l?l?u?|p([0-9][0-9_]*[fl]?)?)$/, '0x'
tok.value = $1.to_i(16) if $1
ntok = lexer.readtok
# check for C99 hex float
if not tr.include? 'p' and ntok and ntok.type == :punct and ntok.raw == '.'
if not nntok = lexer.readtok or nntok.type != :string
lexer.unreadtok nntok
lexer.unreadtok ntok
return
end
# read all pre-mantissa
tok.raw << ntok.raw
ntok = nntok
tok.raw << ntok.raw if ntok
raise tok, 'invalid hex float' if not ntok or ntok.type != :string or ntok.raw !~ /^[0-9a-f_]*p([0-9][0-9_]*[fl]?)?$/i
raise tok, 'invalid hex float' if tok.raw.delete('_').downcase[0,4] == '0x.p' # no digits
ntok = lexer.readtok
end
if not tok.raw.downcase.include? 'p'
# standard hex
lexer.unreadtok ntok
else
if tok.raw.downcase[-1] == ?p
# read signed mantissa
tok.raw << ntok.raw if ntok
raise tok, 'invalid hex float' if not ntok or ntok.type == :punct or (ntok.raw != '+' and ntok.raw != '-')
ntok = lexer.readtok
tok.raw << ntok.raw if ntok
raise tok, 'invalid hex float' if not ntok or ntok.type != :string or ntok.raw !~ /^[0-9][0-9_]*[fl]?$/i
end
raise tok, 'internal error' if not tok.raw.delete('_').downcase =~ /^0x([0-9a-f]*)(?:\.([0-9a-f]*))?p([+-]?[0-9]+)[fl]?$/
b1, b2, b3 = $1.to_i(16), $2, $3.to_i
b2 = b2.to_i(16) if b2
tok.value = b1.to_f
# tok.value += 1/b2.to_f # TODO
puts "W: unhandled hex float #{tok.raw}" if $VERBOSE and b2 and b2 != 0
tok.value *= 2**b3
puts "hex float: #{tok.raw} => #{tok.value}" if $DEBUG
end
when /^([0-9][0-9_]*)(u?l?l?|l?l?u?|e([0-9][0-9_]*[fl]?)?)$/, '.'
tok.value = $1.to_i if $1
ntok = lexer.readtok
if tok.raw == '.' and (not ntok or ntok.type != :string)
lexer.unreadtok ntok
return
end
if not tr.include? 'e' and tr != '.' and ntok and ntok.type == :punct and ntok.raw == '.'
if not nntok = lexer.readtok or nntok.type != :string
lexer.unreadtok nntok
lexer.unreadtok ntok
return
end
# read upto '.'
tok.raw << ntok.raw
ntok = nntok
end
if not tok.raw.downcase.include? 'e' and tok.raw[-1] == ?.
# read fractional part
tok.raw << ntok.raw if ntok
raise tok, 'bad float' if not ntok or ntok.type != :string or ntok.raw !~ /^[0-9_]*(e[0-9_]*)?[fl]?$/i
ntok = lexer.readtok
end
if tok.raw.downcase[-1] == ?e
# read signed exponent
tok.raw << ntok.raw if ntok
raise tok, 'bad float' if not ntok or ntok.type != :punct or (ntok.raw != '+' and ntok.raw != '-')
ntok = lexer.readtok
tok.raw << ntok.raw if ntok
raise tok, 'bad float' if not ntok or ntok.type != :string or ntok.raw !~ /^[0-9][0-9_]*[fl]?$/i
ntok = lexer.readtok
end
lexer.unreadtok ntok
if tok.raw.delete('_').downcase =~ /^(?:(?:[0-9]+\.[0-9]*|\.[0-9]+)(?:e[+-]?[0-9]+)?|[0-9]+e[+-]?[0-9]+)[fl]?$/i
tok.value = tok.raw.to_f
else
raise tok, 'internal error' if tok.raw =~ /[e.]/i
end
else raise tok, 'invalid numeric constant'
end
end
# parses an integer/a float, sets its tok.value, consumes&aggregate necessary following tokens (point, mantissa..)
# handles $/$$ special asm label name
# XXX for binary, use _ delimiter or 0b prefix, or start with 0 : 1b may conflict with backward local anonymous label reference
def parse_intfloat(lexer, tok)
if not tok.value and tok.raw == '$'
if not (l = lexer.program.cursource.last).kind_of? Label
l = Label.new(lexer.program.new_label('instr_start'))
l.backtrace = tok.backtrace.dup
lexer.program.cursource << l
end
tok.value = l.name
elsif not tok.value and tok.raw == '$$'
if not (l = lexer.program.cursource.first).kind_of? Label
l = Label.new(lexer.program.new_label('section_start'))
l.backtrace = tok.backtrace.dup
lexer.program.cursource.unshift l
end
tok.value = l.name
elsif not tok.value and tok.raw =~ /^([1-9][0-9]*)([fb])$/
case $2
when 'b'; tok.value = lexer.program.locallabels_bkw($1) # may fallback to binary parser
when 'f'; tok.value = lexer.program.locallabels_fwd($1)
end
end
parse_num_value(lexer, tok)
end
# returns the next value from lexer (parenthesised expression, immediate, variable, unary operators)
def parse_value(lexer)
nil while tok = lexer.readtok and tok.type == :space
return if not tok
case tok.type
when :string
parse_intfloat(lexer, tok)
val = tok.value || tok.raw
when :quoted
if tok.raw[0] != ?'
lexer.unreadtok tok
return
end
s = tok.value || tok.raw[1..-2] # raise tok, 'need ppcessing !'
s = s.reverse if lexer.respond_to? :program and lexer.program and lexer.program.cpu and lexer.program.cpu.endianness == :little
val = s.unpack('C*').inject(0) { |sum, c| (sum << 8) | c }
when :punct
case tok.raw
when '('
nil while ntok = lexer.readtok and (ntok.type == :space or ntok.type == :eol)
lexer.unreadtok ntok
val = parse(lexer)
nil while ntok = lexer.readtok and (ntok.type == :space or ntok.type == :eol)
raise tok, "syntax error, no ) found after #{val.inspect}, got #{ntok.inspect}" if not ntok or ntok.type != :punct or ntok.raw != ')'
when '!', '+', '-', '~'
nil while ntok = lexer.readtok and (ntok.type == :space or ntok.type == :eol)
lexer.unreadtok ntok
raise tok, 'need expression after unary operator' if not val = parse_value(lexer)
val = Expression[tok.raw.to_sym, val]
when '.'
parse_intfloat(lexer, tok)
if not tok.value
lexer.unreadtok tok
return
end
val = tok.value
else
lexer.unreadtok tok
return
end
else
lexer.unreadtok tok
return
end
nil while tok = lexer.readtok and tok.type == :space
lexer.unreadtok tok
val
end
# for boolean operators, true is 1 (or anything != 0), false is 0
def parse(lexer)
opstack = []
stack = []
return if not e = parse_value(lexer)
stack << e
while op = readop(lexer)
nil while ntok = lexer.readtok and (ntok.type == :space or ntok.type == :eol)
lexer.unreadtok ntok
until opstack.empty? or OP_PRIO[op.value][opstack.last]
stack << new(opstack.pop, stack.pop, stack.pop)
end
opstack << op.value
raise op, 'need rhs' if not e = parse_value(lexer)
stack << e
end
until opstack.empty?
stack << new(opstack.pop, stack.pop, stack.pop)
end
Expression[stack.first]
end
end
end
end

File diff suppressed because it is too large Load Diff

View File

@ -1,42 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/pic16c/opcode'
require 'metasm/decode'
module Metasm
class Pic16c
def build_opcode_bin_mask(op)
# bit = 0 if can be mutated by an field value, 1 if fixed by opcode
op.bin_mask = Array.new(op.bin.length, 0)
op.fields.each { |f, (oct, off)|
op.bin_mask[oct] |= (@fields_mask[f] << off)
}
op.bin_mask.map! { |v| 255 ^ v }
end
def build_bin_lookaside
# sets up a hash byte value => list of opcodes that may match
# opcode.bin_mask is built here
lookaside = Array.new(256) { [] }
@opcode_list.each { |op|
build_opcode_bin_mask op
b = op.bin[0]
msk = op.bin_mask[0]
for i in b..(b | (255^msk))
ext if i & msk != b & msk
lookaside[i] << op
end
}
lookaside
end
end
end

View File

@ -1,17 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
class Pic16c < CPU
def initialize(endianness = :big)
super()
@endianness = endianness
init
end
end
end

View File

@ -1,69 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/pic16c/main'
module Metasm
class Pic16c
def addop(name, bin, *l)
o = Opcode.new self, name
o.bin = bin
l.each { |ll|
if @props_allowed[ll]
o.props[ll] = true
else
o.args << ll
o.fields[ll] = @fields_off[ll]
end
}
@opcode_list << o
end
def init
@fields_mask = {:f => 0x7f, :b => 0x7, :k => 0xff, :klong => 0x3ff, :d => 1 }
@props_allowed = {:setip => true, :saveip => true, :stopexec => true }
@fields_off = { :f => 0, :b => 7, :k => 0, :klong => 0, :d => 7, :d => 7 }
addop 'addwf', 0b00_0111_0000_0000, :f, :d
addop 'andwf', 0b00_0101_0000_0000, :f, :d
addop 'clrf', 0b00_0001_1000_0000, :f
addop 'clrw', 0b00_0001_0000_0000 # 00_0001_0xxx_xxxx
addop 'comf', 0b00_1001_0000_0000, :f, :d
addop 'decf', 0b00_0011_0000_0000, :f, :d
addop 'decfsz',0b00_1011_0000_0000, :f, :d
addop 'incf', 0b00_1010_0000_0000, :f, :d
addop 'incfsz',0b00_1111_0000_0000, :f, :d
addop 'iorwf', 0b00_0100_0000_0000, :f, :d
addop 'movf', 0b00_1000_0000_0000, :f, :d
addop 'movwf', 0b00_0000_1000_0000, :f
addop 'nop', 0b00_0000_0000_0000 # 00_0000_0xx0_0000
addop 'rlf', 0b00_1101_0000_0000, :f, :d
addop 'rrf', 0b00_1100_0000_0000, :f, :d
addop 'subwf', 0b00_0010_0000_0000, :f, :d
addop 'swapf', 0b00_1110_0000_0000, :f, :d
addop 'xorwf', 0b00_0110_0000_0000, :f, :d
addop 'bcf', 0b01_0000_0000_0000, :f, :b
addop 'bsf', 0b01_0100_0000_0000, :f, :b
addop 'btfsc', 0b01_1000_0000_0000, :f, :b, :setip
addop 'btfss', 0b01_1100_0000_0000, :f, :b, :setip
addop 'addlw', 0b11_1110_0000_0000, :k # 00_000x_0000_0000
addop 'andlw', 0b11_1001_0000_0000, :k
addop 'call', 0b10_0000_0000_0000, :klong, :setip, :stopexec, :saveip
addop 'clrwdt',0b00_0000_0110_0100
addop 'goto', 0b10_1000_0000_0000, :klong, :setip, :stopexec
addop 'iorlw', 0b11_1000_0000_0000, :k
addop 'movlw', 0b11_0000_0000_0000, :k # 00_00xx_0000_0000
addop 'retfie',0b00_0000_0000_1001, :setip, :stopexec
addop 'retlw', 0b11_0100_0000_0000, :k, :setip, :stopexec # 00_00xx_0000_0000
addop 'return',0b00_0000_0000_1000, :setip, :stopexec
addop 'sleep', 0b00_0000_0110_0011
addop 'sublw', 0b11_1100_0000_0000, :k # 00_000x_0000_0000
addop 'xorlw', 0b11_1010_0000_0000, :k
end
end
end

File diff suppressed because it is too large Load Diff

View File

@ -1,77 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm/main'
module Metasm
# a Renderable element has a method #render that returns an array of [String or Renderable]
module Renderable
def to_s
render.join
end
end
class Instruction
include Renderable
def render
@cpu.render_instruction(self)
end
end
class Label
include Renderable
def render
[@name + ':']
end
end
class CPU
# renders an instruction
# may use instruction-global properties to render an argument (eg specify pointer size if not implicit)
def render_instruction(i)
r = []
r << i.opname
r << ' '
i.args.each { |a| r << a << ', ' }
r.pop
r
end
# ease debugging in irb
def inspect
"#<#{self.class}:#{'%x' % object_id} ... >"
end
end
class Expression
include Renderable
def render
return Expression[@lexpr, :-, -@rexpr].render if @op == :+ and @rexpr.kind_of?(::Numeric) and @rexpr < 0
l, r = [@lexpr, @rexpr].map { |e|
if e.kind_of? Integer
if e < 0
neg = true
e = -e
end
if e < 10; e = e.to_s
else e = '%xh' % e
end
e = '0' << e unless (?0..?9).include? e[0]
e = '-' << e if neg
end
e
}
nosq = {:* => [:*], :+ => [:+, :-, :*], :- => [:+, :-, :*]}
l = ['(', l, ')'] if @lexpr.kind_of? Expression and not nosq[@op].to_a.include?(@lexpr.op)
nosq[:-] = [:*]
r = ['(', r, ')'] if @rexpr.kind_of? Expression and not nosq[@op].to_a.include?(@rexpr.op)
op = @op if l or @op != :+
[l, op, r].compact
end
end
end

View File

@ -1,58 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# shows the preprocessor path to find a specific line
# usage: ruby chdr-find.rb 'regex pattern' list of files.h
#
def gets
l = $ungets
$ungets = nil
l || super
end
def parse(root=false)
want = false
ret = []
while l = gets
case l = l.strip
when /^#if/
ret << l
r = parse(true)
if r.empty?
ret.pop
else
want = true
rr = r.pop
ret.concat r.map { |l| (l[0,3] == '#el' ? ' ' : ' ') << l }
ret << rr
end
when /^#el/
if not root
$ungets = l
break
end
ret << l
r = parse
want = true if not r.empty?
ret.concat r
when /^#endif/
if not root
$ungets = l
break
end
ret << l
break
when /#$srch/ #, /^#include/
want = true
ret << l
end
end
want ? ret : []
end
$srch = ARGV.shift
puts parse

View File

@ -1,74 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
synclen = 6
ctxlen = 16
file1 = ('x'*ctxlen) + File.read(ARGV.shift)
file2 = ('x'*ctxlen) + File.read(ARGV.shift)
count1 = count2 = ctxlen
# prints the string in 80 cols
# with the first column filled with +pfx+
def show(pfx, str)
loop do
if str.length > 79
len = 79 - str[0...79][/\S+$/].to_s.length
len = 79 if len == 0
puts pfx + str[0...len]
str = str[len..-1]
else
puts pfx + str
break
end
end
end
loop do
w1 = file1[count1]
w2 = file2[count2]
break if not w1 and not w2
if w1 == w2
count1 += 1
count2 += 1
else
diff1 = diff2 = nil
catch(:resynced) {
1000.times { |depth|
(-depth..depth).map { |d|
if d == 0
[depth, depth]
elsif d < 0
[depth, depth+d]
elsif d > 0
[depth-d, depth]
end
}.each { |dc1, dc2|
next if not (0...synclen).all? { |i| file1[count1 + dc1 + i] == file2[count2 + dc2 + i] }
puts "@#{(count1-ctxlen).to_s 16} #{(count2-ctxlen).to_s 16}"
show ' ', file1[count1-ctxlen, ctxlen].inspect
if dc1 > 0
show '-', file1[count1, dc1].inspect
end
if dc2 > 0
show '+', file2[count2, dc2].inspect
end
count1 += dc1
count2 += dc2
show ' ', file1[count1, ctxlen].inspect
puts
throw :resynced
}
}
raise 'nomatch..'
}
end
end

View File

@ -1,36 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'enumerator'
class String
def hexdump
o = -16
lastl = []
lastdpl = false
unpack('C*').each_slice(16) { |s|
o += 16
if s != lastl
lastdpl = false
print '%04x ' % o
print s.map { |b| '%02x' % b }.join(' ').ljust(3*16-1) + ' '
print s.pack('C*').unpack('L*').map { |bb| '%08x' % bb }.join(' ').ljust(9*4-1) + ' '
puts s.map { |c| (32..126).include?(c) ? c : ?. }.pack('C*')
elsif not lastdpl
lastdpl = true
puts '*'
end
lastl = s
}
puts '%04x' % length
end
end
if $0 == __FILE__
File.open(ARGV.first, 'rb') { |fd| fd.read }.hexdump
end

View File

@ -1,10 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm'
require 'metasm-shell'
require 'pp'
include Metasm

View File

@ -1,47 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
# computes the difference beetween two ruby objects
# walks accessors, arrays and hashes
def Object.diff(o1, o2)
if o1.class == o2.class
h = {}
case o1
when Array, Hash
if o1.kind_of? Array
keys = (0...[o1.length, o2.length].max).to_a
else
keys = o1.keys | o2.keys
end
keys.each { |k|
d = diff(o1[k], o2[k])
h["[#{k.inspect}]"] = d if not d.empty?
}
else
a = (@@diff_accessor_cache ||= {})[o1.class] ||= (im = o1.class.public_instance_methods.grep(/^[a-z]/) ; (im & im.map { |m| m + '=' }).map { |m| m.chop })
if a.empty?
return o1 == o2 ? h : [o1, o2]
end
a.each { |k|
d = diff(o1.send(k), o2.send(k))
h['.' + k] = d if not d.empty?
}
end
# simplify tree
h.keys.each { |k|
if h[k].kind_of? Hash and h[k].length == 1
v = h.delete k
h[k + v.keys.first] = v.values.first
end
}
h
else
[o1, o2]
end
end

View File

@ -1,33 +0,0 @@
class Object
def scan_iter
case self
when ::Array
length.times { |i| yield self[i], "[#{i}]" }
when ::Hash
each { |k, v| yield v, "[#{k.inspect}]" ; yield k, "(key)" }
else
instance_variables.each { |i| yield instance_variable_get(i), ".#{i[1..-1]}" }
end
end
# dumps to stdout the path to find some targets ( array of objects to match with == )
def scan_for(targets, path, done={})
done[object_id] = self if done.empty?
t = nil
if targets.find { |t| self == t }
puts "found #{t} at #{path}"
end
scan_iter { |v, p|
case v
when Fixnum, Symbol; next
end
p = path+p
if done[v.object_id]
puts "loop #{p} -> #{done[v.object_id]}" if $VERBOSE
else
done[v.object_id] = p
v.scan_for(targets, p, done)
end
}
end
end

View File

@ -1,31 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# here we build a simple a.out executable
#
require 'metasm'
Metasm::AOut.assemble(Metasm::Ia32.new, <<EOS).encode_file('m-a.out')
.text
.entrypoint
mov eax, 4
mov ebx, 1
.data
str db "kikoo\\n"
strend:
.text
mov ecx, str
mov edx, strend - str
int 80h // linux sys_write
mov eax, 1
mov ebx, 42
int 80h // linux sys_exit
ret
EOS

View File

@ -1,49 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm'
require 'metasm-shell'
# padding
edata = <<EOS.encode_edata
inc ebx
jmp toto
pre_pad:
.pad 90h ; this statement will be replaced by the right number of 0x90 to honor the next .offset directive
post_pad:
toto:
.offset 24 + ((3-12) >> 8) ; we are now at 24 bytes from the beginning of the shellcode (inc ebx)
; .offset accepts an arbitrary expression
mov eax, [ebx + ((kikoo<<1) - 4*lol)] ; all immediate value can be an arbitrary arithmetic/logic expression
.padto toto+38, db 3 dup(0b0110_0110) ; fill space with the specified data structure until 38 bytes after toto (same as .pad + .offset)
inc eax
.align 16, dw foobar + 42
ret
#ifdef BLABLA
you can also use any preprocessor directive (gcc-like syntax)
# elif defined(HOHOHO) && 42
# error 'infamous error message'
#else
#define test(ic) ((ic) - \
4)
#endif
EOS
edata.fixup 'foobar' => 1 # fixup the value of 'foobar'
newdata = 'somestring'
edata.patch 'pre_pad', 'post_pad', newdata # replace the (beginning of the) segment beetween the labels by a string
#edata.patch 'pre_pad', 'post_pad', 'waaaaaaaaaay tooooooooooooooooooooooooooooooooooooooooo big !!!!' # raise an error
edata.fixup 'kikoo' => 8, 'lol' => 42 # fixup the immediate values
p edata.data # show the resulting raw string

View File

@ -1,21 +0,0 @@
require 'metasm'
src = <<EOS
void foo(int);
void bla()
{
int i = 10;
while (--i)
foo(i);
}
EOS
cp = Metasm::C::Parser.parse src
puts cp, '', ' ----', ''
cp.precompile
puts cp, '', ' ----', ''
cp = Metasm::C::Parser.parse src
cpu = Metasm::Ia32.new
cpu.generate_PIC = false
puts cpu.new_ccompiler(cp).compile

View File

@ -1,55 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# This script takes a C header or a path to a Visual Studio install and
# outputs a ruby source file defining StackOffsets, a hash used by the disassembler
# In verbose mode (ruby -v), instead dumps the parsed header (+ warnings)
#
require 'metasm'
filename = ARGV.shift
abort "usage: #$0 filename" if not File.exist? filename
# path to visual studio install directory
if File.directory? filename
src = <<EOS
// add the path to the visual studio std headers
#ifdef __METASM__
#pragma include_dir #{(filename+'/VC/platformsdk/include').inspect}
#pragma include_dir #{(filename+'/VC/include').inspect}
#pragma prepare_visualstudio
#pragma no_warn_redefinition
#endif
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
EOS
else
# standalone header
src = File.read(filename)
end
include Metasm
cp = Ia32.new.new_cparser.parse(src)
if not $VERBOSE
funcs = cp.toplevel.symbol.values.grep(C::Variable).reject { |v| v.initializer or not v.type.kind_of? C::Function }
puts 'module Metasm'
puts 'StackOffsets = {'
align = proc { |val| (val + cp.typesize[:ptr] - 1) / cp.typesize[:ptr] * cp.typesize[:ptr] }
puts funcs.find_all { |f| f.attributes and f.attributes.include? 'stdcall' and f.type.args }.sort_by { |f| f.name }.map { |f|
"#{f.name.inspect} => #{f.type.args.inject(0) { |sum, arg| sum + align[cp.sizeof(arg)] }}"
}.join(",\n")
puts '}'
puts 'end'
else
# dump the full parsed header
puts cp.lexer.dump_macros(cp.lexer.definition.keys, false), '', '', cp
end

View File

@ -1,35 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# quick demonstration that the disassembler's backtracker works
#
require 'metasm-shell'
puts <<EOS.encode.decode
.base_addr 0
; compute jump target
mov ebx, 0x12345678
mov eax, ((toto + 12) ^ 0x12345678)
xor eax, ebx
sub eax, 12
; jump
call eax
; trap
add eax, 42
; die, you vile reverser !
db 0e9h
; real target
toto:
mov eax, 28h
EOS

View File

@ -1,318 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this is a little script to navigate in a disassembler dump
#
# copypasted from lindebug.rb
module Ansi
CursHome = "\e[H".freeze
ClearLineAfter = "\e[0K"
ClearLineBefore = "\e[1K"
ClearLine = "\e[2K"
ClearScreen = "\e[2J"
def self.set_cursor_pos(y=1,x=1) "\e[#{y};#{x}H" end
Reset = "\e[m"
Colors = [:black, :red, :green, :yellow, :blue, :magenta, :cyan, :white, :aoeu, :reset]
def self.color(*args)
fg = true
"\e[" << args.map { |a|
case a
when :bold; 2
when :negative; 7
when :normal; 22
when :positive; 27
else
if col = Colors.index(a)
add = (fg ? 30 : 40)
fg = false
col+add
end
end
}.compact.join(';') << 'm'
end
def self.hline(len) "\e(0"<<'q'*len<<"\e(B" end
TIOCGWINSZ = 0x5413
TCGETS = 0x5401
TCSETS = 0x5402
CANON = 2
ECHO = 8
def self.get_terminal_size
s = ''.ljust(8)
$stdin.ioctl(TIOCGWINSZ, s) >= 0 ? s.unpack('SS') : [80, 25]
end
def self.set_term_canon(bool)
tty = ''.ljust(256)
$stdin.ioctl(TCGETS, tty)
if bool
tty[12] &= ~(ECHO|CANON)
else
tty[12] |= ECHO|CANON
end
$stdin.ioctl(TCSETS, tty)
end
ESC_SEQ = {'A' => :up, 'B' => :down, 'C' => :right, 'D' => :left,
'1~' => :home, '2~' => :inser, '3~' => :suppr, '4~' => :end,
'5~' => :pgup, '6~' => :pgdown,
'P' => :f1, 'Q' => :f2, 'R' => :f3, 'S' => :f4,
'15~' => :f5, '17~' => :f6, '18~' => :f7, '19~' => :f8,
'20~' => :f9, '21~' => :f10, '23~' => :f11, '24~' => :f12,
'[A' => :f1, '[B' => :f2, '[C' => :f3, '[D' => :f4, '[E' => :f5,
'H' => :home, 'F' => :end,
}
def self.getkey
c = $stdin.getc
return c if c != ?\e
c = $stdin.getc
if c != ?[ and c != ?O
$stdin.ungetc c
return ?\e
end
seq = ''
loop do
c = $stdin.getc
seq << c
case c; when ?a..?z, ?A..?Z, ?~; break end
end
ESC_SEQ[seq] || seq
end
end
class Viewer
attr_accessor :text, :pos, :x, :y
Color = {
:normal => Ansi.color(:white, :black, :normal),
:comment => Ansi.color(:blue),
:label => Ansi.color(:green),
:hilight => Ansi.color(:yellow),
}
def initialize(text)
text = File.read(text) if File.exist? text rescue nil
@text = text.gsub("\t", " "*8).to_a.map { |l| l.chomp }
@pos = @posh = 0
@x = @y = 0
@mode = :navig
@searchtext = 'x'
@posstack = []
@h, @w = Ansi.get_terminal_size
@h -= 2
@w -= 1
if y = @text.index('entrypoint:')
view(0, y)
end
end
def main_loop
Ansi.set_term_canon(true)
$stdout.write Ansi::ClearScreen
begin
loop do
refresh if not s = IO.select([$stdin], nil, nil, 0)
handle_key(Ansi.getkey)
end
ensure
Ansi.set_term_canon(false)
$stdout.write Ansi.set_cursor_pos(@h+2, 0) + Ansi::ClearLineAfter
end
end
def refresh
case @mode
when :navig
refresh_navig
when :search
refresh_search
end
end
def refresh_navig
str = ''
#str << Ansi::ClearScreen
str << Ansi.set_cursor_pos(0, 0)
hl = readtext
(0..@h).each { |h|
l = @text[@pos+h] || ''
str << outline(l, hl) << Ansi::ClearLineAfter << "\n"
}
str << Ansi.set_cursor_pos(@y+1, @x+1)
$stdout.write str
end
def refresh_search
$stdout.write '' << Ansi.set_cursor_pos(@h+2, 1) << '/' << @searchtext << Ansi::ClearLineAfter
end
def outline(l, hl=nil)
l = l[@posh, @w] || ''
hlr = /\b#{Regexp.escape(hl)}\b/i if hl
case l
when /^\/\//; Color[:comment] + l + Color[:normal]
when /^\S+:$/; Color[:label] + l + Color[:normal]
when /^(.*)(;.*)$/
str = $1
cmt = $2
str.gsub!(hlr, Color[:hilight]+hl+Color[:normal]) if hl
str + Color[:comment] + cmt + Color[:normal]
else
l = l.gsub(hlr, Color[:hilight]+hl+Color[:normal]) if hl
l
end
end
def search_prev
return if @searchtext == ''
y = @pos+@y-1
loop do
y = @text.length-1 if not @text[y] or y < 0
if x = (@text[y] =~ /#@searchtext/i)
view(x, y)
return
end
y -= 1
break if y == @pos+@y
end
end
def search_next
return if @searchtext == ''
y = @pos+@y+1
loop do
y = 0 if not @text[y]
if x = (@text[y] =~ /#@searchtext/i)
view(x, y)
return
end
break if y == @pos+@y or (y >= @text.length and not @text[@pos+@y])
y += 1
end
end
def view(x, y)
@posh, @x = 0, x
if @x > @w
@posh = @w-@x
@x = @w
end
if @pos+@h < y
@y = @h/2-1
@pos = y-@y
elsif @pos > y
@y = 1
@pos = y-@y
else
@y = y-@pos
end
end
def readtext
return if not l = @text[@pos+@y]
x = (l.rindex(/\W/, [@posh+@x-1, 0].max) || -1)+1
t = l[x..-1][/^\w+/]
t if t and @posh+@x < x+t.length
end
def handle_key(k)
case @mode
when :navig
handle_key_navig(k)
when :search
handle_key_search(k)
end
end
def handle_key_search(k)
case k
when ?\n; @mode = :navig ; @posstack << [@posh, @pos, @x, @y] ; search_next
when 0x20..0x7e; @searchtext << k
when :backspace, 0x7f; @searchtext.chop!
end
end
def handle_key_navig(k)
case k
when :f1
if not @posstack.empty?
@posh, @pos, @x, @y = @posstack.pop
end
when ?\n
return if not label = readtext
return if label.empty? or not newy = @text.index(@text.find { |l| l[0, label.length] == label }) or newy == @pos+@y
@posstack << [@posh, @pos, @x, @y]
view(0, newy)
when :up
if @y > 0; @y -= 1
elsif @pos > 0; @pos -= 1
end
when :down
if @y < @h; @y += 1
elsif @pos < text.length-@h; @pos += 1
end
when :home
@x = @posh = 0
when :end
@x = @text[@pos+@y].length
@posh, @x = @x-@w, @w if @x > @w
when :left
x = @text[@pos+@y].rindex(/\W\w/, [@posh+@x-2, 0].max)
x = x ? x+1 : @posh+@x-1
x = @posh+@x-3 if x < @posh+@x-3
x = 0 if x < 0
if x < @posh; @posh, @x = x, 0
else @x = x-@posh
end
#if @x > 0; @x -= 1
#elsif @posh > 0; @posh -= 1
#end
when :right
x = @text[@pos+@y].index(/\W\w/, @posh+@x)
x = x ? x+1 : @posh+@x+1
x = @posh+@x+3 if x > @posh+@x+3
if x > @posh+@w; @posh, @x = x-@w, @w
else
@x = x-@posh
@posh, @x = @x-@w, @w if @x > @w
end
#if @x < @w; @x += 1
#elsif @posh+@w < (@text[@pos, @h].map { |l| l.length }.max); @posh += 1
#end
when :pgdown
if @y < @h/2; @y += @h/2
elsif @pos < @text.length-3*@h/2; @pos += @h/2 ; @y = @h
else @pos = [0, @text.length-@h].max ; @y = @h
end
when :pgup
if @y > @h/2; @y -= @h/2
elsif @pos > @h/2; @pos -= @h/2 ; @y = 0
else @pos = @y = 0
end
when ?q; exit
when ?o; @text.insert(@pos+@y+1, '')
when ?O; @text.insert(@pos+@y, '') ; handle_key_navig(:down)
when :suppr; @text.delete_at(@pos+@y) if @text[@pos+@y] == ''
when ?D; @text.delete_at(@pos+@y)
when ?/
@mode = :search
@searchtext = ''
when ?*
@searchtext = readtext || ''
search_next
when ?n; search_next
when ?N; search_prev
when :f5
ARGV << '--reload'
load $0
end
end
end
if $0 == __FILE__ and not ARGV.delete '--reload'
Viewer.new(ARGF.read).main_loop
end

View File

@ -1,54 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this script disassembles an executable (elf/pe) using the GTK front-end
#
require 'metasm'
require 'optparse'
require 'metasm/gui/gtk'
# parse arguments
opts = {}
OptionParser.new { |opt|
opt.banner = 'Usage: disassemble-gtk.rb [options] <executable> [<entrypoints>]'
opt.on('--no-data-trace', 'do not backtrace memory read/write accesses') { opts[:nodatatrace] = true }
opt.on('--debug-backtrace', 'enable backtrace-related debug messages (very verbose)') { opts[:debugbacktrace] = true }
opt.on('--custom <hookfile>', 'loads the ruby script hookfile and invokes "dasm_setup(exe, dasm)"') { |h| opts[:hookfile] = h }
opt.on('-c <header>', '--c-header <header>', 'read C function prototypes (for external library functions)') { |h| opts[:cheader] = h }
opt.on('-v', '--verbose') { $VERBOSE = true }
opt.on('-d', '--debug') { $DEBUG = true }
}.parse!(ARGV)
exename = ARGV.shift
if not exename
w = Metasm::GtkGui::OpenFile.new(nil, 'chose target binary') { |t| exename = t }
w.signal_connect('destroy') { Gtk.main_quit }
Gtk.main
exit if not exename
end
exe = Metasm::AutoExe.orshellcode.decode_file(exename)
dasm = exe.init_disassembler
dasm.parse_c_file opts[:cheader] if opts[:cheader]
dasm.backtrace_maxblocks_data = -1 if opts[:nodatatrace]
dasm.debug_backtrace = true if opts[:debugbacktrace]
if opts[:hookfile]
load opts[:hookfile]
dasm_setup(exe, dasm)
end
ep = ARGV.map { |arg| (?0..?9).include?(arg[0]) ? Integer(arg) : arg }
w = Metasm::GtkGui::MainWindow.new("#{exename} - metasm disassembler").display(dasm, ep)
w.dasm_widget.focus_addr ep.first if not ep.empty?
w.signal_connect('destroy') { Gtk.main_quit }
Gtk.main

View File

@ -1,91 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this script disassembles an executable (elf/pe) and dumps the output
# ruby -h for help
#
require 'metasm'
include Metasm
require 'optparse'
# parse arguments
opts = {}
OptionParser.new { |opt|
opt.banner = 'Usage: disassemble.rb [options] <executable> [<entrypoints>]'
opt.on('--no-data', 'do not display data bytes') { opts[:nodata] = true }
opt.on('--no-data-trace', 'do not backtrace memory read/write accesses') { opts[:nodatatrace] = true }
opt.on('--debug-backtrace', 'enable backtrace-related debug messages (very verbose)') { opts[:debugbacktrace] = true }
opt.on('-c <header>', '--c-header <header>', 'read C function prototypes (for external library functions)') { |h| opts[:cheader] = h }
opt.on('-o <outfile>', '--output <outfile>', 'save the assembly listing in the specified file (defaults to stdout)') { |h| opts[:outfile] = h }
opt.on('-s <addrlist>', '--stop <addrlist>', '--stopaddr <addrlist>', 'do not disassemble past these addresses') { |h| opts[:stopaddr] ||= [] ; opts[:stopaddr] |= h.split ',' }
opt.on('--custom <hookfile>', 'loads the ruby script hookfile and invokes "dasm_setup(exe, dasm)"') { |h| opts[:hookfile] = h }
opt.on('--benchmark') { opts[:benchmark] = true }
opt.on('-v', '--verbose') { $VERBOSE = true }
opt.on('-d', '--debug') { $DEBUG = true }
}.parse!(ARGV)
exename = ARGV.shift
t0 = Time.now if opts[:benchmark]
# load the file
exe = AutoExe.orshellcode.decode_file exename
# set options
d = exe.init_disassembler
makeint = proc { |addr|
case addr
when /^[0-9].*h/; addr.to_i(16)
when /^[0-9]/; Integer(addr)
else d.normalize(addr)
end
}
d.parse_c_file opts[:cheader] if opts[:cheader]
d.backtrace_maxblocks_data = -1 if opts[:nodatatrace]
d.debug_backtrace = true if opts[:debugbacktrace]
opts[:stopaddr].to_a.each { |addr| d.decoded[makeint[addr]] = true }
if opts[:hookfile]
load opts[:hookfile]
dasm_setup(exe, d)
end
t1 = Time.now if opts[:benchmark]
# do the work
begin
if ARGV.empty?
exe.disassemble
else
exe.disassemble(*ARGV.map { |addr| makeint[addr] })
end
rescue Interrupt
puts $!, $!.backtrace
end
t2 = Time.now if opts[:benchmark]
# output
if opts[:outfile]
File.open(opts[:outfile], 'w') { |fd|
d.dump(!opts[:nodata]) { |l| fd.puts l }
}
else
d.dump(!opts[:nodata])
end
t3 = Time.now if opts[:benchmark]
todate = proc { |f|
if f > 5400
"#{f.to_i/3600}h#{(f.to_i%3600)/60}mn"
elsif f > 90
"#{f.to_i/60}mn#{f.to_i%60}s"
else
"#{'%.02f' % f}s"
end
}
puts "durations\n load #{todate[t1-t0]}\n dasm #{todate[t2-t1]}\n output #{todate[t3-t2]}\n total #{todate[t3-t0]}" if opts[:benchmark]

View File

@ -1,90 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
# Original script and idea by Alexandre GAZET
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this script will load an upx-packed windows executable, find its
# original entrypoint by disassembling the UPX stub, set breakpoint on it,
# run the program, and dump the loaded image to an executable PE.
#
# usage: dump_upx.rb <packed.exe> [<dumped.exe>]
#
require 'metasm'
include Metasm
# TerminateProcess prototype: 2 arguments (int, int) ; return value = int
WinAPI.new_api 'kernel32', 'TerminateProcess', 'II I'
class UPXUnpacker < WinDbg
# loads the file
# find the oep by disassembling
# run it until the oep
# dump the memory image
def initialize(file, dumpfile)
@dumpfile = dumpfile || 'upx-dumped.exe'
pe = PE.decode_file(file)
puts 'disassembling UPX loader...'
@oep = find_oep(pe)
puts "oep found at #{@oep.to_s 16}"
@baseaddr = pe.optheader.image_base
super(file.dup)
puts 'running...'
debugloop
end
# disassemble the upx stub to find a cross-section jump (to the real entrypoint)
def find_oep(pe)
dasm = pe.init_disassembler
dasm.backtrace_maxblocks_data = -1 # speed up dasm
dasm.disassemble 'entrypoint'
jmp = dasm.decoded.find { |addr, di|
di.instruction.opname == 'jmp' and
s = dasm.get_section_at(di.instruction.args[0]) and
s != dasm.get_section_at(addr)
}[1].instruction
dasm.normalize(jmp.args[0])
end
# when the initial thread is created, set a hardware breakpoint to the entrypoint
def handler_newthread(pid, tid, info)
super
puts "oep breakpoint set..."
ctx = get_context(pid, tid)
ctx[:dr0] = @oep
ctx[:dr6] = 0
ctx[:dr7] = 1
WinAPI::DBG_CONTINUE
end
# when our breakpoint hits, dump the file and terminate the process
def handler_exception(pid, tid, info)
if info.code == WinAPI::STATUS_SINGLE_STEP and
get_context(pid, tid)[:eip] == @oep
puts 'oep breakpoint hit !'
puts 'dumping...'
# dump the loaded pe to a genuine PE object
dump = LoadedPE.memdump @mem[pid], @baseaddr, @oep
# the UPX loader unpacks everything in the first section which is marked read-only in the PE header, we must make it writeable
dump.sections.first.characteristics = %w[MEM_READ MEM_WRITE MEM_EXECUTE]
# write the PE file
dump.encode_file @dumpfile
# kill the process
WinAPI.terminateprocess(@hprocess[pid], 0)
puts 'done.'
WinAPI::DBG_CONTINUE
else
super
end
end
end
if __FILE__ == $0
UPXUnpacker.new(ARGV.shift, ARGV.shift)
end

View File

@ -1,46 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this script reads a list of elf files, and lists its dependencies recursively
# libraries are searched in LD_LIBRARY_PATH, /usr/lib and /lib
# includes the elf interpreter
# can be useful when chrooting a binary
#
require 'metasm'
paths = ENV['LD_LIBRARY_PATH'].to_s.split(':') + %w[/usr/lib /lib]
todo = ARGV.map { |file| (file[0] == ?/) ? file : "./#{file}" }
done = []
while src = todo.shift
puts src
# could do a simple ELF.decode_file, but this is quicker
elf = Metasm::ELF.decode_file_header(src)
if s = elf.segments.find { |s| s.type == 'INTERP' }
interp = elf.encoded[s.offset, s.filesz].data.chomp("\0")
if not done.include? interp
puts interp
done << interp
end
end
elf.decode_tags
elf.decode_segments_tags_interpret
deps = elf.tag['NEEDED'].to_a - done
done.concat deps
deps.each { |dep|
if not path = paths.find { |path| File.exist? File.join(path, dep) }
$stderr.puts "cannot find #{dep} for #{src}"
else
todo << File.join(path, dep)
end
}
end

View File

@ -1,29 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this script takes a list of dll filenames as arguments, and outputs each lib export
# libname, followed by the list of the exported symbol names, in a format usable
# by the Elf class autoimport functionnality (see metasm/os/linux.rb)
#
require 'metasm'
ARGV.each { |f|
e = Metasm::ELF.decode_file(f)
next if not e.tag['SONAME']
puts e.tag['SONAME']
line = ''
e.symbols.find_all { |s|
s.name and s.type == 'FUNC' and s.shndx != 'UNDEF' and s.bind == 'GLOBAL'
}.map { |s| ' ' << s.name }.sort.each { |s|
if line.length + s.length >= 160
puts line
line = ''
end
line << s
}
puts line if not line.empty?
}

View File

@ -1,21 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm'
elf = Metasm::ELF.compile_c(Metasm::Ia32.new, DATA.read)
elf.encode_file('sampelf-c')
__END__
int printf(char *fmt, ...);
void exit(int);
int main(void)
{
printf("Hello, %s !\n", "world");
exit(0x28);
}

View File

@ -1,43 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm'
elf = Metasm::ELF.assemble(Metasm::Ia32.new, DATA.read)
elf.encode_file('sampelf')
__END__
.interp '/lib/ld-linux.so.2'
.pt_gnu_stack rw
.data
toto db "world", 0
fmt db "Hello, %s !\n", 0
.text
.entrypoint
call metasm_intern_geteip
mov esi, eax
lea eax, [esi-metasm_intern_geteip+toto]
push eax
lea eax, [esi-metasm_intern_geteip+fmt]
push eax
call printf
add esp, 8
push 28h
call _exit
add esp, 4
ret
metasm_intern_geteip:
call 1f
1:
pop eax
add eax, metasm_intern_geteip - 1b
ret

View File

@ -1,88 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this exemple illustrates the use of the cparser/preprocessor #factorize functionnality:
# it generates code that references to the functions imported by a windows executable, and
# factorizes the windows headers through them
# usage: factorize-imports.rb <exe> <path to visual studio installation> [<additional func names>... !<func to exclude>]
#
require 'metasm'
include Metasm
require 'optparse'
opts = {}
OptionParser.new { |opt|
opt.on('--ddk') { opts[:ddk] = true }
opt.on('-o outfile') { |f| opts[:outfile] = f }
opt.on('-I additional_header') { |f| (opts[:add_hdrs] ||= []) << f }
opt.on('--exe executable', '--pe executable') { |f| opts[:pe] = f }
opt.on('--vs path', '--vspath path') { |f| opts[:vspath] = f }
}.parse!(ARGV)
pe = PE.decode_file_header(opts[:pe] || ARGV.shift)
opts[:vspath] ||= ARGV.shift
raise 'need a path to the headers' if not opts[:vspath]
opts[:vspath].chop! if opts[:vspath][-1] == '/'
opts[:vspath][-3..-1] = '' if opts[:vspath][-3..-1] == '/VC'
pe.decode_imports
funcnames = pe.imports.map { |id| id.imports.map { |i| i.name } }.flatten.compact.uniq.sort
ARGV.each { |n|
if n[0] == ?!
funcnames.delete n[1..-1]
else
funcnames |= [n]
end
}
src = <<EOS + opts[:add_hdrs].to_a.map { |h| "#include <#{h}>\n" }.join
#define DDK #{opts[:ddk] ? 1 : 0}
#ifdef __METASM__
#if DDK
#pragma include_dir #{opts[:vspath].inspect}
#else
#pragma include_dir #{(opts[:vspath]+'/VC/platformsdk/include').inspect}
#pragma include_dir #{(opts[:vspath]+'/VC/include').inspect}
#endif
#pragma prepare_visualstudio
#pragma no_warn_redefinition
#define _WIN32_WINNT 0x0600 // vista
#endif
#if DDK
#define NO_INTERLOCKED_INTRINSICS
typedef struct _CONTEXT CONTEXT; // needed by ntddk.h, but this will pollute the factorized output..
typedef CONTEXT *PCONTEXT;
#define dllimport stdcall // wtff
#include <ntddk.h>
#include <stdio.h>
#else
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <winternl.h>
#endif
EOS
parser = Ia32.new.new_cparser
parser.factorize_init
parser.parse src
# delete imports not present in the header files
funcnames.delete_if { |f|
if not parser.toplevel.symbol[f]
puts "// #{f.inspect} is not defined in the headers"
true
end
}
parser.parse 'void *fnptr[] = { ' + funcnames.map { |f| '&'+f }.join(', ') + ' };'
outfd = (opts[:outfile] ? File.open(opts[:outfile], 'w') : $stdout)
outfd.puts parser.factorize_final
outfd.close

View File

@ -1,41 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this exemple illustrates the use of the cparser/preprocessor #factorize functionnality:
# we write some code using standard headers, and the factorize call on CParser
# gives us back the macro/C definitions that we use in our code, so that we can
# get rid of the header
# Argument: C file to factorize, [path to visual studio installation]
# with a single argument, uses GCC standard headers
#
require 'metasm'
include Metasm
abort 'target needed' if not file = ARGV.shift
visualstudiopath = ARGV.shift
if visualstudiopath
stub = <<EOS
// add the path to the visual studio std headers
#ifdef __METASM__
#pragma include_dir #{(visualstudiopath+'/platformsdk/include').inspect}
#pragma include_dir #{(visualstudiopath+'/include').inspect}
#pragma prepare_visualstudio
#pragma no_warn_redefinition
#endif
EOS
else
stub = <<EOS
#ifdef __METASM__
#pragma prepare_gcc
#endif
EOS
end
# to trace only pp macros (using eg an asm source), use Preprocessor#factorize instead
puts Ia32.new.new_cparser.factorize(stub + File.read(file))

View File

@ -1,787 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this is a linux/x86 debugger with a console interface
#
require 'metasm'
module Ansi
CursHome = "\e[H".freeze
ClearLineAfter = "\e[0K"
ClearLineBefore = "\e[1K"
ClearLine = "\e[2K"
ClearScreen = "\e[2J"
def self.set_cursor_pos(y=1,x=1) "\e[#{y};#{x}H" end
Reset = "\e[m"
Colors = [:black, :red, :green, :yellow, :blue, :magenta, :cyan, :white, :aoeu, :reset]
def self.color(*args)
fg = true
"\e[" << args.map { |a|
case a
when :bold; 2
when :negative; 7
when :normal; 22
when :positive; 27
else
if col = Colors.index(a)
add = (fg ? 30 : 40)
fg = false
col+add
end
end
}.compact.join(';') << 'm'
end
def self.hline(len) "\e(0"<<'q'*len<<"\e(B" end
TIOCGWINSZ = 0x5413
TCGETS = 0x5401
TCSETS = 0x5402
CANON = 2
ECHO = 8
def self.get_terminal_size
s = ''.ljust(8)
$stdin.ioctl(TIOCGWINSZ, s) >= 0 ? s.unpack('SS') : [80, 25]
end
def self.set_term_canon(bool)
tty = ''.ljust(256)
$stdin.ioctl(TCGETS, tty)
if bool
tty[12] &= ~(ECHO|CANON)
else
tty[12] |= ECHO|CANON
end
$stdin.ioctl(TCSETS, tty)
end
ESC_SEQ = {'A' => :up, 'B' => :down, 'C' => :right, 'D' => :left,
'1~' => :home, '2~' => :inser, '3~' => :suppr, '4~' => :end,
'5~' => :pgup, '6~' => :pgdown,
'P' => :f1, 'Q' => :f2, 'R' => :f3, 'S' => :f4,
'15~' => :f5, '17~' => :f6, '18~' => :f7, '19~' => :f8,
'20~' => :f9, '21~' => :f10, '23~' => :f11, '24~' => :f12,
'[A' => :f1, '[B' => :f2, '[C' => :f3, '[D' => :f4, '[E' => :f5,
'H' => :home, 'F' => :end,
}
def self.getkey
c = $stdin.getc
return c if c != ?\e
c = $stdin.getc
if c != ?[ and c != ?O
$stdin.ungetc c
return ?\e
end
seq = ''
loop do
c = $stdin.getc
seq << c
case c; when ?a..?z, ?A..?Z, ?~; break end
end
ESC_SEQ[seq] || seq
end
end
class Indirect < Metasm::ExpressionType
attr_accessor :ptr, :sz
UNPACK_STR = {1 => 'C', 2 => 'S', 4 => 'L'}
def initialize(ptr, sz) @ptr, @sz = ptr, sz end
def bind(bd)
raw = bd['tracer_memory'][@ptr.bind(bd).reduce, @sz]
Metasm::Expression[raw.unpack(UNPACK_STR[@sz]).first]
end
def externals ; @ptr.externals end
end
class ExprParser < Metasm::Expression
def self.parse_intfloat(lex, tok)
case tok.raw
when 'byte', 'word', 'dword'
nil while ntok = lex.readtok and ntok.type == :space
nil while ntok = lex.readtok and ntok.type == :space if ntok and ntok.raw == 'ptr'
if ntok and ntok.raw == '['
tok.value = Indirect.new(parse(lex), {'byte' => 1, 'word' => 2, 'dword' => 4}[tok.raw])
nil while ntok = lex.readtok and ntok.type == :space
nil while ntok = lex.readtok and ntok.type == :space if ntok and ntok.raw == ']'
lex.unreadtok ntok
end
else super
end
end
def self.parse_value(lex)
nil while tok = lex.readtok and tok.type == :space
lex.unreadtok tok
if tok and tok.type == :punct and tok.raw == '['
tt = tok.dup
tt.type = :string
tt.raw = 'dword'
lex.unreadtok tt
end
super
end
end
class LinDebug
attr_accessor :win_data_height, :win_code_height, :win_prpt_height
def init_screen
Ansi.set_term_canon(true)
@win_data_height = 20
@win_code_height = 20
resize
end
def fini_screen
Ansi.set_term_canon(false)
end
def win_data_start; 2 end
def win_code_start; win_data_start+win_data_height end
def win_prpt_start; win_code_start+win_code_height end
Color = {:changed => Ansi.color(:cyan, :bold), :border => Ansi.color(:green),
:normal => Ansi.color(:white, :black, :normal), :hilight => Ansi.color(:blue, :white, :normal),
:status => Ansi.color(:black, :cyan)}
attr_accessor :dataptr, :codeptr, :rs, :promptlog
def initialize(rs)
@rs = rs
@rs.logger = self
@dataptr = 0
@datafmt = 'db'
@prompthistlen = 20
@prompthistory = []
@promptloglen = 200
@promptlog = []
@promptbuf = ''
@promptpos = 0
@log_off = 0
@focus = :prompt
@command = {}
load_commands
trap('WINCH') { resize }
stack = @rs[@rs.regs_cache['esp'], 0x1000].unpack('L*')
stack.shift # argc
stack.shift until stack.empty? or stack.first == 0 # argv
stack.shift
stack.shift until stack.empty? or stack.first == 0 # envp
stack.shift
stack.shift until stack.empty? or stack.shift == 3 # find PHDR ptr in auxv
if phdr = stack.shift
phdr &= 0xffff_f000
@rs.loadsyms phdr, phdr.to_s(16)
end
end
def main_loop
begin
begin
init_screen
main_loop_inner
rescue Errno::ESRCH
log "target does not exist anymore"
ensure
fini_screen
$stdout.print Ansi.set_cursor_pos(@console_height, 1)
end
rescue
$stdout.puts $!, $!.backtrace
end
$stdout.puts @promptlog.last
end
def update
csy, csx = @console_height-1, @promptpos+2
$stdout.write Ansi.set_cursor_pos(0, 0) + updateregs + updatedata + updatecode + updateprompt + Ansi.set_cursor_pos(csy, csx)
end
def updateregs
text = ''
text << ' '
x = 1
%w[eax ebx ecx edx eip].each { |r|
text << Color[:changed] if @rs.regs_cache[r] != @rs.oldregs[r]
text << r << ?=
text << ('%08X' % @rs.regs_cache[r])
text << Color[:normal] if @rs.regs_cache[r] != @rs.oldregs[r]
text << ' '
x += r.length + 11
}
text << (' '*(@console_width-x)) << "\n" << ' '
x = 1
%w[esi edi ebp esp].each { |r|
text << Color[:changed] if @rs.regs_cache[r] != @rs.oldregs[r]
text << r << ?=
text << ('%08X' % @rs.regs_cache[r])
text << Color[:normal] if @rs.regs_cache[r] != @rs.oldregs[r]
text << ' '
x += r.length + 11
}
Rubstop::EFLAGS.sort.each { |off, flag|
val = @rs.regs_cache['eflags'] & (1<<off)
flag = flag.upcase if val != 0
if val != @rs.oldregs['eflags'] & (1 << off)
text << Color[:changed]
text << flag
text << Color[:normal]
else
text << flag
end
text << ' '
x += 2
}
text << (' '*(@console_width-x)) << "\n"
end
def updatecode
if @codeptr
addr = @codeptr
elsif @rs.oldregs['eip'] and @rs.oldregs['eip'] < @rs.regs_cache['eip'] and @rs.oldregs['eip'] + 8 >= @rs.regs_cache['eip']
addr = @rs.oldregs['eip']
else
addr = @rs.regs_cache['eip']
end
@codeptr = addr
if @rs.findfilemap(addr) == '???'
base = addr & 0xffff_f000
8.times {
sig = @rs[base, 4]
if sig == "\x7fELF"
@rs.loadsyms(base, base.to_s(16))
break
end
base -= 0x1000
}
end
text = ''
text << Color[:border]
title = @rs.findsymbol(addr)
pre = [@console_width-100, 6].max
post = @console_width - (pre + title.length + 2)
text << Ansi.hline(pre) << ' ' << title << ' ' << Ansi.hline(post)
text << Color[:normal]
text << "\n"
cnt = @win_code_height
while (cnt -= 1) > 0
if @rs.symbols[addr]
text << (' ' << @rs.symbols[addr] << ?:) << Ansi::ClearLineAfter << "\n"
break if (cnt -= 1) <= 0
end
text << Color[:hilight] if addr == @rs.regs_cache['eip']
text << ('%04X' % @rs.regs_cache['cs']) << ':'
text << ('%08X' % addr)
di = @rs.mnemonic_di(addr)
di = nil if di and addr < @rs.regs_cache['eip'] and addr+di.bin_length > @rs.regs_cache['eip']
len = (di ? di.bin_length : 1)
text << ' '
text << @rs[addr, [len, 10].min].unpack('C*').map { |c| '%02X' % c }.join.ljust(22)
if di
text <<
if addr == @rs.regs_cache['eip']
"*#{di.instruction}".ljust(@console_width-37)
else
" #{di.instruction}" << Ansi::ClearLineAfter
end
else
text << ' <unk>' << Ansi::ClearLineAfter
end
text << Color[:normal] if addr == @rs.regs_cache['eip']
addr += len
text << "\n"
end
text
end
def updatedata
@dataptr &= 0xffff_ffff
addr = @dataptr
text = ''
text << Color[:border]
title = @rs.findsymbol(addr)
pre = [@console_width-100, 6].max
post = @console_width - (pre + title.length + 2)
text << Ansi.hline(pre) << ' ' << title << ' ' << Ansi.hline(post)
text << Color[:normal]
cnt = @win_data_height
while (cnt -= 1) > 0
raw = @rs[addr, 16]
text << ('%04X' % @rs.regs_cache['ds']) << ':' << ('%08X' % addr) << ' '
case @datafmt
when 'db'; text << raw[0,8].unpack('C*').map { |c| '%02x ' % c }.join << ' ' <<
raw[8,8].unpack('C*').map { |c| '%02x ' % c }.join
when 'dw'; text << raw.unpack('S*').map { |c| '%04x ' % c }.join
when 'dd'; text << raw.unpack('L*').map { |c| '%08x ' % c }.join
end
text << ' ' << raw.unpack('C*').map { |c| (0x20..0x7e).include?(c) ? c : ?. }.pack('C*')
text << Ansi::ClearLineAfter << "\n"
addr += 16
end
text
end
def updateprompt
text = ''
text << Color[:border] << Ansi.hline(@console_width) << Color[:normal] << "\n"
@log_off = @promptlog.length - 2 if @log_off >= @promptlog.length
@log_off = 0 if @log_off < 0
len = @win_prpt_height - 2
len.times { |i|
i += @promptlog.length - @log_off - len
text << ((@promptlog[i] if i >= 0) || '')
text << Ansi::ClearLineAfter << "\n"
}
text << ':' << @promptbuf << Ansi::ClearLineAfter << "\n"
text << Color[:status] << statusline.ljust(@console_width) << Color[:normal]
end
def statusline
' Enter a command (help for help)'
end
def resize
@console_height, @console_width = Ansi.get_terminal_size
@win_data_height = 1 if @win_data_height < 1
@win_code_height = 1 if @win_code_height < 1
if @win_data_height + @win_code_height + 5 > @console_height
@win_data_height = @console_height/2 - 4
@win_code_height = @console_height/2 - 4
end
@win_prpt_height = @console_height-(@win_data_height+@win_code_height+2) - 1
update
end
def log(str)
raise str.inspect if not str.kind_of? ::String
@promptlog << str
@promptlog.shift if @promptlog.length > @promptloglen
end
def puts(*s)
s.each { |s| log s.to_s }
update rescue nil
end
def mem_binding(expr)
b = @rs.regs_cache.dup
ext = expr.externals
(ext - @rs.regs_cache.keys).each { |ex|
if not s = @rs.symbols.index(ex)
log "unknown value #{ex}"
return {}
end
b[ex] = s
if @rs.symbols.values.grep(ex).length > 1
raise "multiple definitions found for #{ex}"
end
}
b['tracer_memory'] = @rs
b
end
def exec_prompt
@log_off = 0
log ':'+@promptbuf
return if @promptbuf == ''
lex = Metasm::Preprocessor.new.feed @promptbuf
@prompthistory << @promptbuf
@prompthistory.shift if @prompthistory.length > @prompthistlen
@promptbuf = ''
@promptpos = @promptbuf.length
argint = proc {
begin
raise if not e = ExprParser.parse(lex)
rescue
log 'syntax error'
return
end
e.bind(mem_binding(e)).reduce
}
cmd = lex.readtok
cmd = cmd.raw if cmd
nil while ntok = lex.readtok and ntok.type == :space
lex.unreadtok ntok
if @command.has_key? cmd
@command[cmd].call(lex, argint)
else
if cmd and (poss = @command.keys.find_all { |c| c[0, cmd.length] == cmd }).length == 1
@command[poss.first].call(lex, argint)
else
log 'unknown command'
end
end
end
def updatecodeptr
@codeptr ||= @rs.regs_cache['eip']
if @codeptr > @rs.regs_cache['eip'] or @codeptr < @rs.regs_cache['eip'] - 6*@win_code_height
@codeptr = @rs.regs_cache['eip']
elsif @codeptr != @rs.regs_cache['eip']
addr = @codeptr
addrs = []
while addr < @rs.regs_cache['eip']
addrs << addr
o = @rs.mnemonic_di(addr).bin_length
addr += ((o == 0) ? 1 : o)
end
if addrs.length > @win_code_height-4
@codeptr = addrs[-(@win_code_height-4)]
end
end
updatedataptr
end
def updatedataptr
@dataptr = @watch.bind(mem_binding(@watch)).reduce if @watch
end
def singlestep
@rs.singlestep
updatecodeptr
end
def stepover
@rs.stepover
updatecodeptr
end
def cont(*a)
@rs.cont(*a)
updatecodeptr
end
def stepout
@rs.stepout
updatecodeptr
end
def syscall
@rs.syscall
updatecodeptr
end
def main_loop_inner
@prompthistory = ['']
@histptr = nil
@running = true
update
while @running
if not IO.select [$stdin], nil, nil, 0
begin
update
rescue Errno::ESRCH
break
end
end
break if handle_keypress(Ansi.getkey)
end
@rs.checkbp
end
def handle_keypress(k)
case k
when 4; log 'exiting'; return true # eof
when ?\e; focus = :prompt
when :f5; cont
when :f6
syscall
log Rubstop::SYSCALLNR.index(@rs.regs_cache['orig_eax']) || @rs.regs_cache['orig_eax'].to_s
when :f10; stepover
when :f11; singlestep
when :f12; stepout
when :up
case @focus
when :prompt
if not @histptr
@prompthistory << @promptbuf
@histptr = 2
else
@histptr += 1
@histptr = 1 if @histptr > @prompthistory.length
end
@promptbuf = @prompthistory[-@histptr].dup
@promptpos = @promptbuf.length
when :data
@dataptr -= 16
when :code
@codeptr ||= @rs.regs_cache['eip']
@codeptr -= (1..10).find { |off|
di = @rs.mnemonic_di(@codeptr-off)
di.bin_length == off if di
} || 10
end
when :down
case @focus
when :prompt
if not @histptr
@prompthistory << @promptbuf
@histptr = @prompthistory.length
else
@histptr -= 1
@histptr = @prompthistory.length if @histptr < 1
end
@promptbuf = @prompthistory[-@histptr].dup
@promptpos = @promptbuf.length
when :data
@dataptr += 16
when :code
@codeptr ||= @rs.regs_cache['eip']
di = @rs.mnemonic_di(@codeptr)
@codeptr += (di ? (di.bin_length || 1) : 1)
end
when :left; @promptpos -= 1 if @promptpos > 0
when :right; @promptpos += 1 if @promptpos < @promptbuf.length
when :home; @promptpos = 0
when :end; @promptpos = @promptbuf.length
when :backspace, 0x7f; @promptbuf[@promptpos-=1, 1] = '' if @promptpos > 0
when :suppr; @promptbuf[@promptpos, 1] = '' if @promptpos < @promptbuf.length
when :pgup
case @focus
when :prompt; @log_off += @win_prpt_height-3
when :data; @dataptr -= 16*(@win_data_height-1)
when :code
@codeptr ||= @rs.regs_cache['eip']
(@win_code_height-1).times {
@codeptr -= (1..10).find { |off|
di = @rs.mnemonic_di(@codeptr-off)
di.bin_length == off if di
} || 10
}
end
when :pgdown
case @focus
when :prompt; @log_off -= @win_prpt_height-3
when :data; @dataptr += 16*(@win_data_height-1)
when :code
@codeptr ||= @rs.regs_cache['eip']
(@win_code_height-1).times { @codeptr += (((o = @rs.mnemonic_di(@codeptr).bin_length) == 0) ? 1 : o) }
end
when ?\t
if not @promptbuf[0, @promptpos].include? ' '
poss = @command.keys.find_all { |c| c[0, @promptpos] == @promptbuf[0, @promptpos] }
if poss.length > 1
log poss.sort.join(' ')
elsif poss.length == 1
@promptbuf[0, @promptpos] = poss.first + ' '
@promptpos = poss.first.length+1
end
end
when ?\n; @histptr = nil ; exec_prompt rescue log "error: #$!"
when 0x20..0x7e
@promptbuf[@promptpos, 0] = k.chr
@promptpos += 1
else log "unknown key pressed #{k.inspect}"
end
nil
end
def load_commands
ntok = nil
@command['kill'] = proc { |lex, int|
@rs.kill
@running = false
log 'killed'
}
@command['quit'] = @command['detach'] = @command['exit'] = proc { |lex, int|
@rs.detach
@running = false
}
@command['closeui'] = proc { |lex, int|
@rs.logger = nil
@running = false
}
@command['bpx'] = proc { |lex, int|
addr = int[]
@rs.bpx addr
}
@command['bphw'] = proc { |lex, int|
type = lex.readtok.raw
addr = int[]
@rs.set_hwbp type, addr
}
@command['bl'] = proc { |lex, int|
log "bpx at #{@rs.findsymbol(@rs.wantbp)}" if @rs.wantbp.kind_of? ::Integer
@rs.breakpoints.sort.each { |addr, oct|
log "bpx at #{@rs.findsymbol(addr)}"
}
(0..3).each { |dr|
if @rs.regs_cache['dr7'] & (1 << (2*dr)) != 0
log "bphw #{{0=>'x', 1=>'w', 2=>'?', 3=>'r'}[(@rs.regs_cache['dr7'] >> (16+4*dr)) & 3]} at #{@rs.findsymbol(@rs.regs_cache["dr#{dr}"])}"
end
}
}
@command['bc'] = proc { |lex, int|
@rs.wantbp = nil if @rs.wantbp == @rs.regs_cache['eip']
@rs.breakpoints.each { |addr, oct| @rs[addr] = oct }
@rs.breakpoints.clear
if @rs.regs_cache['dr7'] & 0xff != 0
@rs.dr7 = 0
@rs.readregs
end
}
@command['bt'] = proc { |lex, int| @rs.backtrace.each { |t| puts t } }
@command['d'] = proc { |lex, int| @dataptr = int[] || return }
@command['db'] = proc { |lex, int| @datafmt = 'db' ; @dataptr = int[] || return }
@command['dw'] = proc { |lex, int| @datafmt = 'dw' ; @dataptr = int[] || return }
@command['dd'] = proc { |lex, int| @datafmt = 'dd' ; @dataptr = int[] || return }
@command['r'] = proc { |lex, int|
r = lex.readtok.raw
nil while ntok = lex.readtok and ntok.type == :space
if r == 'fl'
flag = ntok.raw
if i = Rubstop::EFLAGS.index(flag)
@rs.eflags ^= 1 << i
@rs.readregs
else
log "bad flag #{flag}"
end
elsif not @rs.regs_cache[r]
log "bad reg #{r}"
elsif ntok
lex.unreadtok ntok
newval = int[]
if newval and newval.kind_of? ::Integer
@rs.send r+'=', newval
@rs.readregs
end
else
log "#{r} = #{@rs.regs_cache[r]}"
end
}
@command['run'] = @command['cont'] = proc { |lex, int|
if tok = lex.readtok
lex.unreadtok tok
cont int[]
else cont
end
}
@command['syscall'] = proc { |lex, int| syscall }
@command['singlestep'] = proc { |lex, int| singlestep }
@command['stepover'] = proc { |lex, int| stepover }
@command['stepout'] = proc { |lex, int| stepout }
@command['g'] = proc { |lex, int| @rs.bpx int[], true ; cont }
@command['u'] = proc { |lex, int| @codeptr = int[] || break }
@command['has_pax'] = proc { |lex, int|
if tok = lex.readtok
lex.unreadtok tok
@rs.has_pax = (int[] != 0)
else @rs.has_pax = !@rs.has_pax
end
log "has_pax now #{@rs.has_pax}"
}
@command['loadsyms'] = proc { |lex, int| @rs.loadallsyms }
@command['scansyms'] = proc { |lex, int| @rs.scansyms }
@command['sym'] = proc { |lex, int|
sym = ''
sym << ntok.raw while ntok = lex.readtok
s = @rs.symbols.values.grep(/#{sym}/)
if s.empty?
log "unknown symbol #{sym}"
else
s = @rs.symbols.keys.find_all { |k| s.include? @rs.symbols[k] }
s.sort.each { |s| log "#{'%08x' % s} #{@rs.symbols_len[s].to_s.ljust 6} #{@rs.findsymbol(s)}" }
end
}
@command['delsym'] = proc { |lex, int|
addr = int[]
log "deleted #{@rs.symbols.delete addr}"
@rs.symbols_len.delete addr
}
@command['addsym'] = proc { |lex, int|
name = lex.readtok.raw
addr = int[]
if t = lex.readtok
lex.unreadtok t
@rs.symbols_len[addr] = int[]
else
@rs.symbols_len[addr] = 1
end
@rs.symbols[addr] = name
}
@command['help'] = proc { |lex, int|
log 'commands: (addr/values are things like dword ptr [ebp+(4*byte [eax])] ), type <tab> to see all commands'
log ' bpx <addr>'
log ' bphw [r|w|x] <addr>: debug register breakpoint'
log ' bl: list breakpoints'
log ' bc: clear breakpoints'
log ' cont [<signr>]: continue the target sending a signal'
log ' d/db/dw/dd [<addr>]: change data type/address'
log ' g <addr>: set a bp at <addr> and run'
log ' has_pax [0|1]: set has_pax flag (hwbp+0x60000000 instead of bpx)'
log ' r <reg> [<value>]: show/change register'
log ' r fl <flag>: toggle eflags bit'
log ' loadsyms: load symbol information from mapped files (from /proc and disk)'
log ' scansyms: scan memory for ELF headers'
log ' sym <symbol regex>: show symbol information'
log ' addsym <name> <addr> [<size>]'
log ' delsym <addr>'
log ' u <addr>: disassemble addr'
log ' reload: reload lindebug source'
log ' ruby <ruby code>: instance_evals ruby code in current instance'
log ' closeui: detach from the underlying RubStop'
log 'keys:'
log ' F5: continue'
log ' F6: syscall'
log ' F10: step over'
log ' F11: single step'
log ' F12: step out (til next ret)'
log ' pgup/pgdown: move command history'
}
@command['reload'] = proc { |lex, int| load $0 ; load_commands }
@command['ruby'] = proc { |lex, int|
str = ''
str << ntok.raw while ntok = lex.readtok
instance_eval str
}
@command['maps'] = proc { |lex, int|
@rs.filemap.sort_by { |f, (b, e)| b }.each { |f, (b, e)|
log "#{f.ljust 20} #{'%08x' % b} - #{'%08x' % e}"
}
}
@command['resize'] = proc { |lex, int| resize }
@command['watch'] = proc { |lex, int| @watch = ExprParser.parse(lex) ; updatedataptr }
@command['wd'] = proc { |lex, int|
@focus = :data
if tok = lex.readtok
lex.unreadtok tok
@win_data_height = int[] || return
resize
end
}
@command['wc'] = proc { |lex, int|
@focus = :code
if tok = lex.readtok
lex.unreadtok tok
@win_code_height = int[] || return
resize
end
}
@command['wp'] = proc { |lex, int| @focus = :prompt }
@command['?'] = proc { |lex, int|
val = int[]
log "#{val} 0x#{val.to_s(16)} #{[val].pack('L').inspect}"
}
@command['.'] = proc { |lex, int| @codeptr = nil }
end
end
if $0 == __FILE__
begin
require 'samples/rubstop'
rescue LoadError
if not defined? Rubstop
$: << File.dirname(__FILE__)
require 'rubstop'
$:.pop
end
end
LinDebug.new(Rubstop.new(ARGV.join(' '))).main_loop
end

View File

@ -1,67 +0,0 @@
#!/usr/bin/ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2008 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this exemple illustrates the use of the PTrace32 class to hijack a syscall in a running process
# the next syscall made is patched to run the syscall with the arguments of our choice, then
# run the original intended syscall
# Works on linux/x86
#
require 'metasm'
class SyscallHooker < Metasm::PTrace32
CTX = ['EBX', 'ECX', 'EDX', 'ESI', 'EDI', 'EAX', 'ESP', 'EBP', 'EIP', 'ORIG_EAX']
def inject(sysnr, *args)
sysnr = SYSCALLNR[sysnr] || sysnr
syscall
puts '[*] waiting syscall'
Process.waitpid(@pid)
savedctx = CTX.inject({}) { |ctx, reg| ctx.update reg => peekusr(REGS_I386[reg]) }
if readmem((savedctx['EIP'] - 2) & 0xffff_ffff, 2) != "\xcd\x80"
puts 'no int 80h seen, cannot replay orig syscall, aborting'
elsif args.length > 5
puts 'too may arguments, unsupported, aborting'
else
puts "[*] hooking #{SYSCALLNR.index(savedctx['ORIG_EAX'])}"
# stack pointer to store buffers to
esp_ptr = savedctx['ESP']
args.zip(CTX).map { |arg, reg|
# set syscall args, put buffers on the stack as needed
if arg.kind_of? String
esp_ptr -= arg.length
esp_ptr &= 0xffff_fff0
writemem(esp_ptr, arg)
arg = [esp_ptr].pack('L').unpack('l').first
end
pokeusr(REGS_I386[reg], arg)
}
# patch syscall number
pokeusr(REGS_I386['ORIG_EAX'], sysnr)
# run hooked syscall
syscall
Process.waitpid(@pid)
puts "[*] retval: #{'%X' % peekusr(REGS_I386['EAX'])}"
# restore eax & eip to run the orig syscall
savedctx['EIP'] -= 2
savedctx['EAX'] = savedctx['ORIG_EAX']
savedctx.each { |reg, val| pokeusr(REGS_I386[reg], val) }
end
cont
end
end
if $0 == __FILE__
SyscallHooker.new(ARGV.shift.to_i).inject('write', 2, "testic\n", 7)
end

View File

@ -1,69 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# in this file, we open an existing PE, add some code to its last section and
# patch the entrypoint so that we are executed at program start
#
require 'metasm'
require 'metasm-shell'
# read original file
raise 'need a target filename' if not target = ARGV.shift
pe_orig = Metasm::PE.decode_file(target)
pe = pe_orig.mini_copy
pe.mz.encoded = pe_orig.encoded[0, pe_orig.coff_offset-4]
pe.mz.encoded.export = pe_orig.encoded[0, 512].export.dup
pe.header.time = pe_orig.header.time
has_mb = pe.imports.find { |id| id.imports.find { |i| i.name == 'MessageBoxA' } } ? 1 : 0
# hook code to run on start
newcode = <<EOS.encode_edata
hook_entrypoint:
pushad
#if ! #{has_mb}
push hook_libname
call [iat_LoadLibraryA]
push hook_funcname
push eax
call [iat_GetProcAddress]
#else
mov eax, [iat_MessageBoxA]
#endif
push 0
push hook_title
push hook_msg
push 0
call eax
popad
jmp entrypoint
.align 4
hook_msg db '(c) David Hasselhoff', 0
hook_title db 'Hooked on a feeling', 0
#if ! #{has_mb}
hook_libname db 'user32', 0
hook_funcname db 'MessageBoxA', 0
#endif
EOS
# modify last section
s = Metasm::PE::Section.new
s.name = '.hook'
s.encoded = newcode
s.characteristics = %w[MEM_READ MEM_WRITE MEM_EXECUTE]
s.encoded.fixup!('entrypoint' => pe.optheader.image_base + pe.optheader.entrypoint) # tell the original entrypoint address to our hook
pe.sections << s
# patch entrypoint
pe.optheader.entrypoint = 'hook_entrypoint'
# save
pe.encode_file(target.sub(/\.exe$/i, '-patch.exe'))

View File

@ -1,203 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this sample shows the compilation of a slightly more complex program
# it displays in a messagebox the result of CPUID
#
require 'metasm'
pe = Metasm::PE.assemble Metasm::Ia32.new, <<EOS
.text
m_cpuid macro nr
xor ebx, ebx
and ecx, ebx
and edx, ebx
mov eax, nr
cpuid
endm
.entrypoint
push ebx push ecx push edx
m_cpuid(0)
mov [cpuname], ebx
mov [cpuname+4], edx
mov [cpuname+8], ecx
and byte ptr [cpuname+12], 0
m_cpuid(0x8000_0000)
and eax, 0x8000_0000
jz extended_unsupported
m_str_cpuid macro nr
m_cpuid(0x8000_0002 + nr)
mov [cpubrand + 16*nr + 0], eax
mov [cpubrand + 16*nr + 4], ebx
mov [cpubrand + 16*nr + 8], ecx
mov [cpubrand + 16*nr + 12], edx
endm
m_str_cpuid(0)
m_str_cpuid(1)
m_str_cpuid(2)
extended_unsupported:
and byte ptr[cpubrand+48], 0
push cpubrand
push cpuname
push format
push buffer
call wsprintf
add esp, 4*4
push 0
push title
push buffer
push 0
call messagebox
pop edx pop ecx pop ebx
xor eax, eax
ret
.import user32 MessageBoxA messagebox
.import user32 wsprintfA wsprintf
#define PE_HOOK_TARGET
#ifdef PE_HOOK_TARGET
; import these to be a good target for pe-hook.rb
.import kernel32 LoadLibraryA
.import kernel32 GetProcAddress
#endif
.data
format db 'CPU: %s\\nBrandstring: %s', 0
title db 'cpuid', 0
.bss
buffer db 1025 dup(?)
.align 4
cpuname db 3*4+1 dup(?)
.align 4
cpubrand db 3*4*4+1 dup(?)
EOS
pe.encode_file('metasm-cpuid.exe')
__END__
// original C code (more complete)
#include <unistd.h>
#include <stdio.h>
static char *featureinfo[32] = {
"fpu", "vme", "de", "pse", "tsc", "msr", "pae", "mce", "cx8",
"apic", "unk10", "sep", "mtrr", "pge", "mca", "cmov", "pat",
"pse36", "psn", "clfsh", "unk20", "ds", "acpi", "mmx",
"fxsr", "sse", "sse2", "ss", "htt", "tm", "unk30", "pbe"
}, *extendinfo[32] = {
"sse3", "unk1", "unk2", "monitor", "ds-cpl", "unk5-vt", "unk6", "est",
"tm2", "unk9", "cnxt-id", "unk12", "cmpxchg16b", "unk14", "unk15",
"unk16", "unk17", "unk18", "unk19", "unk20", "unk21", "unk22", "unk23",
"unk24", "unk25", "unk26", "unk27", "unk28", "unk29", "unk30", "unk31"
};
#define cpuid(id) __asm__( "cpuid" : "=a"(eax), "=b"(ebx), "=c"(ecx), "=d"(edx) : "a"(id), "b"(0), "c"(0), "d"(0))
#define b(val, base, end) ((val << (31-end)) >> (31-end+base))
int main(void)
{
unsigned long eax, ebx, ecx, edx;
unsigned long i, max;
int support_extended;
printf("%8s - %8s %8s %8s %8s\n", "query", "eax", "ebx", "ecx", "edx");
max = 0;
for (i=0 ; i<=max ; i++) {
cpuid(i);
if (!i)
max = eax;
printf("%.8lX - %.8lX %.8lX %.8lX %.8lX\n", i, eax, ebx, ecx, edx);
}
printf("\n");
max = 0x80000000;
for (i=0x80000000 ; i<=max ; i++) {
cpuid(i);
if (!(i << 1)) {
max = eax;
support_extended = eax >> 31;
}
printf("%.8lX - %.8lX %.8lX %.8lX %.8lX\n", i, eax, ebx, ecx, edx);
}
printf("\n");
cpuid(0);
printf("identification: \"%.4s%.4s%.4s\"\n", (char *)&ebx, (char *)&edx, (char *)&ecx);
printf("cpu information:\n");
cpuid(1);
printf(" family %ld model %ld stepping %ld efamily %ld emodel %ld\n",
b(eax, 8, 11), b(eax, 4, 7), b(eax, 0, 3), b(eax, 20, 27), b(eax, 16, 19));
printf(" brand %ld cflush sz %ld*8 nproc %ld apicid %ld\n",
b(ebx, 0, 7), b(ebx, 8, 15), b(ebx, 16, 23), b(ebx, 24, 31));
printf(" feature information:");
for (i=0 ; i<32 ; i++)
if (edx & (1 << i))
printf(" %s", featureinfo[i]);
printf("\n extended information:");
for (i=0 ; i<32 ; i++)
if (ecx & (1 << i))
printf(" %s", extendinfo[i]);
printf("\n");
if (!support_extended)
return 0;
printf("extended cpuid:\n", eax);
cpuid(0x80000001);
printf(" %.8lX %.8lX %.8lX %.8lX + ", eax, ebx, ecx & ~1, edx & ~0x00800102);
if (ecx & 1)
printf(" lahf64");
if (edx & (1 << 11))
printf(" syscall64");
if (edx & (1 << 20))
printf(" nx");
if (edx & (1 << 29))
printf(" em64t");
char brandstring[48];
unsigned long *p = (unsigned long*)brandstring;
cpuid(0x80000002);
*p++ = eax;
*p++ = ebx;
*p++ = ecx;
*p++ = edx;
cpuid(0x80000003);
*p++ = eax;
*p++ = ebx;
*p++ = ecx;
*p++ = edx;
cpuid(0x80000004);
*p++ = eax;
*p++ = ebx;
*p++ = ecx;
*p++ = edx;
printf("\n brandstring: \"%.48s\"\n", brandstring);
return 0;
}

View File

@ -1,34 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# here we assemble a dummy MIPS PE
# TODO autodetect header.machine from cpu, find something to put in
# the MZ header, make a real mips sample program
#
require 'metasm'
cpu = Metasm::MIPS.new(:little)
prog = Metasm::PE.assemble(cpu, <<EOS)
.text
.entrypoint
lui r4, 0x42
jal toto
add r4, r1, r2
jr r31
nop
toto:
jr r31
;ldc1 fp12, 28(r4)
nop
.import 'foobar' 'baz'
EOS
prog.header.machine='R4000'
data = prog.encode_file 'mipspe.exe'

View File

@ -1,78 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# here we will build an executable file that will shut down the machine
# when run
# the header part comes from the factorize sample script
#
require 'metasm'
cpu = Metasm::Ia32.new
cpu.generate_PIC = false
Metasm::PE.compile_c(cpu, DATA.read + <<EOS).encode_file('metasm-shutdown.exe')
int main(void) {
static HANDLE htok;
static TOKEN_PRIVILEGES tokpriv;
OpenProcessToken(GetCurrentProcess(), TOKEN_ADJUST_PRIVILEGES | TOKEN_QUERY, &htok);
LookupPrivilegeValue(NULL, SE_SHUTDOWN_NAME, &tokpriv.Privileges[0].Luid);
tokpriv.PrivilegeCount = 1U;
tokpriv.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED;
AdjustTokenPrivileges(htok, 0, &tokpriv, 0U, NULL, NULL);
ExitWindowsEx(EWX_SHUTDOWN | EWX_FORCE, SHTDN_REASON_MAJOR_OPERATINGSYSTEM | SHTDN_REASON_MINOR_UPGRADE | SHTDN_REASON_FLAG_PLANNED);
return 0;
}
EOS
__END__
#define EWX_FORCE 0x00000004U
#define EWX_SHUTDOWN 0x00000001U
#define LookupPrivilegeValue LookupPrivilegeValueA
#define NULL ((void *)0)
#define SE_PRIVILEGE_ENABLED (0x00000002UL)
#define SHTDN_REASON_FLAG_PLANNED 0x80000000U
#define SHTDN_REASON_MAJOR_OPERATINGSYSTEM 0x00020000U
#define SHTDN_REASON_MINOR_UPGRADE 0x00000003U
#define TOKEN_ADJUST_PRIVILEGES (0x0020U)
#define TOKEN_QUERY (0x0008U)
#define __TEXT(quote) quote
#define TEXT(quote) __TEXT(quote)
#define SE_SHUTDOWN_NAME TEXT("SeShutdownPrivilege")
typedef int BOOL;
typedef char CHAR;
typedef unsigned long DWORD;
typedef void *HANDLE;
typedef long LONG;
typedef unsigned int UINT;
BOOL ExitWindowsEx __attribute__((dllimport)) __attribute__((stdcall))(UINT uFlags, DWORD dwReason);
HANDLE GetCurrentProcess __attribute__((dllimport)) __attribute__((stdcall))(void);
typedef const CHAR *LPCSTR;
typedef DWORD *PDWORD;
typedef HANDLE *PHANDLE;
struct _LUID {
DWORD LowPart;
LONG HighPart;
};
typedef struct _LUID LUID;
BOOL OpenProcessToken __attribute__((dllimport)) __attribute__((stdcall))(HANDLE ProcessHandle, DWORD DesiredAccess, PHANDLE TokenHandle);
typedef struct _LUID *PLUID;
BOOL LookupPrivilegeValueA __attribute__((dllimport)) __attribute__((stdcall))(LPCSTR lpSystemName, LPCSTR lpName, PLUID lpLuid);
struct _LUID_AND_ATTRIBUTES {
LUID Luid;
DWORD Attributes;
};
typedef struct _LUID_AND_ATTRIBUTES LUID_AND_ATTRIBUTES;
struct _TOKEN_PRIVILEGES {
DWORD PrivilegeCount;
LUID_AND_ATTRIBUTES Privileges[1];
};
typedef struct _TOKEN_PRIVILEGES *PTOKEN_PRIVILEGES;
typedef struct _TOKEN_PRIVILEGES TOKEN_PRIVILEGES;
BOOL AdjustTokenPrivileges __attribute__((dllimport)) __attribute__((stdcall))(HANDLE TokenHandle, BOOL DisableAllPrivileges, PTOKEN_PRIVILEGES NewState, DWORD BufferLength, PTOKEN_PRIVILEGES PreviousState, PDWORD ReturnLength);

View File

@ -1,51 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# in this sample we compile 2 PE files, one executable and one dll
# with the same base address, to check if the base relocation table
# of the dll is correctly encoded
#
require 'metasm'
cpu = Metasm::Ia32.new
pe = Metasm::PE.assemble cpu, <<EOS
.image_base 0x50000
.section '.text' r w x ; allows merging iat/data/etc
.entrypoint
call foobarplt
xor eax, eax
ret
.import 'pe-foolib' foobar foobarplt
EOS
pe.encode_file('pe-testreloc.exe', 'exe')
dll = Metasm::PE.assemble cpu, <<EOS
.image_base 0x50000
.section '.text' r w x
foobar:
push 0
push msg ; use non-position independant code
push title
push 0
call msgbox
xor eax, eax
ret
.align 4
msg db 'foo', 0
title db 'bar', 0
.import user32 MessageBoxA msgbox
.export foobar
.libname 'pe-foolib'
EOS
dll.encode_file('pe-foolib.dll', 'dll')

View File

@ -1,24 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# compiles a PE file with the specified resource directory
# TODO build an icon or something
#
require 'metasm'
pe = Metasm::PE.assemble Metasm::Ia32.new, <<EOS
.entrypoint
xor eax, eax
ret
EOS
rsrc = { 1 => { 1 => { 2 => 'xxx' }, 'toto' => { 12 => 'tata' } } }
pe.resource = Metasm::COFF::ResourceDirectory.from_hash rsrc
pe.encode_file('pe-testrsrc.exe')

View File

@ -1,31 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this script takes a list of dll filenames as arguments, and outputs each lib export
# libname, followed by the list of the exported symbol names, in a format usable
# by the PE class autoimport functionnality (see metasm/os/windows.rb)
#
require 'metasm'
ARGV.each { |f|
pe = Metasm::PE.decode_file_header(f)
pe.decode_exports
next if not pe.export or not pe.export.libname
puts pe.export.libname.sub(/\.dll$/i, '')
line = ''
pe.export.exports.each { |e|
next if not e.name
# next if not e.target # allow forwarders ? (may change name)
e = ' ' << e.name
if line.length + e.length > 160
puts line
line = ''
end
line << e
}
puts line if not line.empty?
}

View File

@ -1,19 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this file takes preprocessor files as arguments
# it preprocesses their content and dump the result to stdout
# it also dumps all macro definitions
#
require 'metasm/preprocessor'
p = Metasm::Preprocessor.new
p.feed(ARGF.read)
raw = p.dump
puts p.dump_macros(p.definition.keys, false)
puts raw

View File

@ -1,321 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this exemple illustrates the use of the PTrace32 class to implement a pytstop-like functionnality
# Works on linux/x86
#
require 'metasm'
class Rubstop < Metasm::PTrace32
EFLAGS = {0 => 'c', 2 => 'p', 4 => 'a', 6 => 'z', 7 => 's', 9 => 'i', 10 => 'd', 11 => 'o'}
# define accessors for registers
%w[eax ebx ecx edx ebp esp edi esi eip orig_eax eflags dr0 dr1 dr2 dr3 dr6 dr7 cs ds es fs gs].each { |reg|
define_method(reg) { peekusr(REGS_I386[reg.upcase]) & 0xffffffff }
define_method(reg+'=') { |v|
@regs_cache[reg] = v
v = [v].pack('L').unpack('l').first if v >= 0x8000_0000
pokeusr(REGS_I386[reg.upcase], v)
}
}
def cont(signal=0)
@ssdontstopbp = nil
singlestep(true) if @wantbp
super
::Process.waitpid(@pid)
return if child.exited?
@oldregs.update @regs_cache
readregs
checkbp
end
def singlestep(justcheck=false)
super()
::Process.waitpid(@pid)
return if child.exited?
case @wantbp
when ::Integer; bpx @wantbp ; @wantbp = nil
when ::String; self.dr7 |= 1 << (2*@wantbp[2, 1].to_i) ; @wantbp = nil
end
return if justcheck
@oldregs.update @regs_cache
readregs
checkbp
end
def stepover
if curinstr.opcode and curinstr.opcode.name == 'call'
eaddr = @regs_cache['eip'] + curinstr.bin_length
bpx eaddr, true
cont
else
singlestep
end
end
def stepout
# XXX @regs_cache..
stepover until curinstr.opcode.name == 'ret'
singlestep
end
def syscall
@ssdontstopbp = nil
singlestep(true) if @wantbp
super
::Process.waitpid(@pid)
return if child.exited?
@oldregs.update @regs_cache
readregs
checkbp
end
attr_accessor :pgm, :regs_cache, :breakpoints, :singleshot, :wantbp,
:symbols, :symbols_len, :filemap, :has_pax, :oldregs
def initialize(*a)
super
@pgm = Metasm::ExeFormat.new Metasm::Ia32.new
@pgm.encoded = Metasm::EncodedData.new Metasm::LinuxRemoteString.new(@pid)
@pgm.encoded.data.ptrace = self
@regs_cache = {}
@oldregs = {}
readregs
@oldregs.update @regs_cache
@breakpoints = {}
@singleshot = {}
@wantbp = nil
@symbols = {}
@symbols_len = {}
@filemap = {}
@has_pax = false
end
def readregs
%w[eax ebx ecx edx esi edi esp ebp eip orig_eax eflags dr0 dr1 dr2 dr3 dr6 dr7 cs ds].each { |r| @regs_cache[r] = send(r) }
@curinstr = nil if @regs_cache['eip'] != @oldregs['eip']
@pgm.encoded.data.invalidate
end
def curinstr
@curinstr ||= mnemonic_di
end
def child
$?
end
def checkbp
::Process::waitpid(@pid, ::Process::WNOHANG) if not child
return if not child
if not child.stopped?
if child.exited?; log "process exited with status #{child.exitstatus}"
elsif child.signaled?; log "process exited due to signal #{child.termsig} (#{Signal.list.index child.termsig})"
else log "process in unknown status #{child.inspect}"
end
return
elsif child.stopsig != ::Signal.list['TRAP']
log "process stopped due to signal #{child.stopsig} (#{Signal.list.index child.stopsig})"
return # do not check 0xcc at eip-1 ! ( if curinstr.bin_length == 1 )
end
ccaddr = @regs_cache['eip']-1
if @breakpoints[ccaddr] and self[ccaddr] == 0xcc
if @ssdontstopbp != ccaddr
self[ccaddr] = @breakpoints.delete ccaddr
self.eip = ccaddr
@wantbp = ccaddr if not @singleshot.delete ccaddr
@ssdontstopbp = ccaddr
else
@ssdontstopbp = nil
end
elsif @regs_cache['dr6'] & 15 != 0
dr = (0..3).find { |dr| @regs_cache['dr6'] & (1 << dr) != 0 }
@wantbp = "dr#{dr}" if not @singleshot.delete @regs_cache['eip']
self.dr6 = 0
self.dr7 = @regs_cache['dr7'] & (0xffff_ffff ^ (3 << (2*dr)))
readregs
end
end
def bpx(addr, singleshot=false)
@singleshot[addr] = singleshot
return if @breakpoints[addr]
if @has_pax
set_hwbp 'x', addr
else
begin
@breakpoints[addr] = self[addr]
self[addr] = 0xcc
rescue Errno::EIO
log 'i/o error when setting breakpoint, switching to PaX mode'
@has_pax = true
@breakpoints.delete addr
bpx(addr, singleshot)
end
end
end
def mnemonic_di(addr = eip)
@pgm.encoded.ptr = addr
di = @pgm.cpu.decode_instruction(@pgm.encoded, addr)
@curinstr = di if addr == @regs_cache['eip']
di
end
def mnemonic(addr=eip)
mnemonic_di(addr).instruction
end
def regs_dump
[%w[eax ebx ecx edx orig_eax], %w[ebp esp edi esi eip]].map { |l|
l.map { |reg| "#{reg}=#{'%08x' % @regs_cache[reg]}" }.join(' ')
}.join("\n")
end
def findfilemap(s)
@filemap.keys.find { |k| @filemap[k][0] <= s and @filemap[k][1] > s } || '???'
end
def findsymbol(k)
file = findfilemap(k) + '!'
if s = @symbols.keys.find { |s| s <= k and s + @symbols_len[s] > k }
file + @symbols[s] + (s == k ? '' : "+#{(k-s).to_s(16)}")
else
file + ('%08x' % k)
end
end
def set_hwbp(type, addr, len=1)
dr = (0..3).find { |dr| @regs_cache['dr7'] & (1 << (2*dr)) == 0 and @wantbp != "dr#{dr}" }
if not dr
log 'no debug reg available :('
return false
end
@regs_cache['dr7'] &= 0xffff_ffff ^ (0xf << (16+4*dr))
case type
when 'x'; addr += 0x6000_0000 if @has_pax
when 'r'; @regs_cache['dr7'] |= (((len-1)<<2)|3) << (16+4*dr)
when 'w'; @regs_cache['dr7'] |= (((len-1)<<2)|1) << (16+4*dr)
end
send("dr#{dr}=", addr)
self.dr6 = 0
self.dr7 = @regs_cache['dr7'] | (1 << (2*dr))
readregs
true
end
def loadsyms(baseaddr, name)
@loadedsyms ||= {}
return if @loadedsyms[name] or self[baseaddr, 4] != "\x7fELF"
@loadedsyms[name] = true
e = Metasm::LoadedELF.load self[baseaddr, 0x100_0000]
e.load_address = baseaddr
begin
e.decode
#e = Metasm::ELF.decode_file name rescue return # read from disk
rescue
log "failed to load symbols from #{name}: #$!"
($!.backtrace - caller).each { |l| log l.chomp }
@filemap[baseaddr.to_s(16)] = [baseaddr, baseaddr+0x1000]
return
end
if e.tag['SONAME']
name = e.tag['SONAME']
return if name and @loadedsyms[name]
@loadedsyms[name] = true
end
last_s = e.segments.reverse.find { |s| s.type == 'LOAD' }
vlen = last_s.vaddr + last_s.memsz
vlen -= baseaddr if e.header.type == 'EXEC'
@filemap[name] = [baseaddr, baseaddr + vlen]
oldsyms = @symbols.length
e.symbols.each { |s|
next if not s.name or s.shndx == 'UNDEF'
sname = s.name
sname = 'weak_'+sname if s.bind == 'WEAK'
sname = 'local_'+sname if s.bind == 'LOCAL'
v = s.value
v = baseaddr + v if v < baseaddr
@symbols[v] = sname
@symbols_len[v] = s.size
}
if e.header.type == 'EXEC'
@symbols[e.header.entry] = 'entrypoint'
@symbols_len[e.header.entry] = 1
end
log "loaded #{@symbols.length-oldsyms} symbols from #{name} at #{'%08x' % baseaddr}"
end
def loadallsyms
File.read("/proc/#{@pid}/maps").each { |l|
name = l.split[5]
loadsyms l.to_i(16), name if name and name[0] == ?/
}
end
def scansyms
addr = 0
fd = @pgm.encoded.data.readfd
while addr <= 0xffff_f000
addr = 0xc000_0000 if @has_pax and addr == 0x6000_0000
log "scansym: #{'%08x' % addr}" if addr & 0x0fff_ffff == 0
fd.pos = addr
loadsyms(addr, '%08x'%addr) if (fd.read(4) == "\x7fELF" rescue false)
addr += 0x1000
end
end
def backtrace
bt = []
bt << findsymbol(@regs_cache['eip'])
fp = @regs_cache['ebp']
while fp >= @regs_cache['esp'] and fp <= @regs_cache['esp']+0x10000
bt << findsymbol(self[fp+4, 4].unpack('L').first)
fp = self[fp, 4].unpack('L').first
end
bt
end
def [](addr, len=nil)
@pgm.encoded.data[addr, len]
end
def []=(addr, len, str=nil)
@pgm.encoded.data[addr, len] = str
end
attr_accessor :logger
def log(s)
@logger ||= $stdout
@logger.puts s
end
end
if $0 == __FILE__
# start debugging
rs = Rubstop.new(ARGV.shift)
begin
while rs.child.stopped? and rs.child.stopsig == Signal.list['TRAP']
if $VERBOSE
puts "#{'%08x' % rs.eip} #{rs.mnemonic}"
rs.singlestep
else
rs.syscall ; rs.syscall # wait return of syscall
puts "#{rs.orig_eax.to_s.ljust(3)} #{Rubstop::SYSCALLNR.index rs.orig_eax}"
end
end
p rs.child
puts rs.regs_dump
rescue Interrupt
rs.detach rescue nil
puts 'interrupted!'
rescue Errno::ESRCH
end
end

View File

@ -1,54 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this script scans directories recursively for ELF files which have a PT_GNU_STACK rwe or absent
# usage : scan_pt_gnu_stack.rb <dir> [<dir>]
#
require 'metasm'
def _puts(a)
puts a.to_s.ljust(60)
end
def _printadv(a)
$stderr.print a.to_s.ljust(60)[-60, 60] + "\r"
end
# the recursive scanning procedure
iter = proc { |f|
if File.symlink? f
elsif File.directory? f
# show where we are & recurse
_printadv f
Dir[ File.join(f, '*') ].each { |ff|
iter[ff]
}
else
# interpret any file as a ELF
begin
elf = Metasm::ELF.decode_file_header(f)
next if not elf.segments or elf.header.type == 'REL'
seg = elf.segments.find { |seg| seg.type == 'GNU_STACK' }
if not seg
_puts "PT_GNU_STACK absent : #{f}"
elsif seg.flags.include? 'X'
_puts "PT_GNU_STACK RWE : #{f}"
else
_puts "#{f} : #{seg.inspect}" if $VERBOSE
end
rescue
# the file is not a valid ELF
_puts "E: #{f} #{$!}" if $VERBOSE
end
end
}
# go
ARGV.each { |dir| iter[dir] }
_printadv ''

View File

@ -1,62 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this script scans a directory for PE files which export a given symbol name (regexp case-insensitive)
# usage : ruby scanpeexports.rb <dir> <pattern>
#
require 'metasm'
if not base = ARGV.shift
puts 'base dir ?'
base = gets.chomp
end
if not match = ARGV.shift
puts 'pattern ?'
match = gets.chomp
puts 'searching...'
end
def _puts(a)
puts a.to_s.ljust(60)
end
def _printadv(a)
$stderr.print a.to_s.ljust(60)[-60, 60] + "\r"
end
# the recursive scanning procedure
iter = proc { |f, match|
if File.directory? f
# show where we are & recurse
_printadv f
Dir[ File.join(f, '*') ].each { |ff|
iter[ff, match]
}
else
# interpret any file as a PE
begin
pe = Metasm::PE.decode_file_header(f)
pe.decode_exports
next if not pe.export
# scan the export directory for the symbol pattern, excluding forwarders
pe.export.exports.each { |exp|
if exp.name =~ /#{match}/i and not exp.forwarder_lib
_puts f + " : " + exp.name
end
}
rescue
# the file is not a valid PE
end
end
}
# go
iter[base, match]
if RUBY_PLATFORM =~ /win32/i
_puts "press [enter] to exit"
gets
end

View File

@ -1,40 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# in this exemple we can write a shellcode using a C function
#
require 'metasm'
# load and decode the file
sc = Metasm::Shellcode.new(Metasm::Ia32.new)
sc.parse <<EOS
jmp c_func
some_func:
mov eax, 42
ret
EOS
cp = sc.cpu.new_cparser
cp.parse <<EOS
void some_func(void);
/* __declspec(naked) */ void c_func() {
int i;
for (i=0 ; i<10 ; ++i)
some_func();
}
EOS
asm = sc.cpu.new_ccompiler(cp, sc).compile
sc.parse asm
sc.assemble
sc.encode_file 'shellcode.raw'
puts Metasm::Shellcode.load_file('shellcode.raw', Metasm::Ia32.new).disassemble

View File

@ -1,30 +0,0 @@
sys_write equ 4
sys_exit equ 1
stdout equ 1
syscall macro nr
mov eax, nr // the syscall number goes in eax
int 80h
endm
nop nop
call foobar
toto_str db "toto\n"
toto_str_len equ $ - toto_str
foobar:
; setup write arguments
mov ebx, stdout ; fd
call got_eip
got_eip: pop ecx
add ecx, toto_str - got_eip // buf
mov edx, toto_str_len ; buf_len
syscall(sys_write)
/*
; hang forever
jmp $
*/
xor ebx, ebx
syscall(sys_exit)

View File

@ -1,32 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# a sample application
#
require 'metasm'
pe = Metasm::PE.assemble Metasm::Ia32.new, <<EOS
.entrypoint
push 0
push title
push message
push 0
call messagebox
xor eax, eax
ret
.import 'user32' MessageBoxA messagebox
.data
message db 'kikoo lol', 0
title db 'blaaa', 0
EOS
pe.encode_file 'testpe.exe'

View File

@ -1,45 +0,0 @@
#!/usr/bin/env ruby
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
# usage: test.rb < source.asm
require 'metasm'
dump = ARGV.delete '--dump'
source = ARGF.read
cpu = Metasm::Ia32.new
shellcode = Metasm::Shellcode.assemble(cpu, source).encode_string
shellstring = shellcode.unpack('C*').map { |b| '\\x%02x' % b }.join
if dump
puts shellstring
exit
end
File.open('test-testraw.c', 'w') { |fd|
fd.puts <<EOS
unsigned char sc[] = "#{shellstring}";
int main(void)
{
((void (*)())sc)();
return 42;
}
EOS
}
system 'gcc -W -Wall -o test-testraw test-testraw.c'
system 'chpax -psm test-testraw'
puts "running"
system './test-testraw'
puts "done"
#File.unlink 'test-testraw'
File.unlink 'test-testraw.c'

View File

@ -1,172 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# in this exemple we will patch a process specified on the commandline (pid or part of the image name)
# we will retrieve the user32.dll library mapped, and hook every exported function.
# each hook will redirect the code flow to our shellcode, which will display the hooked function
# name in a messagebox.
# The hook is this time a real hook: we overwrite the first instructions with a jump to our code,
# and run those overwritten instruction again before giving control back to original function.
#
# usage: ruby w32hook-advance.rb notepad
# use ruby -d to impress your friends :)
#
require 'metasm'
include Metasm
include WinAPI
# open target
WinAPI.get_debug_privilege
if not pr = WinAPI.find_process((Integer(ARGV.first) rescue ARGV.first))
# display list of running processes and exit
puts WinAPI.list_processes.sort_by { |pr| pr.pid }.map { |pr| "#{pr.pid}: #{File.basename(pr.modules.first.path) rescue nil}" }
exit
end
raise 'cannot open target process' if not handle = WinAPI.openprocess(PROCESS_ALL_ACCESS, 0, pr.pid)
# virtual mapping of remote process memory
remote_mem = WindowsRemoteString.new(handle)
# the main shellcode
sc = Shellcode.assemble Ia32.new, <<EOS
main_hook:
pushfd ; save registers
pushad
mov eax, dword ptr [in_hook] ; check if we are in the hook (yay threadsafe)
test eax, eax
jnz main_hook_done
mov dword ptr [in_hook], 1
mov eax, dword ptr [esp+4+4*9] ; get the function name (1st argument)
push 0
push eax
push eax
push 0
call messageboxw
mov dword ptr [in_hook], 0
main_hook_done:
popad
popfd
ret 4
.align 4
in_hook dd 0 ; l33t mutex
EOS
# this is where we store every function hook
hooks = {}
prepare_hook = proc { |mpe, base, export|
hooklabel = sc.new_label('hook')
namelabel = sc.new_label('name')
# this will overwrite the function entrypoint
target = base + export.target
hooks[target] = Shellcode.new(sc.cpu).share_namespace(sc).parse("jmp #{hooklabel}").assemble.encoded
# backup the overwritten instructions
# retrieve instructions until their length is >= our hook length
mpe.encoded.ptr = export.target
sz = 0
overwritten = []
while sz < hooks[target].length
di = sc.cpu.decode_instruction mpe.encoded, target
if not di or not di.opcode or not di.instruction
puts "W: unknown instruction in #{export.name} !"
break
end
overwritten << di.instruction
sz += di.bin_length
end
puts "overwritten at #{export.name}:", overwritten, '' if $DEBUG
resumeaddr = target + sz
# append the call-specific shellcode to the main hook code
sc.cursource << Label.new(hooklabel)
sc.parse <<EOS
push #{namelabel}
call main_hook ; log the call
; rerun the overwritten instructions
#{overwritten.join("\n")}
jmp #{resumeaddr} ; get back to original code flow
EOS
sc.cursource << Label.new(namelabel)
sc.parse "dw #{export.name.inspect}, 0"
}
msgboxw = nil
# decode interesting libraries from address space
pr.modules[1..-1].each { |m|
# search for messageboxw
if m.path =~ /user32/i
mpe = LoadedPE.load remote_mem[m.addr, 0x1000000]
mpe.decode_header
mpe.decode_exports
mpe.export.exports.each { |e| msgboxw = m.addr + mpe.label_rva(e.target) if e.name == 'MessageBoxW' }
end
# prepare hooks
next if m.path !~ /user32/i # filter interesting libraries
puts "handling #{File.basename m.path}" if $VERBOSE
if not mpe
mpe = LoadedPE.load remote_mem[m.addr, 0x1000000]
mpe.decode_header
mpe.decode_exports
end
next if not mpe.export or not mpe.export.exports
# discard exported data
text = mpe.sections.find { |s| s.name == '.text' }
mpe.export.exports.each { |e|
next if not e.target or not e.name
next if e.name =~ /(?:Translate|Get|Dispatch)Message|CallNextHookEx|TranslateAccelerator/
# ensure we have an offset and not a label name
e.target = mpe.label_rva(e.target)
# ensure the exported thing is in the .text section
next if e.target < text.virtaddr or e.target >= text.virtaddr + text.virtsize
# prepare the hook
prepare_hook[mpe, m.addr, e]
}
}
raise 'Did not find MessageBoxW !' if not msgboxw
puts 'linking...'
sc.assemble
puts 'done'
# allocate memory for our code
raise 'remote allocation failed' if not injected_addr = WinAPI.virtualallocex(handle, 0, sc.encoded.length, MEM_COMMIT|MEM_RESERVE, PAGE_EXECUTE_READWRITE)
puts "Injecting hooks at #{'%x' % injected_addr}"
# fixup & inject our code
binding = { 'messageboxw' => msgboxw }
hooks.each { |addr, edata| binding.update edata.binding(addr) }
binding.update sc.encoded.binding(injected_addr)
# fixup
sc.encoded.fixup(binding)
# inject
remote_mem[injected_addr, sc.encoded.data.length] = sc.encoded.data
# now overwrite entry points
hooks.each { |addr, edata|
edata.fixup(binding)
remote_mem[addr, edata.data.length] = edata.data
}
puts 'done'
WinAPI.closehandle(handle)

View File

@ -1,100 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# in this exemple we will patch a process specified on the commandline (pid or part of image name)
# the IAT entry matching /WriteFile/ will be replaced by a pointer to a malicious code we inject,
# which calls back the original function.
# Our shellcode will display the first bytes of the data to be written, using MessageBoxW (whose
# pointer is also retrieved from the target IAT)
#
# usage: ruby w32hook.rb notepad ; then go in notepad, type some words and save to a file
#
require 'metasm'
include Metasm
include WinAPI
# open target
WinAPI.get_debug_privilege
if not pr = WinAPI.find_process((Integer(ARGV.first) rescue ARGV.first))
# display list of running processes if no target found
puts WinAPI.list_processes.sort_by { |pr| pr.pid }.map { |pr| "#{pr.pid}: #{File.basename(pr.modules.first.path) rescue nil}" }
exit
end
raise 'cannot open target process' if not handle = WinAPI.openprocess(PROCESS_ALL_ACCESS, 0, pr.pid)
# virtual mapping of remote process memory
remote_mem = WindowsRemoteString.new(handle)
# read the target PE structure
pe = LoadedPE.load remote_mem[pr.modules[0].addr, 0x1000000]
pe.decode_header
pe.decode_imports
# find iat entries
target = nil
target_p = nil
msgboxw_p = nil
iat_entry_len = pe.encode_xword(0).length # 64bits portable ! (shellcode probably won't work)
pe.imports.each { |id|
id.imports.each_with_index { |i, idx|
case i.name
when 'MessageBoxW'
msgboxw_p = pr.modules[0].addr + id.iat_p + iat_entry_len * idx
when /WriteFile/
target_p = pr.modules[0].addr + id.iat_p + iat_entry_len * idx
target = id.iat[idx]
end
}
}
raise "iat entries not found" if not target or not msgboxw_p
# here we write our shellcode (no need to code position-independant)
sc = Shellcode.assemble(Ia32.new, <<EOS)
pushad
mov esi, dword ptr [esp+20h+8] ; 2nd arg = buffer
mov edi, message
mov ecx, 19
xor eax, eax
copy_again:
lodsb
stosw
loop copy_again
push 0
push title
push message
push 0
call [msgboxw]
popad
jmp target
.align 4
; strings to display
message dw 20 dup(?)
title dw 'I see what you did there...', 0
EOS
# alloc some space in the remote process to put our shellcode
raise 'remote allocation failed' if not injected = WinAPI.virtualallocex(handle, 0, sc.encoded.length, MEM_COMMIT|MEM_RESERVE, PAGE_EXECUTE_READWRITE)
puts "injected malicous code at %x" % injected
# fixup the shellcode with its known base address, and with the addresses it will need from the IAT
sc.base_addr = injected
sc.encoded.fixup! 'msgboxw' => msgboxw_p, 'target' => target
raw = sc.encode_string
# inject the shellcode
remote_mem[injected, raw.length] = raw
# rewrite iat entry
iat_h = pe.encode_xword(injected).data
remote_mem[target_p, iat_h.length] = iat_h
# done
WinAPI.closehandle(handle)

View File

@ -1,36 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'metasm'
require 'metasm-shell'
include Metasm
include WinAPI
# open target
WinAPI.get_debug_privilege
if not pr = WinAPI.find_process((Integer(ARGV.first) rescue ARGV.first))
puts WinAPI.list_processes.sort_by { |pr| pr.pid }.map { |pr| "#{pr.pid}: #{File.basename(pr.modules.first.path) rescue nil}" }
exit
end
# virtual mapping of remote process memory
remote_mem = WindowsRemoteString.open_pid(pr.pid)
# retrieve the pe load address
baseaddr = pr.modules[0].addr
# decode the COFF headers
pe = Metasm::LoadedPE.load remote_mem[baseaddr, 0x100000]
pe.decode_header
# get the entrypoint address
eip = baseaddr + pe.label_rva(pe.optheader.entrypoint)
# use degraded disasm mode: assume all calls will return
String.cpu.make_call_return # String.cpu is the Ia32 cpu set up by metasm-shell
# disassemble & dump opcodes
puts pe.encoded[pe.optheader.entrypoint, 0x100].data.decode(eip)

View File

@ -1,88 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
#
# this is a simple executable tracer for Windows using the
# Metasm windows debug api abstraction
# all callbacks are full ruby, so this is extremely slow !
#
require 'metasm'
require 'metasm-shell'
class Tracer < Metasm::WinDbg
def initialize(*a)
super
@label = {}
@prog = Metasm::ExeFormat.new(Metasm::Ia32.new)
debugloop
puts 'finished'
end
def handler_newprocess(pid, tid, info)
ret = super
# need to call super first
# super calls newthread
hide_debugger(pid, tid, info)
ret
end
def handler_newthread(pid, tid, info)
ret = super
do_singlestep(pid, tid)
ret
end
def handler_exception(pid, tid, info)
do_singlestep(pid, tid) if @hthread[pid] and @hthread[pid][tid]
case info.code
when Metasm::WinAPI::STATUS_SINGLE_STEP
Metasm::WinAPI::DBG_CONTINUE
else super
end
end
def handler_loaddll(pid, tid, info)
# update @label with exported symbols
pe = Metasm::LoadedPE.load(@mem[pid][info.imagebase, 0x1000000])
pe.decode_header
pe.decode_exports
libname = read_str_indirect(pid, info.imagename, info.unicode)
pe.export.exports.each { |e|
next if not r = pe.label_rva(e.target)
@label[info.imagebase + r] = libname + '!' + (e.name || "ord_#{e.ordinal}")
}
super
end
# dumps the opcode at eip, sets the trace flag
def do_singlestep(pid, tid)
ctx = get_context(pid, tid)
eip = ctx[:eip]
if l = @label[eip]
puts l + ':'
end
if $VERBOSE
bin = @mem[pid][eip, 16]
di = @prog.cpu.decode_instruction(Metasm::EncodedData.new(bin), eip)
puts "#{'%08X' % eip} #{di.instruction}"
end
ctx[:eflags] |= 0x100
end
# resets the DebuggerPresent field of the PEB
def hide_debugger(pid, tid, info)
peb = @mem[pid][info.threadlocalbase + 0x30, 4].unpack('L').first
@mem[pid][peb + 2, 2] = [0].pack('S')
end
end
if $0 == __FILE__
Metasm::WinAPI.get_debug_privilege
Tracer.new ARGV.shift.dup
end

View File

@ -1,7 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
Dir['tests/*.rb'].each { |f| require f }

View File

@ -1,132 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'test/unit'
require 'metasm/exe_format/shellcode'
class TestEncodedData < Test::Unit::TestCase
def compile(src)
p = Metasm::Shellcode.assemble(Metasm::UnknownCPU.new(32, :little), src)
p.encoded
end
def test_basic
e = compile <<EOS
toto db 42
tutu db 48
dd bla
EOS
assert_equal(6, e.virtsize)
assert_equal(2, e.export.keys.length)
assert_equal(0, e.export['toto'])
assert_equal(1, e.reloc.keys.length)
assert_equal('bla', e.reloc[2].target.reduce.rexpr)
end
def test_slice
e = compile <<EOS
db 4 dup(1)
toto:
db 4 dup(2)
db 4 dup(?)
foo:
dd bla
tutu:
EOS
e1 = e[4, 8]
e2 = e[4..11]
e3 = e[4...12]
e4 = e['toto', 8]
e5 = e['toto'...'foo']
assert_equal([e1.data, e1.virtsize], [e2.data, e2.virtsize])
assert_equal([e1.data, e1.virtsize], [e3.data, e3.virtsize])
assert_equal([e1.data, e1.virtsize], [e4.data, e4.virtsize])
assert_equal([e1.data, e1.virtsize], [e5.data, e5.virtsize])
assert_equal(nil, e[53, 12])
assert_equal(2, e[2, 2].export['toto'])
assert_equal(4, e[0, 4].export['toto'])
assert_equal(1, e[0, 16].reloc.length)
assert_equal(0, e[0, 15].reloc.length)
assert_equal(0, e[13, 8].reloc.length)
assert_equal(1, e[12, 4].reloc.length)
assert_equal(16, e[0, 50].virtsize)
assert_equal(1, e[15, 50].virtsize)
e.align 5
assert_equal(20, e.virtsize)
e.align 5
assert_equal(20, e.virtsize)
e.fill 30
assert_equal(30, e.virtsize)
end
def test_slice2
e = compile <<EOS
db '1'
toto:
.pad
tutu:
db '0'
.offset toto+11
EOS
assert_equal(12, e.virtsize)
assert_equal(11, e.export['tutu'])
e[1..10] = 'abcdefghij'
assert_equal(12, e.virtsize)
assert_equal(2, e.export.length)
e[1, 10] = 'jihgfedcba'
assert_equal(12, e.virtsize)
e[1...11] = 'abcdefghij'
assert_equal(12, e.virtsize)
e.patch('toto', 'tutu', 'xxx')
assert_equal('1xxxdefghij0', e.data)
e[1..10] = 'z'
assert_equal(3, e.virtsize)
assert_equal(2, e.export['tutu'])
assert_raise(Metasm::EncodeError) { e.patch('toto', 'tutu', 'toolong') }
e = compile <<EOS
db '1'
dd rel
db '2'
EOS
assert_equal(1, e.reloc.length)
assert_equal(1, e[1, 4].reloc.length)
assert_equal(1, e[1..4].reloc.length)
assert_equal(1, e[1...5].reloc.length)
assert_equal(0, e[2, 8].reloc.length)
assert_equal(0, e[1, 3].reloc.length)
end
def test_fixup
e = compile <<EOS
db 1
db toto + tata
dd tutu
EOS
assert_equal(2, e.reloc.length)
e.fixup!('toto' => 42)
assert_raise(Metasm::EncodeError) { e.fixup('tata' => 192349129) }
e.fixup('tata' => -12)
assert_equal(30, e.data[1])
assert_equal(1, e.reloc.length)
assert_equal(2, e.offset_of_reloc('tutu'))
assert_equal(2, e.offset_of_reloc(Metasm::Expression[:+, 'tutu']))
e.fixup('tutu' => 1024)
assert_equal("\1\x1e\0\4\0\0", e.data)
ee = Metasm::Expression[:+, 'bla'].encode(:u16, :big)
ee.fixup('bla' => 1024)
assert_equal("\4\0", ee.data)
eee = compile <<EOS
db abc - def
def:
db 12 dup(?, 3 dup('x'))
abc:
EOS
assert_equal(12*4, eee.data[0])
end
end

View File

@ -1,111 +0,0 @@
# This file is part of Metasm, the Ruby assembly manipulation suite
# Copyright (C) 2007 Yoann GUILLOT
#
# Licence is LGPL, see LICENCE in the top-level directory
require 'test/unit'
require 'metasm'
class TestMips < Test::Unit::TestCase
def test_enc
sc = Metasm::Shellcode.assemble(Metasm::MIPS.new(:big), <<EOS)
;
; MIPS nul-free xor decoder
;
; (C) 2006 Julien TINNES
; <julien at cr0.org>
;
; The first four bytes in encoded shellcode must be the xor key
; This means that you have to put the xor key right after
; this xor decoder
; This key will be considered part of the encoded shellcode
; by this decoder and will be xored, thus becoming 4NULs, meaning nop
;
; This is Linux-only because I use the cacheflush system call
;
; You can use shellforge to assemble this, but be sure to discard all
; the nul bytes at the end (everything after x01\\x4a\\x54\\x0c)
;
; change 2 bytes in the first instruction's opcode with the number of passes
; the number of passes is the number of xor operations to apply, which should be
; 1 (for the key) + the number of 4-bytes words you have in your shellcode
; you must encode ~(number_of_passes + 1) (to ensure that you're nul-free)
;.text
;.align 2
;.globl main
;.ent main
;.type main,@function
main:
li macro reg, imm
; lui reg, ((imm) >> 16) & 0ffffh
; ori reg, reg, (imm) & 0ffffh
addiu reg, $0, imm ; sufficient if imm.abs <= 0x7fff
endm
li( $14, -5) ; 4 passes
nor $14, $14, $0 ; put number of passes in $14
li( $11,-73) ; addend to calculated PC is 73
;.set noreorder
next:
bltzal $8, next
;.set reorder
slti $8, $0, 0x8282
nor $11, $11, $0 ; addend in $9
addu $25, $31, $11 ; $25 points to encoded shellcode +4
; addu $16, $31, $11 ; $16 too (enable if you want to pass correct parameters to cacheflush
; lui $2, 0xDDDD ; first part of the xor (old method)
slti $23, $0, 0x8282 ; store 0 in $23 (our counter)
; ori $17, $2, 0xDDDD ; second part of the xor (old method)
lw $17, -4($25) ; load xor key in $17
li( $13, -5)
nor $13, $13, $0 ; 4 in $13
addi $15, $13, -3 ; 1 in $15
loop:
lw $8, -4($25)
addu $23, $23, $15 ; increment counter
xor $3, $8, $17
sltu $30, $23, $14 ; enough loops?
sw $3, -4($25)
addi $6, $13, -1 ; 3 in $6 (for cacheflush)
bne $0, $30, loop
addu $25, $25, $13 ; next instruction to decode :)
; addiu $4, $16, -4 ; not checked by Linux
; li $5,40 ; not checked by Linux
; li $6,3 ; $6 is set above
; .set noreorder
li( $2, 4147) ; cacheflush
;.ascii "\\x01JT\\x0c" ; nul-free syscall
syscall 0x52950
; .set reorder
; write last decoder opcode and decoded shellcode
; li $4,1 ; stdout
; addi $5, $16, -8
; li $6,40 ; how much to write
; .set noreorder
; li $2, 4004 ; write
; syscall
; .set reorder
nop ; encoded shellcoded must be here (xor key right here ;)
; $t9 (aka $25) points here
EOS
assert_equal("\x24\x0e\xff\xfb\x01\xc0\x70\x27\x24\x0b\xff\xb7\x05\x10\xff\xff\x28\x08\x82\x82\x01\x60\x58\x27\x03\xeb\xc8\x21\x28\x17\x82\x82\x8f\x31\xff\xfc\x24\x0d\xff\xfb\x01\xa0\x68\x27\x21\xaf\xff\xfd\x8f\x28\xff\xfc\x02\xef\xb8\x21\x01\x11\x18\x26\x02\xee\xf0\x2b\xaf\x23\xff\xfc\x21\xa6\xff\xff\x17\xc0\xff\xf9\x03\x2d\xc8\x21\x24\x02\x10\x33\x01\x4a\x54\x0c\0\0\0\0", sc.encoded.data)
end
end

Some files were not shown because too many files have changed in this diff Show More