metasploit-framework/lib/metasm/doc/core/Expression.txt

221 lines
6.9 KiB
Plaintext

Expression
==========
Metasm uses this class to represent arbitrary symbolic arithmetic expressions, e.g.
* `42`
* `eax + 12`
* `loc_4228h + 4*ebx - 12`
These expressions can include `Integers`, `Symbols`, and `Strings`.
The symbols and strings represent arbitrary variables, with the convention that
strings represent fixed quantities (eg addresses, labels), whereas symbols
represent more variable stuff (eg register values).
There is also a special symbol that may be used, `:unknown`, to represent a
value that is known to be unknown. See the `reduce` section.
See also <core/Indirection.txt>.
The Expression class holds all methods relative to Integer binary manipulation,
that is `encoding` and `decoding` from/to a binary blob (see also
<core/EncodedData.txt>)
Members
-------
Expressions hold exactly 3 members:
* `lexpr`, the left-hand side of the expression
* `rexpr`, the right-hand side
* `op`, the operator
`lexpr` and `rexpr` can be any value, most often String, Symbol, Integer or
Expression. For unary operators, `lexpr` is `nil`.
`op` is a Symbol representing the operation.
It should be from the list:
* arithmetic: `+ - / * >> << & | ^`
* boolean: `|| && == != > >= < <=`
* unary: `+ - ~ !`
Instantiation
-------------
In ruby code, use the class method `[]`. It takes 1 to 3 arguments, `lexpr`,
`op`, and `rexpr`. `lexpr` defaults to `nil`, and `op` defaults to `:+` (except
for negative numeric values, which is stored with `op` == `:-` and `rexpr` ==
abs).
If `lexpr` or `rexpr` are an `Array`, the `[]` constructor is called
recursively, to ease the definition of nested Expressions.
Exemples:
Expression[42]
Expression[:eax, :+, 12]
Expression[:-, 'my_var']
Expression[[:eax, :-, 4], :*, [:ebx, :+, 0x12]]
The Expression class also includes a parser, to allow creating an expression
from a string. `parse_string!` will create an Expression and update its
argument to point after the last part read successfully into the expr.
The parser handles standard C operator precedence.
str = "1 + var"
Expression.parse_string!(str) # => Expression[1, :+, "var"]
str = "42 bla"
Expression.parse_string!(str) # => Expression[42]
str # => "bla"
Use `parse_string` without the ! to parse the string without updating it.
External variables
------------------
The `externals` method will return all non-integer members of the Expression.
Expression[[:eax, :+, 42], :-, "bla"].externals # => [:eax, "bla"]
Pattern matching
----------------
The `match` method allows to check an Expression against a pattern without
having to check individual members. The pattern should be an Expression,
whose variable members should be Strings or Symbols, which are also passed as
arguments to the match function. On successful match, the correspondance
between variable patterns and their actual value matched is returned as a Hash.
Expression[1, :+, 2].match(Expression['var', :+, 2], 'var')
# => { 'var' => 1 }
Expression[1, :+, 2].match(Expression['var', :+, 'var'], 'var')
# => nil
Expression[1, :+, 1].match(Expression['var', :op, 'var'], 'var', :op)
# => { 'var' => 1, :op => :+ }
Reduction
---------
Metasm Expressions include a basic symbolic computation engine, that allows
some simple transformations of the Expression. The reduction will also
compute numerical values whenever possible. If the final result is fully
numeric, an Integer is returned, otherwise a new Expression is returned.
In this context, the special value `:unknown` has a particular meaning.
Expression[1, :+, 2].reduce
# => 3
Expression[:eax, :+, [:ebx, :-, :eax]].reduce
# => Expression[:ebx]
Expression[1, :+, [:eax, :+, 2]].reduce
# => Expression[:eax, :+, 3]
Expression[:unknown, :+, :eax].reduce
# => Expression[:unknown]
The symbolic engine operates mostly on addition/substractions, and
no-operations (eg shift by 0). It also handles some boolean composition.
The detail can be found in the #replace_rec method body, in `metasm/main.rb`.
The reduce method can also take a block argument, which will be called at
every step in the recursive reduction, for custom operations. If the block
returns nil, the result is unchanged, otherwise the new value is used as
replacement. For exemple, if you operate on 32-bit values and want to get rid
of `bla & 0xffffffff`, use
some_expr.reduce { |e|
if e.kind_of?(Expression) and e.op == :& and e.rexpr == 0xffff_ffff
e.lexpr
end
}
Binding
-------
An expression involving variable externals can be bound using a Hash. This will
replace any occurence of a key of the Hash by its value in the expression
members. The `bind` method will return a new Expression with the substitutions,
and the `bind!` method will update the Expression in-place.
Expression['val', :+, 'stuff'].bind('val' => 4, 'stuff' => 8).reduce
# => 12
Expression[:eax, :+, :ebx].bind(:ebx => 42)
# Expression[:eax, :+, 42]
Expression[:eax, :+, :ebx].bind(:ebx => :ecx)
# Expression[:eax, :+, :ecx]
You can use Expressions as keys, but they will only be used on perfect matches.
Binary packing
--------------
Encoding
########
The `encode` method will generate an EncodedData holding the expression, either
as binary if it can reduce to an integral value, or as a relocation.
The arguments are the relocation type and the endianness, plus an optional
backtrace (to notify the user where an overflowing relocation comes from).
The `encode_imm` class method will generate a raw String for a given
integral value, a type and an endianness.
The type can be given as a byte size.
Expression.encode_imm(42, :u8, :little) # => "*"
Expression.encode_imm(42, 1, :big) # => "*"
Expression.encode_imm(256, :u8, :little) # raise EncodeError
On overflows (value cannot be encoded in the bit field) an EncodeError
exception is raised.
Decoding
########
The `decode_imm` class method can be used to read a binary value into an
Integer, with an optional offset into the binary string.
Expression.decode_imm("*", :u8, :little) # => 42
Expression.decode_imm("bla\xfe\xff", :i16, :little, 3) # => -2
Arithmetic coercion
-------------------
Expression implement the `:+` and `:-` ruby methods, so that `expr + 4`
works as expected. The result is reduced.
Integer methods
---------------
The Expression class offers a few methods to work with integers.
make_signed
###########
`make_signed` will convert a raw unsigned to its equivalent signed value,
given a bit size.
Expression.make_signed(1, 16) # => 1
Expression.make_signed(0xffff, 16) # => -1
in_range?
#########
`in_range?` can check if a given numeric value would fit in a particular
<core/Relocation.txt> field. The method can return true or false if it
fits or not, or `nil` if the result is unknown (eg the expr has no numeric
value).
Expression.in_range?(42, :i8) # => true
Expression.in_range?(128, :i8) # => false
Expression.in_range?(-128, :i8) # => true
Expression.in_range?(Expression['bla'], :u32) # => nil