221 lines
6.9 KiB
Plaintext
221 lines
6.9 KiB
Plaintext
Expression
|
|
==========
|
|
|
|
Metasm uses this class to represent arbitrary symbolic arithmetic expressions, e.g.
|
|
* `42`
|
|
* `eax + 12`
|
|
* `loc_4228h + 4*ebx - 12`
|
|
|
|
These expressions can include `Integers`, `Symbols`, and `Strings`.
|
|
|
|
The symbols and strings represent arbitrary variables, with the convention that
|
|
strings represent fixed quantities (eg addresses, labels), whereas symbols
|
|
represent more variable stuff (eg register values).
|
|
|
|
There is also a special symbol that may be used, `:unknown`, to represent a
|
|
value that is known to be unknown. See the `reduce` section.
|
|
|
|
See also <core/Indirection.txt>.
|
|
|
|
The Expression class holds all methods relative to Integer binary manipulation,
|
|
that is `encoding` and `decoding` from/to a binary blob (see also
|
|
<core/EncodedData.txt>)
|
|
|
|
|
|
Members
|
|
-------
|
|
|
|
Expressions hold exactly 3 members:
|
|
* `lexpr`, the left-hand side of the expression
|
|
* `rexpr`, the right-hand side
|
|
* `op`, the operator
|
|
|
|
`lexpr` and `rexpr` can be any value, most often String, Symbol, Integer or
|
|
Expression. For unary operators, `lexpr` is `nil`.
|
|
|
|
`op` is a Symbol representing the operation.
|
|
It should be from the list:
|
|
* arithmetic: `+ - / * >> << & | ^`
|
|
* boolean: `|| && == != > >= < <=`
|
|
* unary: `+ - ~ !`
|
|
|
|
|
|
Instantiation
|
|
-------------
|
|
|
|
In ruby code, use the class method `[]`. It takes 1 to 3 arguments, `lexpr`,
|
|
`op`, and `rexpr`. `lexpr` defaults to `nil`, and `op` defaults to `:+` (except
|
|
for negative numeric values, which is stored with `op` == `:-` and `rexpr` ==
|
|
abs).
|
|
|
|
If `lexpr` or `rexpr` are an `Array`, the `[]` constructor is called
|
|
recursively, to ease the definition of nested Expressions.
|
|
|
|
Exemples:
|
|
|
|
Expression[42]
|
|
Expression[:eax, :+, 12]
|
|
Expression[:-, 'my_var']
|
|
Expression[[:eax, :-, 4], :*, [:ebx, :+, 0x12]]
|
|
|
|
The Expression class also includes a parser, to allow creating an expression
|
|
from a string. `parse_string!` will create an Expression and update its
|
|
argument to point after the last part read successfully into the expr.
|
|
The parser handles standard C operator precedence.
|
|
|
|
str = "1 + var"
|
|
Expression.parse_string!(str) # => Expression[1, :+, "var"]
|
|
str = "42 bla"
|
|
Expression.parse_string!(str) # => Expression[42]
|
|
str # => "bla"
|
|
|
|
Use `parse_string` without the ! to parse the string without updating it.
|
|
|
|
External variables
|
|
------------------
|
|
|
|
The `externals` method will return all non-integer members of the Expression.
|
|
|
|
Expression[[:eax, :+, 42], :-, "bla"].externals # => [:eax, "bla"]
|
|
|
|
|
|
Pattern matching
|
|
----------------
|
|
|
|
The `match` method allows to check an Expression against a pattern without
|
|
having to check individual members. The pattern should be an Expression,
|
|
whose variable members should be Strings or Symbols, which are also passed as
|
|
arguments to the match function. On successful match, the correspondance
|
|
between variable patterns and their actual value matched is returned as a Hash.
|
|
|
|
Expression[1, :+, 2].match(Expression['var', :+, 2], 'var')
|
|
# => { 'var' => 1 }
|
|
Expression[1, :+, 2].match(Expression['var', :+, 'var'], 'var')
|
|
# => nil
|
|
Expression[1, :+, 1].match(Expression['var', :op, 'var'], 'var', :op)
|
|
# => { 'var' => 1, :op => :+ }
|
|
|
|
|
|
Reduction
|
|
---------
|
|
|
|
Metasm Expressions include a basic symbolic computation engine, that allows
|
|
some simple transformations of the Expression. The reduction will also
|
|
compute numerical values whenever possible. If the final result is fully
|
|
numeric, an Integer is returned, otherwise a new Expression is returned.
|
|
|
|
In this context, the special value `:unknown` has a particular meaning.
|
|
|
|
Expression[1, :+, 2].reduce
|
|
# => 3
|
|
Expression[:eax, :+, [:ebx, :-, :eax]].reduce
|
|
# => Expression[:ebx]
|
|
Expression[1, :+, [:eax, :+, 2]].reduce
|
|
# => Expression[:eax, :+, 3]
|
|
Expression[:unknown, :+, :eax].reduce
|
|
# => Expression[:unknown]
|
|
|
|
The symbolic engine operates mostly on addition/substractions, and
|
|
no-operations (eg shift by 0). It also handles some boolean composition.
|
|
|
|
The detail can be found in the #replace_rec method body, in `metasm/main.rb`.
|
|
|
|
The reduce method can also take a block argument, which will be called at
|
|
every step in the recursive reduction, for custom operations. If the block
|
|
returns nil, the result is unchanged, otherwise the new value is used as
|
|
replacement. For exemple, if you operate on 32-bit values and want to get rid
|
|
of `bla & 0xffffffff`, use
|
|
|
|
some_expr.reduce { |e|
|
|
if e.kind_of?(Expression) and e.op == :& and e.rexpr == 0xffff_ffff
|
|
e.lexpr
|
|
end
|
|
}
|
|
|
|
|
|
Binding
|
|
-------
|
|
|
|
An expression involving variable externals can be bound using a Hash. This will
|
|
replace any occurence of a key of the Hash by its value in the expression
|
|
members. The `bind` method will return a new Expression with the substitutions,
|
|
and the `bind!` method will update the Expression in-place.
|
|
|
|
Expression['val', :+, 'stuff'].bind('val' => 4, 'stuff' => 8).reduce
|
|
# => 12
|
|
Expression[:eax, :+, :ebx].bind(:ebx => 42)
|
|
# Expression[:eax, :+, 42]
|
|
Expression[:eax, :+, :ebx].bind(:ebx => :ecx)
|
|
# Expression[:eax, :+, :ecx]
|
|
|
|
You can use Expressions as keys, but they will only be used on perfect matches.
|
|
|
|
|
|
Binary packing
|
|
--------------
|
|
|
|
Encoding
|
|
########
|
|
|
|
The `encode` method will generate an EncodedData holding the expression, either
|
|
as binary if it can reduce to an integral value, or as a relocation.
|
|
The arguments are the relocation type and the endianness, plus an optional
|
|
backtrace (to notify the user where an overflowing relocation comes from).
|
|
|
|
The `encode_imm` class method will generate a raw String for a given
|
|
integral value, a type and an endianness.
|
|
The type can be given as a byte size.
|
|
|
|
Expression.encode_imm(42, :u8, :little) # => "*"
|
|
Expression.encode_imm(42, 1, :big) # => "*"
|
|
Expression.encode_imm(256, :u8, :little) # raise EncodeError
|
|
|
|
On overflows (value cannot be encoded in the bit field) an EncodeError
|
|
exception is raised.
|
|
|
|
Decoding
|
|
########
|
|
|
|
The `decode_imm` class method can be used to read a binary value into an
|
|
Integer, with an optional offset into the binary string.
|
|
|
|
Expression.decode_imm("*", :u8, :little) # => 42
|
|
Expression.decode_imm("bla\xfe\xff", :i16, :little, 3) # => -2
|
|
|
|
|
|
Arithmetic coercion
|
|
-------------------
|
|
|
|
Expression implement the `:+` and `:-` ruby methods, so that `expr + 4`
|
|
works as expected. The result is reduced.
|
|
|
|
|
|
Integer methods
|
|
---------------
|
|
|
|
The Expression class offers a few methods to work with integers.
|
|
|
|
make_signed
|
|
###########
|
|
|
|
`make_signed` will convert a raw unsigned to its equivalent signed value,
|
|
given a bit size.
|
|
|
|
Expression.make_signed(1, 16) # => 1
|
|
Expression.make_signed(0xffff, 16) # => -1
|
|
|
|
|
|
in_range?
|
|
#########
|
|
|
|
`in_range?` can check if a given numeric value would fit in a particular
|
|
<core/Relocation.txt> field. The method can return true or false if it
|
|
fits or not, or `nil` if the result is unknown (eg the expr has no numeric
|
|
value).
|
|
|
|
Expression.in_range?(42, :i8) # => true
|
|
Expression.in_range?(128, :i8) # => false
|
|
Expression.in_range?(-128, :i8) # => true
|
|
Expression.in_range?(Expression['bla'], :u32) # => nil
|
|
|