481 lines
21 KiB
Plaintext
481 lines
21 KiB
Plaintext
|
For most hashes where MD5 is used, building a proper md5 format is likely
|
||
|
not the best bet overall. A format is not trivial. It requires maintainance
|
||
|
and will likely requires specific enhancements to get it to perform
|
||
|
optimally on all hardware. Likely there will need to be 'generic' C
|
||
|
code done, then it will need code to tie it into CPU specific optimizations,
|
||
|
such as SSE, MMX, intrisic SSE, GPU, ... ... ... This will also mean that
|
||
|
to stay up to date, the format will require ongoing work and mainainance.
|
||
|
|
||
|
However, there is one format which may reduce a lot of this maintainance
|
||
|
work to very little. Now, that format itself will need to be kept up to
|
||
|
date, but any formats that are built upon its internal workings. That
|
||
|
format is md5-gen. In this 'format', there is a scripting language, where
|
||
|
a format developer only need to describe the actual operations properly,
|
||
|
and the format is 'done', and working.
|
||
|
|
||
|
This document will go over how to 'build' a format that uses this md5-gen
|
||
|
format, how to optimize it to work faster, and how to build a 'thin'
|
||
|
quasi format which insulates the end user from the md5-gen format line
|
||
|
building.
|
||
|
|
||
|
**** Introduction ****
|
||
|
|
||
|
To start off with, a little background on 'how' and 'where' to build the
|
||
|
scripts that run md5-gen, what interanal data structures are available to
|
||
|
be used.
|
||
|
|
||
|
The 'where' which a format developer can easily build into john, is to add a
|
||
|
new md5-gen format 'script', into john.ini file (john.conf). This
|
||
|
file usually is located in the current directory where john is run out
|
||
|
of (but the --config=file can override the default behavior). Within the
|
||
|
john.conf, a new 'section' can be added for a md5 genercic format. The
|
||
|
new 'section' will be set by using this section naming:
|
||
|
|
||
|
[List.Generic:md5_gen(NUM)]
|
||
|
|
||
|
You replace the NUM with the sub-format number (from 1001 to 9999).
|
||
|
Pick a number that is not used.
|
||
|
|
||
|
Within this 'section', there will be multiple lines added. These lines
|
||
|
are primarily of the form: Type=Value
|
||
|
|
||
|
The actual contents of these scripts will be addressed later. That will
|
||
|
be the 'How', and preforming this is actually outside of the intro section.
|
||
|
|
||
|
The 'Data' and runtime information is this:
|
||
|
|
||
|
Inside of the md5-gen format, there are 2 input buffers (actually ALL data
|
||
|
is arrays of 128 of each buffer type). There is input1 and input2 buffers.
|
||
|
The main operations on these buffers is to clear them, and to append data,
|
||
|
to build string which will later be md5 hashed.
|
||
|
|
||
|
There are also 2 output buffers. These buffers will receive the md5 hashing
|
||
|
from the 2 input buffers. NOTE, when the format processing is complete, the
|
||
|
results MUST be placed into output1 buffer. This is where all of the comparison
|
||
|
functions will check against.
|
||
|
|
||
|
In the format, there is a salt (if the format is salted). There may also be
|
||
|
a second salt value.
|
||
|
|
||
|
There are also 'keys' value(s). These are the passwords being tested at this
|
||
|
given time.
|
||
|
|
||
|
There are also 8 'constant' strings which can be used within a format. A
|
||
|
format such as md5-po has a couple of constants within it.
|
||
|
|
||
|
There are also numerous optimization 'flags' which do special things when
|
||
|
loading keys or salts, and there are numourous special 'optimization' primative
|
||
|
functions within the format, for speedup of certain operations.
|
||
|
|
||
|
**** Simple format building ****
|
||
|
|
||
|
We will start out with a few simple formats, and simply 'show' how to build
|
||
|
a straight forward script. The scripts may or may not be optimal. Later
|
||
|
we will optimize these somewhat. When building the formats here, there will
|
||
|
be comments interspersed, listing just what is being done, and why.
|
||
|
|
||
|
we will build these formats:
|
||
|
md5_gen(1030) md5($p.$p)
|
||
|
md5_gen(1031) md5($s.md5($p).$p)
|
||
|
md5_gen(1032) md5(md5($s).md5($p).$p)
|
||
|
|
||
|
[List.Generic:md5_gen(1030)]
|
||
|
Expression=md5_gen(1030): md5($p.$p)
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Test=md5_gen(1030)42b72f913c3201fc62660d512f5ac746:test1
|
||
|
|
||
|
Here is the exact same format, with some comments added, describing the
|
||
|
sub-sections, and exactly what is being done.
|
||
|
|
||
|
#first line is the section name. It MUST be of the format shown.
|
||
|
[List.Generic:md5_gen(1030)]
|
||
|
#
|
||
|
#the next line, is a required line. It serves 2 purposes. It is output
|
||
|
#in john, when the format 'starts'. Also, the md5_gen(#) part is used
|
||
|
#to destinguish this exact format (so the command line of --sub=md5_gen(1030)
|
||
|
#would specify this and only this format)
|
||
|
#
|
||
|
Expression=md5_gen(1030): md5($p.$p)
|
||
|
#
|
||
|
#This is the set of functions. This is the ONLY section of the format
|
||
|
#where order IS important. The functions ARE handled one after the
|
||
|
#other, from top to bottom, to perform the string operations, and md5
|
||
|
#operations which are needed to perform the hash of this format
|
||
|
#The functions ARE a required part of the format.
|
||
|
#
|
||
|
#first step, clean the input. All work for this format is done using
|
||
|
#only input 1 and output 1 buffers.
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
#
|
||
|
#Step 2, append the keys. Note, the buffer is clean, so this is simply
|
||
|
#the same as Input=keys (but required 2 steps, the clean and append keys).
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
#
|
||
|
#Step 3, append keys again (the format is ($p.$p) or keys appended to keys.
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
#
|
||
|
#Step 4, final step performs md5 of $p.$p This will properly leave the
|
||
|
#results in output1
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
#
|
||
|
#This is test string. These ARE required. You can provide more than
|
||
|
#one. 5 or 6 are best, to make sure the format is valid.
|
||
|
#
|
||
|
Test=md5_gen(1030)42b72f913c3201fc62660d512f5ac746:test1
|
||
|
|
||
|
Ok, here is the second format. The format being done is md5($s.md5($p).$p)
|
||
|
Here are a few comments about this format:
|
||
|
1. There is a Flag= value. This is because this is a Salted format. This
|
||
|
REQUIRES the MGF_SALTED flag.
|
||
|
2. We only use input 1 and output 1.
|
||
|
3. There are a couple of calls to crypt (md5). The first simply gets
|
||
|
md5($p) and puts it into output1, which will later be appeneded in
|
||
|
base-16 format as we build our string.
|
||
|
4. After the first crypt (md5), we clear our input buffer, then put
|
||
|
the salt in, append the base-16 of md5($p), and then append $p
|
||
|
5. Finally, and call to crypt is done, which leaves the results in
|
||
|
output1, so the rest of the md5-gen format can properly compare it.
|
||
|
|
||
|
[List.Generic:md5_gen(1031)]
|
||
|
Expression=md5_gen(1031): md5($s.md5($p).$p)
|
||
|
Flag=MGF_SALTED
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_salt
|
||
|
Func=MD5GenBaseFunc__append_from_last_output_as_base16
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Test=md5_gen(1031)a459f60614498dbdd9a79dcc9c538749$aabbccdd:test1
|
||
|
|
||
|
|
||
|
Now, here is the final format: md5(md5($s).md5($p).$p)
|
||
|
|
||
|
[List.Generic:md5_gen(1032)]
|
||
|
Expression=md5_gen(1032): md5(md5($s).md5($p).$p)
|
||
|
Flag=MGF_SALTED
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_salt
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Func=MD5GenBaseFunc__clean_input2
|
||
|
Func=MD5GenBaseFunc__append_keys2
|
||
|
Func=MD5GenBaseFunc__crypt2
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_from_last_output_as_base16
|
||
|
Func=MD5GenBaseFunc__append_from_last_output2_to_input1_as_base16
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Test=md5_gen(1032)042d1f15ed57929a2ac8ee4f0a924679$aabbccdd:test1
|
||
|
|
||
|
Ok, now that these have been built, here are a few 'benchmarks' listing
|
||
|
that they are WORKING, and what speed they are working:
|
||
|
|
||
|
Here is MinGW build 'x86'
|
||
|
|
||
|
john_x86 -test -for=md5-gen -sub=md5_gen(1030)
|
||
|
Benchmarking: md5_gen(1030) md5_gen(1030): md5($p.$p) [128x1 (MD5_Go)]... DONE
|
||
|
Raw: 3530K c/s
|
||
|
|
||
|
john_x86 -test -for=md5-gen -sub=md5_gen(1031)
|
||
|
Benchmarking: md5_gen(1031) md5_gen(1031): md5($s.md5($p).$p) [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 1945K c/s
|
||
|
Only one salt: 1890K c/s
|
||
|
|
||
|
john_x86 -test -for=md5-gen -sub=md5_gen(1032)
|
||
|
Benchmarking: md5_gen(1032) md5_gen(1032): md5(md5($s).md5($p).$p) [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 1016K c/s
|
||
|
Only one salt: 1031K c/s
|
||
|
|
||
|
|
||
|
Here is MinGW build of SSE2
|
||
|
|
||
|
john_sse2 -test -for=md5-gen -sub=md5_gen(1030)
|
||
|
Benchmarking: md5_gen(1030) md5_gen(1030): md5($p.$p) SSE2 [SSE2 32x4 (.S)]... DONE
|
||
|
Raw: 7250K c/s
|
||
|
|
||
|
john_sse2 -test -for=md5-gen -sub=md5_gen(1031)
|
||
|
Benchmarking: md5_gen(1031) md5_gen(1031): md5($s.md5($p).$p) SSE2 [SSE2 32x4 (.S)]... DONE
|
||
|
Many salts: 5065K c/s
|
||
|
Only one salt: 4436K c/s
|
||
|
|
||
|
john_sse2 -test -for=md5-gen -sub=md5_gen(1032)
|
||
|
Benchmarking: md5_gen(1032) md5_gen(1032): md5(md5($s).md5($p).$p) SSE2 [SSE2 32x4 (.S)]... FAILED (get_hash[0](0))
|
||
|
|
||
|
|
||
|
Here is some timings to check against:
|
||
|
|
||
|
john_x86 -test -for=md5-gen -sub=md5_gen(0)
|
||
|
Benchmarking: md5_gen(0): md5($p) (raw-md5) [128x1 (MD5_Go)]... DONE
|
||
|
Raw: 4005K c/s
|
||
|
|
||
|
john_sse2 -test -for=md5-gen -sub=md5_gen(0)
|
||
|
Benchmarking: md5_gen(0): md5($p) (raw-md5) SSE2 [SSE2 32x4 (.S)]... DONE
|
||
|
Raw: 10740K c/s
|
||
|
|
||
|
|
||
|
**** Optimizations of prior formats ****
|
||
|
|
||
|
For format 1030, the speed should be very close to that of md5_gen(0).
|
||
|
In both formats, there is only 1 call to md5(). However, we are seeing that the
|
||
|
(1030) is slower than (0). The explanation of this, is that the (0) format has
|
||
|
an optimization used, which we can not use in the (1030). The (1030) is likely
|
||
|
about as optimal as it can be made in the current md5-gen format. The optimization
|
||
|
for format (0) is: Flag=MGF_KEYS_INPUT What that does, is to place the keys
|
||
|
directly into the input field, and then later, when john gets the keys back (it
|
||
|
does this if a hash is cracked), john gets them from the input. In the (1030)
|
||
|
format, we load the keys, into the 'keys' arrays. We then have to call a function
|
||
|
to clean input buffer 1, and to append the keys (twice). Thus, what we have is
|
||
|
additional memory movement, and that slows things down. However, to use the
|
||
|
MGF_KEYS_INPUT optimization, we would have had to keep the input1 buffer prestine
|
||
|
and ONLY put in the keys (passwords). Since we had to append the keys twice,
|
||
|
we simply 'blew' that requirement, and thus, could NOT use it. At a later
|
||
|
time, we will show a format WHERE we can use this optimization.
|
||
|
|
||
|
For format 1031, there also appears to be no optimizations available.
|
||
|
|
||
|
For 1032, there are optimizations. In this format, we notice that we have
|
||
|
this sub expression: md5($s). Well, there is an optimization, which when it
|
||
|
loads the input file, it converts all salts into md5($s) and uses that value
|
||
|
instead. So, at startup time, we perform md5 hashes of all salts, but at
|
||
|
runtime, we simply place the salt into the building string, instead of performing
|
||
|
a MD5 on the salt. So, in the 1032, we had 3 calls to crypt. By using this
|
||
|
optimization, we can reduce that to 2 crypts. The starting format is:
|
||
|
md5(md5($s).md5($p).$p) This optimization makes the format 'behave' at
|
||
|
runtime, like it is md5($s.md5($p).$p), which was format 1031. Note, after
|
||
|
we make this optimzation, the timings will be almost identical to the 1031
|
||
|
timings. Also note, the Test string for 1032 and 1042 are exactly the
|
||
|
same. These are the same formats. It is just that 1042 performs fewer
|
||
|
crypt calls per test. Also note, in the 'original' run of SSE2, the 1032
|
||
|
format failed. This failure, is due to the SSE2 / MMX code only working
|
||
|
for strings up to 54 bytes (optimization reason). The length of this string:
|
||
|
md5($s).md5($p) is 64 bytes by itself, and we also append $p to that. Thus,
|
||
|
our string is OVER 54 bytes in length, and thus, can not be used in SSE2
|
||
|
mode. We do have a couple work arounds for this, to get it working properly
|
||
|
on SSE2 builds. We can use a flag which simply stops SSE2 dead in its tracks
|
||
|
(and preforms all work using x86 code). This is flag MGF_NOTSSE2Safe
|
||
|
|
||
|
[List.Generic:md5_gen(1042)]
|
||
|
Expression=md5_gen(1042): md5(md5($s).md5($p).$p)
|
||
|
Flag=MGF_SALTED
|
||
|
Flag=MGF_SALT_AS_HEX
|
||
|
Flag=MGF_NOTSSE2Safe
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_salt
|
||
|
Func=MD5GenBaseFunc__append_from_last_output_as_base16
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Test=md5_gen(1042)042d1f15ed57929a2ac8ee4f0a924679$aabbccdd:test1
|
||
|
|
||
|
Once the above changes have been done, here are the speeds:
|
||
|
|
||
|
john_x86 -test=5 -for=md5-gen -sub=md5_gen(1031)
|
||
|
Benchmarking: md5_gen(1031) md5_gen(1031): md5($s.md5($p).$p) [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 2007K c/s
|
||
|
Only one salt: 1913K c/s
|
||
|
|
||
|
john_x86 -test=5 -for=md5-gen -sub=md5_gen(1032)
|
||
|
Benchmarking: md5_gen(1032) md5_gen(1032): md5(md5($s).md5($p).$p) [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 1052K c/s
|
||
|
Only one salt: 1030K c/s
|
||
|
|
||
|
john_x86 -test=5 -for=md5-gen -sub=md5_gen(1042)
|
||
|
Benchmarking: md5_gen(1042) md5_gen(1042): md5(md5($s).md5($p).$p) [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 1420K c/s
|
||
|
Only one salt: 1372K c/s
|
||
|
|
||
|
john_sse2 -test=5 -for=md5-gen -sub=md5_gen(1042)
|
||
|
Benchmarking: md5_gen(1042) md5_gen(1042): md5(md5($s).md5($p).$p) SSE2 [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 1416K c/s
|
||
|
Only one salt: 1372K c/s
|
||
|
|
||
|
|
||
|
We can also perform even more optimizations in the format. What we do in this format, is we
|
||
|
md5 the salt (when we first load the file). Thus the salts which john works with, are really
|
||
|
md5($s) (same as we did in format 1042). Then we use a different flag, which puts the
|
||
|
md5($p) into offset 32 of input1 (where we want it). Then we simply overwrite the data in
|
||
|
input 1 with the salt (which is md5($s) in base-16 format), then force set length to 64, then
|
||
|
append the keys, then crypt.
|
||
|
|
||
|
[List.Generic:md5_gen(1052)]
|
||
|
Expression=md5_gen(1052): md5(md5($s).md5($p).$p)
|
||
|
Flag=MGF_SALTED
|
||
|
Flag=MGF_SALT_AS_HEX
|
||
|
Flag=MGF_KEYS_BASE16_IN1_Offset32
|
||
|
Flag=MGF_NOTSSE2Safe
|
||
|
Func=MD5GenBaseFunc__overwrite_salt_to_input1_no_size_fix
|
||
|
Func=MD5GenBaseFunc__set_input_len_64
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Test=md5_gen(1052)042d1f15ed57929a2ac8ee4f0a924679$aabbccdd:test1
|
||
|
|
||
|
Here are the benchmarks for the above format:
|
||
|
|
||
|
john_x86 -test=5 -for=md5-gen -sub=md5_gen(1052)
|
||
|
Benchmarking: md5_gen(1052) md5_gen(1052): md5(md5($s).md5($p).$p) [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 2251K c/s
|
||
|
Only one salt: 1369K c/s
|
||
|
|
||
|
john_sse2 -test=5 -for=md5-gen -sub=md5_gen(1052)
|
||
|
Benchmarking: md5_gen(1052) md5_gen(1052): md5(md5($s).md5($p).$p) SSE2 [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 2251K c/s
|
||
|
Only one salt: 1369K c/s
|
||
|
|
||
|
|
||
|
Now, note the speed for 'many salts'. It is very close to the speed of (1031), actually faster.
|
||
|
This speed is the speed john will have for a normal password cracking, where you have dozens (or
|
||
|
hundreds, or 1000's) of password hashes to crack.
|
||
|
|
||
|
To understand WHY this format is this much faster (the 'Many salts', is the normal way to
|
||
|
benchmark the speed of a salted hash), is to understand what is happening under the hood within
|
||
|
john's 'crypt all' loop.
|
||
|
|
||
|
while (!feof(password_file)) {
|
||
|
for (i = 0 to max_num_passwords)
|
||
|
SetKey(i, getnextpassword(password_file));
|
||
|
if (salted)
|
||
|
{
|
||
|
while (z<salt_count)
|
||
|
{
|
||
|
SetSalt(salt[z]);
|
||
|
crypt_all
|
||
|
for (all_binaries_for_salt[z])
|
||
|
CheckForMatched(binary)
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
|
||
|
The above code is certainly not 'exact', but should show close enough, the algorithm used
|
||
|
within john. Now, the algorithm as used within md5-gen will be shown (specifically for the
|
||
|
flag MGF_KEYS_BASE16_IN1_Offset32).
|
||
|
|
||
|
- SetKey() is called numerous times. This will set a 'dirty flag' for the keys inside of md5-gen.
|
||
|
- SetSalt() will be called. The salt handed to us is actually md5($s), since MGF_SALT_AS_HEX is set
|
||
|
The SetSalt() calls are happening within the 'while(z<salt_count)' loop in john.
|
||
|
- crypt_all is called.
|
||
|
Within crypt_all, md5-gen knows that we want the base-16 md5($p) to be placed at offset 32
|
||
|
within input1. So the first call to crypt_all (for the first salt), will cause the md5($p)
|
||
|
to be computed, and to be placed at offset 32.
|
||
|
Then the script will overwrite the starting bytes of input1 with the 32 bytes of the salt,
|
||
|
then the length is set to 64, then the key is appened, then a crypt, and then comparisons.
|
||
|
- NOW, we are at the next loop within the 'while(z<salt_count)'.
|
||
|
- Then john loads the next salt [ SetSalt() ].
|
||
|
- Then john calls crypt_all.
|
||
|
At this time, there have been NO additional SetKey() calls. Thus, md5-gen knows that the
|
||
|
base-16 text of md5($p) is STILL located at offset 32 of Input1. So, the format DOES NOT
|
||
|
perform this crypt again (until new SetKey() function calls happen).
|
||
|
- This SetSalt .. crypt_all .. compare continues until all salts are tested. However, there
|
||
|
will be no crypt calls to md5($p) again, UNTIL the working code within john calls SetKey()
|
||
|
again (when starting with new passwords, after all salts have been checked).
|
||
|
|
||
|
|
||
|
Now, in the final format, we start from 1042, and do NOT turn off the sse2 code. What we do, is
|
||
|
to turn off SSE2 when it is not valid. This will generate x86 code (generic) that runs exactly
|
||
|
the same as in 1042 (the 2 function calls of MD5GenBaseFunc__SSEtoX86_switch_output1 and
|
||
|
MD5GenBaseFunc__X86toSSE_switch_output1 are no-ops in x86 builds). However, in SSE mode,
|
||
|
the first crypt will be done using SSE. Thus, as we see, the speed went from 1420k, up
|
||
|
to almost 1800k. But note, this is NOT as fast as format 1052, for 'many' salts.
|
||
|
|
||
|
[List.Generic:md5_gen(1062)]
|
||
|
Expression=md5_gen(1062): md5(md5($s).md5($p).$p)
|
||
|
Flag=MGF_SALTED
|
||
|
Flag=MGF_SALT_AS_HEX
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Func=MD5GenBaseFunc__SSEtoX86_switch_output1
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_salt
|
||
|
Func=MD5GenBaseFunc__append_from_last_output_as_base16
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Func=MD5GenBaseFunc__X86toSSE_switch_output1
|
||
|
Test=md5_gen(1062)042d1f15ed57929a2ac8ee4f0a924679$aabbccdd:test1
|
||
|
|
||
|
john_sse2 -test=5 -for=md5-gen -sub=md5_gen(1062)
|
||
|
Benchmarking: md5_gen(1062) md5_gen(1062): md5(md5($s).md5($p).$p) SSE2 [SSE2 32x4 (.S)]... DONE
|
||
|
Many salts: 1792K c/s
|
||
|
Only one salt: 1715K c/s
|
||
|
|
||
|
So all in all, 1032, 1042, 1052, 1062 were all equivalent (1032 was not, since it fails in
|
||
|
SSE2 builds, but that was 'fixed' in 1042). They all run using differing sets of flags, differing
|
||
|
sets of Function primatives, and have different runtime speeds. However, in the end, they all
|
||
|
|
||
|
|
||
|
|
||
|
Now, the above format 1062 is slower than 1052. This is due to the final crypt still having to be
|
||
|
done in x86 mode. However, in 1062, we crypt EVERY password for each salt. Thus you can see there
|
||
|
is no speed gain between many salts, and 1 salt. Yes, the md5($p) IS done using SSE2 which is much
|
||
|
faster, but in version 1052, when there are multiple salts, the slower md5($p) is done only 1 time
|
||
|
per password.
|
||
|
|
||
|
|
||
|
Now, the flag MGF_KEYS_BASE16_IN1_Offset32 (or other flags like it), CAN be used in SSE2 to
|
||
|
get much faster behavior, however, it has to be in a format that IS SSE2 friendly. Here
|
||
|
is an example:
|
||
|
|
||
|
md5(md5($p).$s) In this format, we CAN build an SSE2 friendly format, that is VERY fast.
|
||
|
For this test, we will set the salt length to a fixed size of 12.
|
||
|
|
||
|
Here is a very easy to read, but also very far from optimal format for the above type:
|
||
|
[List.Generic:md5_gen(1033)]
|
||
|
Expression=md5_gen(1033): md5(md5($p).$s)
|
||
|
Flag=MGF_SALTED
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_keys
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Func=MD5GenBaseFunc__clean_input
|
||
|
Func=MD5GenBaseFunc__append_from_last_output_as_base16
|
||
|
Func=MD5GenBaseFunc__append_salt
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Test=md5_gen(1033)e9fb44106edf60419d26a10b5439d0c7$aabbccddeeff:test1
|
||
|
SaltLen=12
|
||
|
|
||
|
john_x86 -test -format=md5-gen -subf=md5_gen(1033)
|
||
|
Benchmarking: md5_gen(1033) md5_gen(1033): md5(md5($p).$s) [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 1918K c/s
|
||
|
Only one salt: 1889K c/s
|
||
|
|
||
|
john_sse2 -test -format=md5-gen -subf=md5_gen(1033)
|
||
|
Benchmarking: md5_gen(1033) md5_gen(1033): md5(md5($p).$s) SSE2 [SSE2 32x4 (.S)]... DONE
|
||
|
Many salts: 5479K c/s
|
||
|
Only one salt: 4922K c/s
|
||
|
|
||
|
|
||
|
Here is a MUCH more optimal version (1043). This version will use the flag
|
||
|
MGF_KEYS_BASE16_IN1 to load the md5($p) into input 1, at the start of that string. That
|
||
|
will ONLY be done, if there is a SetKeys() change. Then we simply set the input length
|
||
|
to 32, append the salt, and call crypt.
|
||
|
|
||
|
[List.Generic:md5_gen(1043)]
|
||
|
Expression=md5_gen(1043): md5(md5($p).$s)
|
||
|
Flag=MGF_SALTED
|
||
|
Flag=MGF_KEYS_BASE16_IN1
|
||
|
Func=MD5GenBaseFunc__set_input_len_32
|
||
|
Func=MD5GenBaseFunc__append_salt
|
||
|
Func=MD5GenBaseFunc__crypt
|
||
|
Test=md5_gen(1033)e9fb44106edf60419d26a10b5439d0c7$aabbccddeeff:test1
|
||
|
SaltLen=12
|
||
|
|
||
|
john_x86 -test -format=md5-gen -subf=md5_gen(1043)
|
||
|
Benchmarking: md5_gen(1043) md5_gen(1043): md5(md5($p).$s) [128x1 (MD5_Go)]... DONE
|
||
|
Many salts: 4128K c/s
|
||
|
Only one salt: 1890K c/s
|
||
|
|
||
|
john_sse2 -test -format=md5-gen -subf=md5_gen(1043)
|
||
|
Benchmarking: md5_gen(1043) md5_gen(1043): md5(md5($p).$s) SSE2 [SSE2 32x4 (.S)]... DONE
|
||
|
Many salts: 13096K c/s
|
||
|
Only one salt: 4834K c/s
|
||
|
|
||
|
So in this case, we see that the 'only 1 salt' speed is pretty much a wash. However, the
|
||
|
'many salts' speed, has gone from 1900k to 4100k for non-sse, and from 5500k to 13100k.
|
||
|
|
||
|
NOTE, the above format is actually md5_gen(6) (also md5_gen(7)) format.
|
||
|
|