72 lines
2.5 KiB
Plaintext
72 lines
2.5 KiB
Plaintext
These are "public" functions that can be used for supporting UTF-8 in your
|
|
own formats for JtR. All are put in unicode.c.
|
|
|
|
If you are not aware of UCS-2, just consider it a synonym to UTF-16. For our
|
|
purposes, they are the same.
|
|
|
|
|
|
Convert from ISO-8859-1 or UTF-8 to UTF-16LE:
|
|
=============================================
|
|
Source format depends on --utf8 flag given to john when running.
|
|
If length is exceeded or malformed data is found, it will return a negative
|
|
number telling you how much of the source that was read. That is, a return
|
|
code of -32 means you have an UNKNOWN number of characters of UCS-2 [use
|
|
strlen16()] in the destination buffer and you should truncate your
|
|
"saved_plain" (if applicable) at 32.
|
|
|
|
#include "unicode.h"
|
|
int plaintowcs(UTF16 *dst, int maxdstlen, const UTF8 *src, int srclen);
|
|
|
|
|
|
|
|
Convert UTF-8 to UTF-16LE:
|
|
==========================
|
|
Always from UTF-8. This is optimised for speed. If length is exceeded or
|
|
malformed data is found, it will return a negative number telling you how much
|
|
of the source that was read. That is, a return code of -32 means you have an
|
|
UNKNOWN number of characters of UCS-2 [use strlen16()] in the destination
|
|
buffer and you should truncate your "saved_plain" at 32.
|
|
|
|
#include "unicode.h"
|
|
int utf8towcs(UTF16 *target, int maxtargetlen,
|
|
const UTF8 *source, int sourcelen);
|
|
|
|
|
|
|
|
Convert UTF-16LE to UTF-8:
|
|
==========================
|
|
Currently used in NT_fmt.c in order to avoid using a saved_plain buffer.
|
|
|
|
#include "unicode.h"
|
|
extern char * utf16toutf8 (const UTF16* source);
|
|
|
|
|
|
|
|
Return length (in characters) of a UTF16 string:
|
|
================================================
|
|
Number of octets is the result * sizeof(UTF16)
|
|
|
|
#include "unicode.h"
|
|
int strlen16(const UTF16 *str);
|
|
|
|
|
|
|
|
Create an NT hash:
|
|
==================
|
|
This will convert from ISO-8859 or UTF-8 depending on the --utf8 option. The
|
|
function will use Alain's fast NT hashing if length is <= 27 characters,
|
|
otherwise it will use Solar's MD4. Lengths up to MAX_PLAINTEXT_LENGTH is thus
|
|
supported with no hassle for you. If length is exceeded or malformed data is
|
|
found, it will return a negative number telling you how much of the source
|
|
that was read. That is, a return code of -32 means you should truncate your
|
|
"saved_plain" at 32.
|
|
|
|
#include "unicode.h"
|
|
int E_md4hash(const UTF8 *passwd, int len, unsigned char *p16);
|
|
|
|
|
|
|
|
NOTE that there are more functions available in ConvertUTF.c.original
|
|
(ConvertUTF.h.original) and if we need any of them, we should copy them to
|
|
unicode.c/h and simplify/optimize them.
|