brute forcer

Kevin McGuire · Post by **Kevin McGuire** » Mon Jun 25, 2007 4:34 pm

Try:
int inline __fastcall do_next_sequence(char *sequence, int charset_length, int length);

Kevin McGuire · Post by **Kevin McGuire** » Mon Jun 25, 2007 4:47 pm

    while( do_next_sequence(sequence, sizeof(charset), len) )
    {
        //sprintf(message, "%c", charset[sequence[0]]);
        //for(x = 1; x < len; ++x)
        //{
        //    sprintf(temp, "%c", charset[sequence[x]]);
        //    strcat(message, temp);
        //}
        for(x = 0; x < len; ++x)
        {
             message[x] = charset[sequence[x]];
        }
        message[x] = 0;
       
        //   START md5 string: input message
        md5_init(&state);
        md5_append(&state, (const md5_byte_t *)message, len);
        md5_finish(&state, digest);
        //   END md5 string: output digest
       
        loops++;
        if( memcmp( raw_inhash, digest, 16 ) == 0 )
        {
            end = clock();
            dif = (end - start);
            dif /= CLOCKS_PER_SEC;
            printf("Collision Found!: %s is : %s\n  Cracking took: %.2lfs\n  Average h/s: %.2f h/s\n", argv[1], message, dif, (loops / dif));
            system("pause");
            return 0;
        }   
       
    }

GLneo · Post by **GLneo** » Mon Jun 25, 2007 7:06 pm

w00t!, 62% faster, 710,638.64 hashes / second!, the only way to get faster is probably in the md5 lib I'm using..., does anyone know of a faster one?, I'm using: http://sourceforge.net/project/showfile ... p_id=42360

p.s. the fastest any md5 cracker has run on my computer is: 4,841,410 h/s, so I'm at least getting closer!

thx!

Kevin McGuire · Post by **Kevin McGuire** » Mon Jun 25, 2007 7:33 pm

Try post computing a array of sequences.

Code: Select all

while(1)
{
 for(uint32_t x = 0; x < PRECOMPUTE_COUNT; ++x)
 {
    // compute sequences and store in array on stack.
 }
 for(uint32_t x = 0; x < PRECOMPUTE_COUNT; ++x)
 {
   // check each sequence
 }
}

Of course try altering the PRECOMPUTE_COUNT so that you may find a optimal number/size that fits into cache. The idea should be to keep the processor from having to predict many jumps in between computing the sequence and computing the MD5 hash so that it may pipeline more.

os64dev · Post by **os64dev** » Tue Jun 26, 2007 1:26 am

GLneo wrote:w00t!, 62% faster, 710,638.64 hashes / second!, the only way to get faster is probably in the md5 lib I'm using..., does anyone know of a faster one?, I'm using: http://sourceforge.net/project/showfile ... p_id=42360

p.s. the fastest any md5 cracker has run on my computer is: 4,841,410 h/s, so I'm at least getting closer!

thx!

Well your memcmp should take some time to. A little faster might be:

Code: Select all

if(*(int *)raw_inhash == *(int *)digest) {
if( memcmp( raw_inhash, digest, 16 ) == 0 ) 
        { 
            end = clock(); 
            dif = (end - start); 
            dif /= CLOCKS_PER_SEC; 
            printf("Collision Found!: %s is : %s\n  Cracking took: %.2lfs\n  Average h/s: %.2f h/s\n", argv[1], message, dif, (loops / dif)); 
            system("pause"); 
            return 0; 
        }
}

further more you for loop are counting wrong (alway compare to zero)

Code: Select all

for(x = len - 1; x >= 0; x--) 
        { 
             message[x] = charset[sequence[x]]; 
        } 
        message[len] = 0;

note these problably will not improve the performance much but they are basics.

GLneo · Post by **GLneo** » Tue Jun 26, 2007 2:36 pm

well, one slow down is the hashing algorithm ( which i didn't write ), you use it by making a state and a place to put the final hash:

Code: Select all

md5_state_t state;
md5_byte_t digest[16];

then init the state:

Code: Select all

md5_init(&state);

add to the state:

Code: Select all

md5_append(&state, (const md5_byte_t *)message, len);

then finish to get the output:

Code: Select all

md5_finish(&state, digest);

this works fine for summing a file:

Code: Select all

while ( x = readfile() )
   md5_append(&state, (md5_byte_t *)x, len)

but for my case i need something that gets a message and makes the hash in one step:

Code: Select all

md5_byte_t *md5hash( char message, int len, md5_byte_t *buf );

using the attached code as a base. does any one have any idea's of how i can combine the 3 functions in to one?

thx!

B.E · Post by **B.E** » Tue Jun 26, 2007 7:32 pm

try removing the function calls in the following (i.e remove sprintf() and strcat()). Also you may be able to remove the loop altogether and put it in your do_next_sequence() function. Also in your do_next_sequence(), you could reduce the amount of iterations, by starting at the end and working back untill no change is required.

Code: Select all

        sprintf(message, "%c", charset[sequence[0]]);
        for(x = 1; x < len; ++x){
       
            sprintf(temp, "%c", charset[sequence[x]]);
            strcat(message, temp);
        }

EDIT: also osDev made a point, you could write something like this. Because the MD5 string is constant and is divisable by 4 (if the machine is 64 use unsigned long long and remove every second line in the if statment).

Code: Select all

   if((*(unsigned long *)(raw_inhash) == *(unsigned long *)(digest)) &&
      (*(unsigned long *)(raw_inhash+4) == *(unsigned long *)(digest+4)) &&
      (*(unsigned long *)(raw_inhash+8) == *(unsigned long *)(digest+8)) &&
      (*(unsigned long *)(raw_inhash+12) == *(unsigned long *)(digest+12))) {
            end = clock();
            dif = (end - start);
            dif /= CLOCKS_PER_SEC;
            printf("Collision Found!: %s is : %s\n  Cracking took: %.2lfs\n  Average h/s: %.2f h/s\n", argv[1], message, dif, (loops / dif));
            system("pause");
            return 0;
        }

Kevin McGuire · Post by **Kevin McGuire** » Tue Jun 26, 2007 8:40 pm

I tried, but I can not spot my mistake. Maybe you guys can find it. I will have to try it some more tomorrow.

Its close, but something is wrong.

Code: Select all

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <math.h>


#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32-(n))))
#define F(x,y,z) (((x) & (y)) | ((~(x)) & (z)))
#define G(x,y,z) (((x) & (z)) | ((y) & (~(z))))
#define H(x,y,z) ((x) ^ (y) ^ (z))
#define I(x,y,z) ((y) ^ ((x) | (~(z))))

void md5_genT(uint32_t *T)
{
	uint_fast32_t i;
	for(i = 0; i < 64; ++i)
	{
		T[i] = 4294967296.0 * fabs(sin((float)(i+1)));
		//printf("%08x\n", T[i]);
		//sleep(1);
	}
}
void md5_hash(uint8_t *message, uint_fast32_t mlength, uint32_t *T)
{
	uint8_t *nm, nmlength;
	uint32_t __s1[] = {7, 12, 17, 22};
	uint32_t __s2[] = {5, 9, 14, 20};
	uint32_t __s3[] = {4, 11, 16, 23};
	uint32_t __s4[] = {6, 10, 15, 21};
	uint32_t __k1[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
	uint32_t __k2[] = {1, 6, 11, 0, 5, 10, 15, 4, 9, 14, 3, 8, 13, 2, 7, 12};
	uint32_t __k3[] = {5, 8, 11, 14, 1, 4, 7, 10, 13, 0, 3, 6, 9, 12, 15, 2};
	uint32_t __k4[] = {0, 7, 14, 5, 12, 3, 10, 1, 8, 15, 6, 13, 4, 11, 2, 9};
	uint32_t X[16];
	uint32_t i, j, ii, k;
	uint32_t AA, BB, CC, DD;
	uint32_t A, B, C, D;
	// do padding so message is a multiple of 512 bits (64 bytes)
	if((mlength%64) == 0)
	{
		nmlength = mlength + 64;
	}else{
		nmlength = mlength + (64 - (mlength%64));
	}
	nm = (uint8_t*)malloc(nmlength);
	memset(nm, 0, nmlength);
	memcpy(nm, message, mlength);
	printf("nmlength:%u mlength:%u\n", nmlength, mlength);
	nm[mlength] = 0x80;
	((uint32_t*)&nm[nmlength - 8])[0] = mlength << 3;
	((uint32_t*)&nm[nmlength - 8])[1] = 0;
	
	// hash padded message (nmessage)
	AA = 0x67452301; BB = 0xefcdab89; CC = 0x98badcfe; DD = 0x10325476;
	for(i = 0; i < (nmlength/64); ++i)
	{
		// create table
		for(j = 0; j < 16; ++j)
		{
			X[j] = 	 nm[i*64+j*4+0] | 
					(nm[i*64+j*4+1] << 8) |
					(nm[i*64+j*4+2] << 16) |
					(nm[i*64+j*4+3] << 24);
		}
		A = AA;
		B = BB;
		C = CC;
		D = DD;
		// round one
		ii = 0;
		for(k = 0; k < 4; ++k)
		{
			A = B + ROTATE_LEFT((A + F(B, C, D) + X[__k1[k*4+0]] + T[ii+0]), __s1[0]);
			D = A + ROTATE_LEFT((D + F(A, B, C) + X[__k1[k*4+1]] + T[ii+1]), __s1[1]);
			C = D + ROTATE_LEFT((C + F(D, A, B) + X[__k1[k*4+2]] + T[ii+2]), __s1[2]);
			B = C + ROTATE_LEFT((B + F(C, D, A) + X[__k1[k*4+3]] + T[ii+3]), __s1[3]);
			printf("X[..]: %u\n", __k1[k*4+0]);
			ii += 4;
		}
		// round two
		for(k = 0; k < 4; ++k)
		{
			A = B + ROTATE_LEFT((A + G(B, C, D) + X[__k2[k*4+0]] + T[ii+0]), __s2[0]);
			D = A + ROTATE_LEFT((D + G(A, B, C) + X[__k2[k*4+1]] + T[ii+1]), __s2[1]);
			C = D + ROTATE_LEFT((C + G(D, A, B) + X[__k2[k*4+2]] + T[ii+2]), __s2[2]);
			B = C + ROTATE_LEFT((B + G(C, D, A) + X[__k2[k*4+3]] + T[ii+3]), __s2[3]);
			ii += 4;
		}
		// round three
		for(k = 0; k < 4; ++k)
		{
			A = B + ROTATE_LEFT((A + H(B, C, D) + X[__k3[k*4+0]] + T[ii+0]), __s3[0]);
			D = A + ROTATE_LEFT((D + H(A, B, C) + X[__k3[k*4+1]] + T[ii+1]), __s3[1]);
			C = D + ROTATE_LEFT((C + H(D, A, B) + X[__k3[k*4+2]] + T[ii+2]), __s3[2]);
			B = C + ROTATE_LEFT((B + H(C, D, A) + X[__k3[k*4+3]] + T[ii+3]), __s3[3]);
			ii += 4;
		}
		// round four
		for(k = 0; k < 4; ++k)
		{
			A = B + ROTATE_LEFT((A + I(B, C, D) + X[__k4[k*4+0]] + T[ii+0]), __s4[0]);
			D = A + ROTATE_LEFT((D + I(A, B, C) + X[__k4[k*4+1]] + T[ii+1]), __s4[2]);
			C = D + ROTATE_LEFT((C + I(D, A, B) + X[__k4[k*4+2]] + T[ii+2]), __s4[3]);
			B = C + ROTATE_LEFT((B + I(C, D, A) + X[__k4[k*4+3]] + T[ii+3]), __s4[4]);
			ii += 4;
		}
		printf("----\n");
		printf("ii:%u\n", ii);
		AA = AA + A;
		BB = BB + B;
		CC = CC + C;
		DD = DD + D;
	}
	printf("digest: ");
	printf("%02x%02x%02x%02x,", AA>>24, AA>>16&0xFF, AA>>8&0xFF, AA&0xFF);
	printf("%02x%02x%02x%02x,", BB>>24, BB>>16&0xFF, BB>>8&0xFF, BB&0xFF);
	printf("%02x%02x%02x%02x,", CC>>24, CC>>16&0xFF, CC>>8&0xFF, CC&0xFF);
	printf("%02x%02x%02x%02x,", DD>>24, DD>>16&0xFF, DD>>8&0xFF, DD&0xFF);
	printf("\n");
	return;
}

int main(int argc, char *argv[])
{
	uint8_t *message = (uint8_t*)"abc";
	uint32_t T[64];
	md5_genT(&T[0]);
	md5_hash(message, strlen(message), &T[0]);
	return 1;
}

B.E · Post by **B.E** » Tue Jun 26, 2007 10:07 pm

GLneo wrote:well, one slow down is the hashing algorithm ( which i didn't write ), you use it by making a state and a place to put the final hash:

take a look at the FreeBSD implementation. Don't know if it's faster (could not be bothered testing it either). Also you may get some preformace increase by placeing some variables in registers, (i.e register int var_name).

GLneo · Post by **GLneo** » Fri Jun 29, 2007 4:15 pm

wow!, that looks good, but i cant figure out whats wrong, and y are you printing the results that way? have you seen the little wikipedia example:

Code: Select all

//Note: All variables are unsigned 32 bits and wrap modulo 2^32 when calculating
var int[64] r, k

//r specifies the per-round shift amounts
r[ 0..15] := {7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22} 
r[16..31] := {5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20}
r[32..47] := {4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23}
r[48..63] := {6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21}

//Use binary integer part of the sines of integers as constants:
for i from 0 to 63
    k[i] := floor(abs(sin(i + 1)) Ã— (2 pow 32))

//Initialize variables:
var int h0 := 0x67452301
var int h1 := 0xEFCDAB89
var int h2 := 0x98BADCFE
var int h3 := 0x10325476

//Pre-processing:
append "1" bit to message
append "0" bits until message length in bits â‰¡ 448 (mod 512)
append bit (bit, not byte) length of unpadded message as 64-bit little-endian integer to message

//Process the message in successive 512-bit chunks:
for each 512-bit chunk of message
    break chunk into sixteen 32-bit little-endian words w[i], 0 â‰¤ i â‰¤ 15

    //Initialize hash value for this chunk:
    var int a := h0
    var int b := h1
    var int c := h2
    var int d := h3

    //Main loop:
    for i from 0 to 63
        if 0 â‰¤ i â‰¤ 15 then
            f := (b and c) or ((not b) and d)
            g := i
        else if 16 â‰¤ i â‰¤ 31
            f := (d and b) or ((not d) and c)
            g := (5Ã—i + 1) mod 16
        else if 32 â‰¤ i â‰¤ 47
            f := b xor c xor d
            g := (3Ã—i + 5) mod 16
        else if 48 â‰¤ i â‰¤ 63
            f := c xor (b or (not d))
            g := (7Ã—i) mod 16
 
        temp := d
        d := c
        c := b
        b := b + leftrotate((a + f + k[i] + w[g]) , r[i])
        a := temp

    //Add this chunk's hash to result so far:
    h0 := h0 + a
    h1 := h1 + b 
    h2 := h2 + c
    h3 := h3 + d

var int digest := h0 append h1 append h2 append h3 //(expressed as little-endian)

thx!

Ninjarider · Post by **Ninjarider** » Fri Jun 29, 2007 9:47 pm

that would take a long time to convert to asm. unless you were willing to do it from a command prompt. what exactly are the specs for you password cracker.

GLneo · Post by **GLneo** » Sat Jun 30, 2007 3:13 pm

I'm not really sure what you are saying?, converting it all to asm should be kinda easy. and what do you mean by specs?

p.s. I've switched my md5 lib, and now it runs at ~1,185,394 h/s, for comparison Cain is ~3,300,000 h/s

hopefully Kevin McGuire finds whats wrong with his lib and i can start using it and converting it to asm.

also I've been experimenting with Pthreads, but i cant find an efficient way to break up the work into threads. What i've tried is to break it up by length, were 1 thread works on 7 char long passes, and another on 8, etc... but i'm still confused on how if one thread finds a pass it can signal the others to stop and all that other IPC stuff, or if it is even useful to split the work on a single CPU?

thx!

Ninjarider · Post by **Ninjarider** » Sun Jul 01, 2007 3:30 pm

unless you would like to see it run on a multi core processor then i would leave threads alone.

Ninjarider · Post by **Ninjarider** » Sun Jul 01, 2007 3:35 pm

reson i asked about that is because knowledge of c very limited. that and i was not understanding how the hashes are created. the hash length. seeing psuedo code would help me out most.

converting it to asm would not be hard. just very tedious.

an alternative to threads. when you convert to asm you should be able to have the cpu and the fpu both do computations at the same time. not sure if it would help or if it would be more of a pain. when using the fpu. the fpu registers are 80-bits

GLneo · Post by **GLneo** » Mon Jul 02, 2007 11:28 am

but before any of this i need to find whats wrong with Kevin McGuire's code, then on to asm!

OSDev.org

brute forcer

spec's