brute forcer
-
- Member
- Posts: 62
- Joined: Fri Jun 29, 2007 8:36 pm
im not sure if it would be possible for it to be done in c, i know it is done in asm. all processors have a v-pipe and a u-pipe. certain instructions come in on the u pipe others on the v-pipe and some go in on any pipe. there might be a way to play with all the stuff in the loops to line up the opcodes to possible make it twice as fast in the loops.
starting work on a asm module. might take a while to get it on here since i dont have internet at the house.
starting work on a asm module. might take a while to get it on here since i dont have internet at the house.
-
- Member
- Posts: 62
- Joined: Fri Jun 29, 2007 8:36 pm
as far as the compatiblity there should be any issue with an intel 32-bit processor using the piplines. it would not have the same speed going from an intel to an amd. it will not exactly double the speed. but gives the possiblility to increase speed up to double for any loops.
not to mention when implementing something like that you can running in to computation errors because v pipe has excecuted an instructions before the u pipe and the u pipe required a value the v pipe changed.
not to mention when implementing something like that you can running in to computation errors because v pipe has excecuted an instructions before the u pipe and the u pipe required a value the v pipe changed.
- Brynet-Inc
- Member
- Posts: 2426
- Joined: Tue Oct 17, 2006 9:29 pm
- Libera.chat IRC: brynet
- Location: Canada
- Contact:
I think he meant it would work on "any" 32bit CPU, you do know there are more then just x86 processors right?Ninjarider wrote:as far as the compatiblity there should be any issue with an intel 32-bit processor using the piplines. it would not have the same speed going from an intel to an amd. it will not exactly double the speed. but gives the possiblility to increase speed up to double for any loops.
not to mention when implementing something like that you can running in to computation errors because v pipe has excecuted an instructions before the u pipe and the u pipe required a value the v pipe changed.
-
- Member
- Posts: 62
- Joined: Fri Jun 29, 2007 8:36 pm
segfault
Found myself 5 minutes to look at it and found the source of the problem.
Please change
to
Keep up the good work. I love watching projects like this.
Please change
Code: Select all
sscanf(&argv[1][t*2], "%2x", (unsigned int *)&raw_inhash[t]);
Code: Select all
unsigned int tmp;
sscanf(&argv[1][t*2], "%2x", &tmp);
raw_inhash[t] = tmp;
Re: segfault
Doh, that error was kind of obvious. I didn't look into it because is didn't rearrange the code but thanks. Ofcourse we will continue i have to beet Cainjhawthorn wrote:Found myself 5 minutes to look at it and found the source of the problem.
Please changetoCode: Select all
sscanf(&argv[1][t*2], "%2x", (unsigned int *)&raw_inhash[t]);
Keep up the good work. I love watching projects like this.Code: Select all
unsigned int tmp; sscanf(&argv[1][t*2], "%2x", &tmp); raw_inhash[t] = tmp;
Author of COBOS
an additonal 5%, though now it gettting problematic to increase speed. The MD5 has is almost entirely done in registers. Now i will use this version as a basis for multithreading.
old version:
old version:
new version:$ time ./brute.exe d6a6bc0db10694a2d90e3a69648f3a03 6
Collision Found!
hash[d6a6bc0db10694a2d90e3a69648f3a03] = 'hacker'
- time: 48.81s
- avg. hash/s: 4178977.22 h/s
real 0m48.938s
user 0m48.859s
sys 0m0.000s
$ time ./brute.exe d6a6bc0db10694a2d90e3a69648f3a03 6
Collision Found!
hash[d6a6bc0db10694a2d90e3a69648f3a03] = 'hacker'
- time: 22.98s
- avg. hash/s: 4371943.88 h/s
real 0m23.047s
user 0m23.015s
sys 0m0.015s
- Attachments
-
[The extension cc has been deactivated and can no longer be displayed.]
Author of COBOS
Latest changes have put me up to ~4050000 h/s on my AMD64 3000+. I have to disagree with changing digest, raw_inhash, and charset into global variables. Digest will almost certainly be used individually by each thread eventually. charset will, hopefully, not always be a constant. Moreover, all the functions are being inlined, so there shouldn't (assuming your compiler is half sane) be a big performance hit from passing them an additional argument.
I look forward to seeing the threaded and then distributed versions of this piece of code.
I look forward to seeing the threaded and then distributed versions of this piece of code.
well, why even use digest? now that the comparison is made in the md5 hasher why do we still use that variable?:
Code: Select all
int __attribute__((__always_inline__)) md5_hash(unsigned char *message, unsigned int mlength, unsigned char input[16])
{
uint32_t AA, BB, CC, DD;
uint32_t *X;
uint32_t A, B, C, D;
uint32_t i;
AA = 0x67452301;
BB = 0xefcdab89;
CC = 0x98badcfe;
DD = 0x10325476;
for(i = 0; i < (mlength / 64); ++i)
{
A = AA;
B = BB;
C = CC;
D = DD;
X = (uint32_t *)&message[i * 64];
/// round one (unrolled)
A = B + ROTATE_LEFT((A + F(B, C, D) + X[ 0] + 0xd76aa478), 7);
D = A + ROTATE_LEFT((D + F(A, B, C) + X[ 1] + 0xe8c7b756), 12);
C = D + ROTATE_LEFT((C + F(D, A, B) + X[ 2] + 0x242070db), 17);
B = C + ROTATE_LEFT((B + F(C, D, A) + X[ 3] + 0xc1bdceee), 22);
A = B + ROTATE_LEFT((A + F(B, C, D) + X[ 4] + 0xf57c0faf), 7);
D = A + ROTATE_LEFT((D + F(A, B, C) + X[ 5] + 0x4787c62a), 12);
C = D + ROTATE_LEFT((C + F(D, A, B) + X[ 6] + 0xa8304613), 17);
B = C + ROTATE_LEFT((B + F(C, D, A) + X[ 7] + 0xfd469501), 22);
A = B + ROTATE_LEFT((A + F(B, C, D) + X[ 8] + 0x698098d8), 7);
D = A + ROTATE_LEFT((D + F(A, B, C) + X[ 9] + 0x8b44f7af), 12);
C = D + ROTATE_LEFT((C + F(D, A, B) + X[10] + 0xffff5bb1), 17);
B = C + ROTATE_LEFT((B + F(C, D, A) + X[11] + 0x895cd7be), 22);
A = B + ROTATE_LEFT((A + F(B, C, D) + X[12] + 0x6b901122), 7);
D = A + ROTATE_LEFT((D + F(A, B, C) + X[13] + 0xfd987193), 12);
C = D + ROTATE_LEFT((C + F(D, A, B) + X[14] + 0xa679438e), 17);
B = C + ROTATE_LEFT((B + F(C, D, A) + X[15] + 0x49b40821), 22);
/// round two (unrolled)
A = B + ROTATE_LEFT((A + G(B, C, D) + X[ 1] + 0xf61e2562), 5);
D = A + ROTATE_LEFT((D + G(A, B, C) + X[ 6] + 0xc040b340), 9);
C = D + ROTATE_LEFT((C + G(D, A, B) + X[11] + 0x265e5a51), 14);
B = C + ROTATE_LEFT((B + G(C, D, A) + X[ 0] + 0xe9b6c7aa), 20);
A = B + ROTATE_LEFT((A + G(B, C, D) + X[ 5] + 0xd62f105d), 5);
D = A + ROTATE_LEFT((D + G(A, B, C) + X[10] + 0x02441453), 9);
C = D + ROTATE_LEFT((C + G(D, A, B) + X[15] + 0xd8a1e681), 14);
B = C + ROTATE_LEFT((B + G(C, D, A) + X[ 4] + 0xe7d3fbc8), 20);
A = B + ROTATE_LEFT((A + G(B, C, D) + X[ 9] + 0x21e1cde6), 5);
D = A + ROTATE_LEFT((D + G(A, B, C) + X[14] + 0xc33707d6), 9);
C = D + ROTATE_LEFT((C + G(D, A, B) + X[ 3] + 0xf4d50d87), 14);
B = C + ROTATE_LEFT((B + G(C, D, A) + X[ 8] + 0x455a14ed), 20);
A = B + ROTATE_LEFT((A + G(B, C, D) + X[13] + 0xa9e3e905), 5);
D = A + ROTATE_LEFT((D + G(A, B, C) + X[ 2] + 0xfcefa3f8), 9);
C = D + ROTATE_LEFT((C + G(D, A, B) + X[ 7] + 0x676f02d9), 14);
B = C + ROTATE_LEFT((B + G(C, D, A) + X[12] + 0x8d2a4c8a), 20);
/// round three (unrolled)
A = B + ROTATE_LEFT((A + H(B, C, D) + X[ 5] + 0xfffa3942), 4);
D = A + ROTATE_LEFT((D + H(A, B, C) + X[ 8] + 0x8771f681), 11);
C = D + ROTATE_LEFT((C + H(D, A, B) + X[11] + 0x6d9d6122), 16);
B = C + ROTATE_LEFT((B + H(C, D, A) + X[14] + 0xfde5380c), 23);
A = B + ROTATE_LEFT((A + H(B, C, D) + X[ 1] + 0xa4beea44), 4);
D = A + ROTATE_LEFT((D + H(A, B, C) + X[ 4] + 0x4bdecfa9), 11);
C = D + ROTATE_LEFT((C + H(D, A, B) + X[ 7] + 0xf6bb4b60), 16);
B = C + ROTATE_LEFT((B + H(C, D, A) + X[10] + 0xbebfbc70), 23);
A = B + ROTATE_LEFT((A + H(B, C, D) + X[13] + 0x289b7ec6), 4);
D = A + ROTATE_LEFT((D + H(A, B, C) + X[ 0] + 0xeaa127fa), 11);
C = D + ROTATE_LEFT((C + H(D, A, B) + X[ 3] + 0xd4ef3085), 16);
B = C + ROTATE_LEFT((B + H(C, D, A) + X[ 6] + 0x04881d05), 23);
A = B + ROTATE_LEFT((A + H(B, C, D) + X[ 9] + 0xd9d4d039), 4);
D = A + ROTATE_LEFT((D + H(A, B, C) + X[12] + 0xe6db99e5), 11);
C = D + ROTATE_LEFT((C + H(D, A, B) + X[15] + 0x1fa27cf8), 16);
B = C + ROTATE_LEFT((B + H(C, D, A) + X[ 2] + 0xc4ac5665), 23);
/// round four (unrolled)
A = B + ROTATE_LEFT((A + I(B, C, D) + X[ 0] + 0xf4292244), 6);
D = A + ROTATE_LEFT((D + I(A, B, C) + X[ 7] + 0x432aff97), 10);
C = D + ROTATE_LEFT((C + I(D, A, B) + X[14] + 0xab9423a7), 15);
B = C + ROTATE_LEFT((B + I(C, D, A) + X[ 5] + 0xfc93a039), 21);
A = B + ROTATE_LEFT((A + I(B, C, D) + X[12] + 0x655b59c3), 6);
D = A + ROTATE_LEFT((D + I(A, B, C) + X[ 3] + 0x8f0ccc92), 10);
C = D + ROTATE_LEFT((C + I(D, A, B) + X[10] + 0xffeff47d), 15);
B = C + ROTATE_LEFT((B + I(C, D, A) + X[ 1] + 0x85845dd1), 21);
A = B + ROTATE_LEFT((A + I(B, C, D) + X[ 8] + 0x6fa87e4f), 6);
D = A + ROTATE_LEFT((D + I(A, B, C) + X[15] + 0xfe2ce6e0), 10);
C = D + ROTATE_LEFT((C + I(D, A, B) + X[ 6] + 0xa3014314), 15);
B = C + ROTATE_LEFT((B + I(C, D, A) + X[13] + 0x4e0811a1), 21);
A = B + ROTATE_LEFT((A + I(B, C, D) + X[ 4] + 0xf7537e82), 6);
D = A + ROTATE_LEFT((D + I(A, B, C) + X[11] + 0xbd3af235), 10);
C = D + ROTATE_LEFT((C + I(D, A, B) + X[ 2] + 0x2ad7d2bb), 15);
B = C + ROTATE_LEFT((B + I(C, D, A) + X[ 9] + 0xeb86d391), 21);
AA += A;
BB += B;
CC += C;
DD += D;
}
if((*(unsigned long *)(input) == (AA)) &&
(*(unsigned long *)(input+4) == (BB)) &&
(*(unsigned long *)(input+8) == (CC)) &&
(*(unsigned long *)(input+12) == (DD)))
return 1;
return 0;
}
Well you can disagree but i just did it to gain performance. The md5 hash function now uses registers for the whole md5 processing. The additional argument did take a few percent for the same reason as above. I should test is on 64-bit because then the global variables stuff will be converted to RIP-relative addressing.jhawthorn wrote:Latest changes have put me up to ~4050000 h/s on my AMD64 3000+. I have to disagree with changing digest, raw_inhash, and charset into global variables. Digest will almost certainly be used individually by each thread eventually. charset will, hopefully, not always be a constant. Moreover, all the functions are being inlined, so there shouldn't (assuming your compiler is half sane) be a big performance hit from passing them an additional argument.
I look forward to seeing the threaded and then distributed versions of this piece of code.
I added a new version because there was a bug in the previous versions (try aaa as a password). However it seemed to have slowed down a bit.
The process takes longer now but the hashes per second are still high. The total time is bogus any way. for instance test with the password zzzzza and zzzzzz.
@glneo$ time ./brute.exe d6a6bc0db10694a2d90e3a69648f3a03 6
Collision Found!
hash[d6a6bc0db10694a2d90e3a69648f3a03] = 'hacker'
- time: 47.14s
- avg. hash/s: 4327197.45 h/s
real 0m47.297s
user 0m47.187s
sys 0m0.030s
I tested that and it produced about the same result. I even made a MD5 has that doesn't have the length parameter either based on the assumption that a password generally is smaller then 64-9 = 55 characters, but even that didn't improve much.
@all
i've been trying to get the multithreaded function running and succeeded however the performace didn't even get close to the single threaded version so me is puzzled. I think i leave the MT version for Kevin
- Attachments
-
[The extension cc has been deactivated and can no longer be displayed.]
Author of COBOS
ahh. for such sweet moments we live. I am glad to announce that multi-threading is working . And for you pleasure here it is. I know that some of you are eager for the stats. I limited the sequence to 32 characters.
$ g++ brute-mt.cc -foptimize-register-move -finline-functions -fno-exceptions -fno-rtti -fomit-frame-pointer -O3 -march=i686 -o brute.exe
$ time ./brute.exe d6a6bc0db10694a2d90e3a69648f3a03 6 2
threadList[t].sequence[0]: 0
threadList[t].sequence[0]: 13
Collision Found!
hash[d6a6bc0db10694a2d90e3a69648f3a03] = 'hacker'
time: 23.23s
- avg. hash/s: 7106282.43 h/s
#done.
real 0m23.454s
user 0m46.655s
sys 0m0.015s
- Attachments
-
[The extension cc has been deactivated and can no longer be displayed.]
Author of COBOS
Well I have a Core 2 Duo running at 1.4Ghz and I have some results for the code os64dev posted above
2 Threads
1 Thread
EDIT: Cain takes 52 seconds on my computer when I set it to min 6 max 6 and the lowercase alpha charset. It says 4050000 pass/s.
2 Threads
Code: Select all
$ time ./brute d6a6bc0db10694a2d90e3a69648f3a03 6 2
threadList[t].sequence[0]: 0
threadList[t].sequence[0]: 13
Collision Found!
hash[d6a6bc0db10694a2d90e3a69648f3a03] = 'hacker'
time: 38.87s
- avg. hash/s: 4493377.81 h/s
#done.
real 0m38.977s
user 1m14.256s
sys 0m0.093s
Code: Select all
$ time ./brute d6a6bc0db10694a2d90e3a69648f3a03 6 1
threadList[t].sequence[0]: 0
Collision Found!
hash[d6a6bc0db10694a2d90e3a69648f3a03] = 'hacker'
time: 26.98s
- avg. hash/s: 3084769.79 h/s
#done.
real 0m27.098s
user 0m25.864s
sys 0m0.139s
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
Does this one do any better? I am too afraid to report my findings since the last time I pursued what I thought was faster was not.
Compile
gcc md5.c -o md5 -O3
Options
md5 [hash] [minimum-length] [maximum-length] [thread-count]
Try it with the hash:
d6a6bc0db10694a2d90e3a69648f3a03 = hacker (longer run time)
Also multiple threads on a UNI can increase the cracking time by starting at different offsets in the message space.
Compile
gcc md5.c -o md5 -O3
Options
md5 [hash] [minimum-length] [maximum-length] [thread-count]
Try it with the hash:
d6a6bc0db10694a2d90e3a69648f3a03 = hacker (longer run time)
Also multiple threads on a UNI can increase the cracking time by starting at different offsets in the message space.
- Attachments
-
- md5.c
- (11.55 KiB) Downloaded 127 times