Active7 years, 10 months ago
I am doing an assignment for class which I have to create a brute force password cracker in java. Write a function using Recursion to crack a password. The password is of unknown length (maximum 10) and is made up of capital letters and digits. The brute-force attack would likely start at one-digit passwords before moving to two-digit passwords and so on, trying all possible combinations until one works. A “dictionary attack” is similar and tries words in a dictionary — or a list of common passwords — instead of all possible passwords. There are different types of attacks. For example, the brute force attack would simply try all possible password combinations starting with “0” and followed with “1”, “2”,, all the way to “ZZZZZZZZZ” or whatever the last character or special symbol there is instead of the “Z” in the chosen character set. I'm currently working on a Brute Force Password Cracker in C++ that can crack passwords 8 times as fast! ( Averaging 16 million passwords per second ) 11 months ago.
I'm working on my 10th grade science fair project right now and I've kind of hit a wall. My project is testing the effect of parallelism on the efficiency of brute forcing md5 password hashes. I'll be calculating the # of password combinations/second it tests to see how efficient it is, using 1, 4,16,32,64,128,512,and 1024 threads. I'm not sure if I'll do dictionary brute force or pure brute force. I figure that dictionary would be easier to parallelize; just split the list up into equal parts for each thread. I haven't written much code yet; I'm just trying to plan it out before I start coding.
My questions are:
- Is calculating the password combinations tested/second the best way to determine the performance based on # of threads?
- Dictionary or pure brute force? If pure brute force, how would you split up the task into a variable number of threads?
- Any other suggestions?
68k2323 gold badges183183 silver badges220220 bronze badges
Andy HadenAndy Haden
4 Answers
I'm not trying to dampen your enthusiasm, but this is already quite a well understood problem. I'll try to explain what to expect below. But maybe it would be better to do your project in another area. How's about 'Maximising MD5 hashing throughput' then you wouldn't be restricted to just looking at threading.
I think that when you write up your project, you'll need to offer some kind of analysis as to when parallel processing is appropriate and when it isn't.
Each time that your CPU changes to another thread, it has to persist the current thread context and load the new thread context. This overhead does not occur in a single-threaded process (except for managed services like garbage collection). So all else equal, adding threads won't improve performance because it must do the original workload plus all of the context switching.
But if you have multiple CPUs (cores) at your disposal, creating one thread per CPU will mean that you can parallelize your calculations without incurring context switching costs. If you have more threads than CPUs then context switching will become an issue.
There are 2 classes of computation: IO-bound and compute-bound. An IO-bound computation can spend large amounts of CPU cycles waiting for a response from some hardware like a network card or a hard disk. Because of this overhead, you can increase the number of threads to the point where the CPU is maxed out again, and this can cancel out the cost of context switching. However there is a limit to the number of threads, beyond which context switching will take up more time than the threads spend blocking for IO.
Compute-bound computations simply require CPU time for number crunching. This is the kind of computation used by a password cracker. Compute-bound operations do not get blocked, so adding more threads than CPUs will slow down your overall throughput.
The C# ThreadPool already takes care of all of this for you - you just add tasks, and it queues them until a Thread is available. New Threads are only created when a thread is blocked. That way, context switches are minimised.
How To Crack Passwords With Kali Linux
I have a quad-core machine - breaking the problem into 4 threads, each executing on its own core, will be more or less as fast as my machine can brute force passwords.
To seriously parallelize this problem, you're going to need a lot of CPUs. I've read about using the GPU of a graphics card to attack this problem.
There's an analysis of attack vectors that I wrote up here if it's any use to you. Rainbow tables and the processor/memory trade offs would be another interesting area to do a project in.
sheikhjabootiesheikhjabootie6,40722 gold badges2929 silver badges4040 bronze badges
To answer your question:1) There is nothing like the best way to test thread performance. Different problems scale differently with threads, depending on how independent each operation in the target problem is. So you can try the dictionary thing. But, when you analyse the results, the results that you get might not be applicable on all problems. One very popular example however, is that people try a shared counter, where the counter is increased by a fixed number of times by each thread.
2) Brute force will cover a large number of cases. In fact, by brute force, there can be an infinite number of possibilities. So, you might have to limit your password by some constraints like the maximum length of the password and so on. One way to distribute brute force is to assign each thread a different starting character for the password. The thread then tests all possible passwords for that starting character. Once the thread finishes its work, it gets another starting character till you use all possible starting symbols.
3) One suggestion that I would like to give you is to test on a little smaller number of threads. You are going upto 1024 threads. That is not a good idead. The number of cores on a machine is generally 4 to 10. So, try not to exceed the number of threads by a huge number than the number of cores. Because, a processor cannot run multiple threads at the same time. Its one thread per processor at any given time. Instead, try to measure performace for different schemes for assigning the problem to different threads.
Let me know if this helps!
![Algorithm Algorithm](/uploads/1/2/6/4/126464948/371136228.png)
3,01666 gold badges4040 silver badges7373 bronze badges
One solution that will work for both a dictionary and a brute-force of all possible passwords is to use a approach based around dividing the job up into work units. Have a shared object responsible for dividing the problem space up into units of work - ideally, something like 100ms to 5 seconds worth of work each - and give a reference to this object to each thread you start. Each thread then operates in a loop like this:
The advantage of this over just parcelling up the whole workspace into one chunk per thread up-front is that if one thread works faster than others, it won't run out of work and just sit idle - it'll pick up more chunks.
Hard To Crack Passwords
Ideally your work item generator would have an interface that, when called, returns an iterator, which itself returns individual passwords to test. The dictionary-based one, then, selects a range from the dictionary, while the brute force one selects a prefix to test for each batch. You'll need to use synchronization primitives to stop races between different threads trying to grab work units, of course.
Nick JohnsonNick Johnson95.5k1616 gold badges119119 silver badges192192 bronze badges
In both the dictionary and brute force methods, the problem is Embarrassingly Parallel.To divide the problem for brute force with n threads, just say, the first two (or three) letters (the 'prefix') into n pieces. Then, each thread has a set of assigned prefixes, like 'aa - fz' where it is responsible only for testing everything that follows its prefixes.
Dictionary is usually statistically slightly better in practice for cracking more passwords, but brute force, since it covers everything, cannot miss a password within the target length.
VoidStarVoidStar