I was wondering if anyone has looked into using fuzzing to detect side-channel vulnerabilities, and if there would be any interest in this group in pursuing that?
The basic idea is to split your fuzzing input into two parts, and run the two input-parts against your program. The fitness function of the fuzzer is then modified so that, in addition to coverage, it's also trying to maximize the difference in time, memory, and instructions between the two input parts. The research suggests that this can find real-world timing vulnerabilities. More information on the research can be found here: https://arxiv.org/abs/1811.07005
Sounds cool! I have never tried it! But there might be some others who did! Even though I never saw anything like this pass here.
Note that this can detect only a limited set of side-channel vulnerabilities. It's a worthwhile task, but won't catch things like Hamming-weight-based vulnerabilities.
This seems much alike to dudeCT statistical tests. dudeCT is also based on a two input-part approach, where you flip a coin for each part whether it should be random or not (you can surely also randomize lengths to test for padding-based timing side channels, instead of just comparisons though I haven't done that myself yet). DifFuzz, at first glance, also seems limited to timing-based side-channel vulnerabilities (please correct me if I'm wrong), and I can't quite tell why I should use that instead of dudeCT. Is there something I've missed?
Edit: I'm interested in this. I've been working on some side-channel testing myself for a library, and maybe I could use this for testing as well.
I could be wrong, but I think the main difference is that for dudeCT you need to hand-craft the input variables and it will tell you if there is a statistical difference between the two runs. With DiffFuzz, the inputs are dynamically discovered using a genetic algorithm. I think you could combine dudeCT and DiffFuzz together where DiffFuzz is used to find the inputs, and then dudeCT screens the results of DiffFuzz to determine wether the results are "real" or just statistical noise.
@brycx, I'm considering going ahead and building a proof-of-concept crate that uses a genetic-algorithm crate (there seems to be a few good options out there) and PAPI (https://github.com/LutzCle/papi-rs). If you're interested in this space as well, it might be worth while coordinating on approach before I get too deep into it.
Then it's somewhat in the lines of what I suspected. I still don't know whether it is limited to timing side-channels or if it could measure anything else. I actually thought about whether it was something to be combined, because finding input for dudeCT that produces the largest t-values could be handy. On the other hand, some crypto algorithms are very straightforward to craft such inputs for, such as comparisons. I'm not familiar with PAPI, but the crate seems to only support (only tested on) Linux of the major platforms. If DifFuzz only supports timing-based side-channels, it would be a shame to limit the fuzzer to only Linux machines (if the crate doesn't work on other major platforms), when the CPU's cycle counter could just as well have been used.
This would be my first time being involved building something like this, so it's probably best if you just go ahead on the PoC. This way I can follow along and get a more solid understanding of what this project would consist of. I'm still happy to discuss it though.
Also for dudect, you don't actually need to hand-craft, you can use random ones as well. I'm sure hand-crafted work better in some instances though.
@brycx , thanks for your input. I would opt to use PAPI mainly because my skillset is not very well suited to low-level work. I'll try to get the prototype working with PAPI, then open it up to other contributors who might have the skillset to interface with the hardware more directly. I'll let the group here know when I have something working. :)
No problem, and that's understandable. Looking forward to it!
brycx , thanks for your input. I would opt to use PAPI mainly because my skillset is not very well suited to low-level work. I'll try to get the prototype working with PAPI, then open it up to other contributors who might have the skillset to interface with the hardware more directly. I'll let the group here know when I have something working. :)
I would actually suggest that you keep using papi-rs for as long as you possibly can; no need to reinvent the wheel when someone is giving a perfectly good one!
I wanted to give an update on fuzzing for side-channel vulnerabilities.
I've got a basic proof-of-concept up and running here:
I've got a few example fuzzing targets here:
Summary: It works! It can successfully tell the difference between constant-time and non-constant time functions by fuzzing inputs. It operates in two steps:
Step 1. Use a genetic algorithm to find candidate input pairs that look like they might have large differenced in timing (similar to DiffFuzz)
Step 2. Uses statistics to check t-values to confirm that there is really a difference (similar to DudeCT).
Right now I'm counting cpu-cycles as my primary measurement. It should work on all x64 platforms. Using cpu-cycles is a bit noisy, so I still want to use PAPI behind a feature-flag for linux users. Please take it for a run and give feedback. There's still lots of cleanup to do in terms of documentation etc etc.
The next steps are:
1. Clean up the code and document.
2. Start making fuzzing-targets against real crates!
That sounds pretty cool! Good work man! I hope I can give it a try somewhere today and report back here!
I took another look at the README and stuff like that! I think it really looks good! The targets repository is nice as well.
Just the other day I created a list of Rust related security projects here: https://github.com/rust-secure-code/rustsec-projects
If you want I can add your fuzzer here as well? Or if you want you can create a PR / issue yourself on that repo :)
Keep the good work up at least!
Is the motivation that no such tool currently exists for any language, or just for Rust? Or is it something I've missed?
There’s a similar tool for Java here: https://github.com/isstac/diffuzz
However, SideFuzz is not a strict port of the above, and is trying to combine the best parts of both Java’s DiffFuzz and dudect. In that regard it might be unique.
Another update on side-channel fuzzing.
You can check out the sidefuzz tool here: https://github.com/phayes/sidefuzz
It is "basically done" modulo a few minor improvements and publishing to crates.io . I made a major pivot in the architecture of the whole thing. Instead of counting CPU cycles, it now requires that targets be compiled to wasm, and counts individual instructions as it executes the target inside a wasm interpreter. This has some advantages and disadvantages:
1. Major advantage is that using wasm makes the fuzzer WAY more accurate. It is now easily able to pick out side-channel vulnerabilities, and consistently evolves inputs that maximize timing differences. More importantly, there are no more false-positives. It works exactly as it should, and it works really well.
2. Major disadvantage is that we now require the targets to compile to wasm. Targets like ring don't compile to wasm, so we can't fuzz them. However, I expect this disadvantage to slowly diminish over time as more and more cryptography crates add wasm support.
3. Bonus: we can now fuzz non-rust code for side-channel vulnerabilities. Go, C, and C++ can all be used with sidefuzz if they can compile to wasm.
Please take a look at provide feedback. The tool has already found it's first side-channel vulnerability, and I will be reporting that to it the relevant crate shortly.
Fist potential side-channel vulnerability found reported: https://github.com/RustCrypto/RSA/issues/19
Sounds cool! Let's hope most projects make the transition to WASM such that we can fuzz most things.