SoftwareSecurity2013/Group 42/Code Scanning Reflection
Reflection on RIPS
RIPS is a code analyzer written exclusively for PHP applications. Its main focus is to detect where tainted user input reaches sensitive sinks, i.e. potentially vulnerable functions. It is because of these properties that we believe it is quite suitable for the task - scanning an open source web application written in php, focusing on ASVS V5 - Input validation.
Setting it up is quite straightforward and quick. Scanning the entire mediawiki code took just over half an hour. The first thing that drew our attention was the message stating RIPS does not support object oriented code. A pity since it does its job quite well (more on that below) and most of mediawiki's code is object oriented.
While viewing the results of the analyzer we found immediately that it is by far too verbose. E.g. it is common practice to report the usage of a potentially vulnerable function A within function B, even when no path from actual user input to function B exists. (Erik:Is is a blanket warning that RIPS provides, without even looking if user inout van reach B, or is it an inaccuracy in its data flow analysis, which overestimates flows and hence incorrectly thinks that user input can read B?) Hence this may be an example of a bad design, but it can even be the most sensible design in many cases. In any case it is not a vulnerability and we would have highly appreciated an option to suppress these results in order to quickly dive into the more serious threats.
The very nice property, which is where the underlying strength of RIPS lies, is the traces. The traces enable the user to immediately see where in the code user input is taken, what intermediate functions are called on this data, and in which potentially vulnerable function it eventually ends up. If one would analyze the code by hand it is easy to lose the complete overview of all the data flows. RIPS automates this, making it very easy to spot where things go wrong and whenever they do the place where to patch the code also becomes almost immediately clear.
As stated, the output is by far too verbose, and therefore most of the reported results are not vulnerabilities and hence false positives. This need not be a bad thing if analyzing a small project and when the user is also interested in spotting a bad design. However, we truly missed a feature that enables the user to organize, filter and customize the reports. The output in its current form seems too rigid.
Reflection on Fortify
Fortify is a very extensive commercial code analyzer not specifically targeted to a specific purpose or language. It attempts to be as universal as possible and boasts an impressive amount of options. When we first started using it we were a bit sceptical about its usefulness for our purpose because of its universality. However it turned out to be quite an impressive analyzer after all.
Setting it up was not very painful, although 3GB for a code analyzer seems quite bloated. Once we got the analyzer up and running we pointed it to the wikimedia source code and started the analyzer. This is where the pain starts. The analyzer runs for about ten minutes only to report a tiny amount of not very serious vulnerabilities and stating that it has run out of memory. Also it is unable to parse the following statement within a class definition:
static $attribsRegex;
It turns out the php parser included in Fortify requires the code to assign a value to a (static) class variable. This makes some sense since the above statement does not do anything since php does not have a notion of class variables; objects can have attributes assigned and when an attribute is queried that has not been assigned, null is returned, which is also returned when the above statement is given in the class definition. However in for example C++ class definitions it is mandatory to define class variables and it can be regarded a good code convention to do so in php as well.
This and similar parse errors were easily fixed but it was quite annoying the errors would pop up only after ten minutes of analysis. Same holds for the out of memory error, which was solved by passing "-Xmx4G" "-64"
as arguments to the code analyzer.
When digging through the reported issues it was easy to focus on our objective - input validation, as the issues are conveniently organized in groups. We haven't looked into all reported issues due to lack of time, but found that most issues are either false positives or duplicates. (Erik:Again, on how many out of how many warnings do you base this conclusion?) Although Fortify surely offers a way of flagging patterns of detected vulnerabilities as safe, we didn't extensively look for such functionality since most of the duplicates are within grouped together and hence easily skipped over.
We were impressed by fortify's nice graphical diagram that displays a trace on how tainted input reaches possibly vulnerable functions unsanitized. This is a very convenient and intuitive way of displaying how possible malicious data flows through the program at a glance. The diagram can also be clicked, resulting in a jump to the corresponding code.
We must say that fortify detected quite an impressive number of issues. Although an overwhelming majority was about hardcoding passwords in databases. Fortify triggers such an error whenever it finds the word "password" in a string. This of course does not help the user at all apart from very rare cases. However, it is hard to come up with a smarter way of detecting hardcoded passwords and at least the cases where a password is hardcoded do get detected. Fortunately we could do away with all these warnings by simply collapsing this category of issues.
All in all, it seems safe to say Fortify is far ahead of open source alternatives in many aspects. Unfortunately, due to all its features, it has a somewhat steep learning curve, although it is not too bad and it is targeted at security specialists anyway. What's more unfortunate is the vast amount of resources required to run this tool and the errors that only pop up after the analysis is complete.