Check out the new USENIX Web site. next up previous
Next: 5.2 Test Setup Up: 5 Real-World Tests Previous: 5 Real-World Tests

5.1 Metrics

The ideal bug detector would detect all extant bugs without flagging correct code as being incorrect. The initial output from cqual is a list of warnings that indicate a type error somewhere in the program. Some of these correspond to real bugs; others are false positives stemming from our conservative tainting approach (and lack of full polymorphism). False negatives are also of interest: we would like all vulnerabilities to show up as warnings. One complicating factor is that many warnings can result from the same bug--for example, if many functions reading network data call a single function that has a format string bug, then all the warnings may go away when that bug is fixed.

We chose the following metrics, measured per-program:

Figure 6: Results of our experimental evaluation of the tool. The size of the program is measured unpreprocessed and preprocessed, in thousands of lines of code, excluding comments. Time is the wall clock time for a run of cqual. Warnings counts the total number of warnings issued by cqual after the GUI's recommendations were followed, and Bugs is the number of real vulnerabilities found.
\begin{figure*}\begin{tabular}{lllrrrrr}
\hline
Name & Version & Description & L...
...dentification service &0.2k &1.2k &3s &0 &0\\
\hline
\end{tabular}\end{figure*}


next up previous
Next: 5.2 Test Setup Up: 5 Real-World Tests Previous: 5 Real-World Tests
Umesh Shankar 2001-05-16