Scalability of program model checkers such as JPF encompasses two aspects:
- How large a program can be model checked, and
- Once a defect has been detected, how readily meaningful debugging information can be derived from trace.
How large a program JPF can model check is highly dependent on the structure and amount of nondeterminism in the application. For example, UI applications or certain Web applications which have a central control loop (i.e., reactive state machines) that reacts to user input or signals from the environment are amenable to model checking. We have applied JPF successfully to multithreaded Mars Rover software which was around ten thousand lines of code (10 KLOC.) We have also applied JPF successfully to multithreaded UI programs with 33 KLOC of application source plus many more lines of code in the libraries. At this point, the largest program checked by JPF is a commercial Web application of several hundred thousand lines of code, in which JPF found a deadlock involving about 20 threads; JPF executed more than 200 million byte code instructions of the target application in about 4 hours.
The second aspect of scalability is the ability to derive meaningful debugging information from an execution trace that manifests a defect. This sense of scalability can represent a severe limitation if the defect occurs long into the program execution. JPF provides limited capabilities here—for example, JPF’s DeadlockAnalyzer and ChoiceTracker, which are used in user interface applications to generate trace reports showing only user-input events. They are used also in analyzing Unified Modeling Language (UML) to limit trace reports to showing only state-machine environment events.