To Read or to Run: Two Ways to Understand Source Code Evidence

Daniel W. Steinbrook

Technology Strategy & Analysis

Client Bulletin

April 19, 2021

Back in kindergarten, when given the freedom to run around at recess, I was much more likely to read a book. A few decades later, I’m still faced with this choice: when given a collection of source code to analyze, should I read it or run it?

Computer scientists refer to these options as static analysis and dynamic analysis. There are many tools to assist with both, but most are geared towards finding bugs or security holes. By contrast, we’re often assessing IP claims. We’re interested in knowing exactly what parts of the code are run, and when. Is this an activity for the library or the playground?

When to read

Source: https://xkcd.com/1513/

In a software patent lawsuit, reading the source code is essential. That’s because software patents generally cover specific algorithms or architectures of a software system. The source code lays out how the software is implemented and organized.

If authorship is in question, then reading is also most likely the best approach. Source code repositories like Git can make it trivial to determine who last modified a given file or line of code.

Finally, reading code can also identify relevant licenses. These are often included as code comments, or as separate text files alongside the source code.

When to run

Running a complex codebase can help to identify which parts of the codebase are actively used and which are not. A block of source code that never runs is known as dead code. While an expert can say with some confidence whether a particular piece of code is run by reading a program, it can be much quicker to identify dead code by just running the program. When reading code, it’s often not readily apparent whether a particular section will ever execute. The order in which lines of a program run, called the control flow, rarely exactly matches the order in which lines appear in a source code file. Indeed, in some environments, various sections can execute spontaneously in response to mouse clicks or keystrokes.

Determining the visual appearance of an application is another task where running can help. Some user interfaces (UIs) are stored as text-based specifications. These describe the exact coordinates of every button and box on the screen. This is only decipherable by putting the source code in a program that can render that specification into a graphical representation.

The rendered appearance (left) and machine-generated source code (right) of a user interface.[1]

Unfortunately, the source code evidence produced in discovery is not always straightforward to run. Incomplete productions can leave out module dependencies. Build files that indicate how to assemble the source code are also sometimes necessary but missing. Experts can develop tools to work around these limitations, or identify when additional discovery requests are required.

Conclusion

Source code is unique among evidentiary productions. Unlike other documents that are preserved and reviewed as-is, source code can be run and manipulated to understand more fully. Reading is often relied upon more heavily, but running the code can lead to insights that wouldn’t otherwise be apparent.

[1] D. Draheim, C. Lutteroth & G. Weber. “Graphical user interfaces as documents.” Proceedings of the 7th ACM SIGCHI New Zealand Chapter's International Conference on Computer-Human Interaction: Design Centered HCI, 2006, Christchurch, New Zealand, July 6-7, 2006. 67-74. 10.1145/1152760.1152769.