When working in QA or Test Automation you are much more likely to be confronted with a legacy application with hundreds of thousands of code lines with missing documentation and test cases than finding well documented one with high test coverage and beautiful code. In many cases the legacy code has to be re-factored to improve testability. Therefore one critical skill is to be able to work with legacy code.
When I start reading source code I start from a birds eye perspective. I first want to know how big the project is I am looking at.
Loc comes in handy in this situation to get a first general impression about the scale of the code base.
Measuring Lines Of Code (LOC)
sudo apt-get install sloccount
cd sloccount openerp-server-5.0.0_rc3
SLOC Directory SLOC-by-Language (Sorted)
58241 addons-extra python=50644,php=7434,sh=163
53168 addons python=53126,php=42
22288 server python=22287,sh=1
15599 client python=15599
12334 web python=12205,sh=129
72 top_dir python=72
Totals grouped by language (dominant language first):
python: 153933 (95.20%)
php: 7476 (4.62%)
sh: 293 (0.18%)
Measuring complexity (McCabe)
Another very useful code metric in the situation described above is the McCabe complexity metric. This tool helps you to identify the most complex code. The complex areas need the most attention when it comes to quality assurance measures (documentation, testing, etc.). These areas usually contain the most defects, too.
#manually extract tarball
PyMetrics.py program.py
Understanding the code structure using Callgraphs
Almost every time I am approaching a code base unknown to me I am looking for a certain functionality which I am particularly interested in. As soon as I understand the functionality I go to the next one and so on until I understand everything I need to. To identify relevant parts in the code a call graph proved to be very handy. The graph “lists” all the relevant modules and functions and shows in which order they are called. I always use the call graph as a “map” that helps me to navigate the code base unknown to me.
Installation of the pycallgraph tool (in Ubuntu)
apt-get install pycallgraph
Instead of starting your program with your python interpreter you just use pycallgraph to execute your script. For example run the scotch recording proxy within the pycallgraph tool:
pycallgraph run-recording-proxy -i scotch.* -e *.*
Call graphs get big very fast because the program usually calls a lot of library functions. For that reason I excluded everything except the scotch module from the diagram (-i and -e options), which I included.

Printing the callgraph
Callgraphs are usually big especially if you are working with a non trivial module. To be able to print those callgraphs in case you do not have a plotter the program Dia (Diagrams, UML, etc.) comes in very handy. To install it on a Ubuntu box just type:
sudo apt-get install dia
Dia helps you to split huge visualizations graphics into multiple pages that you then print separately.
Tags: development
Leave a reply