Call graphs in Python part 2

I’ve made some improvements to the program discussed last month in Call graphs in Python. It’s had a significant rewrite in program analysis, paying more attention to how names are used. I’ve also experimented a little more with rendering the call graphs as diagrams.

The program pyan.py now creates distinct nodes for each object encountered, e.g. Message.send and Socket.send are no longer considered the same thing. In that particular case, the two send methods could not even be said to be implementations of any meaningful abstract method. The first sends the subject (this), the second sends an object (a parameter) using the subject.

The program tries to infer which is being used in each particular instance; if it can’t do that, it will add use-edges to every possible match. For example, in this program:

x = Message()
x.send()

it can infer a reference to Message.send, after making certain reasonable assumptions, such as that Message is a class. But it is possible that Message is a function that just happens to begin with capital letter. This is all very heuristical; so can’t be relied upon as proper static analysis. But it does clean up the call graphs and makes them better approximate the program in most cases.

Analysing program structure using types

The main goal of this call/usage analysis is to identify, for each variable used in a function, which possible objects that variable could be bound to when the function is executed. This is done using a static analysis: the target program is not executed itself, but properties of its code are analysed to determine roughly how it will behave.

One of the major concepts in program analysis is types. Types are the sets of values that expressions (e.g. variables, and the results of operations on variables) can take. Types can be formal, for instance when there are a fixed set of abstract types , e.g. integer, string, and reference to window; and each expression has one of those types. That can also be informal, and implicit in the program behaviour, for instance the type of integers greater than 0, or the set of window references that are non-NULL).

Ultimately all program behaviour in terms of correctness can be defined in terms of informal types. Programs are functions, and functions have types, too– it’s often desirable to prove that the program is in the type of all functions that correct implement a given specification. Formal types are much more limited, and are a course-grained way to specify the types of parts of the program. In static languages, roughly speaking, the formal types of expressions are independent of the program’s input, and in dynamic languages they are less independent.

Thinking about languages I’ve used (and without researching the official terminology) I can identify five approximate levels of type “staticness”:

Static typing, in which each variable and expression is assigned a distinct type in the syntax of the program. That variable will only be assigned to objects of that type during execution. Types are disjoint — there is no inheritance or interfaces. C is an example of this kind of language. Every variable has a precise type — otherwise the compiler does not know how to compile machine code for storing and manipulating it. (C also casts that can allow an expression of one type to be assigned to a variable of a another; and unions that allow several variables of different types to occupy the same space. But all expressions still have distinct types.) Note that it is possible for types to be inferred in a static language.
Interface typing, in which a hierarchy of interfaces (or classes) are defined, and each variable is assigned a distinct interface. The variable may be assigned to any object whose type is a subtype of that interface. Subtyping is a partially ordered relation. A language where this occurs is Java; for example …
Inferential typing, in which the types of individual variables may be unknown, but are assigned type variables. Two variables with the same type variable must be the same type, although that type is known. Variables may also have their types inferred using standard type inference such as (a -> b) being the type of functions from a to b (whatever types a and b may actually be); if we know that x is of type Int, and we know f is of type a -> [a], then can infer that f(x) is of type [Int]. An example of this Haskell.
Dynamic typing, in which variables are merely bindings to objects, and can be rebound to objects of other types. Types pertain to objects, not variables or expressions. For example, Python has types (built in primitive types, any number of user-defined classes). But an expression in a program could be of one type at one point, and a completely different type at another. Even when the expression is always the same type, it is difficult to determine which type that is without running the program, or conducting a sophisticated analysis.
Untyped or “stringly-typed” programs. In these all objects are of the same type, which is made to represent all kinds of data. Typing here is an issue for the programmer’s mental model and arguably for the runtime validity of operations when applied to certain values, but the language model does not provide enough conceptual support for automatic analysis. An example would be shell scripting, in which strings are used for most data.

pyan.py analyses the abstract syntax tree of its target in a single pass. It can use assignments to a name on one line as an indication of the type of that name on subsequent lines.

There are a variety of erroneous analyses it does. One earlier example is this:

class IRC:
    def join(self, channel):
        pass

print ','.join(range(5))

In this case the program will confuse the join method on strings with the join method on the IRC class. This was fixed by adding code to analyse constants. The type of ‘,’ is str, which means join on that line is a reference to str.join, not to a generic join method that could apply to any class.

Diagramming

I’ve continued to look for neat ways to draw the resulting graph.

The two programs I’ve experimented with are GraphViz and yEd. GraphViz is well known and easy to automate. It consists of a set of command line tools, with a common, text-based graph input format. It’s major drawback is that it only has a few layout modes (though they can be heavily tweaked). It works great for simple or highly hierarchical graphs.

yEd is newer and is a very refined GUI program. Unfortunately it is not open source, but the yEd editor is free. (I understand that graph layout libraries are available for separate (i.e. automated) use, for a fee. The layout libraries that come with yEd are obfuscated to prevent external use — the same company also sells the Java obfuscator!) I have restricted myself to laying out the graph using the options in the yEd GUI.

Updated call graph for my problematic program. Laid out by yEd (not as compactly as I’d like though). The objects from each module are colourised (manually, since the pyan.py export format is TGF which means Trivial Graph Format — no room for colour information in it).

Here is how yEd has laid out the program:

Call/usage graph for a Python program

The objects are grouped into modules, and the layout algorithm attempts to keep the groups separate.

I’ve coloured each module’s objects manually. pyan.py exports to yEd using the Trivial Graph Format, which consists only of a list of vertices and a list of edges, each with optional names — there’s no room for formatting information. If I write an exporter for a more advanced format such as yEd’s GraphML, it should be possible for pyan.py to calculate colours for each node and to pass them through.

5 Responses to Call graphs in Python part 2

Juha Jeronen says:

August 17, 2012 at 8:57 pm

Hi,

Thanks for a great tool!

I was looking for a tool to make call graphs for Python, and pyan fits the bill almost exactly. Combined with the “fdp” (spring model) filter of GraphViz, and the interactive xdot viewer, it’s a nice help for refactoring (specifically, looking for mostly-independent subsets of functions for splitting a large module).

Inspired by your screenshot at the end of the post, I made a modified version of pyan with some added features that may help when working with large projects.

The modified version has command-line options to:

– Color nodes automatically. A HSL color model is used, picking the hue based on the top-level namespace (effectively, the module). The colors start out light, and then darken for each level of nesting. Seven different hues are available, cycled automatically.

– Group nodes in the same namespace. GraphViz clusters are used for this. The namespace name is used as the cluster label. Groups can be created as standalone (always inside top-level graph) or nested. The nested mode follows the namespace structure of the code.

– Disable generation of links for “defines” relationships. This can make the resulting graph look much clearer, when there are a lot of “uses” relationships. This is especially useful for layout with “fdp”.

– Disable generation of links for “uses” relationships. Can be useful for visualizing just where functions are defined.

Also, now the “defines” relations are drawn with gray arrows, so that it’s easier to visually tell them apart from the “uses” relations when there are a lot of edges of both types in the graph.

Nodes are now always filled (white if color disabled), and made translucent to clearly show arrows passing underneath them. This is useful for large graphs with the fdp filter.

With the exception of the disable switches, which support both output formats, all modifications are for GraphViz output only.

The modified version is available at:

https://yousource.it.jyu.fi/jjrandom2/miniprojects/blobs/master/refactoring/pyan.py

As it stands, the added code is a bit messy. Tell me if you want to integrate the changes, and we can see if it can be cleaned up sufficiently. :)

- ejrh says:
  
  August 18, 2012 at 3:26 pm
  
  Wow, that’s great work! The analysis code was already quite messy and unmaintainable. I was kind of hoping to see it all rewritten in your version but it’s understandable that it wasn’t — even I don’t know how it works any more.
  
  The .tgf format is obsolete, I just picked it because I wanted to see how effective the yEd layouts were, and .tgf was easiest to implement. I should replace it with .graphml format, which would support the additional options you’ve added for .dot.
  
  You can keep it in your git repository, since you are more likely to write more improvements — I’ll just copy the pyan.py changes for now. If I make any major improvements myself I’ll post about them here.
  
  - Juha Jeronen says:
    
    August 19, 2012 at 7:26 pm
    
    Thanks. I think the analysis code already works pretty well, so I didn’t look into it much… (I picked pyan particularly, because it understands classes.)
    
    The only case where I’ve gotten it to trip up was to (out of curiosity) feed it pyan.py and xdot.py at the same time – both of them happen to contain a class named Node, and this generates some spurious links between the two pieces of software that should be completely disconnected. It’s not exactly a real-world use case, and I didn’t have any immediate ideas as to how to fix it (hmm – maybe prefer nodes within the same namespace hierarchy?), so I decided to leave it as-is.
    
    Ok, I’ll keep it in my repository.
    
    And thanks for the new post with screenshots :)
Pingback: Coloured call graphs! | EJRH
Akshay Virdhe says:

August 31, 2013 at 11:39 am

Hi,

I was trying to use your tool to get callgraph for a large python project. However, it was taking way too long. I noticed that it was stuck in the cull_inherited() method. When I commented line 493 (v.cull_inherited()), it ran to completion in a shorter length of time. However, the output was a graph with more than 4 million edges with many incorrect edges. I tried to understand your code by reading it but I could not be certain of how it works. Could you, please, elaborate what exactly contract_nonexistents(), expand_unknowns() and cull_inherited() do?

Thanks,
Akshay