Monday, May 20, 2019

Linux perf and KCachegrind

If you occassionally do performance profiling as I do, you probably know Valgrind's Callgrind and the related UI KCachegrind. While Callgrind is a pretty powerful tool, running it takes quite a while (not exactly fun to do with something as big as e.g. LibreOffice).

Recently I finally gave Linux perf a try. Not quite sure why I didn't use it before, IIRC when I tried it somewhen long ago, it was probably difficult to set up or something. Using perf record has very little overhead, but I wasn't exactly thrilled by perf report. I mean, it's text UI, and it just gives a list of functions, so if I want to see anything close to a call graph, I have to manually expand one function, expand another function inside it, expand yet another function inside that, and so on. Not that it wouldn't work, but compared to just looking at what KCachegrind shows and seeing ...

When figuring out how to use perf, while watching a talk from Milian Wolff, on one slide I noticed a mention of a Callgrind script. Of course I had to try it. It was a bit slow, but hey, I could finally look at perf results without feeling like that's an effort. Well, and then I improved the part of the script that was slow, so I guess I've just put the effort elsewhere :).

And I thought this little script might be useful for others. After mailing Milian, it turns out he just created the script as a proof of concept and wasn't interested in it anymore, instead developing Hotspot as UI for perf. Fair enough, but I think I still prefer KCachegrind, I'm used to this, and I don't have to switch the UI when switching between perf and callgrind. So, with his agreement, I've submitted the script to KCachegrind. If you would find it useful, just download this do something like:

$ perf record -g ...
$ perf script -s perf2calltree.py > perf.out
$ kcachegrind perf.out



No comments:

Post a Comment