Skip to content

Performance Profiling

Valgrind

Useful Valgrind tools:

  • Memcheck
  • Callgrind/Cachegrind

Basics

To profile your C/C++ application you have to compile it with the -g flag

Callgrind

Ref: https://valgrind.org/docs/manual/cl-manual.html

Base command

sh
valgrind --tool=callgrind <executable> <arguments>
valgrind --tool=callgrind <executable> <arguments>

This command witll run the program executable. After terminating the program, a file callgrind.out.pid (where pid is replaced with the actual process ID) will be created.

sh
callgrind_annotate callgrind.out.<pid> --inclusive=yes --tree=both | less
callgrind_annotate callgrind.out.<pid> --inclusive=yes --tree=both | less

Kcachegrind

Kcachegrind is a tool for analyzing output from profiling tools such as Valgrind.

Once you've created your output from Callgrind, import it into Kcachegrind to evaluate the profilier results.

Understanding the output

  • Ir counts are effectively the number (count) of assembly instructions executed.

Tips

  • Rarely used functions aren't worth optimizing, only those that are called often or are otherwise computationally expensive.
  • Incl. (Inclusive cost) refers to the cost of the function including the cost of all called functions whereas Self (Self/exclusive cost) refers to the cost of the commands in the function itself.

Gprof2dot

sh
gprof2dot --format=callgrind --output=out.dot callgrind.out.9000
gprof2dot --format=callgrind --output=out.dot callgrind.out.9000

Then make the graph with dot. For example you can make a PNG or SVG with

sh
dot -Tpng out.dot -o graph.png
dot -Tsvg out.dot -o graph.svg
dot -Tpng out.dot -o graph.png
dot -Tsvg out.dot -o graph.svg

Linux Perf

Ref: https://perf.wiki.kernel.org/index.php/Main_Page

Google Benchmark

Ref: https://github.com/google/benchmark

ADDITIONAL RESOURCES