Sunday, April 18, 2021

The effect of CPU, link-time (LTO) and profile-guided (PGO) optimizations on the compiler itself

 In other words, how much faster will a compiler be after it's been built with various optimizations?

Given the recent Clang12 release, I've decided to update my local build of Clang11 that I've been using for building LibreOffice. I switched to using my own Clang build instead of openSUSE packages somewhen in the past because it was faster. I've meanwhile forgot how much faster :), and openSUSE packages now build with LTO, so I've built Clang12 in several different ways to test the effect and this is it:

The file compiled is LO Calc's document.cxx, a fairly large source file, in a debug LO build. The compilation of the file is always the same, the only thing that differs is the compiler used and whether LO's PCH support is enabled. And the items are:

  1. Base - A release build of Clang12, with (more or less) the default options.
  2. CPU - As above, with -march=native -mtune=native added.
  3. LTO - As above, with link-time optimization used. Building Clang this way takes longer.
  4. LTO+PGO - As above, also with profile-guided optimization used. Building Clang this way takes even longer, as it needs two extra Clang builds to collect the PGO data.
  5. Base PCH - As Base, and the file is built with PCH used.
  6. LTO+PGO PCH - As LTO+PGO, again with PCH used.

Or, if you want this as numbers, then with Base being 100%, CPU is 85%, LTO is 78%, LTO+PGO is 59%, Base PCH is 37% and LTO+PGO PCH is 25%. Not bad.

Mind you, this is just for one randomly selected file. YMMV. For the build from the video from the last time, the original time of 4m39s with Clang11 LTO PCH goes down to 3m31s for Clang12 LTO+PGO PCH, which is 76%, which is consistent with the LTO->LTO+PGO change above.



  1. Any idea why the distro clang is not built with LTO and PGO?

    These two seem like a "worthwhile" price to pay at the distro level?

  2. The openSUSE Clang is built using LTO, the ~whole distro is ( and you can grep for "-flto" in So if your distro doesn't, you need to ask them. But presumably the answer is that LTO and/or PGO need more complicated setup, disk space and CPU time for building.