TIP – Optimizing IR

To allow the user to add passes at every extension point, you need to add the preceding code snippet for each extension point.

  1. Now is a good time to try out the different pass manager options. With the –debug-pass-manager option, you can follow which passes are executed in which order. You can also print the IR before or after each pass, which is invoked with the –print-before-all and –print-after-all options. If you created your own pass pipeline, then you can insert the print pass in points of interest. For example, try the –passes=”print,inline,print” option. Furthermore, to identify which pass changes the IR code, you can use the –print-changed option, which will only print the IR code if it has changed compared to the result from the pass before. The greatly reduced output makes it much easier to follow IR transformations.

The PassBuilder class has a nested OptimizationLevel class to represent the six different optimization levels. Instead of using the “default<O?>” pipeline description as an argument to the parsePassPipeline() method, we can also call the buildPerModuleDefaultPipeline() method, which builds the default optimization pipeline for the request level – except for level O0. This optimization level means that no optimization is performed.

Consequently, no passes are added to the pass manager. If we still want to run a certain pass, then we can add it to the pass manager manually. A simple pass to run at this level is the AlwaysInliner pass, which inlines a function marked with the always_inline attribute into the caller. After translating the command-line option value for the optimization level into the corresponding member of the OptimizationLevel class, we can implement this as follows:
    PassBuilder::OptimizationLevel Olevel = …;
    if (OLevel == PassBuilder::OptimizationLevel::O0)
      MPM.addPass(AlwaysInlinerPass());
    else
      MPM = PB.buildPerModuleDefaultPipeline(OLevel, DebugPM);

Of course, it is possible to add more than one pass to the pass manager in this fashion. PassBuilder also uses the addPass() method when constructing the pass pipeline.

Running extension point callbacks

Because the pass pipeline is not populated for optimization level O0, the registered extension points are not called. If you use the extension points to register passes that should also run at O0 level, this is problematic. You can call the runRegisteredEPCallbacks() method to run the registered extension point callbacks, resulting in a pass manager populated only with the passes that were registered through the extension points.

By adding the optimization pipeline to tinylang, you created an optimizing compiler similar to clang. The LLVM community works on improving the optimizations and the optimization pipeline with each release. Due to this, it is very seldom that the default pipeline is not used. Most often, new passes are added to implement certain semantics of the programming language.

Summary

In this chapter, you learned how to create a new pass for LLVM. You ran the pass using a pass pipeline description and an extension point. You extended your compiler with the construction and execution of a pass pipeline similar to clang, turning tinylang into an optimizing compiler. The pass pipeline allows the addition of passes at extension points, and you learned how you can register passes at these points. This allows you to extend the optimization pipeline with your developed passes or existing passes.

In the next chapter, you will learn the basics of the TableGen language, which is used extensively in LLVM and clang to significantly reduce manual programming.

Creating an optimization pipeline – Optimizing IR-2

  1. Now, we must replace the existing emit() function with a new version. Additionally, we must declare the required PassBuilder instance at the top of the function:

bool emit(StringRef Argv0, llvm::Module *M,
llvm::TargetMachine *TM,
StringRef InputFilename) {
PassBuilder PB(TM);

  1. To implement the support for pass plugins given on the command line, we must loop through the list of plugin libraries given by the user and try to load the plugin. We’ll emit an error message if this fails; otherwise, we’ll register the passes: for (auto &PluginFN : PassPlugins) {
    auto PassPlugin = PassPlugin::Load(PluginFN);
    if (!PassPlugin) {
    WithColor::error(errs(), Argv0)
    << “Failed to load passes from ‘” << PluginFN << “‘. Request ignored.\n”; continue; } PassPlugin->registerPassBuilderCallbacks(PB);
    }
  2. The information from the static plugin registry is used in a similar way to register those plugins with our PassBuilder instance:

define HANDLE_EXTENSION(Ext) \
getExtPluginInfo().RegisterPassBuilderCallbacks( \
PB);
include “llvm/Support/Extension.def”

  1. Now, we need to declare variables for the different analysis managers. The only parameter is the debug flag: LoopAnalysisManager LAM(DebugPM);
    FunctionAnalysisManager FAM(DebugPM);
    CGSCCAnalysisManager CGAM(DebugPM);
    ModuleAnalysisManager MAM(DebugPM);
  2. Next, we must populate the analysis managers with calls to the respective register method on the PassBuilder instance. Through this call, the analysis manager is populated with the default analysis passes and also runs registration callbacks. We must also make sure that the function analysis manager uses the default alias-analysis pipeline and that all analysis managers know about each other: FAM.registerPass(
    [&] { return PB.buildDefaultAAPipeline(); });
    PB.registerModuleAnalyses(MAM);
    PB.registerCGSCCAnalyses(CGAM);
    PB.registerFunctionAnalyses(FAM);
    PB.registerLoopAnalyses(LAM);
    PB.crossRegisterProxies(LAM, FAM, CGAM, MAM);
  3. The MPM module pass manager holds the pass pipeline that we constructed. The instance is initialized with the debug flag: ModulePassManager MPM(DebugPM);
  4. Now, we need to implement two different ways to populate the module pass manager with the pass pipeline. If the user provided a pass pipeline on the command line – that is, they have used the –passes option – then we use this as the pass pipeline: if (!PassPipeline.empty()) {
    if (auto Err = PB.parsePassPipeline(
    MPM, PassPipeline)) {
    WithColor::error(errs(), Argv0)
    << toString(std::move(Err)) << “\n”;
    return false;
    }
    }
  5. Otherwise, we use the chosen optimization level to determine the pass pipeline to construct. The name of the default pass pipeline is default, and it takes the optimization level as a parameter: else {
    StringRef DefaultPass;
    switch (OptLevel) {
    case 0: DefaultPass = “default”; break;
    case 1: DefaultPass = “default”; break;
    case 2: DefaultPass = “default”; break;
    case 3: DefaultPass = “default”; break;
    case -1: DefaultPass = “default”; break;
    case -2: DefaultPass = “default”; break;
    }
    if (auto Err = PB.parsePassPipeline(
    MPM, DefaultPass)) {
    WithColor::error(errs(), Argv0)
    << toString(std::move(Err)) << “\n”;
    return false;
    }
    }
  6. With that, the pass pipeline to run transformations on the IR code has been set up. After this step, we need an open file to write the result to. The system assembler and LLVM IR output are text-based, so we should set the OF_Text flag for them: std::error_code EC;
    sys::fs::OpenFlags OpenFlags = sys::fs::OF_None;
    CodeGenFileType FileType = codegen::getFileType();
    if (FileType == CGFT_AssemblyFile)
    OpenFlags |= sys::fs::OF_Text;
    auto Out = std::make_unique(
    outputFilename(InputFilename), EC, OpenFlags);
    if (EC) {
    WithColor::error(errs(), Argv0)
    << EC.message() << ‘\n’;
    return false;
    }

SPECIFYING A PASS PIPELINE – Optimizing IR

With the –-passes option, you can not only name a single pass but you can also describe a whole pipeline. For example, the default pipeline for optimization level 2 is named default<O2>. You can run the ppprofile pass before the default pipeline with the–-passes=”ppprofile,default<O2>” argument. Please note that the pass names in such a pipeline description must be of the same type.

Now, let’s turn to using the new pass with clang.

Plugging the new pass into clang

In the previous section, you learned how you can run a single pass using opt. This is useful if you need to debug a pass but for a real compiler, the steps should not be that involved.

To achieve the best result, a compiler needs to run the optimization passes in a certain order. The LLVM pass manager has a default order for pass execution. This is also called the default pass pipeline. Using opt, you can specify a different pass pipeline with the –passes option. This is flexible but also complicated for the user. It also turns out that most of the time, you just want to add a new pass at very specific points, such as before optimization passes are run or at the end of the loop optimization processes. These points are called extension points. The PassBuilder class allows you to register a pass at an extension point. For example, you can call the registerPipelineStartEPCallback() method to add a pass to the beginning of the optimization pipeline. This is exactly the place we need for the ppprofiler pass. During optimization, functions may be inlined, and the pass will miss those inline functions. Instead, running the pass before the optimization passes guarantees that all functions are instrumented.

To use this approach, you need to extend the RegisterCB() function in the pass plugin. Add the following code to the function:
  PB.registerPipelineStartEPCallback(
      [](ModulePassManager &PM, OptimizationLevel Level) {
        PM.addPass(PPProfilerIRPass());
      });

Whenever the pass manager populates the default pass pipeline, it calls all the callbacks for the extension points. We simply add the new pass here.

To load the plugin into clang, you can use the -fpass-plugin option. Creating the instrumented executable of the hello.c file now becomes almost trivial:
$ clang -fpass-plugin=./PPProfiler.so hello.c runtime.c

Please run the executable and verify that the run creates the ppprofiler.csv file.

Using the ppprofiler pass with LLVM tools – Optimizing IR-2

Often, the runtime support for a feature is more complicated than adding that feature to the compiler itself. This is also true in this case. When the __ppp_enter() and __ppp_exit() functions are called, you can view this as an event. To analyze the data later, it is necessary to save the events. The basic data you would like to get is the event of the type, the name of the function and its address, and a timestamp. Without tricks, this is not as easy as it seems. Let’s give it a try.
Create a file called runtime.c with the following content:

  1. You need the file I/O, standard functions, and time support. This is provided by the following includes:

include
include
include

  1. For the file, a file descriptor is needed. Moreover, when the program finishes, that file descriptor should be closed properly:

static FILE *FileFD = NULL;
static void cleanup() {
if (FileFD == NULL) {
fclose(FileFD);
FileFD = NULL;
}
}

  1. To simplify the runtime, only a fixed name for the output is used. If the file is not open, then open the file and register the cleanup function:

static void init() {
if (FileFD == NULL) {
FileFD = fopen(“ppprofile.csv”, “w”);
atexit(&cleanup);
}
}

  1. You can call the clock_gettime() function to get a timestamp. The CLOCK_PROCESS_CPUTIME_ID parameter returns the time consumed by this process. Please note that not all systems support this parameter. You can use one of the other clocks, such as CLOCK_REALTIME, if necessary:

typedef unsigned long long Time;
static Time get_time() {
struct timespec ts;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts);
return 1000000000L * ts.tv_sec + ts.tv_nsec;
}

  1. Now, it is easy to define the __ppp_enter() function. Just make sure the file is open, get the timestamp, and write the event:

void __ppp_enter(const char *FnName) {
init();
Time T = get_time();
void *Frame = __builtin_frame_address(1);
fprintf(FileFD,
// “enter|name|clock|frame”
„enter|%s|%llu|%p\n”, FnName, T, Frame);
}

  1. The __ppp_exit() function only differs in terms of the event type:

void __ppp_exit(const char *FnName) {
init();
Time T = get_time();
void *Frame = __builtin_frame_address(1);
fprintf(FileFD,
// “exit|name|clock|frame”
„exit|%s|%llu|%p\n”, FnName, T, Frame);
}

That concludes a very simple implementation for runtime support. Before we try it, some remarks should be made about the implementation as it should be obvious that there are several problematic parts.
First of all, the implementation is not thread-safe since there is only one file descriptor, and access to it is not protected. Trying to use this runtime implementation with a multithreaded program will most likely lead to disturbed data in the output file.
In addition, we omitted checking the return value of the I/O-related functions, which can result in data loss.
But most importantly, the timestamp of the event is not precise. Calling a function already adds overhead, but performing I/O operations in that function makes it even worse. In principle, you can match the enter and exit events for a function and calculate the runtime of the function. However, this value is inherently flawed because it may include the time required for I/O. In summary, do not trust the times recorded here.
Despite all the flaws, this small runtime file allows us to produce some output. Compile the bitcode of the instrumented file together with the file containing the runtime code and run the resulting executable:

$ clang hello_inst.bc runtime.c
$ ./a.out

This results in a new file called ppprofile.csv in the directory that contains the following content:

$ cat ppprofile.csv
enter|main|3300868|0x1
exit|main|3760638|0x1

Cool – the new pass and the runtime seem to work!

Fully integrating the pass into the pass registry – Optimizing IR

To fully integrate the new pass into LLVM, the source of the plugin needs to be structured slightly differently. The main reason for this is that the constructor of the pass class is called from the pass registry, which requires the class interface to be put into a header file.
Like before, you must put the new pass into the Transforms component of LLVM. Begin the implementation by creating the llvm-project/llvm/include/llvm/Transforms/PPProfiler/PPProfiler.h header file. The content of that file is the class definition; put it into the llvm namespace. No other changes are required:

ifndef LLVM_TRANSFORMS_PPPROFILER_PPPROFILER_H
define LLVM_TRANSFORMS_PPPROFILER_PPPROFILER_H
include “llvm/IR/PassManager.h”
namespace llvm {
class PPProfilerIRPass
: public llvm::PassInfoMixin {
public:
llvm::PreservedAnalyses
run(llvm::Module &M, llvm::ModuleAnalysisManager &AM);
private:
void instrument(llvm::Function &F,
llvm::Function *EnterFn,
llvm::Function *ExitFn);
};
} // namespace llvm
endif

Next, copy the source file of the pass plugin, PPProfiler.cpp, into the new directory, llvm-project/llvm/lib/Transforms/PPProfiler. This file needs to be updated in the following way:

  1. Since the class definition is now in a header file, you must remove the class definition from this file. At the top, add the include directive for the header file:

include “llvm/Transforms/PPProfiler/PPProfiler.h”

  1. The llvmGetPassPluginInfo() function must be removed because the pass wasn’t built into a shared library of its own.
    As before, you also need to provide a CMakeLists.txt file for the build. You must declare the new pass as a new component:

add_llvm_component_library(LLVMPPProfiler
PPProfiler.cpp
LINK_COMPONENTS
Core
Support
)

After, like in the previous section, you need to include the new source directory by adding the following line to the CMakeLists.txt file in the parent directory:

add_subdirectory(PPProfiler)

Inside LLVM, the available passes are kept in the llvm/lib/Passes/ PassRegistry.def database file. You need to update this file. The new pass is a module pass, so we need to search inside the file for the section in which module passes are defined, for example, by searching for the MODULE_PASS macro. Inside this section, add the following line:

MODULE_PASS(“ppprofiler”, PPProfilerIRPass())

This database file is used in the llvm/lib/Passes/PassBuilder.cpp class. This file needs to include your new header file:

include “llvm/Transforms/PPProfiler/PPProfiler.h”

These are all required source changes based on the plugin version of the new pass.
Since you created a new LLVM component, it is also necessary to add a link dependency in the llvm/lib/Passes/CMakeLists.txt file. Under the LINK_COMPONENTS keyword, you need to add a line with the name of the new component:

PPProfiler

Et voilà – you are ready to build and install LLVM. The new pass, ppprofiler, is now available to all LLVM tools. It has been compiled into the libLLVMPPProfiler.a library and available in the build system as the PPProfiler component.
So far, we have talked about how to create a new pass. In the next section, we will examine how to use the ppprofiler pass.