Category Drawbacks of TableGen

Extending the pass pipeline – Optimizing IR

In the previous section, we used the PassBuilder class to create a pass pipeline, either from a user-provided description or a predefined name. Now, let’s look at another way to customize the pass pipeline: using extension points.
During the construction of the pass pipeline, the pass builder allows passes contributed by the user to be added. These places are called extension points. A couple of extension points exist, as follows:
• The pipeline start extension point, which allows us to add passes at the beginning of the pipeline
• The peephole extension point, which allows us to add passes after each instance of the instruction combiner pass
Other extension points exist too. To employ an extension point, you must register a callback. During the construction of the pass pipeline, your callback is run at the defined extension point and can add passes to the given pass manager.
To register a callback for the pipeline start extension point, you must call the registerPipelineStartEPCallback() method of the PassBuilder class. For example, to add our PPProfiler pass to the beginning of the pipeline, you would adapt the pass to be used as a module pass with a call to the createModuleToFunctionPassAdaptor() template function and then add the pass to the module pass manager:

PB.registerPipelineStartEPCallback(
[](ModulePassManager &MPM) {
MPM.addPass(PPProfilerIRPass());
});

You can add this snippet in the pass pipeline setup code anywhere before the pipeline is created – that is, before the parsePassPipeline() method is called.
A very natural extension to what we did in the previous section is to let the user pass a pipeline description for an extension point on the command line. The opt tool allows this too. Let’s do this for the pipeline start extension point. Add the following code to the tools/driver/Driver.cpp file:

  1. First, we must a new command line for the user to specify the pipeline description. Again, we take the option name from the opt tool:

static cl::opt PipelineStartEPPipeline(
“passes-ep-pipeline-start”,
cl::desc(“Pipeline start extension point));

  1. Using a Lambda function as a callback is the most convenient way to do this. To parse the pipeline description, we must call the parsePassPipeline() method of the PassBuilder instance. The passes are added to the PM pass manager and given as an argument to the Lambda function. If an error occurs, we only print an error message without stopping the application. You can add this snippet after the call to the crossRegisterProxies() method: PB.registerPipelineStartEPCallback(
    [&PB, Argv0](ModulePassManager &PM) {
    if (auto Err = PB.parsePassPipeline(
    PM, PipelineStartEPPipeline)) {
    WithColor::error(errs(), Argv0)
    << “Could not parse pipeline “
    << PipelineStartEPPipeline.ArgSt
    r << “: “
    << toString(std::move(Err)) << “\n”;
    }
    });

Creating an optimization pipeline – Optimizing IR-2

  1. Now, we must replace the existing emit() function with a new version. Additionally, we must declare the required PassBuilder instance at the top of the function:

bool emit(StringRef Argv0, llvm::Module *M,
llvm::TargetMachine *TM,
StringRef InputFilename) {
PassBuilder PB(TM);

  1. To implement the support for pass plugins given on the command line, we must loop through the list of plugin libraries given by the user and try to load the plugin. We’ll emit an error message if this fails; otherwise, we’ll register the passes: for (auto &PluginFN : PassPlugins) {
    auto PassPlugin = PassPlugin::Load(PluginFN);
    if (!PassPlugin) {
    WithColor::error(errs(), Argv0)
    << “Failed to load passes from ‘” << PluginFN << “‘. Request ignored.\n”; continue; } PassPlugin->registerPassBuilderCallbacks(PB);
    }
  2. The information from the static plugin registry is used in a similar way to register those plugins with our PassBuilder instance:

define HANDLE_EXTENSION(Ext) \
getExtPluginInfo().RegisterPassBuilderCallbacks( \
PB);
include “llvm/Support/Extension.def”

  1. Now, we need to declare variables for the different analysis managers. The only parameter is the debug flag: LoopAnalysisManager LAM(DebugPM);
    FunctionAnalysisManager FAM(DebugPM);
    CGSCCAnalysisManager CGAM(DebugPM);
    ModuleAnalysisManager MAM(DebugPM);
  2. Next, we must populate the analysis managers with calls to the respective register method on the PassBuilder instance. Through this call, the analysis manager is populated with the default analysis passes and also runs registration callbacks. We must also make sure that the function analysis manager uses the default alias-analysis pipeline and that all analysis managers know about each other: FAM.registerPass(
    [&] { return PB.buildDefaultAAPipeline(); });
    PB.registerModuleAnalyses(MAM);
    PB.registerCGSCCAnalyses(CGAM);
    PB.registerFunctionAnalyses(FAM);
    PB.registerLoopAnalyses(LAM);
    PB.crossRegisterProxies(LAM, FAM, CGAM, MAM);
  3. The MPM module pass manager holds the pass pipeline that we constructed. The instance is initialized with the debug flag: ModulePassManager MPM(DebugPM);
  4. Now, we need to implement two different ways to populate the module pass manager with the pass pipeline. If the user provided a pass pipeline on the command line – that is, they have used the –passes option – then we use this as the pass pipeline: if (!PassPipeline.empty()) {
    if (auto Err = PB.parsePassPipeline(
    MPM, PassPipeline)) {
    WithColor::error(errs(), Argv0)
    << toString(std::move(Err)) << “\n”;
    return false;
    }
    }
  5. Otherwise, we use the chosen optimization level to determine the pass pipeline to construct. The name of the default pass pipeline is default, and it takes the optimization level as a parameter: else {
    StringRef DefaultPass;
    switch (OptLevel) {
    case 0: DefaultPass = “default”; break;
    case 1: DefaultPass = “default”; break;
    case 2: DefaultPass = “default”; break;
    case 3: DefaultPass = “default”; break;
    case -1: DefaultPass = “default”; break;
    case -2: DefaultPass = “default”; break;
    }
    if (auto Err = PB.parsePassPipeline(
    MPM, DefaultPass)) {
    WithColor::error(errs(), Argv0)
    << toString(std::move(Err)) << “\n”;
    return false;
    }
    }
  6. With that, the pass pipeline to run transformations on the IR code has been set up. After this step, we need an open file to write the result to. The system assembler and LLVM IR output are text-based, so we should set the OF_Text flag for them: std::error_code EC;
    sys::fs::OpenFlags OpenFlags = sys::fs::OF_None;
    CodeGenFileType FileType = codegen::getFileType();
    if (FileType == CGFT_AssemblyFile)
    OpenFlags |= sys::fs::OF_Text;
    auto Out = std::make_unique(
    outputFilename(InputFilename), EC, OpenFlags);
    if (EC) {
    WithColor::error(errs(), Argv0)
    << EC.message() << ‘\n’;
    return false;
    }

NOTE – Optimizing IR

The runtime.c file is not instrumented because the pass checks that the special functions are not yet declared in a module.
This already looks better, but does it scale to larger programs? Let’s assume you want to build an instrumented binary of the tinylang compiler for Chapter 5. How would you do this?
You can pass compiler and linker flags on the CMake command line, which is exactly what we need. The flags for the C++ compiler are given in the CMAKE_CXX_FLAGS variable. Thus, specifying the following on the CMake command line adds the new pass to all compiler runs:

-DCMAKE_CXX_FLAGS=”-fpass-plugin=/PPProfiler.so”

Please replace with the absolute path to the shared library.
Similarly, specifying the following adds the runtime.o file to each linker invocation. Again, please replace with the absolute path to a compiled version of runtime.c:

-DCMAKE_EXE_LINKER_FLAGS=”/runtime.o”

Of course, this requires clang as the build compiler. The fastest way to make sure clang is used as the build compiler is to set the CC and CXX environment variables accordingly:

export CC=clang
export CXX=clang++

With these additional options, the CMake configuration from Chapter 5 should run as usual.
After building the tinylang executable, you can run it with the example Gcd.mod file. The ppprofile.csv file will also be written, this time with more than 44,000 lines!
Of course, having such a dataset raises the question of if you can get something useful out of it. For example, getting a list of the 10 most often called functions, together with the call count and the time spent in the function, would be useful information. Luckily, on a Unix system, you have a couple of tools that can help. Let’s build a short pipeline that matches enter events with exit events, counts the functions, and displays the top 10 functions. The awk Unix tool helps with most of these steps.
To match an enter event with an exit event, the enter event must be stored in the record associative map. When an exit event is matched, the stored enter event is looked up, and the new record is written. The emitted line contains the timestamp from the enter event, the timestamp from the exit event, and the difference between both. We must put this into the join.awk file:

BEGIN { FS = “|”; OFS = “|” }
/enter/ { record[$2] = $0 }
/exit/ { split(record[$2],val,”|”)
print val[2], val[3], $3, $3-val[3], val[4] }

To count the function calls and the execution, two associative maps, count and sum, are used. In count, the function calls are counted, while in sum, the execution time is added. In the end, the maps are dumped. You can put this into the avg.awk file:

BEGIN { FS = “|”; count[“”] = 0; sum[“”] = 0 }
{ count[$1]++; sum[$1] += $4 }
END { for (i in count) {
if (i != “”) {
print count[i], sum[i], sum[i]/count[i], I }
} }

After running these two scripts, the result can be sorted in descending order, and then the top 10 lines can be taken from the file. However, we can still improve the function names, __ppp_enter() and __ppp_exit(), which are mangled and are therefore difficult to read. Using the llvm-cxxfilt tool, the names can be demangled. The demangle.awk script is as follows:

{ cmd = “llvm-cxxfilt ” $4
(cmd) | getline name
close(cmd); $4 = name; print }

To get the top 10 function calls, you can run the following:

$ cat ppprofile.csv | awk -f join.awk | awk -f avg.awk |\
sort -nr | head -15 | awk -f demangle.awk

Here are some sample lines from the output:

446 1545581 3465.43 charinfo::isASCII(char)
409 826261 2020.2 llvm::StringRef::StringRef()
382 899471 2354.64
tinylang::Token::is(tinylang::tok::TokenKind) const
171 1561532 9131.77 charinfo::isIdentifierHead(char)

The first number is the call count of the function, the second is the cumulated execution time, and the third number is the average execution time. As explained previously, do not trust the time values, though the call counts should be accurate.
So far, we’ve implemented a new instrumentation pass, either as a plugin or as an addition to LLVM, and we used it in some real-world scenarios. In the next section, we’ll explore how to set up an optimization pipeline in our compiler.

Using the ppprofiler pass with LLVM tools – Optimizing IR-1

Recall the ppprofiler pass that we developed as a plugin out of the LLVM tree in the Developing the ppprofiler pass as a plugin section. Here, we’ll learn how to use this pass with LLVM tools, such as opt and clang, as they can load plugins.
Let’s look at opt first.
Run the pass plugin in opt
To play around with the new plugin, you need a file containing LLVM IR. The easiest way to do this is to translate a C program, such as a basic “Hello World” style program:

include
int main(int argc, char *argv[]) {
puts(“Hello”);
return 0;
}

Compile this file, hello.c, with clang:

$ clang -S -emit-llvm -O1 hello.c

You will get a very simple IR file called hello.ll that contains the following code:

$ cat hello.ll
@.str = private unnamed_addr constant [6 x i8] c”Hello\00″,
align 1
define dso_local i32 @main(
i32 noundef %0, ptr nocapture noundef readnone %1) {
%3 = tail call i32 @puts(
ptr noundef nonnull dereferenceable(1) @.str)
ret i32 0
}

This is enough to test the pass.
To run the pass, you have to provide a couple of arguments. First, you need to tell opt to load the shared library via the –load-pass-plugin option. To run a single pass, you must specify the–-passes option. Using the hello.ll file as input, you can run the following:

$ opt –load-pass-plugin=./PPProfile.so \
–passes=”ppprofiler” –stats hello.ll -o hello_inst.bc

If statistic generation is enabled, you will see the following output:

===——————————————————–===
… Statistics Collected …
===——————————————————–===
1 ppprofiler – Number of instrumented functions.

Otherwise, you will be informed that statistic collection is not enabled:

Statistics are disabled. Build with asserts or with
-DLLVM_FORCE_ENABLE_STATS

The bitcode file, hello_inst.bc, is the result. You can turn this file into readable IR with the llvm-dis tool. As expected, you will see the calls to the __ppp_enter() and __ppp_exit() functions and a new constant for the name of the function:

$ llvm-dis hello_inst.bc -o –
@.str = private unnamed_addr constant [6 x i8] c”Hello\00″,
align 1
@0 = private unnamed_addr constant [5 x i8] c”main\00″,
align 1
define dso_local i32 @main(i32 noundef %0,
ptr nocapture noundef readnone %1) {
call void @__ppp_enter(ptr @0)
%3 = tail call i32 @puts(
ptr noundef nonnull dereferenceable(1) @.str)
call void @__ppp_exit(ptr @0)
ret i32 0
}

This already looks good! It would be even better if we could turn this IR into an executable and run it. For this, you need to provide implementations for the called functions.

Fully integrating the pass into the pass registry – Optimizing IR

To fully integrate the new pass into LLVM, the source of the plugin needs to be structured slightly differently. The main reason for this is that the constructor of the pass class is called from the pass registry, which requires the class interface to be put into a header file.
Like before, you must put the new pass into the Transforms component of LLVM. Begin the implementation by creating the llvm-project/llvm/include/llvm/Transforms/PPProfiler/PPProfiler.h header file. The content of that file is the class definition; put it into the llvm namespace. No other changes are required:

ifndef LLVM_TRANSFORMS_PPPROFILER_PPPROFILER_H
define LLVM_TRANSFORMS_PPPROFILER_PPPROFILER_H
include “llvm/IR/PassManager.h”
namespace llvm {
class PPProfilerIRPass
: public llvm::PassInfoMixin {
public:
llvm::PreservedAnalyses
run(llvm::Module &M, llvm::ModuleAnalysisManager &AM);
private:
void instrument(llvm::Function &F,
llvm::Function *EnterFn,
llvm::Function *ExitFn);
};
} // namespace llvm
endif

Next, copy the source file of the pass plugin, PPProfiler.cpp, into the new directory, llvm-project/llvm/lib/Transforms/PPProfiler. This file needs to be updated in the following way:

  1. Since the class definition is now in a header file, you must remove the class definition from this file. At the top, add the include directive for the header file:

include “llvm/Transforms/PPProfiler/PPProfiler.h”

  1. The llvmGetPassPluginInfo() function must be removed because the pass wasn’t built into a shared library of its own.
    As before, you also need to provide a CMakeLists.txt file for the build. You must declare the new pass as a new component:

add_llvm_component_library(LLVMPPProfiler
PPProfiler.cpp
LINK_COMPONENTS
Core
Support
)

After, like in the previous section, you need to include the new source directory by adding the following line to the CMakeLists.txt file in the parent directory:

add_subdirectory(PPProfiler)

Inside LLVM, the available passes are kept in the llvm/lib/Passes/ PassRegistry.def database file. You need to update this file. The new pass is a module pass, so we need to search inside the file for the section in which module passes are defined, for example, by searching for the MODULE_PASS macro. Inside this section, add the following line:

MODULE_PASS(“ppprofiler”, PPProfilerIRPass())

This database file is used in the llvm/lib/Passes/PassBuilder.cpp class. This file needs to include your new header file:

include “llvm/Transforms/PPProfiler/PPProfiler.h”

These are all required source changes based on the plugin version of the new pass.
Since you created a new LLVM component, it is also necessary to add a link dependency in the llvm/lib/Passes/CMakeLists.txt file. Under the LINK_COMPONENTS keyword, you need to add a line with the name of the new component:

PPProfiler

Et voilà – you are ready to build and install LLVM. The new pass, ppprofiler, is now available to all LLVM tools. It has been compiled into the libLLVMPPProfiler.a library and available in the build system as the PPProfiler component.
So far, we have talked about how to create a new pass. In the next section, we will examine how to use the ppprofiler pass.