Category Creating a new TableGen tool

Code generation changes to support JIT compilation via LLJIT – JIT Compilation

Now, let’s take a brief look at some of the changes we have made within CodeGen.cpp to support our JIT-based calculator:

  1. As previously mentioned, the code generation class has two important methods: one to compile the user-defined function into LLVM IR and print the IR to the console, and another to prepare the calculation evaluation function, calc_expr_func, which contains a call to the original user-defined function for evaluation. This second function also prints the resulting IR to the user:
    void CodeGen::compileToIR(AST *Tree, Module *M,
    StringMap &JITtedFunctions) {
    ToIRVisitor ToIR(M, JITtedFunctions);
    ToIR.run(Tree);
    M->print(outs(), nullptr);
    }
    void CodeGen::prepareCalculationCallFunc(AST *FuncCall,
    Module *M, llvm::StringRef FnName,
    StringMap &JITtedFunctions) {
    ToIRVisitor ToIR(M, JITtedFunctions);
    ToIR.genFuncEvaluationCall(FuncCall);
    M->print(outs(), nullptr);
    }
  1. As noted in the preceding source, these code generation functions define a ToIRVisitor instance that takes in our module and a JITtedFunctions map to be used in its constructor upon initialization:
    class ToIRVisitor : public ASTVisitor {
    Module *M;
    IRBuilder<> Builder;
    StringMap &JITtedFunctionsMap;
    . . .
    public:
    ToIRVisitor(Module *M,
    StringMap &JITtedFunctions)
    : M(M), Builder(M->getContext()), JITtedFunctionsMap(JITtedFunctions) {
  1. Ultimately, this information is used to either generate IR or evaluate the function that the IR was previously generated for. When generating the IR, the code generator expects to see a DefDecl node, which represents defining a new function. The function name, along with the number of arguments it is defined with, is stored within the function definitions map:
    virtual void visit(DefDecl &Node) override {
    llvm::StringRef FnName = Node.getFnName();
    llvm::SmallVector FunctionVars = Node.getVars();
    (JITtedFunctionsMap)[FnName] = FunctionVars.size();
  1. Afterward, the actual function definition is created by the genUserDefinedFunction() call:
    Function *DefFunc = genUserDefinedFunction(FnName);
  1. Within genUserDefinedFunction(), the first step is to check if the function exists within the module. If it does not, we ensure that the function prototype exists within our map data structure. Then, we use the name and the number of arguments to construct a function that has the number of arguments that were defined by the user, and make the function return a single integer value:
    Function *genUserDefinedFunction(llvm::StringRef Name) {
    if (Function *F = M->getFunction(Name))
    return F;
    Function *UserDefinedFunction = nullptr;
    auto FnNameToArgCount = JITtedFunctionsMap.find(Name);
    if (FnNameToArgCount != JITtedFunctionsMap.end()) {
    std::vector IntArgs(FnNameToArgCount->second, Int32Ty);
    FunctionType *FuncType = FunctionType::get(Int32Ty, IntArgs, false);
    UserDefinedFunction =
    Function::Create(FuncType, GlobalValue::ExternalLinkage, Name, M);
    }
    return UserDefinedFunction;
    }
  1. After generating the user-defined function, a new basic block is created, and we insert our function into the basic block. Each function argument is also associated with a name that is defined by the user, so we also set the names for all function arguments accordingly, as well as generate mathematical operations that operate on the arguments within the function:
    BasicBlock BB = BasicBlock::Create(M->getContext(), “entry”, DefFunc); Builder.SetInsertPoint(BB); unsigned FIdx = 0; for (auto &FArg : DefFunc->args()) { nameMap[FunctionVars[FIdx]] = &FArg; FArg.setName(FunctionVars[FIdx++]); } Node.getExpr()->accept(this);
    };
  1. When evaluating the user-defined function, the AST that is expected in our example is called a FuncCallFromDef node. First, we define the evaluation function and name it calc_expr_func (taking in zero arguments and returning one result):
    virtual void visit(FuncCallFromDef &Node) override {
    llvm::StringRef CalcExprFunName = “calc_expr_func”;
    FunctionType *CalcExprFunTy = FunctionType::get(Int32Ty, {}, false);
    Function *CalcExprFun = Function::Create(
    CalcExprFunTy, GlobalValue::ExternalLinkage, CalcExprFunName, M);
  1. Next, we create a new basic block to insert calc_expr_func into:
    BasicBlock *BB = BasicBlock::Create(M->getContext(), “entry”, CalcExprFun);
    Builder.SetInsertPoint(BB);
  1. Similar to before, the user-defined function is retrieved by genUserDefinedFunction(), and we pass the numerical parameters of the function call into the original function that we have just regenerated:
    llvm::StringRef CalleeFnName = Node.getFnName();
    Function *CalleeFn = genUserDefinedFunction(CalleeFnName);
  1. Once we have the actual llvm::Function instance available, we utilize IRBuilder to create a call to the defined function and also return the result so that it is accessible when the result is printed to the user in the end:
    auto CalleeFnVars = Node.getArgs();
    llvm::SmallVector IntParams;
    for (unsigned i = 0, end = CalleeFnVars.size(); i != end; ++i) {
    int ArgsToIntType;
    CalleeFnVars[i].getAsInteger(10, ArgsToIntType);
    Value *IntParam = ConstantInt::get(Int32Ty, ArgsToIntType, true);
    IntParams.push_back(IntParam);
    }
    Builder.CreateRet(Builder.CreateCall(CalleeFn, IntParams, “calc_expr_res”));
    };

LLVM’s overall JIT implementation and use cases – JIT Compilation

So far, we have only looked at ahead-of-time (AOT) compilers. These compilers compile the whole application. The application can only run after the compilation is finished. If the compilation is performed at the runtime of the application, then the compiler is a JIT compiler. A JIT compiler has interesting use cases:

  • Implementation of a virtual machine: A programming language can be translated to byte code with an AOT compiler. At runtime, a JIT compiler is used to compile the byte code to machine code. The advantage of this approach is that the byte code is hardware-independent, and thanks to the JIT compiler, there is no performance penalty compared to an AOT compiler. Java and C use this model today, but this is not a new idea: the USCD Pascal compiler from 1977 already used a similar approach.
  • Expression evaluation: A spreadsheet application can compile often-executed expressions with a JIT compiler. For example, this can speed up financial simulations. The lldb LLVM debugger uses this approach to evaluate source expressions at debug time.
  • Database queries: A database creates an execution plan from a database query. The execution plan describes operations on tables and columns, which leads to a query answer when executed. A JIT compiler can be used to translate the execution plan into machine code, which speeds up the execution of the query.

The static compilation model of LLVM is not as far away from the JIT model as one may think. The llc LLVM static compiler compiles LLVM IR into machine code and saves the result as an object file on disk. If the object file is not stored on disk but in memory, would the code be executable? Not directly, as references to global functions and global data use relocations instead of absolute addresses. Conceptually, a relocation describes how to calculate the address – for example, as an offset to a known address. If we resolve relocations into addresses, as the linker and the dynamic loader do, then we can execute the object code. Running the static compiler to compile IR code into an object file in memory, performing a link step on the in-memory object file, and running the code gives us a JIT compiler. The JIT implementation in the LLVM core libraries is based on this idea.

During the development history of LLVM, there were several JIT implementations, with different feature sets. The latest JIT API is the On-Request Compilation (ORC) engine. In case you were curious about the acronym, it was the lead developer’s intention to invent yet another acronym based on Tolkien’s universe, after Executable and Linking Format (ELF) and Debugging Standard (DWARF) were already present.

The ORC engine builds on and extends the idea of using the static compiler and a dynamic linker on the in-memory object file. The implementation uses a layered approach. The two basic levels are the compile layer and the link layer. On top of this sits a layer providing support for lazy compilation. A transformation layer can be stacked on top or below the lazy compilation layer, allowing the developer to add arbitrary transformations or simply to be notified of certain events. Moreover, this layered approach has the advantage that the JIT engine is customizable for diverse requirements. For example, a high-performance virtual machine may choose to compile everything upfront and make no use of the lazy compilation layer. On the other hand, other virtual machines will emphasize startup time and responsiveness to the user and will achieve this with the help of the lazy compilation layer.

The older MCJIT engine is still available, and its API is derived from an even older, already-removed JIT engine. Over time, this API gradually became bloated, and it lacks the flexibility of the ORC API. The goal is to remove this implementation, as the ORC engine now provides all the functionality of the MCJIT engine, and new developments should use the ORC API.

In the next section, we look at lli, the LLVM interpreter, and the dynamic compiler, before we dive into implementing a JIT compiler.

Drawbacks of TableGen – The TableGen Language

Performance of the token filter

Using a plain binary search for the keyword filter does not give a better performance than the implementation based on the llvm::StringMap type. To beat the performance of the current implementation, you need to generate a perfect hash function.

The classic algorithm from Czech, Havas, and Majewski can be easily implemented, and it gives you a very good performance. It is described in An optimal algorithm for generating minimal perfect hash functions, Information Processing Letters, Volume 43, Issue 5, 1992. See https://www.sciencedirect.com/science/article/abs/pii/002001909290220P.

A state-of-the-art algorithm is PTHash from Pibiri and Trani, described in PTHash: Revisiting FCH Minimal Perfect Hashing, SIGIR ’21. See https://arxiv.org/pdf/2104.10402.pdf.

Both algorithms are good candidates for generating a token filter that is actually faster than llvm::StringMap.

Drawbacks of TableGen

Here are a few drawbacks of TableGen:

  • The TableGen language is built on a simple concept. As a consequence, it does not have the same computing capabilities as other DSLs. Obviously, some programmers would like to replace TableGen with a different, more powerful language, and this topic comes up from time to time in the LLVM discussion forum.
  • With the possibility of implementing your own backends, the TableGen language is very flexible. However, it also means that the semantics of a given definition are hidden inside the backend. Thus, you can create TableGen files that are basically not understandable by other developers.
  • And last, the backend implementation can be very complex if you try to solve a non-trivial task. It is reasonable to expect that this effort would be lower if the TableGen language were more powerful.

Even if not all developers are happy with the capabilities of TableGen, the tool is used widely in LLVM, and for a developer, it is important to understand it.

Summary

In this chapter, you first learned the main idea behind TableGen. Then, you defined your first classes and records in the TableGen language, and you acquired knowledge of the syntax of TableGen. Finally, you developed a TableGen backend emitting fragments of C++ source code, based on the TableGen classes you defined.

In the next chapter, we examine another unique feature of LLVM: generating and executing code in one step, also known as Just-In-Time (JIT) compilation.

Creating a new TableGen tool – The TableGen Language-5

  1. The only missing part now is a way to call this implementation, for which you define a global function, EmitTokensAndKeywordFilter(). The emitSourceFileHeader() function declared in the llvm/TableGen/TableGenBackend.h header emits a comment at the top of the generated file:

void EmitTokensAndKeywordFilter(RecordKeeper &RK,
raw_ostream &OS) {
emitSourceFileHeader(“Token Kind and Keyword Filter “
“Implementation Fragment”,
OS);
TokenAndKeywordFilterEmitter(RK).run(OS);
}

With that, you finished the implementation of the source emitter in the TokenEmitter.cpp file. Overall, the coding is not too complicated.
The TableGenBackends.h header file only contains the declaration of the EmitTokensAndKeywordFilter() function. To avoid including other files, you use forward declarations for the raw_ostream and RecordKeeper classes:

ifndef TABLEGENBACKENDS_H
define TABLEGENBACKENDS_H
namespace llvm {
class raw_ostream;
class RecordKeeper;
} // namespace llvm
void EmitTokensAndKeywordFilter(llvm::RecordKeeper &RK,
llvm::raw_ostream &OS);
endif

The missing part is the implementation of the driver. Its task is to parse the TableGen file and emit the records according to the command-line options. The implementation is in the TableGen.cpp file:

  1. As usual, the implementation begins with including the required headers. The most important one is llvm/TableGen/Main.h because this header declares the frontend of TableGen:

include “TableGenBackends.h”
include “llvm/Support/CommandLine.h”
include “llvm/Support/PrettyStackTrace.h”
include “llvm/Support/Signals.h”
include “llvm/TableGen/Main.h”
include “llvm/TableGen/Record.h”

  1. To simplify coding, the llvm namespace is imported:

using namespace llvm;

  1. The user can choose one action. The ActionType enumeration contains all possible actions:

enum ActionType {
PrintRecords,
DumpJSON,
GenTokens,
};

  1. A single command-line option object called Action is used. The user needs to specify the –gen-tokens option to emit the token filter you implemented. The other two options, –print-records and –dump-json, are standard options to dump read records. Note that the object is in an anonymous namespace:

namespace {
cl::opt Action(
cl::desc(“Action to perform:”),
cl::values(
clEnumValN(
PrintRecords, “print-records”,
“Print all records to stdout (default)”),
clEnumValN(DumpJSON, “dump-json”,
“Dump all records as “
“machine-readable JSON”),
clEnumValN(GenTokens, “gen-tokens”,
“Generate token kinds and keyword “
“filter”)));

  1. The Main() function performs the requested action based on the value of Action. Most importantly, your EmitTokensAndKeywordFilter() function is called if –gen-tokens was specified on the command line. After the end of the function, the anonymous namespace is closed:

bool Main(raw_ostream &OS, RecordKeeper &Records) {
switch (Action) {
case PrintRecords:
OS << Records; // No argument, dump all contents
break;
case DumpJSON:
EmitJSON(Records, OS);
break;
case GenTokens:
EmitTokensAndKeywordFilter(Records, OS);
break;
}
return false;
}
} // namespace

  1. And lastly, you define a main() function. After setting up the stack trace handler and parsing the command-line options, the TableGenMain() function is called to parse the TableGen file and create records. That function also calls your Main() function if there are no errors:

int main(int argc, char **argv) {
sys::PrintStackTraceOnErrorSignal(argv[0]);
PrettyStackTraceProgram X(argc, argv);
cl::ParseCommandLineOptions(argc, argv);
llvm_shutdown_obj Y;
return TableGenMain(argv[0], &Main);
}

Your own TableGen tool is now implemented. After compiling, you can run it with the KeywordC.td sample input file as follows:

$ tinylang-tblgen –gen-tokens –o TokenFilter.inc KeywordC.td

The generated C++ source code is written to the TokenFilter.inc file.

Creating a new TableGen tool – The TableGen Language-3

  1. Next, the function declarations are emitted. This is only a constant string, so nothing exciting happens. This finishes emitting the declarations: OS << ” const char *getTokenName(TokenKind Kind) “
    “LLVM_READNONE;\n”
    << ” const char *getPunctuatorSpelling(TokenKind “
    “Kind) LLVM_READNONE;\n”
    << ” const char *getKeywordSpelling(TokenKind “
    “Kind) “
    “LLVM_READNONE;\n”
    << “}\n”
    << “endif\n”;
  2. Now, let’s turn to emitting the definitions. Again, this generated code is guarded by a macro called GET_TOKEN_KIND_DEFINITION. First, the token names are emitted into a TokNames array, and the getTokenName() function uses that array to retrieve the name. Please note that the quote symbol must be escaped as \” when used inside a string: OS << “ifdef GET_TOKEN_KIND_DEFINITION\n”; OS << “undef GET_TOKEN_KIND_DEFINITION\n”; OS << “static const char * const TokNames[] = {\n”; for (Record *CC : Records.getAllDerivedDefinitions(“Token”)) { OS << ” \”” << CC->getValueAsString(“Name”)
    << “\”,\n”;
    }
    OS << “};\n\n”;
    OS << “const char *tok::getTokenName(TokenKind Kind) “
    “{\n”
    << ” if (Kind <= tok::NUM_TOKENS)\n”
    << ” return TokNames[Kind];\n”
    << ” llvm_unreachable(\”unknown TokenKind\”);\n”
    << ” return nullptr;\n”
    << “};\n\n”;
  3. Next, the getPunctuatorSpelling() function is emitted. The only notable difference to the other parts is that the loop goes over all records derived from the Punctuator class. Also, a switch statement is generated instead of an array: OS << “const char ” “*tok::getPunctuatorSpelling(TokenKind ” “Kind) {\n” << ” switch (Kind) {\n”; for (Record *CC : Records.getAllDerivedDefinitions(“Punctuator”)) { OS << ” ” << CC->getValueAsString(“Name”)
    << “: return \”” << CC->getValueAsString(“Spelling”) << “\”;\n”;
    }
    OS << ” default: break;\n”
    << ” }\n”
    << ” return nullptr;\n”
    << “};\n\n”;
  4. And finally, the getKeywordSpelling() function is emitted. The coding is similar to emitting getPunctuatorSpelling(). This time, the loop goes over all records of the Keyword class, and the name is again prefixed with kw_: OS << “const char *tok::getKeywordSpelling(TokenKind ” “Kind) {\n” << ” switch (Kind) {\n”; for (Record *CC : Records.getAllDerivedDefinitions(“Keyword”)) { OS << ” kw_” << CC->getValueAsString(“Name”)
    << “: return \”” << CC->getValueAsString(“Name”)
    << “\”;\n”;
    }
    OS << ” default: break;\n”
    << ” }\n”
    << ” return nullptr;\n”
    << «};\n\n»;
    OS << «endif\n»;
    }
  5. The emitKeywordFilter() method is more complex than the previous methods since emitting the filter requires collecting some data from the records. The generated source code uses the std::lower_bound() function, thus implementing a binary search.
    Now, let’s make a shortcut. There can be several records of the TokenFilter class defined in the TableGen file. For demonstration purposes, just emit at most one token filter method: std::vector AllTokenFilter =
    Records.getAllDerivedDefinitionsIfDefined(
    “TokenFilter”);
    if (AllTokenFilter.empty())
    return;

Implementing a TableGen backend – The TableGen Language-1

Since parsing and creation of records are done through an LLVM library, you only need to care about the backend implementation, which consists mostly of generating C++ source code fragments based on the information in the records. First, you need to be clear about what source code to generate before you can put it into the backend.
Sketching the source code to be generated
The output of the TableGen tool is a single file containing C++ fragments. The fragments are guarded by macros. The goal is to replace the TokenKinds.def database file. Based on the information in the TableGen file, you can generate the following:

  1. The enumeration members used to define flags. The developer is free to name the type; however, it should be based on the unsigned type. If the generated file is named TokenKinds.inc, then the intended use is this:

enum Flags : unsigned {
define GET_TOKEN_FLAGS
include “TokenKinds.inc”
}

  1. The TokenKind enumeration, and the prototypes and definitions of the getTokenName(), getPunctuatorSpelling(), and getKeywordSpelling() functions. This code replaces the TokenKinds.def database file, most of the TokenKinds.h include file and the TokenKinds.cpp. source file.
  2. A new lookupKeyword() function that can be used instead of the current implementation using the llvm::StringMap. type. This is the function you want to optimize.
    Knowing what you want to generate, you can now turn to implementing the backend.
    Creating a new TableGen tool
    A simple structure for your new tool is to have a driver that evaluates the command-line options and calls the generation functions and the actual generator functions in a different file. Let’s call the driver file TableGen.cpp and the file containing the generator TokenEmitter.cpp. You also need a TableGenBackends.h header file. Let’s begin the implementation with the generation of the C++ code in the TokenEmitter.cpp file:
  3. As usual, the file begins with including the required headers. The most important one is llvm/TableGen/Record.h, which defines a Record class, used to hold records generated by parsing the .td file:

include “TableGenBackends.h”
include “llvm/Support/Format.h”
include “llvm/TableGen/Record.h”
include “llvm/TableGen/TableGenBackend.h”
include

  1. To simplify coding, the llvm namespace is imported:

using namespace llvm;

  1. The TokenAndKeywordFilterEmitter class is responsible for generating the C++ source code. The emitFlagsFragment(), emitTokenKind(), and emitKeywordFilter() methods emit the source code, as described in the previous section, Sketching the source code to be generated. The only public method, run(), calls all the code-emitting methods. The records are held in an instance of RecordKeeper, which is passed as a parameter to the constructor. The class is inside an anonymous namespace:

namespace {
class TokenAndKeywordFilterEmitter {
RecordKeeper &Records;
public:
explicit TokenAndKeywordFilterEmitter(RecordKeeper &R)
: Records(R) {}
void run(raw_ostream &OS);
private:
void emitFlagsFragment(raw_ostream &OS);
void emitTokenKind(raw_ostream &OS);
void emitKeywordFilter(raw_ostream &OS);
};
} // End anonymous namespace

Defining data in the TableGen language – The TableGen Language

The TokenKinds.def database file defines three different macros. The TOK macro is used for tokens that do not have a fixed spelling – for example, for integer literals. The PUNCTUATOR macro is used for all kinds of punctuation marks and includes a preferred spelling. Lastly, the KEYWORD macro defines a keyword that is made up of a literal and a flag, which is used to indicate at which language level this literal is a keyword. For example, the thread_local keyword was added to C++11.
One way to express this in the TableGen language is to create a Token class that holds all the data. You can then add subclasses of that class to make the usage more comfortable. You also need a Flag class for flags defined together with a keyword. And last, you need a class to define a keyword filter. These classes define the basic data structure and can be potentially reused in other projects. Therefore, you create a Keyword.td file for it. Here are the steps:

  1. A flag is modeled as a name and an associated value. This makes it easy to generate an enumeration from this data:

class Flag {
string Name = name;
int Val = val;
}

  1. The Token class is used as the base class. It just carries a name. Please note that this class has no parameters:

class Token {
string Name;
}

  1. The Tok class has the same function as the corresponding TOK macro from the database file. it represents a token without fixed spellings. It derives from the base class, Token, and just adds initialization for the name:

class Tok : Token {
let Name = name;
}

  1. In the same way, the Punctuator class resembles the PUNCTUATOR macro. It adds a field for the spelling of the token:

class Punctuator : Token {
let Name = name;
string Spelling = spelling;
}

  1. And last, the Keyword class needs a list of flags:

class Keyword flags> : Token {
let Name = name;
list Flags = flags;
}

  1. With these definitions in place, you can now define a class for the keyword filter, called TokenFilter. It takes a list of tokens as a parameter:

class TokenFilter tokens> {
string FunctionName;
list Tokens = tokens;
}

With these class definitions, you are certainly able to capture all the data from the TokenKinds.def database file. The TinyLang language does not utilize the flags, since there is only this version of the language. Real-world languages such as C and C++ have undergone a couple of revisions, and they usually require flags. Therefore, we use keywords from C and C++ as an example. Let’s create a KeywordC.td file, as follows:

  1. First, you include the class definitions created earlier:

Include “Keyword.td”

  1. Next, you define flags. The value is the binary value of the flag. Note how the !or operator is used to create a value for the KEYALL flag:

def KEYC99 : Flag<“KEYC99”, 0x1>;
def KEYCXX : Flag<“KEYCXX”, 0x2>;
def KEYCXX11: Flag<“KEYCXX11”, 0x4>;
def KEYGNU : Flag<“KEYGNU”, 0x8>;
def KEYALL : Flag<“KEYALL”, !or(KEYC99.Val, KEYCXX.Val, KEYCXX11.Val , KEYGNU.Val)>;

  1. There are tokens without a fixed spelling – for example, a comment:

def : Tok<“comment”>;

  1. Operators are defined using the Punctuator class, as in this example:

def : Punctuator<“plus”, “+”>;
def : Punctuator<“minus”, “-“>;

  1. Keywords need to use different flags:

def kw_auto: Keyword<“auto”, [KEYALL]>;
def kw_inline: Keyword<“inline”, [KEYC99,KEYCXX,KEYGNU]>;
def kw_restrict: Keyword<“restrict”, [KEYC99]>;

  1. And last, here’s the definition of the keyword filter:

def : TokenFilter<[kw_auto, kw_inline, kw_restrict]>;

Of course, this file does not include all tokens from C and C++. However, it demonstrates all possible usages of the defined TableGen classes.
Based on these TableGen files, you’ll implement a TableGen backend in the next section.

Understanding the TableGen language – The TableGen Language

LLVM comes with its own domain-specific language (DSL) called TableGen. It is used to generate C++ code for a wide range of use cases, thus reducing the amount of code a developer has to produce. The TableGen language is not a full-fledged programming language. It is only used to define records, which is a fancy word for a collection of names and values. To understand why such a restricted language is useful, let’s examine two examples.

Typical data you need to define one machine instruction of a CPU is:

  • The mnemonic of the instruction
  • The bit pattern
  • The number and types of operands
  • Possible restrictions or side effects

It is easy to see that this data can be represented as a record. For example, a field named asmstring could hold the value of the mnemonic; say, “add”. Also, a field named opcode could hold the binary representation of the instruction. Together, the record would describe an additional instruction. Each LLVM backend describes the instruction set in this way.

Records are such a general concept that you can describe a wide variety of data with them. Another example is the definition of command-line options. A command-line option:

  • Has a name
  • May have an optional argument
  • Has a help text
  • May belong to a group of options

Again, this data can be easily seen as a record. Clang uses this approach for the command-line options of the Clang driver.

The TableGen language

In LLVM, the TableGen language is used for a variety of tasks. Large parts of a backend are written in the TableGen language; for example, the definition of a register file, all instructions with mnemonic and binary encoding, calling conventions, patterns for instruction selection, and scheduling models for instruction scheduling. Other uses of LLVM are the definition of intrinsic functions, the definition of attributes, and the definition of command-line options.

You’ll find the Programmer’s Reference at https://llvm.org/docs/TableGen/ProgRef.html and the Backend Developer’s Guide at https://llvm.org/docs/TableGen/BackGuide.html.

To achieve this flexibility, the parsing and the semantics of the TableGen language are implemented in a library. To generate C++ code from the records, you need to create a tool that takes the parsed records and generates C++ code from it. In LLVM, that tool is called llvm-tblgen, and in Clang, it is called clang-tblgen. Those tools contain the code generators required by the project. But they can also be used to learn more about the TableGen language, which is what we’ll do in the next section.

Experimenting with the TableGen language

Very often, beginners feel overwhelmed by the TableGen language. But as soon as you start experimenting with the language, it becomes much easier.

TIP – Optimizing IR

To allow the user to add passes at every extension point, you need to add the preceding code snippet for each extension point.

  1. Now is a good time to try out the different pass manager options. With the –debug-pass-manager option, you can follow which passes are executed in which order. You can also print the IR before or after each pass, which is invoked with the –print-before-all and –print-after-all options. If you created your own pass pipeline, then you can insert the print pass in points of interest. For example, try the –passes=”print,inline,print” option. Furthermore, to identify which pass changes the IR code, you can use the –print-changed option, which will only print the IR code if it has changed compared to the result from the pass before. The greatly reduced output makes it much easier to follow IR transformations.

The PassBuilder class has a nested OptimizationLevel class to represent the six different optimization levels. Instead of using the “default<O?>” pipeline description as an argument to the parsePassPipeline() method, we can also call the buildPerModuleDefaultPipeline() method, which builds the default optimization pipeline for the request level – except for level O0. This optimization level means that no optimization is performed.

Consequently, no passes are added to the pass manager. If we still want to run a certain pass, then we can add it to the pass manager manually. A simple pass to run at this level is the AlwaysInliner pass, which inlines a function marked with the always_inline attribute into the caller. After translating the command-line option value for the optimization level into the corresponding member of the OptimizationLevel class, we can implement this as follows:
    PassBuilder::OptimizationLevel Olevel = …;
    if (OLevel == PassBuilder::OptimizationLevel::O0)
      MPM.addPass(AlwaysInlinerPass());
    else
      MPM = PB.buildPerModuleDefaultPipeline(OLevel, DebugPM);

Of course, it is possible to add more than one pass to the pass manager in this fashion. PassBuilder also uses the addPass() method when constructing the pass pipeline.

Running extension point callbacks

Because the pass pipeline is not populated for optimization level O0, the registered extension points are not called. If you use the extension points to register passes that should also run at O0 level, this is problematic. You can call the runRegisteredEPCallbacks() method to run the registered extension point callbacks, resulting in a pass manager populated only with the passes that were registered through the extension points.

By adding the optimization pipeline to tinylang, you created an optimizing compiler similar to clang. The LLVM community works on improving the optimizations and the optimization pipeline with each release. Due to this, it is very seldom that the default pipeline is not used. Most often, new passes are added to implement certain semantics of the programming language.

Summary

In this chapter, you learned how to create a new pass for LLVM. You ran the pass using a pass pipeline description and an extension point. You extended your compiler with the construction and execution of a pass pipeline similar to clang, turning tinylang into an optimizing compiler. The pass pipeline allows the addition of passes at extension points, and you learned how you can register passes at these points. This allows you to extend the optimization pipeline with your developed passes or existing passes.

In the next chapter, you will learn the basics of the TableGen language, which is used extensively in LLVM and clang to significantly reduce manual programming.

Extending the pass pipeline – Optimizing IR

In the previous section, we used the PassBuilder class to create a pass pipeline, either from a user-provided description or a predefined name. Now, let’s look at another way to customize the pass pipeline: using extension points.
During the construction of the pass pipeline, the pass builder allows passes contributed by the user to be added. These places are called extension points. A couple of extension points exist, as follows:
• The pipeline start extension point, which allows us to add passes at the beginning of the pipeline
• The peephole extension point, which allows us to add passes after each instance of the instruction combiner pass
Other extension points exist too. To employ an extension point, you must register a callback. During the construction of the pass pipeline, your callback is run at the defined extension point and can add passes to the given pass manager.
To register a callback for the pipeline start extension point, you must call the registerPipelineStartEPCallback() method of the PassBuilder class. For example, to add our PPProfiler pass to the beginning of the pipeline, you would adapt the pass to be used as a module pass with a call to the createModuleToFunctionPassAdaptor() template function and then add the pass to the module pass manager:

PB.registerPipelineStartEPCallback(
[](ModulePassManager &MPM) {
MPM.addPass(PPProfilerIRPass());
});

You can add this snippet in the pass pipeline setup code anywhere before the pipeline is created – that is, before the parsePassPipeline() method is called.
A very natural extension to what we did in the previous section is to let the user pass a pipeline description for an extension point on the command line. The opt tool allows this too. Let’s do this for the pipeline start extension point. Add the following code to the tools/driver/Driver.cpp file:

  1. First, we must a new command line for the user to specify the pipeline description. Again, we take the option name from the opt tool:

static cl::opt PipelineStartEPPipeline(
“passes-ep-pipeline-start”,
cl::desc(“Pipeline start extension point));

  1. Using a Lambda function as a callback is the most convenient way to do this. To parse the pipeline description, we must call the parsePassPipeline() method of the PassBuilder instance. The passes are added to the PM pass manager and given as an argument to the Lambda function. If an error occurs, we only print an error message without stopping the application. You can add this snippet after the call to the crossRegisterProxies() method: PB.registerPipelineStartEPCallback(
    [&PB, Argv0](ModulePassManager &PM) {
    if (auto Err = PB.parsePassPipeline(
    PM, PipelineStartEPPipeline)) {
    WithColor::error(errs(), Argv0)
    << “Could not parse pipeline “
    << PipelineStartEPPipeline.ArgSt
    r << “: “
    << toString(std::move(Err)) << “\n”;
    }
    });