Project Stage 1 Overview
In this project, we will update the compiler to add automatic features, allowing us to create one binary that works on more platforms. In the first stage, we will improve the compiler by adding extra optimization or transformation steps. This includes making a basic GCC pass that goes through the compiled code, prints each function's name, counts how many basic blocks are in each function, and shows the total number of GIMPLE statements in each function.
Step 1:Environment Setup
We will use three separate directories to manage our GCC setup and the modifications made in lab4:
Directories contain:
git/gcc Directory – This is the source code directory where the GCC source files are stored.
gcc-build-001 Directory – This is the build directory where the compilation process takes place.
gcc-test-001 Directory – This is the installation directory where the final GCC binaries and related files are placed after a successful build.
Step 2: Understanding the Files and Procedure to Add Our Own Pass:
To add a custom GCC pass, we need to modify four key files and include some additional configurations:
tree-your-own-pass.cc – Defines the logic of our GCC pass, specifying how it manipulates or analyzes the code during the compilation process.
passes.def – Registers the pass with the compiler so it gets executed at the correct point in the compilation pipeline.
tree-pass.h – Includes the necessary header file for the pass. This file does not need to be in a particular order but is essential for defining the pass structure.
Makefile.in – Updates the build system to incorporate our pass by modifying the Makefile.in to ensure that the new pass is built during the GCC compilation process.
Note:To figure out where our pass should be executed in the GCC process, we need to understand when different types of transformations happen:
1.Early Passes (Source Code and Initial Transformations)
At this stage, the code is still in its original form or an early intermediate representation (like GIMPLE). If we want to analyze or change the code at this stage, we need to insert our pass early in the process. However, GIMPLE will only appear after certain early passes are completed.
2.GIMPLE Representation
GIMPLE is a simplified version of the code used for optimization in GCC. If we want our pass to work with GIMPLE, we need to make sure our pass is added after the transformation to GIMPLE happens. This is because GIMPLE is easier to work with than the original source code. If we want to perform changes or analysis on GIMPLE, our pass should be added after it is generated but before the later optimization steps.
3.Late Passes (After Optimizations)
If we want our pass to work with code after optimizations like loop unrolling or hoisting, we need to add it later in the process. But we should be careful not to put it too late because if we do, the code might already be in its final form, and it will be harder to make changes.
4.Avoid Final Stages
If we put our pass too late in the process, when the code is being translated to RTL or machine code, it could cause problems. At these final stages, the code is closer to what the machine will run, and modifying it may not be useful.
Step 3:Creating a New GCC Pass
To create a new GCC pass, we need to follow these steps:
1.Navigate to the GCC Source Code:
First, navigate to the directory where the GCC source code is located by running the command:
cd ~/git/gcc/gcc
2.Create Our Own Pass File:
Create a new file for our pass using a text editor like nano. This is where we will define the logic for our custom pass. Run the command to create the file:
nano tree_zwang331_pass.cc
Note:In GCC, tree files are used to represent the source code in a simpler way as the compiler works through different steps. These "trees" are special data structures that help the compiler understand and optimize the code.
3.Build Our Pass Logic Inside the File:
Now, within the tree-zwang331.cc file, we will define the actual logic of the pass:
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree.h"
#include "gimple.h"
#include "pass_manager.h"
#include "context.h"
#include "diagnostic-core.h"
#include "tree-pass.h"
#include "ssa.h"
#include "tree-pretty-print.h"
#include "internal-fn.h"
#include "gimple-iterator.h"
#include "gimple-walk.h"
#include "internal-fn.h"
#include "tree-core.h"
#include "basic-block.h"
// Added headers:
#include "gimple-ssa.h"
#include "cgraph.h"
#include "attribs.h"
#include "pretty-print.h"
#include "tree-inline.h"
#include "intl.h"
#include "dumpfile.h"
#include "builtins.h"
namespace {
const pass_data pass_data_zwang331 = {
GIMPLE_PASS, /* type */
"zwang331", /* name */
OPTGROUP_NONE, /* optinfo_flags */
TV_NONE, /* tv_id */
PROP_cfg, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
0, /* todo_flags_finish */
};
class pass_zwang331 : public gimple_opt_pass {
public:
pass_zwang331(gcc::context *ctxt)
: gimple_opt_pass(pass_data_zwang331, ctxt) {}
bool gate(function *) final override {
return true; // Always run the pass
}
unsigned int execute(function *fun) final override;
}; // class pass_zwang331
unsigned int
pass_zwang331::execute(function *fun) {
int bb_count = 0;
int gimple_stmt_count = 0;
if (dump_file) {
fprintf(dump_file, "=== Function: %s ===\n", function_name(fun));
}
// Iterate over basic blocks
basic_block bb;
FOR_EACH_BB_FN(bb, fun) {
bb_count++;
int bb_gimple_count = 0;
for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
bb_gimple_count++;
}
gimple_stmt_count += bb_gimple_count;
if (dump_file) {
fprintf(dump_file, "Basic Block %d contains %d GIMPLE statements.\n", bb_count, bb_gimple_count);
}
}
if (dump_file) {
fprintf(dump_file, "Total Basic Blocks: %d\n", bb_count);
fprintf(dump_file, "Total GIMPLE Statements: %d\n", gimple_stmt_count);
}
return 0;
}
} // anonymous namespace
gimple_opt_pass *
make_pass_zwang331 (gcc::context *ctxt)
{
return new pass_zwang331 (ctxt);
}
Note:FOR_EACH_FUNCTION is a macro found in cgraph.h. It iterates through all the function nodes in the call graph, beginning with the first function and then moving to the next one by using the symbol table (symtab).
Step 4: Include Our New Pass in passes.def
Passes.def file contains all the macros we need. Now, we need to figure out when we want our pass to run in the sequence of passes. Very early passes deal with the source code, while later passes deal more with the output code.We will place our pass after most of the optimizations, but before the final wrap-up stage. This way, it can analyze and modify the code after important optimizations have been applied, but before it's too late to make useful changes.
Step 5: Include Our New Pass in tree-pass.h
We need to declare our pass so that the GCC build system knows about it and can properly register and execute it during the compilation process.
Note:The sequence of adding the pass declaration and the registration of the pass in tree-pass.h
doesn't matter too much, as long as both steps are completed correctly.
Step 6:Updates the Build System
Now we have to modify the build system by updating the Makefile.in, ensuring that the new pass is included and built during the GCC compilation process.
Note:We don't need to provide any specific instructions for it, as it will be automatically included during the build process.
Step 7:Recreating the Makefile
Now, we need to navigate to our build directory to recreate the Makefile, since GCC doesn’t automatically detect changes made to Makefile.in. This step ensures that any modifications we made to the build system, such as adding our new pass, are recognized.By using command: rm Makefile in to remove the Makefile and time make -j 20 |& tee rebuild.log to build our gcc again.Using PATH="$HOME/gcc-test-001/bin:$PATH"sets the PATH environment variable to include the bin directory in our gcc-test-001 folder.
Note:For more information on building GCC, you can click
here.
Step 8: Testing our Pass
Now we can test our hello.c file using
gcc -g -O0 -fno-builtin -fdump-tree-zwang331 -o hello hello.cNote:
-O0 Disables all optimizations. Ensures that the generated assembly code closely matches the original C source code.
-fno-builtin Disables the use of builtin functions provided by GCC (e.g., printf, memcpy, strlen).
-fdump-tree-zwang331Tells GCC to dump the intermediate GIMPLE representation of the program to a file.
(hello.c file)
Output Result:
(Output in aarch64 server)
Reflection
While working on configuring and building GCC, I ran into a few challenges.One of the biggest issues I faced was the error message saying that the CC (C compiler) and CXX (C++ compiler) had changed since the previous run,I have no idea what caused this error. This stopped the build from continuing, and at first, I didn’t know how to fix it.To fix the issue, I found out that I needed to start over by cleaning the build directory and removing old configuration files. Running make distclean allowed me to reset the build process and get rid of anything that might have been causing problems.In the end, solving these problems was a great learning experience that gave me a deeper understanding of how GCC works.
Comments
Post a Comment