Posts

SPO600 Closing Remarks (Update to Function Clone Pruning Analysis)

Image
With the fall semester coming to an end and my SPO600 class coming to a close, here are some updates as to what I've been able to accomplish since my previous blog post. Updates to Function Clone Pruning Analysis Until now, I haven't been able to find a good way to properly compare function clones with my method of identifying functions through accumulating their GIMPLE statement types. This is because for each call to execute() on a function, the GIMPLE body of all functions in the call graph changes, so I am unable to compare their bodies accurately. What I've noticed is that when looping through all function bodies, only the function in question that was passed to execute() has the full body GIMPLE body ready while all other functions look incomplete. The images I show below are snippets from my pass dump on an example of a test program whose clones should be pretty much identical, and so should be marked for "pruning." Here is the pass dump for the ...

Identifying the Essence of a Function Clone

Image
The first step in developing Automatic Function Multi-versioning (AFMV) is to identify whether the clones of a function are essentially the same or not. The easiest way we could go about doing this is to try to go through each function clone's GIMPLE statements line by line and compare them against other clones. If no differences are found, then we can assume that these clones are essentially the same. The Issue One issue we have with this though is that functions that are supposed to be "essentially" the same semantically look different after being converted to GIMPLE. As you can see here, these lines look semantically different because they are using different variables names, etc.; however, when you look a bit closer you may notice that these lines are pretty much logically the same. This "logical essence" is what we want to try compare between clones, but how do we identify this? The Solution After thinking and doing some experimentation, I...

Understanding Compiler Passes

Contributing to the GCC compiler requires that I understand the compilation process and the many source code passes it makes in order to produce the final executable. I only have a basic idea about the compilation process, so it was quite tricky for me to understand how compilation passes work. Our professor directed us to try out the -fdump-tree-all and -fdump-rtl-all compiler options, so I did just that. From what I understand so far (and with the help of ChatGPT) the source code goes through a lot of compilation passes before it turns into the fully compiled executable. We can use the -fdump-tree-all option to see the intermediate representation (IR) of our code during each compilation pass. We can also use -fdump-tree-<pass> to get just the singular dump of a pass let's take this simple "Hello World" program in C for example. #include <stdio.h> int main() { printf("Hello World!\n"); } and compile it with gcc using the -fdum...

Building GCC

In my last blog, I talked about contributing to the GCC compiler by helping out with the Automatic Function Multi-versioning (AFMV) for aarch64. In order to do that I must first learn how to build GCC myself. Unfortunately, since I do not have an aarch64 machine, I have to rely on the aarch64 systems my professor has setup in order to build/work with GCC. The system I decided to start my GGC build is an ARMv8 machine with 4 cores and 16 GiB ram. Now to start, the first step we must do is to clone the GCC git repository. Following the steps from my class wiki page I cloned the repo to my spo600 directory using the following command. $ git clone git://gcc.gnu.org/git/gcc.git ~/spo600/ After cloning, the next step is to create a build directory and configure our build to use that directory. Here are the commands I used to do so. $ mkdir ~/spo600/gcc-build-001 $ cd ~/spo600/gcc-build-001 $ ~/spo600/gcc/configure --prefix=$HOME/spo600/gcc-build-001 Now we can start the build ...

Assembly in the real world

Image
For the past couple of weeks, I have been working on various assembly labs for my Software Portability and Optimization (SPO600) class. We started with 6502 assembly then moved to 64-bit assembly for the aarch64 and x86_64 architectures. While those labs were fun, and albeit very challenging, it was time to take these skills and apply them to actual real world problems. The GCC compiler is a popular open source compiler that was initially made for the C programming language but now has evolved to support other languages. It has a lot of optimization and porting capabilities that allow a program to run an various architectures. A couple of these porting features include indirect functions (IFUNC) and function multi-versioning (FMV) . IFUNC is a feature that allows you to create various implementations of a function and a resolver function to specify which functions will actually get compiled during the compilation process. This allows us to create different versions of the same ...

Lab 04 - Diving into 64-bit Assembly

Image
After having worked with 6502 Assembly, we now move on to the more complex 64-bit Assembly that is being used by the majority of computers today. In this lab we explore the 64-bit assembly language used in both aarch64 (ARM) and x86_64 (Intel/AMD) architectures. We will explore their differences, similarities, as well as the tradeoffs for using either. The task for this lab was to develop a simple program that prints the following output: Loop: 0 Loop: 1 Loop: 2 Loop: 3 Loop: 4 Loop: 5 Loop: 6 Loop: 7 Loop: 8 Loop: 9 Loop: 10 Loop: 11 Loop: 12 Loop: 13 Loop: 14 Loop: 15 Loop: 16 Loop: 17 Loop: 18 Loop: 19 Loop: 20 Loop: 21 Loop: 22 Loop: 23 Loop: 24 Loop: 25 Loop: 26 Loop: 27 Loop: 28 Loop: 29 Here is program I developed to do this in Aarch64 assembly. .text .globl _start min = 0 max = 30 _start: mov x19, min // Store to x19 the value of min (0) loop: /* ... body of the loop ... do something useful here ... */ // write() syscall takes 3 args; file descr...

Lab 03 - Breakout Game

Image
This lab is quite special compared to previous ones, as it is more subjective and taps into our creativity. We were tasked with creating any program we want, as long as it meets these criteria: The program must work in the 6502 Emulator It must output to the character screen as well as the graphics (bitmapped) screen. It must accept user input from the keyboard in some form. It must use some arithmetic/math instructions (to add, subtract, do bitwise operations, or rotate/shift) Some suggestions given by our professor were to make either a simple game or a simple calculator/converter. With that in mind, I decided to remake the 1976 arcade game Breakout. What is Breakout? Based on its Wikipedia description , Breakout is a 1967 arcade game developed and published by Atari, Inc. It involves some bricks, a paddle, and a bouncing ball. The goal is to destroy all bricks with a bouncing ball, using the paddle as a sort of tool to keep the ball in play. You lose when the ...