Be an LLVM contributor: Writing an optimization pattern for LLVM

LLVM’s Optimizations Are Truly Impressive LLVM performs aggressive optimizations. One of my favorites is the Data Structure Elimination. For example, we want to 1) compute the sum of two integers (get_sum function), and 2) select one of the two characters based on a condition argument (get_char function): #include <vector> #include <string> using namespace std; int get_sum(int a, int b) { vector<int> vec1; // Create the first vector vec1 vector<int> vec2; // Create the second vector vec2 vec1.push_back(a); // Push the first argument to vec1 vec2.push_back(b); // Push the second argument to vec2 return vec1[0] + vec2[0]; // Return the sum of them } // Create a similar logic for string values char get_char(char a, char b, bool cond) { string str1{a}; string str2{b}; return cond ? str1[0] : str2[0]; } get_sum(int, int): lea eax, [rdi + rsi] ret get_char(char, char, bool): mov eax, edi test edx, edx cmove eax, esi ret For the get_sum function, the compiler takes advantage of the lea instruction for the calculation, and for get_char, it uses cmove which it usually does to avoid potential misprediction penalties. Despite the vector being created on the heap, LLVM is able to optimize both of them away, whereas GCC fails to do so for the vector but succeeds for string values. This is because small strings are stored on the stack (as explained in Raymond Chen’s excellent post), while the vector’s data is immediately allocated on the heap (For strings with a length of 15 characters (16 - 1 for the null character), the string is stored on the stack; beyond 15 characters, it is moved to the heap.) ...

December 8, 2024 · 8 min · Khagan Khan Karimov

Do parentheses matter for better performance?

Missed optimizations in LLVM The book “Computer Systems: A Programmer’s Perspective” warns the readers: “When in doubt, put in parentheses!”. Despite the authors saying this when talking about precedence issues, it would be applicable to some other cases. If you have ever written in a Lisp-like language you must love them! But what would happen if we put some undue parentheses in our code. Let us consider the following C function (sum.c): ...

August 29, 2024 · 8 min · Khagan Khan Karimov