Software Engineering at Meta βΎ
I live in London, UK π‘
As I mentioned in my previous post about C++, I am learning C++. It has been a bumpy ride so far, and C++ is certainly not an easy to pick up programming language! So, I thought what better way to make the learning stronger than blogging about my journey and pinning down my experience. You now know that the reason this post exists is a bit selfish, but I am hoping it will be helpful to some other folks who are going through the same while also acknowledging that everyone's mental model is different. So, YMMV.
In this post, I want to share my experience of incorporating a 3rd party dependency into my own program, and understanding what goes under the hood during the compilation and linking phase of the build process. For the purposes of this, I will be aiming to incoorporate gflags
C++ library into my own program. This library is providing support to be able to define and parse commandline flags.
The way you should be stating a library dependency in your own C++ code is through the #include
directive by specifying the header file that you want to take a dependency on. As we probably know by now that the header file doesn't actually contain the implementation, but only declares the contract between the library and the consumer. We will shortly touch on how we will be able to tie the header file with its implementation.
As we can see inside the gflags
documentation, the header file we want to work with is called gflags/gflags.h
. That immediately raised some questions for me. I am sure it will for you if you happen to be a newbie in C++ World like me. The biggest one of all is where gflgas/
folder is relative to. That will become more clear when it comes to the building part. So, for now, let's assume it's magicβ’οΈ.
As we learned about how to take a dependency on this library within the code, here is how our sample program looks like:
#include <iostream>
#include <gflags/gflags.h>
DEFINE_string(name, "Tugberk", "Name of the person to greet");
int main(int argc, char *argv[]) {
gflags::ParseCommandLineFlags(&argc, &argv, true);
std::cout << "Hello " << FLAGS_name << std::endl;
}
Nothing fancy, and you can see the gflags
documentation about the specifics of our usage here. The purpose of this post is not to explain that. The only reason that we are using gflags
here to demonstrate how to take a dependency on an external library, and it is an easy to use one that won't be hard to explain.
However, one thing that's worth noting is the usage of gflags::
before the ParseCommandLineFlags
function call. gflags
that's being referred here is the namespace, which we are betting that it will be declared within the gflags.h
header file. gflags::ParseCommandLineFlags
is the fully-qualified reference to the function we want to invoke.
Alternatively, we could have imported the entire gflags
namespace, and be able to call ParseCommandLineFlags
directly without a namespace declaration like the following, which would mean that you can use anything under that namespace directly:
#include <iostream>
#include <gflags/gflags.h>
using namespace gflags;
DEFINE_string(name, "Tugberk", "Name of the person to greet");
int main(int argc, char *argv[]) {
ParseCommandLineFlags(&argc, &argv, true);
std::cout << "Hello " << FLAGS_name << std::endl;
}
Based on my understanding, there is nothing wrong with this in terms of performance of the program or the compiler (I could be wrong, don't quote me on this). However, this will likely increase your changes of having a name collisions, and also it will make it a bit hard to read the code (i.e. it's not immediately clear where ParseCommandLineFlags
is coming from).
One other alternative is to just declare a using for the type you want to use:
#include <iostream>
#include <gflags/gflags.h>
using gflags::ParseCommandLineFlags;
DEFINE_string(name, "Tugberk", "Name of the person to greet");
int main(int argc, char *argv[]) {
ParseCommandLineFlags(&argc, &argv, true);
std::cout << "Hello " << FLAGS_name << std::endl;
}
Although this still suffers from the same problems I listed above to a certain extent, this is a bit better especially when you are planning to use the defined type a few times within the same file.
Final thing I want to note within this code is the use of DEFINE_string
. It's also defined within the same header file. However, that's a Macro and it doesn't seem to be tied to a namespace. I don't have much info about Macros at this stage, but wanted to touch on the rationale of why it's being used in this way.
We have our implementation which should give us a command like program where we can call hello-world --name Bob
and that would print out Hello Bob
for us. To be able to demonstrate different build variations, I am going to run the build within a Docker container. Configuration for this is going to be very simple. The code we have seen above will be inside the main.cpp
file. Also to start with, we will also have a build.sh
file with the following content:
#!/bin/bash
g++ -v ./main.cpp -o hello-world
-v
is here to give verbose output from the compiler which will be handy when it comes to understanding what goes under the hood. The Dockerfile
content will be as following:
FROM ubuntu
RUN apt-get update && apt-get -y install build-essential
WORKDIR /opt/
RUN mkdir app
WORKDIR /opt/app
COPY ./ ./
RUN ./build.sh
CMD ["./hello-world", "--name=Bob"]
When I run docker build .
with this setup, I'm getting an error:
...
...
Step 7/7 : RUN ./build.sh
---> Running in 39ce491a452e
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.3.0-17ubuntu1~20.04' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
COLLECT_GCC_OPTIONS='-v' '-o' 'hello-world' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-linux-gnu/9/cc1plus -quiet -v -imultiarch x86_64-linux-gnu -D_GNU_SOURCE ./main.cpp -quiet -dumpbase main.cpp -mtune=generic -march=x86-64 -auxbase main -version -fasynchronous-unwind-tables -fstack-protector-strong -Wformat -Wformat-security -fstack-clash-protection -fcf-protection -o /tmp/ccebxWeM.s
GNU C++14 (Ubuntu 9.3.0-17ubuntu1~20.04) version 9.3.0 (x86_64-linux-gnu)
compiled by GNU C version 9.3.0, GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version isl-0.22.1-GMP
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring duplicate directory "/usr/include/x86_64-linux-gnu/c++/9"
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/9/include-fixed"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/9/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/usr/include/c++/9
/usr/include/x86_64-linux-gnu/c++/9
/usr/include/c++/9/backward
/usr/lib/gcc/x86_64-linux-gnu/9/include
/usr/local/include
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
GNU C++14 (Ubuntu 9.3.0-17ubuntu1~20.04) version 9.3.0 (x86_64-linux-gnu)
compiled by GNU C version 9.3.0, GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version isl-0.22.1-GMP
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 466f818abe2f30ba03783f22bd12d815
./main.cpp:2:10: fatal error: gflags/gflags.h: No such file or directory
2 | #include <gflags/gflags.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
The command '/bin/sh -c ./build.sh' returned a non-zero code: 1
There are a few important things to call out here:
/usr/local/include
and a few others right after hitting the #include
directives during its preprocessing stage.gflags/gflags.h: No such file or directory
. That's giving us an indication that the header file with the path of gflags/gflags.h
wasn't found in any of the include directories which the compiler was searching under.This is an expected error at this stage, because gflags is a 3rd party library, this is a fresh box and we didn't install that library.
Let's pause a bit and learn some fundamentals. I kept mentioning compilation, like it's a black box where you give it some input and get an output, compiled object back. Most of the time, this type of thinking will get us where we want to be. However, my aim here is to understand what's going on under the hood a bit more. When I went a bit deeper to understand the build process for C++, I have found out that the build step is broken down into three independent steps:
#include
and #define
. After the processing of these directives, the preprocessor produces a single output.You can check out this incredible Stackoverflow answer on this topic, which explains compilation steps of a C++ program more in-depth, and I copied most of what I mentioned in this section from there.
Let's install gflags
according to the installation guidelines of this library, and rerun the compilation:
diff --git a/1-dependency/Dockerfile b/1-dependency/Dockerfile
index fbaeba8..58215ea 100644
--- a/1-dependency/Dockerfile
+++ b/1-dependency/Dockerfile
@@ -1,6 +1,7 @@
FROM ubuntu
RUN apt-get update && apt-get -y install build-essential
+RUN apt-get -y install libgflags-dev
WORKDIR /opt/
RUN mkdir app
If I run docker build .
command again, it still gives me an error but this time error is different:
COLLECT_GCC_OPTIONS='-v' '-o' 'hello-world' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-linux-gnu/9/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/9/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper -plugin-opt=-fresolution=/tmp/ccjLVDaH.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -z now -z relro -o hello-world /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/Scrt1.o /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/9/../../.. /tmp/ccg11F4K.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crtn.o
/usr/bin/ld: /tmp/ccg11F4K.o: in function `main':
main.cpp:(.text+0x27): undefined reference to `google::ParseCommandLineFlags(int*, char***, bool)'
/usr/bin/ld: /tmp/ccg11F4K.o: in function `__static_initialization_and_destruction_0(int, int)':
main.cpp:(.text+0x12e): undefined reference to `google::FlagRegisterer::FlagRegisterer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(char const*, char const*, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*)'
collect2: error: ld returned 1 exit status
The command '/bin/sh -c ./build.sh' returned a non-zero code: 1
We are still not quite there yet. However, as a software engineer, you know that this is a great feeling! You made some progress, and the changes that you have just made had some impact to move you forward π
What has happened here is that the compiler was able to find the header file to be able to preprocess the #include
directives. However, where did it find it? We can try to look for gflags.h
file inside the container and see where it's located:
# find / -iname gflags.h
/usr/include/gflags/gflags.h
This makes more sense now as /usr/include
is one of the directories where the compiler is looking for to find the header files.
The error we have received this time seems to be coming from ld
, the linker, and it seems to be indicating that there are undefined references to several objects and functions under google
namespace.
/usr/bin/ld: /tmp/ccg11F4K.o: in function `main':
main.cpp:(.text+0x27): undefined reference to `google::ParseCommandLineFlags(int*, char***, bool)'
It's worth noting where this
google::
namespace comes from. This library seems to be exposed under two namespaces:gflags
andgflags
. However, it seems like any usage under that namespace eventually seems to be redirected to
This error is also expected, as we haven't told the compiler yet what library dependency we want to link to, a.k.a archive, or static library. For static library files, the filenames always start with lib
, and end with .a
(archive, static library) on Unix/Linux (see this post for reference). We can use the -l
command line option of the g++
compiler, which would eventually pass this to ld
to add the archive file to the list of files to link. This option may be used any number of times. ld
will search its path-list for occurrences of lib{archive}.a
for every {archive}
specified.
With this in mind, we should be able to complete our compilation journey by passing -lgflags
option to g++
compiler:
The error output above might be confusing you since it seems like
/usr/lib/gcc/x86_64-linux-gnu/9/collect2
is invoked directly, notld
. Quick search suggests to me thatcollect2
eventually callsld
but I am not sure at this stage why and how the compiler locatedcollect2
at the first place, and decided to call it instead of callingld
directly. For simplicity, I will ignorecollect2
for the rest of the post, and only mentionld
.
#!/bin/bash
g++ -v ./main.cpp -lgflags -o hello-world
Now, let's run docker build .
with this setup:
...
...
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring duplicate directory "/usr/include/x86_64-linux-gnu/c++/9"
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/9/include-fixed"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/9/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/usr/include/c++/9
/usr/include/x86_64-linux-gnu/c++/9
/usr/include/c++/9/backward
/usr/lib/gcc/x86_64-linux-gnu/9/include
/usr/local/include
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
GNU C++14 (Ubuntu 9.3.0-17ubuntu1~20.04) version 9.3.0 (x86_64-linux-gnu)
compiled by GNU C version 9.3.0, GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version isl-0.22.1-GMP
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 466f818abe2f30ba03783f22bd12d815
COLLECT_GCC_OPTIONS='-v' '-o' 'hello-world' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
as -v --64 -o /tmp/ccZZScyH.o /tmp/ccvjxfiH.s
GNU assembler version 2.34 (x86_64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.34
COMPILER_PATH=/usr/lib/gcc/x86_64-linux-gnu/9/:/usr/lib/gcc/x86_64-linux-gnu/9/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/9/:/usr/lib/gcc/x86_64-linux-gnu/
LIBRARY_PATH=/usr/lib/gcc/x86_64-linux-gnu/9/:/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/9/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-o' 'hello-world' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-linux-gnu/9/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/9/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper -plugin-opt=-fresolution=/tmp/cc10fZHH.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -z now -z relro -o hello-world /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/Scrt1.o /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/9/../../.. /tmp/ccZZScyH.o -lgflags -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crtn.o
COLLECT_GCC_OPTIONS='-v' '-o' 'hello-world' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
Removing intermediate container ce5a3c257fe2
---> 455abaa9d2d9
Step 9/9 : CMD ["./hello-world", "--name=Bob"]
---> Running in 2b17e00b3210
Removing intermediate container 2b17e00b3210
---> cc8ae20c8aa8
Successfully built cc8ae20c8aa8
Build passed! If we look at the compiler output from this, we should be able to see that -lgflags
option is passed to the linker:
Based on the information we have about the linker and with the -lgflags
option being passed to it now, we know that the linker is looking for libgflags.a
static library file to use as part of the linking process. Where did it find it though, and how did it knew to look there at the first place? Let's look for that file within the container:
β docker run -it cc8ae20c8aa8 /bin/sh
# find / -iname libgflags.a
/usr/lib/x86_64-linux-gnu/libgflags.a
That seems to be existing under /usr/lib/x86_64-linux-gnu
folder. This is the folder where architecture specific libraries live under Ubuntu. If we also look at what's being passed to the linker through the -L
command line option, which adds a path to the list of paths that ld
will search for archive libraries and ld
control scripts, we will see that /usr/lib/x86_64-linux-gnu
is already bing passed.
Nice, the C++ build process is now making more sense for me π
Just to make sure things are working as expected, I will run the container I have just built.
β docker run cc8ae20c8aa8
Hello Bob
β docker run cc8ae20c8aa8 ./hello-world --name=Alice
Hello Alice
It works as expected π
These are the resources I benefited from while writing this post. It's only fair I give these some credit. They might not entirely beneficial to you though: