Writing source/hello.cpp
Basic C++
In this blog post I will collect some notes on the basic concepts of C++ programming. I will try to keep these notes brief, and guide the reader to the appropriate resources for further details. I will keep updating this post to include more concepts and examples, as I write more C++ code.
I am using Jupyter notebook to write this blog post, as it allows me to keep the text, code and commands in one place, and compile it to this HTML page using Quarto. I use the following IPython magic commands to write files and execute shell commands:
%%writefile <path-to-file>.cpp%%sh
Note that I am executing the commands in a Linux environment, and these commands may be different on a Windows environment. I use Windows Subsystem for Linux (WSL)1 on my Windows machine.
1 For details on WSL: https://learn.microsoft.com/en-us/windows/wsl
Hello, World!
As with any programming language, I start by printing Hello, World! to the command-line interface (CLI), using the code below:
In this blog post, I will introduce different concepts as I go, alongside the example code. I start by discussing the code above, line by line:
- Line 1: I create a file with the extension “.cpp”. You might also come across file with the extension “.cc”. However, throughout this blog I will be using the former.
- Line 2: I import the header
<iostream>2 from the standard template library (STL). It gives us the definition ofstd::coutwhich I use print to the CLI, andstd::endlwhich I use to print a newline character.stdhere refers to a namespace, which is a container that collects names of variables, classes, functions, etc., that are defined within it. - Line 3: I define a function called
mainwith the return typeintwhich stands for integer. I have to declare the type of the variable, which makes C++ a statically typed language. Thismainfunction is where the program starts to execute. The curved brackets()aftermaincontains the list of parameters (in this case none), which I pass into the function. Next is the curly brackets{}which contains all the statements of the function. - Line 4: I pass the string
Hello, World!followed by a newline character usingstd:endlto the objectstd:cout, which then prints out the contents passed into it to the CLI. The statement ends with a semicolon;, which is case for all statements in C++ syntax. - Line 5: I return the integer
0from the function.returnis a keyword in C++ which is reserved by the language.
2 For details on <iostream>: https://en.cppreference.com/w/cpp/header/iostream.html
Before I could execute the above code, I will need to compile it to a program. This makes C++ a compiled language. Here I compile the code and execute the program, as shown below:
Again, discussing line by line:
- Line 2: I ask the
g++to take the filesource/hello.cppand outputbuild/helloafter the full compilation process. - Line 3: I execute the program, and it prints to the CLI “Hello, World!”.
- Line 4: I check the return code of the program, which in this case is integer
0as I defined in our program.
Compilation Process
The compilation process has multiple steps. These include:
Pre-processing: handles lines that start with#, such as#includeand#define.Compiling: translates the pre-processed code to assembly code.Assembling: translates the assembly code to machine code, producing object files.Linking: links the object files and libraries (e.g.,<iostream>) to produce the final executable program.
In the pre-processing stage, the compiler finds the headers I have defined and replace the include statement (e.g., #include <iostream>) with the contents of that header file. I can view the result of this process by stopping the compilation process after the preprocessing stage, as show below:
# 0 "source/hello.cpp"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "<command-line>" 2
# 1 "source/hello.cpp"
# 1 "/usr/include/c++/13/iostream" 1 3
# 36 "/usr/include/c++/13/iostream" 3
# 37 "/usr/include/c++/13/iostream" 3
# 1 "/usr/include/c++/13/bits/requires_hosted.h" 1 3
# 31 "/usr/include/c++/13/bits/requires_hosted.h" 3
# 1 "/usr/include/x86_64-linux-gnu/c++/13/bits/c++config.h" 1 3
# 306 "/usr/include/x86_64-linux-gnu/c++/13/bits/c++config.h" 3
# 306 "/usr/include/x86_64-linux-gnu/c++/13/bits/c++config.h" 3
namespace std
{
typedef long unsigned int size_t;
In the shell command above, I am using the compiler g++ which belong to the GNU compiler collection (GCC)3. Let’s break done the shell commands, line by line:
3 For details on GCC: https://gcc.gnu.org/
- Line 2: I ask
g++to take the filesource/hello.cppand outputbuild/hello.iiafter the preprocessing stage. The flag for askingg++to output a file is-oand the flag for askingg++to stop after preprocessing stage is-E. You can view a list of available flag options using the commandg++ --help. - Line 3: I am displaying only the first 20 lines of the output file, as the file is very long. I recommend, taking a look at the file on GitHub repository. It shows that the
includestatement was replaced with the contents of the header file. The third last line printed above shows the definition of namespacestd. Every name defined within the curly brackets{}after it belongs to the namespacestd.
Similarly, I can also check the assembly code after the compiling stage. The assembly code is specific to the architecture of the machine it is being compiled on. In my case, it is x86-64 architecture. Here I ask the compiler to stop after the compiling stage, as shown below:
.file "hello.cpp"
.text
#APP
.globl _ZSt21ios_base_library_initv
.section .rodata
.LC0:
.string "Hello, World!"
#NO_APP
.text
.globl main
.type main, @function
main:
.LFB1988:
.cfi_startproc
endbr64
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
Again, I am showing the first 20 lines of the output file here. The rest of the file can be found on the GitHub repository.
- Line 2: I ask
g++to take the filesource/hello.cppand outputbuild/hello.asmafter the compiling stage. The flag for askingg++to stop after the compiling stage is-S.
If I have only the executable file and not the source file, I can still view the assembly code by disassembling the executable file, as shown below:
build/hello: file format elf64-x86-64
Disassembly of section .init:
0000000000001000 <_init>:
1000: f3 0f 1e fa endbr64
1004: 48 83 ec 08 sub rsp,0x8
1008: 48 8b 05 e1 2f 00 00 mov rax,QWORD PTR [rip+0x2fe1] # 3ff0 <__gmon_start__@Base>
100f: 48 85 c0 test rax,rax
1012: 74 02 je 1016 <_init+0x16>
1014: ff d0 call rax
1016: 48 83 c4 08 add rsp,0x8
101a: c3 ret
Disassembly of section .plt:
0000000000001020 <.plt>:
1020: ff 35 8a 2f 00 00 push QWORD PTR [rip+0x2f8a] # 3fb0 <_GLOBAL_OFFSET_TABLE_+0x8>
- Line 2: I ask
objdumpto take the file machine codesource/helloand output assembly codebuild/hello_2.asmby disassembling it. The flag for askingobjdumpto disassemble the executable section of the file is-d. The flag-Mpasses options to the disassembler. Here I are asking it to use Intel syntax instead of the default AT&T syntax.