Overwriting source/hello.cpp
Basic C++
In this blog post I will compile notes of basic C++ programming. My goal is to supplement my Parallel C++ blog post with more details on C++ programming language itself. In this way, I can keep the Parallel C++ blog post focused on parallel programming concepts, and refer to this blog post for C++ language details. I will cover only the basic C++ concepts that are necessary for understanding parallel programming in C++.
I do not intend to write a complete C++ tutorial here. There are many great C++ tutorials available online. If you are new to C++, learncpp.com is an excellent resource with detailed tutorials which are easy to follow. If you are looking for a C++ reference, cppreference.com is one of the best C++ references available online.
I am using Jupyter notebook to write this blog post, as it allows me to keep the text, code and commands in one place, and compile it to this HTML page using Quarto. I developing in a Linux environment using Windows Subsystem for Linux (WSL)1. Note that some commands may be specific to Linux environment, and may need to be adapted for other environments.
1 For details on WSL: https://learn.microsoft.com/en-us/windows/wsl
Getting Started
I start by compiling a simple C++ program to ensure that my development environment is set up correctly. I create a file named hello.cpp stored in the subdirectory source. I write this file using IPython magic command %%writefile as follows:
I am including the header <iostream>2 from the C++ standard library. It gives my program the definition of std::cout which I use print to the CLI, and std::endl which I use to print a newline character. std here refers to a namespace, which is a container that collects names of variables, classes, functions, etc., that are defined within it.
2 For details on <iostream>: https://en.cppreference.com/w/cpp/header/iostream.html
C++ a compiled language. Here, before I can run the program, I need to compile it using a C++ compiler. The compilation process has multiple steps. These include:
Pre-processing: handles lines that start with#, such as#includeand#define.Compiling: translates the pre-processed code to assembly code.Assembling: translates the assembly code to machine code, producing object files.Linking: links the object files to produce the final executable program.
Machine code is low-level instructions that the computer’s CPU can execute directly and takes the form of binary code (0s and 1s). Machine code is specific to the architecture of the CPU (e.g., x86, ARM), hence a program compiled for one architecture may not run on another.
Assembly code is a low-level programming language that is one step above machine code. It uses human-readable mnemonics to represent machine instructions which corresponds directly to a specific machine code instruction for a given CPU architecture. Hence, assembly code is also architecture-specific.
C++ programming language provides an abstraction over assembly and machine code that is portable across different CPU architectures. A C++ compiler translates C++ code to assembly and machine code specific to the target architecture during the compilation process.
I use the g++ compiler to compile C++ programs. It is part of the GNU Compiler Collection (GCC)3 and is widely used for compiling C++ code. I compile and run the hello.cpp program as follows:
3 For details on GCC: https://gcc.gnu.org/
I am doing 3 things here:
- Line 2: I ask the
g++to take the filesource/hello.cppand outputbuild/helloafter the full compilation process. - Line 3: I execute the program, and it prints to the CLI “Hello, World!”.
- Line 4: I check the return code of the program, which in this case is integer
0as I defined in our program.
Next, it will be instructive to view the output of the complication process after each step. This can be done by stopping the compilation process after each step using the appropriate flags. I can view a list of available flag options using the command g++ --help. First, I will view the output after the pre-processing stage, as follows:
# 0 "source/hello.cpp"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "<command-line>" 2
# 1 "source/hello.cpp"
# 1 "/usr/include/c++/13/iostream" 1 3
# 36 "/usr/include/c++/13/iostream" 3
# 37 "/usr/include/c++/13/iostream" 3
# 1 "/usr/include/c++/13/bits/requires_hosted.h" 1 3
# 31 "/usr/include/c++/13/bits/requires_hosted.h" 3
# 1 "/usr/include/x86_64-linux-gnu/c++/13/bits/c++config.h" 1 3
# 306 "/usr/include/x86_64-linux-gnu/c++/13/bits/c++config.h" 3
# 306 "/usr/include/x86_64-linux-gnu/c++/13/bits/c++config.h" 3
namespace std
{
typedef long unsigned int size_t;
Here I have 2 commands:
- Line 2: I ask
g++to take the filesource/hello.cppand outputbuild/hello.iiafter the preprocessing stage. The flag for askingg++to stop after preprocessing stage is-E. . - Line 3: I am displaying only the first 20 lines of the output file, as the file is very long. I recommend, taking a look at the file on GitHub repository.
The output shows that the #include <iostream> statement in the C++ code is replaced with the contents of the header file.
The third last line shows the definition of namespace std. Every name defined within the curly brackets {} after it belongs to the namespace std.
Next, I will view the output after the compilation stage, as follows:
.file "hello.cpp"
.text
#APP
.globl _ZSt21ios_base_library_initv
.section .rodata
.LC0:
.string "Hello, World!"
#NO_APP
.text
.globl main
.type main, @function
main:
.LFB1988:
.cfi_startproc
endbr64
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
Again, I am showing the first 20 lines of the output file here. The rest of the file can be found on the GitHub repository.
- Line 2: I ask
g++to take the filesource/hello.cppand outputbuild/hello.asmafter the compiling stage. The flag for askingg++to stop after the compiling stage is-S.
The assembly code is specific to the architecture of the machine it is being compiled on. In my case, it is x86-64 architecture.
Note that if I have only the executable file and not the source file, I can still view the assembly code by disassembling the executable file, as shown below:
build/hello: file format elf64-x86-64
Disassembly of section .init:
0000000000001000 <_init>:
1000: f3 0f 1e fa endbr64
1004: 48 83 ec 08 sub rsp,0x8
1008: 48 8b 05 e1 2f 00 00 mov rax,QWORD PTR [rip+0x2fe1] # 3ff0 <__gmon_start__@Base>
100f: 48 85 c0 test rax,rax
1012: 74 02 je 1016 <_init+0x16>
1014: ff d0 call rax
1016: 48 83 c4 08 add rsp,0x8
101a: c3 ret
Disassembly of section .plt:
0000000000001020 <.plt>:
1020: ff 35 8a 2f 00 00 push QWORD PTR [rip+0x2f8a] # 3fb0 <_GLOBAL_OFFSET_TABLE_+0x8>
I use objdump utility available in Linux for disassembling executable files.
- Line 2: I ask
objdumpto take the file machine codesource/helloand output assembly codebuild/hello_2.asmby disassembling it. The flag for askingobjdumpto disassemble the executable section of the file is-d. The flag-Mpasses options to the disassembler. Here I are asking it to use Intel syntax instead of the default AT&T syntax.
Going forward, it will be helpful to automate the compilation process instead to typing the commands every time. To keep things simple, I use tasks in VS code by creating tasks.json file in the .vscode subdirectory of my project directory, as follows:
{
"version": "2.0.0",
"tasks": [
{
"label": "build",
"type": "shell",
"command": "g++ source/${fileBasename} -o build/${fileBasenameNoExtension}.exe",
"group": {
"kind": "build",
"isDefault": true
}
},
{
"label": "run",
"type": "shell",
"command": "./build/${fileBasenameNoExtension}.exe",
"group": {
"kind": "test",
"isDefault": true
}
}
]
}Overwriting .vscode/tasks.json
I have created a task to build and run the program. I use this in my development process, but for this blog post I will continue to show the commands in this notebook.