
Data types
What's the point of having data types at all? Why can't we program in C++ using some var keyword to declare variables and forget about variables such as short, long, int, char, wchar, and so on? Well, C++ does support a similar construct, the auto keyword that we have already used previously in this chapter, a so-called placeholder type specifier. It's named a placeholder because it is, indeed, a placeholder. We cannot (and we must not ever be able to) declare a variable and then change its type during runtime. The following code might be valid JavaScript code, but it is definitely not valid C++ code:
var a = 12;
a = "Hello, World!";
a = 3.14;
Imagine the C++ compiler could compile this code. How many bytes of memory should be allocated for the a variable? When declaring var a = 12;, the compiler could deduce its type to int and specify 4 bytes of memory space, but when the variable changes its value to Hello, World!, the compiler has to reallocate the space, or invent a new hidden variable named a1 of type std::string . Then the compiler tries to find every access to the variable in the code that accesses it as a string and not as an integer or a double and replace the variable with the hidden a1. The compiler might just quit and start to ask itself the meaning of life.
We can declare something similar to the preceding code in C++ as follows:
auto a = 12;
auto b = "Hello, World!";
auto c = 3.14;
The difference between the previous two examples is that the second example declares three different variables of three different types. The previous non-C++ code declared just one variable and then assigned values of different types to it. You can't change the type of a variable in C++, but the compiler allows you to use the auto placeholder and deduces the type of the variable by the value assigned to it.
It is crucial to understand that the type is deduced at compile time, while languages such as JavaScript allow you to deduce the type at runtime. The latter is possible because such programs are run in environments such as virtual machines, while the only environment that runs the C++ program is the OS. The C++ compiler must generate a valid executable file that could be copied into the memory and run without a support system. This forces the compiler to know beforehand the actual size of the variable. Knowing the size is important to generate the final machine code because accessing a variable requires its address and size, allocating memory space to a variable requires the number of bytes that it should take.
The C++ type system classifies types into two major categories:
- Fundamental types (int, double, char, void)
- Compound types (pointers, arrays, classes)
The language even supports special type traits, std::is_fundamental and std::is_compound, to find out the category of a type, for example:
#include <iostream>
#include <type_traits>
struct Point {
float x;
float y;
};
int main() {
std::cout << std::is_fundamental_v<Point> << " "
<< std::is_fundamental_v<int> << " "
<< std::is_compound_v<Point> << " "
<< std::is_compound_v<int> << std::endl;
}
We used std::is_fundamental_v and std::is_compound_v helper variable templates, defined as follows:
template <class T>
inline constexpr bool is_fundamental_v = is_fundamental<T>::value;
template <class T>
inline constexpr bool is_compound_v = is_compound<T>::value;
The program outputs: 0 1 1 0.
Most of the fundamental types are arithmetic types such as int or double; even the char type is arithmetic. It actually holds a number rather than a character, for example:
char ch = 65;
std::cout << ch; // prints A
A char variable holds 1 byte of data, which means it can represent 256 different values (because 1 byte is 8 bits, and 8 bits can be used in 28 ways to represent a number). What if we use one of the bits as a sign bit, for example, allowing the type to support negative values as well? That leaves us with 7 bits for representing the actual value, and following the same logic, it allows us to represent 27 different values, that is, 128 (including 0) different values of positive numbers and the same amount of negative values. Excluding 0 gives us the range -127 to +127 for the signed char. This signed versus unsigned representation applies to almost all integral types.
So whenever you encounter that, for example, the size of an int is 4 bytes, which is 32 bits, you should already know that it is possible to represent the numbers 0 to 232 in an unsigned representation, and the values -231 to +231 in a signed representation.