This is the third part and the biggest of the small book named 'Understanding C++ data types'. This part covers the 'integer' data type.
Integers
What are they
If you look in a dictionary for a definition of integers you will find something like this: “whole number: any positive or negative whole number or zeroâ€. This is the definition you probably had in the manuals from elementary school.
I hope that after the above definition, you agree that integers are numbers, like 12, 3, 2134, 0, -230, etc.
There are five types of integers: int, short, long, char, bool… These integers can also be of two types, signed or unsigned. That makes eight different types for you to use.
Data types don’t have a fixed size every time. That’s because they depend on the machine. We talked about this earlier at the introduction of this book, but I’m reminding you that if you use the code in this book on a different compiler than Microsoft Visual C++ 5 or 6, and on a different operating system than Windows (this may be NT, 95, 98, Me, 2000, XP, 2003) you might get different results, and therefore you’ll get confused. If you compile most of the code we use for examples in an older compiler like Borland C++ 3.1, you will notice big differences. We will mention some of them, but not all.
Before we move on
The following lessons will make you understand and use different types of data types. For you to understand these, you must know how the computer stores data, you must know what bits and bytes are and other important knowledge about how computers store data and programming.
int
The int data type is used often. It store numerical values only. It has the size of 4 bytes on a Windows OS, with VC++ 5.0 and 6.0. On the old Borland C++ 3.1 for DOS int has the size of 2 bytes. However, that’s not our subject here.
As you should know, a byte holds 8 bits. int data type can hold a maximum value of 4 bytes, that means 32 bits (4 bytes * 8 bits). Again, as you should already know, 32 bits mean 4294967296 different combinations, because
32 bits = 2^32
(that means ‘2 to the power of 32’). And the result is:
2^32 = 2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2 = 4294967296
That means that you can store any number from 0 to 4294967295 in an int data type. Why not until 4294967296 but until 4294967295? Because zero is a number too, and he occupies a space in this int data type. You must get used to including 0 everywhere, just like 1 and 12 and any other number, when programming.
For example, let’s imagine there would be a data type that can store 6 numbers. That means that it can store the numbers from 0 to 5.
0, 1, 2, 3, 4, 5 – these are 6 numbers
With int we can store numbers from 0 to 4294967295. Yet, not exactly. Remember, there are negative and positive numbers. If we use int for storing numbers from 0 to 4294967295 and we need a negative number, what shell we do?
Positive / Negative
We reached the signed / unsigned chapter in which you’ll learn how negative numbers are represented. I must take an example for demonstrating you how signed / unsigned data types work. Because you have just learned about the int data type, let’s take it as an example.
We said that the int data type can hold a maximum of 4 bytes, that is 32 bits, and 32 bits mean 32^2 different values. Moreover, 32^2 equals 4294967296 positive numbers (zero is considered positive). Earlier we asked ourselves ‘How can we represent negative numbers?’. Well, let’s think… negative numbers are the same as positive numbers, with the exception of the ‘–‘ (minus) sign in front of them. 12 is the same as -12 with the ‘-‘ sign exception, right? What if we take one bit from that 32 that an int can hold, and put the ‘-‘ sign on it to denote that the number is negative (or positive). Yes, that’s the way programming languages use negative and positive numbers. If we take that one bit, we have 31 more left (32bits – 1 bit). There are 31 left for representing the numbers. Because a bit can have two values (1 and 0, true and false…) we have 2^31 different values for representing the numbers.
2^31 = 2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2 = 2147483648
Half values from what we earlier had. But now we have two types of values with 2147483648 combinations each. One with plus, and what with minus, remember we reserved a bit for negative values and one for positive (- and +). Therefore, we can hold values from -2147483648 to +2147483647. If you are asking yourself why not +2147483648 rather than +2147483647, it means you have big problems with math and memory. Didn’t we just said that 0 is considered positive? Hope we cleared that out once and for all.
Finally, you can see we have again the same amount of 4294967296 values, but now they are divided in two equal parts, positive and negative.
Signed / Unsigned
You just learned what signed and unsigned data types are. “I did?†You may ask… yes, you did. Negative and positive values, this are signed and unsigned data types. Let’s review.
Signed values are the ones that have that bit of sign, + or -, that splits the 4294967296 values in two pieces. All data types are signed by default, therefore there is no need to specify if you want the data type to be signed.
Unsigned data types are the ones that hold only positive values. They can hold 4294967296 values (numbers from 0 to 4294967295). Why don’t unsigned data types need a sign to denote that they are positive, and use all 32 bits? Because if the number is not negative, how can it be? Positive, of course. You usually don’t write a + in front of positive numbers… because numbers that are not signed (unsigned), are positive by default, agree?
Opposite to signed, unsigned data types are not the defaults in programming languages. If you want your data type to be unsigned, just specify it. You will see in the next lesson how this can be done.
Unsigned int
I just said data types are signed by default, which means that we don’t need to specify to the data type anything.
x;
We have just defined a signed int variable named x. It’s signed because we didn’t specified anything, and signed is the default value for the int type. It’s int because int is the default data type, and if you don’t specify any special data type like short or long, C++ uses the default data type, int.
Of course, if you want, you can specify this, but it’s definitely a waste of code and time:
signed int x;
This variable can hold any number from -2147483648 to +2147483647.
However, in this lesson we talk about how to define an unsigned int variable. I believe you already guessed how this is done:
unsigned int y;
I’m telling you again, that because int is the default data type, you can also write:
unsigned y;
This variable, y, can hold any number from 0 to +2147483647.
INT_MIN / INT_MAX / sizeof
How can you find the minimum and maximum numbers a int data type can hold? You might say that you already know, a signed int can hold a minimum of -2147483648 and a maximum of +2147483647, and an unsigned int has the minimum 0 and maximum 4294967296. However, this is only on PCs with Windows OS and new compilers, like VC++ 5 or 6, or Borland C++ Builder 6. What if you use a different compiler, OS, or even some other type of computer? We make a little program that tells us the minimum and maximum values for that data type. For int, we have INT_MIN and INT_MAX for that. Let’s see the complete code for our little program. The program is compiled and executed on a PC with Windows XP and compiled with VC++ 5.0, VC++ 6.0, Borland C++ Builder 6 and with the old Borland C++ 3.1 for DOS. Here you will some major differences:
#include <iostream>
using namespace std;
#include <climits>
int main()
{
cout << "int has " << sizeof(int) << " bytes.\n";
cout << "Minimum int value = " << INT_MIN << "\n";
cout << "Maximum int value = " << INT_MAX << "\n";
return 0;
}
Now it’s time to see the result of the compiled code on different compilers. In Visual C++ 5 and 6, and in Borland C++ Builder 6 we have the same output, because these are the latest compilers:
int has 4 bytes.
Minimum int value = -2147483648
Maximum int value = 2147483647
Compile this with Borland C++ 3.1 for DOS and note the difference:
int has 2 bytes.
Minimum int value = -32768
Maximum int value = 32768
That’s because when Borland C++ 3.1 was used, the int data type had 2 bytes.
I believe that now you can see the importance of using INT_MIN and INT_MAX when dealing with a compiler that you are not familiar with. You never know what the compiler may produce, and the compiler may throw some errors that you don’t know from where they came. It often happens that the error to be caused by a different size assigned by the compiler for the data type. In this case the error compiler displays, is an overflow error…
Didn’t I forget something? It’s sizeof. Sizeof is an operator that returns the dimension in bytes of a type or variable. If you use sizeof with a variable, you can use it like this:
sizeof x;
Where x is a variable, for example.
However, if you use it to find the dimension of the data type directly, you must specify the data type like in the following example:
sizeof(int);
In our above example we used the second option.
Overflows
Let’s see again what the dictionary says about this word: “flow or pour over: to pour out over the limits or edge of a container because the container is too full of liquidâ€. It’s pretty the same what happens when the compiler shows this error. This is the result of assigning a number too big for that data type. For your recently learned data type, int, let’s put a bigger value than 2147483647, let’s say 3424783547.
#include <iostream>
using namespace std;
#include <climits>
int main()
{
int x = 3424783547;
cout << "variable x holds value " << x << ".\n";
return 0;
}
The VC++ 6.0 compiler gives the following output:
variable x holds value -870183749.
This is, by far, wrong because of an overflow problem.
By the way, if the int is unsigned, like this:
#include <iostream>
using namespace std;
#include <climits>
int main()
{
unsigned int x = 3424783547;
cout << "variable x holds value " << x << ".\n";
return 0;
}
The output is now the right one:
variable x holds value 3424783547.
short
Even its name denotes that short is a data type ‘shorter’ than int. That means it can store less. Actually, the full name is short int (a shorter int).
A short is 2 bytes long, that means 16 bits. Therefore, if it is unsigned, it can hold values from 0 to 65535 (65536 different values). By default, like all the other data types, it is signed, and it can hold numbers ranging from -32768 to +32767. To find out what are the minimum and maximum values a short can hold on different compilers, you can use:
SHRT_MIN / SHRT_MAX
Same as INT_MIN and INT_MAX but for the short data type.
#include <iostream>
using namespace std;
#include <climits>
int main()
{
cout << "short has " << sizeof(short) << " bytes.\n";
cout << "Minimum short value = " << SHRT_MIN << "\n";
cout << "Maximum short value = " << SHRT_MAX << "\n";
return 0;
}
This is the complete code we use for testing the size of short. Again, lets see the result, this time only in VC++ 5 and 6, because you saw earlier the differences of the output when compiling with other compilers (usually older). Moreover, I told you at the beginning of the book that we’ll work with VC++ only.
short has 2 bytes.
Minimum int value = -32768
Maximum int value = 32767
This is the output VC++ 5 and 6 shows in a DOS box under Windows XP.
long
It stored more than int, and therefore much more than short. Just like in short, the full name is long (a longer int).
Why did I say ‘It stored more…’? Because it doesn’t store more any longer. On the old days of DOS, int was 2 bytes long and the long, was really long, 4 bytes. Now int is 4 bytes, and long is… 4 bytes too, at least on compilers like VC++ 5 or 6. Therefore, it doesn’t make a big difference if you use int or long, because now they are the same. What’s hilarious is that on the old days of DOS, short was the same size as int, now short is smaller than int, and long is the same size as int.
LONG_MIN / LONG_MAX
Same as INT_MIN and INT_MAX and SHORT_MIN and SHORT_MAX but for the long data type. On many new compilers, it produces the same output as int data type.
#include <iostream>
using namespace std;
#include <climits>
int main()
{
cout << "long has " << sizeof(long) << " bytes.\n";
cout << "Minimum long value = " << LONG_MIN << "\n";
cout << "Maximum long value = " << LONG_MAX << "\n";
return 0;
}
And the output on VC++ 5 and 6:
long has 4 bytes.
Minimum int value = -2147483648
Maximum int value = 2147483647
This is the output VC++ 5 and 6 shows in a DOS box under Windows XP.
char
Till now we only stored numbers. What about characters? We use char. char data type is used especially for characters. char uses at least 1 byte for storing characters.
The 8 bits that char can hold permits him to have 256 different values. This means 256 characters, which is enough to hold characters for all types of languages probably.
Computers use numerical codes for storing characters. Actually, they use integers. Every character has its own numerical code. You probably heard about the ASCII standard. American Standard Code for Information Interchange is the most frequently used character set.
The ‘A’ character has the code 65 in ASCII. ‘B’ character has the code 66. Uppercase characters are completely different from lowercase. Therefore, ‘a’ has a separate code, 97, while ‘A’ is 65.
Signs are also considered characters. Therefore, “#â€, “%â€, “&â€, “?â€, are also characters. Event the space we leave between words is a character, that means “ †is a character, and, by the way, it has the ASCII value 32.
Manipulating char
This code defines a variable of type char that holds the character ‘X’. The cout outputs the character that the variable holds:
#include <iostream>
using namespace std;
int main()
{
char x = 'X';
cout << "Variable x holds character: " << x << "\n";
return 0;
}
The following code will ask you to enter a character and then it will display it:
#include <iostream>
using namespace std;
int main()
{
char x;
cout << "Enter a character:\n";
cin >> x;
cout << "You typed " << x << ".\n";
return 0;
}
You can also type the ASCII code of a character and make the output to be the character that is represented by that code. Alternatively, in reverse, type the character and see the ASCII code for it. Let’s see example code:
#include <iostream>
using namespace std;
int main()
{
int x;
cout << "Enter ASCII character code:\n";
cin >> x;
cout << "Character code is " << char(x) << ".\n";
return 0;
}
First, a usual int variable is declared. Then ask the user to type a character, which is stored in the int variable named x. At the second output, char(x) is used to change the data type of the variable content into char. Therefore, char takes the code (let’s suppose you typed 65), and considers 65 to be the ASCII code for a character (A).
Now, to type a character and see the ASCII code:
#include <iostream>
using namespace std;
int main()
{
char x;
cout << "Enter character:\n";
cin >> x;
cout << "ASCII code for the letter is " << int(x) << ".\n";
return 0;
}
bool
It’s a new C++ data type thus very useful. It is named like this in the honor of the great mathematician Boole George.
There are only two boolean values, true and false. The size of a bool data type is 1 bit, which means there are only two possible values, 0 and 1 (false and true).
bool x = true;
bool y = false;
The bool data type can be assigned to an int:
int x = true;
int y = false;
Now x is 1 and y is 0.
Also, any positive or negative value is considered true. Only 0 represents a false value.
int x = true;
bool x = 1; // true
bool y = 235; // true
bool z = -53; // true
bool q = 0; // false