Search

4.3 — Object sizes and the sizeof operator

Object sizes

As you learned in the lesson 4.1 -- Introduction to fundamental data types, memory on modern machines is typically organized into byte-sized units, with each byte of memory having a unique address. Up to this point, it has been useful to think of memory as a bunch of cubbyholes or mailboxes where we can put and retrieve information, and variables as names for accessing those cubbyholes or mailboxes.

However, this analogy is not quite correct in one regard -- most objects actually take up more than 1 byte of memory. A single object may use 2, 4, 8, or even more consecutive memory addresses. The amount of memory that an object uses is based on its data type.

Because we typically access memory through variable names (and not directly via memory addresses), the compiler is able to hide the details of how many bytes a given object uses from us. When we access some variable x, the compiler knows how many bytes of data to retrieve (based on the type of variable x), and can handle that task for us.

Even so, there are several reasons it is useful to know how much memory an object uses.

First, the more memory an object uses, the more information it can hold.

A single bit can hold 2 possible values, a 0, or a 1:

bit 0
0
1

2 bits can hold 4 possible values:

bit 0 bit 1
0 0
0 1
1 0
1 1

3 bits can hold 8 possible values:

bit 0 bit 1 bit 2
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1

To generalize, an object with n bits can hold 2n (2 to the power of n, also commonly written 2^n) unique values. Therefore, with an 8-bit byte, a byte-sized object can hold 28 (256) different values. An object that uses 2 bytes can hold 2^16 (65536) different values!

Thus, the size of the object puts a limit on the amount of unique values it can store -- objects that utilize more bytes can store a larger number of unique values. We will explore this further when we talk more about integers.

Second, computers have a finite amount of free memory. Every time we define an object, a small portion of that free memory is used for as long as the object is in existence. Because modern computers have a lot of memory, this impact is usually negligible. However, for programs that need a large amount of objects or data (e.g. a game that is rendering millions of polygons), the difference between using 1 byte and 8 byte objects can be significant.

Key insight

New programmers often focus too much on optimizing their code to use as little memory as possible. On modern machines, this simply isn’t necessary. Focus on writing maintainable code, and optimize only when and where necessary.

Fundamental data sizes

The obvious next question is “how much memory do variables of different data types take?”. You may be surprised to find that the size of a given data type is dependent on the compiler and/or the computer architecture!

C++ only guarantees that each fundamental data types will have a minimum size:

Category Type Minimum Size Note
boolean bool 1 byte
character char 1 byte Always exactly 1 byte
wchar_t 1 byte
char16_t 2 bytes C++11 type
char32_t 4 bytes C++11 type
integer short 2 bytes
int 2 bytes
long 4 bytes
long long 8 bytes C99/C++11 type
floating point float 4 bytes
double 8 bytes
long double 8 bytes

However, the actual size of the variables may be different on your machine (particularly int, which is more often 4 bytes).

Key insight

You shouldn’t assume that variables are larger than the specified minimum size, even if they are on your machine.

The sizeof operator

In order to determine the size of data types on a particular machine, C++ provides an operator named sizeof. The sizeof operator is a unary operator that takes either a type or a variable, and returns its size in bytes. You can compile and run the following program to find out how large some of your data types are:

Here is the output from the author’s x64 machine, using Visual Studio:

bool:           1 bytes
char:           1 bytes
wchar_t:        2 bytes
char16_t:       2 bytes
char32_t:       4 bytes
short:          2 bytes
int:            4 bytes
long:           4 bytes
long long:      8 bytes
float:          4 bytes
double:         8 bytes
long double:    8 bytes

Your results may vary if you are using a different type of machine, or a different compiler. Note that you can not use the sizeof operator on the void type, since it has no size (doing so will cause a compile error).

For advanced readers

If you’re wondering what ‘\t’ is in the above program, it’s a special symbol that inserts a tab (in the example, we’re using it to align the output columns). We will cover ‘\t’ and other special symbols in lesson 4.11 -- Chars.

You can also use the sizeof operator on a variable name:

x is 4 bytes

4.4 -- Signed integers
Index
4.2 -- Void

80 comments to 4.3 — Object sizes and the sizeof operator

  • Vex

    If you fail to compile demo code, like this:

        ‘char16_t’ was not declared in this scope|
        ‘char32_t’ was not declared in this scope|

    or

        error C2065: 'char16_t' : undeclared identifier
        error C2070: ''unknown-type'': illegal sizeof operand
        error C2065: 'char32_t' : undeclared identifier
        error C2070: ''unknown-type'': illegal sizeof operand

    Please use following command to compile it:

        g++ -std=c++0x -o bin/Debug/2 main.cpp

  • Nahin

    Why is sizeof() considered an operator and not a built-in function?
    How can we distinguish between a function and an operator?

  • cpplx

    this line: "Because modern computers have a lot of memory, this often isn’t a problem, especially if only declaring a few variables."
    I would like you to delete it and instead suggest the would be programmers to always optimize their code.
    "A jug fills drop by drop." - The Buddha
    the future is in you hands. make it... right.

    • Alex

      Actually, best practice is to NOT prematurely optimize your code. You should code for readability, debugability, and consistency, and then optimize if and when necessary.

      In the book Computer Programming as an Art (1974), Donald Knuth (a renowned Computer Scientist) said, "The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming."

      For example, you _could_ pack every one of your boolean variables into bit flags to save memory, but for 99.99% of programs this simply isn't necessary, and it adds a lot of complexity, degrades understanding, and increases the chance of error.

      It's much better to use a little more memory than increase the risk of your program producing the wrong output or crashing.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">