Since C++ is based on C, you must
be familiar with the syntax of C in order to program in C++, just as you
must be reasonably fluent in algebra in
order to tackle calculus.
If you’ve never seen
C before, this chapter will give
you a decent background in the style of C used in C++. If you are familiar with
the style of C described in the first edition of Kernighan & Ritchie (often
called K&R C), you will find some new and different
features in C++ as well as in Standard C. If you are familiar with Standard C,
you should skim through this chapter looking for features that are particular to
C++. Note that there are some fundamental C++ features introduced here, which
are basic ideas that are akin to the features in C or often modifications to the
way that C does things. The more sophisticated C++ features will not be
introduced until later chapters.
This chapter is a fairly fast coverage of
C constructs and introduction to some basic C++ constructs, with the
understanding that you’ve had some experience programming in another
language. A more gentle introduction to C is found in the
CD ROM packaged in the back of this book, titled
Thinking in C: Foundations for Java & C++ by Chuck Allison (published
by MindView, Inc., and also available at www.MindView.net). This is a seminar on
a CD ROM with the goal of taking you carefully through the fundamentals of the C
language. It focuses on the knowledge necessary for you to be able to move on to
the C++ or Java languages rather than trying to make you an expert in all the
dark corners of C (one of the reasons for using a higher-level language like C++
or Java is precisely so we can avoid many of these dark corners). It also
contains exercises and guided solutions. Keep in mind that because this chapter
goes beyond the Thinking in C CD, the CD is not a replacement for this
chapter, but should be used instead as a preparation for this chapter and for
the
book.
In old (pre-Standard) C, you could call a
function with any number or type of arguments and the compiler wouldn’t
complain. Everything seemed fine until you ran the program. You got mysterious
results (or worse, the program crashed) with no hints as to why. The lack of
help with argument passing and the enigmatic bugs that resulted is probably one
reason why C was dubbed a “high-level assembly
language.” Pre-Standard C programmers just adapted
to it.
Standard C and C++ use a feature called
function prototyping.
With function prototyping, you must use a description of the types of
arguments when declaring and defining a function. This description is the
“prototype.” When the function is called, the compiler uses the
prototype to ensure that the proper arguments are passed in and that the return
value is treated correctly. If the programmer makes a mistake when calling the
function, the compiler catches the mistake.
Essentially, you learned about function
prototyping (without naming it as such) in the previous chapter, since the form
of function declaration in C++ requires proper prototyping. In a function
prototype, the argument list contains the types of arguments that must be passed
to the function and (optionally for the declaration) identifiers for the
arguments. The order and type of the arguments must match in the declaration,
definition, and function call. Here’s an example of a function prototype
in a declaration:
int translate(float x, float y, float z);
You do not use the same form when
declaring variables in function prototypes as you do in ordinary variable
definitions. That is, you cannot say: float x, y, z. You must indicate
the type of each argument. In a function declaration, the following form
is also acceptable:
int translate(float, float, float);
Since the compiler doesn’t do
anything but check for types when the function is called, the identifiers are
only included for clarity when someone is reading the code.
In the function definition, names are
required because the arguments are referenced inside the
function:
int translate(float x, float y, float z) {
x = y = z;
// ...
}
It turns out this rule applies only to C.
In C++, an argument may be unnamed
in the argument list of the
function definition. Since it is unnamed, you cannot use it in the function
body, of course. Unnamed arguments are allowed to give the programmer a way to
“reserve space in the argument list.” Whoever uses the function must
still call the function with the proper arguments. However, the person creating
the function can then use the argument in the future without forcing
modification of code that calls the function. This option of ignoring an
argument in the list is also possible if you leave the name in, but you will get
an annoying warning message about the value being unused every time you compile
the function. The warning is eliminated if you remove the name.
C and C++ have two other ways to declare
an argument list. If you have an
empty
argument list, you can declare it as func( ) in C++, which tells the
compiler there are exactly zero arguments. You should be aware that this only
means an empty argument list in C++. In C it means “an indeterminate
number of arguments (which is a “hole” in C since it disables type
checking in that case). In both C and C++, the declaration func(void);
means an empty argument list. The
void keyword means
“nothing” in this case (it can also mean “no type” in
the case of pointers, as you’ll see later in this
chapter).
The other option for argument lists
occurs when you don’t know how many arguments or what type of arguments
you will have; this is called a variable argument
list.
This “uncertain argument list” is represented by ellipses
(...). Defining a function with a variable
argument list is significantly more complicated than defining a regular
function. You can use a variable argument list for a function that has a fixed
set of arguments if (for some reason) you want to disable the error checks of
function prototyping. Because of this, you should restrict your use of variable
argument lists to C and avoid them in C++ (in which, as you’ll learn,
there are much better alternatives). Handling variable argument lists is
described in the library section of your local C
guide.
A C++ function prototype must specify the
return value type of the function (in C, if you leave off the return value type
it defaults to int). The return type specification precedes the function
name. To specify that no value is returned, use the
void keyword. This will
generate an error if you try to return a value from the function. Here are some
complete function prototypes:
int f1(void); // Returns an int, takes no arguments int f2(); // Like f1() in C++ but not in Standard C! float f3(float, int, char, double); // Returns a float void f4(void); // Takes no arguments, returns nothing
To return a value from a function, you
use the
return
statement. return exits the function back to the point right after the
function call. If return has an argument, that argument becomes the
return value of the function. If a function says that it will return a
particular type, then each return statement must return that type. You
can have more than one return statement in a function
definition:
//: C03:Return.cpp
// Use of "return"
#include <iostream>
using namespace std;
char cfunc(int i) {
if(i == 0)
return 'a';
if(i == 1)
return 'g';
if(i == 5)
return 'z';
return 'c';
}
int main() {
cout << "type an integer: ";
int val;
cin >> val;
cout << cfunc(val) << endl;
} ///:~
In cfunc( ), the first
if that evaluates to true exits the function via the return
statement. Notice that a function declaration is not necessary because the
function definition appears before it is used in main( ), so the
compiler knows about it from that function
definition.
All the functions in your local C
function library are available while you are programming in C++. You should look
hard at the function library before defining your own function –
there’s a good chance that someone has already solved your problem for
you, and probably given it a lot more thought and debugging.
A word of caution, though: many compilers
include a lot of extra functions that make life even easier and are tempting to
use, but are not part of the Standard C library. If you are certain you will
never want to move the application to another platform (and who is certain of
that?), go ahead –use those functions and make your life easier. If you
want your application to be portable, you should restrict yourself to Standard
library functions. If you must perform platform-specific activities, try to
isolate that code in one spot so it can be changed easily when porting to
another platform. In C++, platform-specific activities are often encapsulated in
a class, which is the ideal solution.
The formula for using a library function
is as follows: first, find the function in your programming reference (many
programming references will index the function by category as well as
alphabetically). The description of the function should include a section that
demonstrates the syntax of the code. The top of this section usually has at
least one #include line, showing you the header file containing the
function prototype. Duplicate this #include line in your file so the
function is properly
declared.
Now you can call the function in the same way it appears in the syntax section.
If you make a mistake, the compiler will discover it by comparing your function
call to the function prototype in the header and tell you about your error. The
linker searches the Standard library by default, so that’s all you need to
do: include the header file and call the
function.
You can collect your own functions
together into a library. Most programming packages come with a librarian that
manages groups of object modules. Each librarian has its own commands, but the
general idea is this: if you want to create a library, make a header file
containing the function prototypes for all the functions in your library. Put
this header file somewhere in the preprocessor’s search path, either in
the local directory (so it can be found by #include "header") or in the
include directory (so it can be found by #include <header>). Now
take all the object modules and hand them to the librarian along with a name for
the finished library (most librarians require a common extension, such as
.lib or .a). Place the finished library where the other libraries
reside so the linker can find it. When you use your library, you will have to
add something to the command line so the linker knows to
search the library for the functions you call. You must find all the details in
your local manual, since they vary from system to
system.
This section covers the execution control
statements in C++. You must be familiar with these statements before you can
read and write C or C++ code.
C++ uses all of C’s execution
control statements. These include if-else, while, do-while,
for, and a selection statement called switch. C++ also allows the
infamous goto, which will be avoided in this
book.
All conditional statements use the truth
or falsehood of a conditional expression to determine the execution path. An
example of a conditional expression is A == B. This uses the conditional
operator == to see if the variable A is equivalent to the variable
B. The expression produces a Boolean true
or false (these are keywords only in C++; in C an expression is
“true” if it evaluates to a nonzero value). Other conditional
operators are >, <, >=, etc. Conditional
statements are covered more fully later in this chapter.
The if-else statement can exist in
two forms: with or without the else. The two forms are:
if(expression)
statement
or
if(expression)
statement
else
statement
The “expression” evaluates to
true or false. The “statement” means either a simple
statement terminated by a semicolon or a compound statement, which is a group of
simple statements enclosed in braces. Any time the word “statement”
is used, it always implies that the statement is simple or compound. Note that
this statement can also be another if, so they can be strung
together.
//: C03:Ifthen.cpp
// Demonstration of if and if-else conditionals
#include <iostream>
using namespace std;
int main() {
int i;
cout << "type a number and 'Enter'" << endl;
cin >> i;
if(i > 5)
cout << "It's greater than 5" << endl;
else
if(i < 5)
cout << "It's less than 5 " << endl;
else
cout << "It's equal to 5 " << endl;
cout << "type a number and 'Enter'" << endl;
cin >> i;
if(i < 10)
if(i > 5) // "if" is just another statement
cout << "5 < i < 10" << endl;
else
cout << "i <= 5" << endl;
else // Matches "if(i < 10)"
cout << "i >= 10" << endl;
} ///:~
It is conventional to indent the body of
a control flow statement so the reader may easily determine where it begins and
ends[30].
while, do-while, and
for control looping. A statement repeats until the controlling expression
evaluates to false. The form of a while loop is
while(expression)
statement
The expression is evaluated once at the
beginning of the loop and again before each further iteration of the
statement.
This example stays in the body of the
while loop until you type the secret number or press
control-C.
//: C03:Guess.cpp
// Guess a number (demonstrates "while")
#include <iostream>
using namespace std;
int main() {
int secret = 15;
int guess = 0;
// "!=" is the "not-equal" conditional:
while(guess != secret) { // Compound statement
cout << "guess the number: ";
cin >> guess;
}
cout << "You guessed it!" << endl;
} ///:~
The while’s
conditional expression is not restricted to a simple test as in the example
above; it can be as complicated as you like as long as it produces a true
or false result. You will even see code where the loop has no body, just
a bare semicolon:
while(/* Do a lot here */) ;
In these cases, the programmer has
written the conditional expression not only to perform the test but also to do
the
work.
The form of do-while
is
do
statement
while(expression);
The do-while is different from the
while because the statement always executes at least once, even if the
expression evaluates to false the first time. In a regular while, if the
conditional is false the first time the statement never
executes.
If a do-while is used in
Guess.cpp, the variable guess does not need an initial dummy
value, since it is initialized by the cin statement before it is
tested:
//: C03:Guess2.cpp
// The guess program using do-while
#include <iostream>
using namespace std;
int main() {
int secret = 15;
int guess; // No initialization needed here
do {
cout << "guess the number: ";
cin >> guess; // Initialization happens
} while(guess != secret);
cout << "You got it!" << endl;
} ///:~
A for loop performs initialization
before the first iteration. Then it performs conditional testing and, at the end
of each iteration, some form of “stepping.” The form of the
for loop is:
for(initialization; conditional; step) statement
Any of the expressions
initialization,
conditional, or step
may be empty. The initialization code executes once at the very
beginning. The conditional is tested before each iteration (if it
evaluates to false at the beginning, the statement never executes). At the end
of each loop, the step executes.
for loops are usually used for
“counting” tasks:
//: C03:Charlist.cpp
// Display all the ASCII characters
// Demonstrates "for"
#include <iostream>
using namespace std;
int main() {
for(int i = 0; i < 128; i = i + 1)
if (i != 26) // ANSI Terminal Clear screen
cout << " value: " << i
<< " character: "
<< char(i) // Type conversion
<< endl;
} ///:~
You may notice that the variable i
is defined at the point where it is used, instead of at the beginning of the
block denoted by the open curly brace ‘{’. This is in
contrast to traditional procedural languages (including C), which require that
all variables be defined at the beginning of the block. This will be discussed
later in this
chapter.
Inside the body of any of the looping
constructs while, do-while, or for, you can control
the flow of the loop using break and
continue. break quits the loop without
executing the rest of the statements in the loop. continue stops the
execution of the current iteration and goes back to the beginning of the loop to
begin a new iteration.
As an example of break and
continue, this program is a very simple menu system:
//: C03:Menu.cpp
// Simple menu program demonstrating
// the use of "break" and "continue"
#include <iostream>
using namespace std;
int main() {
char c; // To hold response
while(true) {
cout << "MAIN MENU:" << endl;
cout << "l: left, r: right, q: quit -> ";
cin >> c;
if(c == 'q')
break; // Out of "while(1)"
if(c == 'l') {
cout << "LEFT MENU:" << endl;
cout << "select a or b: ";
cin >> c;
if(c == 'a') {
cout << "you chose 'a'" << endl;
continue; // Back to main menu
}
if(c == 'b') {
cout << "you chose 'b'" << endl;
continue; // Back to main menu
}
else {
cout << "you didn't choose a or b!"
<< endl;
continue; // Back to main menu
}
}
if(c == 'r') {
cout << "RIGHT MENU:" << endl;
cout << "select c or d: ";
cin >> c;
if(c == 'c') {
cout << "you chose 'c'" << endl;
continue; // Back to main menu
}
if(c == 'd') {
cout << "you chose 'd'" << endl;
continue; // Back to main menu
}
else {
cout << "you didn't choose c or d!"
<< endl;
continue; // Back to main menu
}
}
cout << "you must type l or r or q!" << endl;
}
cout << "quitting menu..." << endl;
} ///:~
If the user selects ‘q’ in
the main menu, the break keyword is used to quit, otherwise the program
just continues to execute indefinitely. After each of the sub-menu selections,
the continue keyword is used to pop back up to the beginning of the while
loop.
The while(true) statement is the
equivalent of saying “do this loop forever.” The break
statement allows you to break out of this infinite while loop when the user
types a ‘q.’
A switch statement selects from
among pieces of code based on the value of an integral expression. Its form
is:
switch(selector) {
case integral-value1 : statement; break;
case integral-value2 : statement; break;
case integral-value3 : statement; break;
case integral-value4 : statement; break;
case integral-value5 : statement; break;
(...)
default: statement;
}
Selector is an expression that
produces an integral value. The switch compares the result of
selector to each integral value. If it finds a match, the
corresponding statement (simple or compound) executes. If no match occurs, the
default statement
executes.
You will notice in the definition above
that each case ends with a
break, which causes execution to jump to the end of the switch
body (the closing brace that completes the switch). This is the
conventional way to build a switch statement, but the break is
optional. If it is missing, your case “drops through” to the
one after it. That is, the code for the following case statements execute
until a break is encountered. Although you don’t usually want this
kind of behavior, it can be useful to an experienced
programmer.
The switch statement is a clean
way to implement multi-way
selection (i.e., selecting from among a number of different execution paths),
but it requires a selector that evaluates to an integral value at compile-time.
If you want to use, for example, a string object as a selector, it
won’t work in a switch statement. For a string selector, you
must instead use a series of if statements and compare the string
inside the conditional.
The menu example shown above provides a
particularly nice example of a switch:
//: C03:Menu2.cpp
// A menu using a switch statement
#include <iostream>
using namespace std;
int main() {
bool quit = false; // Flag for quitting
while(quit == false) {
cout << "Select a, b, c or q to quit: ";
char response;
cin >> response;
switch(response) {
case 'a' : cout << "you chose 'a'" << endl;
break;
case 'b' : cout << "you chose 'b'" << endl;
break;
case 'c' : cout << "you chose 'c'" << endl;
break;
case 'q' : cout << "quitting menu" << endl;
quit = true;
break;
default : cout << "Please use a,b,c or q!"
<< endl;
}
}
} ///:~
The quit flag is a
bool, short for
“Boolean,” which is a type you’ll find only in C++. It can
have only the keyword values true or false. Selecting
‘q’ sets the quit flag to true. The next time the
selector is evaluated, quit == false returns false so the body of
the while does not
execute.
The
goto keyword is supported
in C++, since it exists in C. Using goto is often dismissed as poor
programming style, and most of the time it is. Anytime you use goto, look
at your code and see if there’s another way to do it. On rare occasions,
you may discover goto can solve a problem that can’t be solved
otherwise, but still, consider it carefully. Here’s an example that might
make a plausible candidate:
//: C03:gotoKeyword.cpp
// The infamous goto is supported in C++
#include <iostream>
using namespace std;
int main() {
long val = 0;
for(int i = 1; i < 1000; i++) {
for(int j = 1; j < 100; j += 10) {
val = i * j;
if(val > 47000)
goto bottom;
// Break would only go to the outer 'for'
}
}
bottom: // A label
cout << val << endl;
} ///:~
The alternative would be to set a Boolean
that is tested in the outer for loop, and then do a break from the
inner for loop. However, if you have several levels of for or
while this could get awkward.
Recursion is an interesting and sometimes
useful programming technique whereby you call the function that you’re in.
Of course, if this is all you do, you’ll keep calling the function
you’re in until you run out of memory, so there must be some way to
“bottom out” the recursive call. In the following example, this
“bottoming out” is accomplished by simply saying that the recursion
will go only until the cat exceeds
‘Z’:[31]
//: C03:CatsInHats.cpp
// Simple demonstration of recursion
#include <iostream>
using namespace std;
void removeHat(char cat) {
for(char c = 'A'; c < cat; c++)
cout << " ";
if(cat <= 'Z') {
cout << "cat " << cat << endl;
removeHat(cat + 1); // Recursive call
} else
cout << "VOOM!!!" << endl;
}
int main() {
removeHat('A');
} ///:~
In removeHat( ), you can see
that as long as cat is less than ‘Z’,
removeHat( ) will be called from within
removeHat( ), thus effecting the recursion. Each time
removeHat( ) is called, its argument is one greater than the current
cat so the argument keeps increasing.
Recursion is often used when evaluating
some sort of arbitrarily complex problem, since you aren’t restricted to a
particular “size” for the solution – the function can just
keep recursing until it’s reached the end of the problem.
You can think of operators as a special
type of function (you’ll learn that C++ operator overloading treats
operators precisely that way). An operator takes one or more arguments and
produces a new value. The arguments are in a different form than ordinary
function calls, but the effect is the same.
From your previous programming
experience, you should be reasonably comfortable with the operators that have
been used so far. The concepts of addition (+), subtraction and unary
minus (-), multiplication (*), division (/), and
assignment(=) all have essentially the same meaning in any programming
language. The full set of operators is enumerated later in this
chapter.
Operator precedence defines the order in
which an expression evaluates when several different operators are present. C
and C++ have specific rules to determine the order of evaluation. The easiest to
remember is that multiplication and division happen before addition and
subtraction. After that, if an expression isn’t transparent to you it
probably won’t be for anyone reading the code, so you should use
parentheses to make the order of evaluation explicit. For
example:
A = X + Y - 2/2 + Z;
has a very different meaning from the
same statement with a particular grouping of parentheses:
A = X + (Y - 2)/(2 + Z);
C, and therefore C++, is full of
shortcuts. Shortcuts can make code much easier to type, and sometimes much
harder to read. Perhaps the C language designers thought it would be easier to
understand a tricky piece of code if your eyes didn’t have to scan as
large an area of print.
One of the nicer shortcuts is the
auto-increment and auto-decrement
operators. You often use these to change loop variables, which control the
number of times a loop executes.
The
auto-decrement operator is ‘--’ and means “decrease by
one unit.” The auto-increment operator is ‘++’ and
means “increase by one unit.” If A is an int, for
example, the expression ++A is equivalent to (A = A + 1).
Auto-increment and auto-decrement operators produce the value of the variable as
a result. If the operator appears before the variable, (i.e., ++A), the
operation is first performed and the resulting value is produced. If the
operator appears after the variable (i.e. A++), the current value is
produced, and then the operation is performed. For example:
//: C03:AutoIncrement.cpp
// Shows use of auto-increment
// and auto-decrement operators.
#include <iostream>
using namespace std;
int main() {
int i = 0;
int j = 0;
cout << ++i << endl; // Pre-increment
cout << j++ << endl; // Post-increment
cout << --i << endl; // Pre-decrement
cout << j-- << endl; // Post decrement
} ///:~
Data types define the way you use
storage (memory) in the programs you write. By specifying a data type, you tell
the compiler how to create a particular piece of storage, and also how to
manipulate that storage.
Data types can be
built-in or abstract. A built-in
data type is one that the compiler intrinsically
understands, one that is wired directly into the compiler. The types of built-in
data are almost identical in C and C++. In contrast, a user-defined data
type is one that you or another
programmer create as a class. These are commonly referred to as abstract data
types. The compiler knows how to handle built-in types
when it starts up; it “learns” how to handle abstract data types by
reading header files containing class declarations
(you’ll learn about this in later
chapters).
The Standard C specification for built-in
types (which C++ inherits) doesn’t say how many bits each of the built-in
types must contain. Instead, it stipulates the minimum and maximum values that
the built-in type must be able to hold. When a machine is based on binary, this
maximum value can be directly translated into a minimum number of bits necessary
to hold that value. However, if a machine uses, for example, binary-coded
decimal (BCD) to represent numbers, then the amount of space in the machine
required to hold the maximum numbers for each data type will be different. The
minimum and maximum values that can be stored in the various data types are
defined in the system header files limits.h and
float.h (in C++ you will generally #include
<climits> and <cfloat> instead).
C and C++ have four basic built-in data
types, described here for binary-based machines. A
char is for character
storage and uses a minimum of 8 bits (one byte) of storage, although it may be
larger. An int stores an
integral number and uses a minimum of two bytes of storage. The
float and
double types store floating-point
numbers, usually in IEEE floating-point
format. float is for single-precision floating
point and double is for double-precision floating
point.
As mentioned previously, you can define
variables anywhere in a scope, and you can define and initialize
them at the same time.
Here’s how to define variables using the four basic data
types:
//: C03:Basic.cpp
// Defining the four basic data
// types in C and C++
int main() {
// Definition without initialization:
char protein;
int carbohydrates;
float fiber;
double fat;
// Simultaneous definition & initialization:
char pizza = 'A', pop = 'Z';
int dongdings = 100, twinkles = 150,
heehos = 200;
float chocolate = 3.14159;
// Exponential notation:
double fudge_ripple = 6e-4;
} ///:~
The first part of the program defines
variables of the four basic data types without initializing them. If you
don’t initialize a variable, the Standard says that its contents are
undefined (usually, this means they contain garbage). The second part of the
program defines and initializes variables at the same time (it’s always
best, if possible, to provide an initialization value at the point of
definition). Notice the use of exponential notation in
the constant 6e-4, meaning “6 times 10 to the minus fourth
power.”
Before bool became part of
Standard C++, everyone tended to use different techniques in order to produce
Boolean-like behavior.
These
produced portability problems and could introduce subtle
errors.
The Standard C++ bool type can
have two states expressed by the built-in constants true (which converts
to an integral one) and false (which converts to an integral zero). All
three names are keywords. In addition, some language elements have been
adapted:
|
Element |
Usage with bool |
|---|---|
|
&& ||
! |
Take bool arguments and produce
bool results. |
|
< > <= >= ==
!= |
Produce bool
results. |
|
if, for,
|
Conditional expressions convert to
bool values. |
|
? : |
First operand converts to bool
value. |
Because there’s a lot of existing
code that uses an int to represent a flag, the compiler will implicitly
convert from an int to a bool (nonzero values will produce true
while zero values produce false). Ideally, the compiler will
give you a warning as a suggestion to correct the situation.
An idiom that falls under “poor
programming style” is the use of ++ to set a flag to true. This is
still allowed, but deprecated, which means that at
some time in the future it will be made illegal. The problem is that
you’re making an implicit type conversion from bool to int,
incrementing the value (perhaps beyond the range of the normal bool
values of zero and one), and then implicitly converting it back
again.
Pointers (which will be introduced later
in this chapter) will also be automatically converted to bool when
necessary.
Specifiers modify the meanings of the
basic built-in types and expand them to a much larger set. There are four
specifiers: long,
short,
signed, and
unsigned.
long and short modify the
maximum and minimum values that a data type will hold. A plain int must
be at least the size of a short. The size hierarchy for integral types
is: short int, int, long int. All the sizes
could conceivably be the same, as long as they satisfy the minimum/maximum value
requirements. On a machine with a 64-bit word, for instance, all the data types
might be 64 bits.
The size hierarchy for floating point
numbers is: float,
double, and
long double.
“long float” is not a legal type. There are
no short floating-point numbers.
The signed and unsigned
specifiers tell the compiler how to use the sign bit with integral types and
characters (floating-point numbers always contain a sign). An unsigned
number does not keep track of the sign and thus has an extra bit available, so
it can store positive numbers twice as large as the positive numbers that can be
stored in a signed number. signed is the default and is only
necessary with char;
char may or may not default to signed. By specifying
signed char, you
force the sign bit to be used.
The following example shows the size of
the data types in bytes by using the
sizeof operator, introduced
later in this chapter:
//: C03:Specify.cpp
// Demonstrates the use of specifiers
#include <iostream>
using namespace std;
int main() {
char c;
unsigned char cu;
int i;
unsigned int iu;
short int is;
short iis; // Same as short int
unsigned short int isu;
unsigned short iisu;
long int il;
long iil; // Same as long int
unsigned long int ilu;
unsigned long iilu;
float f;
double d;
long double ld;
cout
<< "\n char= " << sizeof(c)
<< "\n unsigned char = " << sizeof(cu)
<< "\n int = " << sizeof(i)
<< "\n unsigned int = " << sizeof(iu)
<< "\n short = " << sizeof(is)
<< "\n unsigned short = " << sizeof(isu)
<< "\n long = " << sizeof(il)
<< "\n unsigned long = " << sizeof(ilu)
<< "\n float = " << sizeof(f)
<< "\n double = " << sizeof(d)
<< "\n long double = " << sizeof(ld)
<< endl;
} ///:~
Be aware that the results you get by
running this program will probably be different from one machine/operating
system/compiler to the next, since (as mentioned previously) the only thing that
must be consistent is that each different type hold the minimum and maximum
values specified in the Standard.
Whenever you run a program, it is first
loaded (typically from disk) into the computer’s memory. Thus, all
elements of your program are located somewhere in memory.
Memory is typically laid out as a sequential series of
memory locations; we usually refer to these locations as eight-bit
bytes but actually the size of each space depends
on the architecture of the particular machine and is usually called that
machine’s word size.
Each space can be uniquely distinguished from all other spaces by its
address. For the purposes of this discussion,
we’ll just say that all machines use bytes that have sequential addresses
starting at zero and going up to however much memory you have in your
computer.
Since your program lives in memory while
it’s being run, every element of your program has an address. Suppose we
start with a simple program:
//: C03:YourPets1.cpp
#include <iostream>
using namespace std;
int dog, cat, bird, fish;
void f(int pet) {
cout << "pet id number: " << pet << endl;
}
int main() {
int i, j, k;
} ///:~
Each of the elements in this program has
a location in storage when the program is running. Even the function occupies
storage. As you’ll see, it turns out that what an element is and the way
you define it usually determines the area of memory where that element is
placed.
There is an operator in C and C++ that
will tell you the address of an element. This is the
‘&’ operator. All you do is precede the identifier name
with ‘&’ and it will produce the address of that
identifier. YourPets1.cpp can be modified to print out the addresses of
all its elements, like this:
//: C03:YourPets2.cpp
#include <iostream>
using namespace std;
int dog, cat, bird, fish;
void f(int pet) {
cout << "pet id number: " << pet << endl;
}
int main() {
int i, j, k;
cout << "f(): " << (long)&f << endl;
cout << "dog: " << (long)&dog << endl;
cout << "cat: " << (long)&cat << endl;
cout << "bird: " << (long)&bird << endl;
cout << "fish: " << (long)&fish << endl;
cout << "i: " << (long)&i << endl;
cout << "j: " << (long)&j << endl;
cout << "k: " << (long)&k << endl;
} ///:~
The (long) is a
cast. It says
“Don’t treat this as if it’s normal type, instead treat it as
a long.” The cast isn’t essential, but if it wasn’t
there, the addresses would have been printed out in hexadecimal instead, so
casting to a long makes things a little more
readable.
The results of this program will vary
depending on your computer, OS, and all sorts of other factors, but it will
always give you some interesting insights. For a single run on my computer, the
results looked like this:
f(): 4198736 dog: 4323632 cat: 4323636 bird: 4323640 fish: 4323644 i: 6684160 j: 6684156 k: 6684152
You can see how the variables that are
defined inside main( ) are in a different area than the variables
defined outside of main( ); you’ll understand why as you learn
more about the language. Also, f( ) appears to be in its own area;
code is typically separated from data in memory.
Another interesting thing to note is that
variables defined one right after the other appear to be placed contiguously in
memory. They are separated by the number of bytes that are required by their
data type. Here, the only data type used is int, and cat is four
bytes away from dog, bird is four bytes away from cat, etc.
So it would appear that, on this machine, an int is four bytes
long.
Other than this interesting experiment
showing how memory is mapped out, what can you do with an address? The most
important thing you can do is store it inside another variable for later use. C
and C++ have a special type of variable that holds an address. This variable is
called a pointer.
The
operator that defines a pointer is the same as the one used for multiplication:
‘*’. The compiler knows that it isn’t multiplication
because of the context in which it is used, as you will see.
When you define a pointer, you must
specify the type of variable it points to. You start out by giving the type
name, then instead of immediately giving an identifier for the variable, you say
“Wait, it’s a pointer” by inserting a star between the type
and the identifier. So a pointer to an int looks like
this:
int* ip; // ip points to an int variable
The association of the
‘*’ with the type looks sensible and reads easily, but it can
actually be a bit deceiving. Your inclination might be to say
“intpointer” as if it is a single discrete type. However, with an
int or other basic data type, it’s possible to
say:
int a, b, c;
whereas with a pointer, you’d
like to say:
int* ipa, ipb, ipc;
C syntax (and by inheritance, C++ syntax)
does not allow such sensible expressions. In the definitions above, only
ipa is a pointer, but ipb and ipc are ordinary ints
(you can say that “* binds more tightly to the identifier”).
Consequently, the best results can be achieved by using only one definition per
line; you still get the sensible syntax without the confusion:
int* ipa; int* ipb; int* ipc;
Since a general guideline for C++
programming is that you should always initialize a variable at the point of
definition, this form actually works better. For example, the variables above
are not initialized to any particular value; they hold garbage. It’s much
better to say something like:
int a = 47; int* ipa = &a;
Now both a and ipa have
been initialized, and ipa holds the address of a.
Once you have an initialized pointer, the
most basic thing you can do with it is to use it to modify the value it points
to. To access a variable through a pointer, you
dereference the pointer using the same operator
that you used to define it, like this:
*ipa = 100;
Now a contains the value 100
instead of 47.
These are the basics of pointers: you can
hold an address, and you can use that address to modify the original variable.
But the question still remains: why do you want to modify one variable using
another variable as a proxy?
For this introductory view of pointers,
we can put the answer into two broad categories:
Ordinarily, when you pass an argument to
a function, a copy of that argument is made inside the function. This is
referred to as
pass-by-value. You
can see the effect of pass-by-value in the following program:
//: C03:PassByValue.cpp
#include <iostream>
using namespace std;
void f(int a) {
cout << "a = " << a << endl;
a = 5;
cout << "a = " << a << endl;
}
int main() {
int x = 47;
cout << "x = " << x << endl;
f(x);
cout << "x = " << x << endl;
} ///:~
In f( ), a is a
local variable, so it
exists only for the duration of the function call to f( ). Because
it’s a function argument,
the value of a is initialized by the arguments that are passed when the
function is called; in main( ) the argument is x, which has a
value of 47, so this value is copied into a when f( ) is
called.
When you run this program you’ll
see:
x = 47 a = 47 a = 5 x = 47
Initially, of course, x is 47.
When f( ) is called, temporary space is created to hold the variable
a for the duration of the function call, and a is initialized by
copying the value of x, which is verified by printing it out. Of course,
you can change the value of a and show that it is changed. But when
f( ) is completed, the temporary space that was created for a
disappears, and we see that the only connection that ever existed between
a and x happened when the value of x was copied into
a.
When you’re inside
f( ), x is the
outside object (my
terminology), and changing the local variable does not affect the outside
object, naturally enough, since they are two separate locations in storage. But
what if you do want to modify the outside object? This is where pointers
come in handy. In a sense, a pointer is an alias for another variable. So if we
pass a pointer into a function instead of an ordinary value, we are
actually passing an alias to the outside object, enabling the function to modify
that outside object, like this:
//: C03:PassAddress.cpp
#include <iostream>
using namespace std;
void f(int* p) {
cout << "p = " << p << endl;
cout << "*p = " << *p << endl;
*p = 5;
cout << "p = " << p << endl;
}
int main() {
int x = 47;
cout << "x = " << x << endl;
cout << "&x = " << &x << endl;
f(&x);
cout << "x = " << x << endl;
} ///:~
Now f( ) takes a pointer as
an argument and dereferences the pointer during assignment, and this causes the
outside object x to be modified. The output is:
x = 47 &x = 0065FE00 p = 0065FE00 *p = 47 p = 0065FE00 x = 5
Notice that the value contained in p
is the same as the address of x – the pointer p does
indeed point to x. If that isn’t convincing enough, when p
is dereferenced to assign the value 5, we see that the value of x is now
changed to 5 as well.
Thus, passing a pointer into a function
will allow that function to modify the outside object. You’ll see plenty
of other uses for pointers later, but this is arguably the most basic and
possibly the most common
use.
Pointers work roughly the same in C and
in C++, but C++ adds an additional way to pass an address into a function. This
is pass-by-reference and it exists in several
other programming languages so it was not a C++ invention.
Your initial perception of references may
be that they are unnecessary, that you could write all your programs without
references. In general, this is true, with the exception of a few important
places that you’ll learn about later in the book. You’ll also learn
more about references later, but the basic idea is the same as the demonstration
of pointer use above: you can pass the address of an argument using a reference.
The difference between references and pointers is that
calling a function that takes references is cleaner, syntactically, than
calling a function that takes pointers (and it is exactly this syntactic
difference that makes references essential in certain situations). If
PassAddress.cpp is modified to use references, you can see the difference
in the function call in main( ):
//: C03:PassReference.cpp
#include <iostream>
using namespace std;
void f(int& r) {
cout << "r = " << r << endl;
cout << "&r = " << &r << endl;
r = 5;
cout << "r = " << r << endl;
}
int main() {
int x = 47;
cout << "x = " << x << endl;
cout << "&x = " << &x << endl;
f(x); // Looks like pass-by-value,
// is actually pass by reference
cout << "x = " << x << endl;
} ///:~
In f( )’s argument
list, instead of saying int* to pass a pointer, you say int&
to pass a reference. Inside f( ), if you just say
‘r’ (which would produce the address if r were a
pointer) you get the value in the variable that r references. If
you assign to r, you actually assign to the variable that r
references. In fact, the only way to get the address that’s held inside
r is with the ‘&’ operator.
In main( ), you can see the
key effect of references in the syntax of the call to f( ), which is
just f(x). Even though this looks like an ordinary pass-by-value, the
effect of the reference is that it actually takes the address and passes it in,
rather than making a copy of the value. The output is:
x = 47 &x = 0065FE00 r = 47 &r = 0065FE00 r = 5 x = 5
So you can see that pass-by-reference
allows a function to modify the outside object, just like passing a pointer does
(you can also observe that the reference obscures the fact that an address is
being passed; this will be examined later in the book). Thus, for this simple
introduction you can assume that references are just a syntactically different
way (sometimes referred to as “syntactic sugar”) to accomplish the
same thing that pointers do: allow functions to change outside
objects.
So far, you’ve seen the basic data
types char, int, float, and double, along with the
specifiers signed, unsigned, short, and long, which
can be used with the basic data types in almost any combination. Now we’ve
added pointers and references that are orthogonal to the basic data types and
specifiers, so the possible combinations have just
tripled:
//: C03:AllDefinitions.cpp
// All possible combinations of basic data types,
// specifiers, pointers and references
#include <iostream>
using namespace std;
void f1(char c, int i, float f, double d);
void f2(short int si, long int li, long double ld);
void f3(unsigned char uc, unsigned int ui,
unsigned short int usi, unsigned long int uli);
void f4(char* cp, int* ip, float* fp, double* dp);
void f5(short int* sip, long int* lip,
long double* ldp);
void f6(unsigned char* ucp, unsigned int* uip,
unsigned short int* usip,
unsigned long int* ulip);
void f7(char& cr, int& ir, float& fr, double& dr);
void f8(short int& sir, long int& lir,
long double& ldr);
void f9(unsigned char& ucr, unsigned int& uir,
unsigned short int& usir,
unsigned long int& ulir);
int main() {} ///:~
Pointers and references also work when
passing objects into and out of functions; you’ll learn about this in a
later chapter.
There’s one other type that works
with pointers: void. If you state that a pointer is
a
void*, it means that any
type of address at all can be assigned to that pointer (whereas if you have an
int*, you can assign only the address of an int variable to that
pointer). For example:
//: C03:VoidPointer.cpp
int main() {
void* vp;
char c;
int i;
float f;
double d;
// The address of ANY type can be
// assigned to a void pointer:
vp = &c;
vp = &i;
vp = &f;
vp = &d;
} ///:~
Once you assign to a void* you
lose any information about what type it is. This means that before you can use
the pointer, you must cast it to the correct type:
//: C03:CastFromVoidPointer.cpp
int main() {
int i = 99;
void* vp = &i;
// Can't dereference a void pointer:
// *vp = 3; // Compile-time error
// Must cast back to int before dereferencing:
*((int*)vp) = 3;
} ///:~
The cast (int*)vp takes the
void* and tells the compiler to treat it as an int*, and thus it
can be successfully dereferenced. You might observe that this syntax is ugly,
and it is, but it’s worse than that – the void* introduces a
hole in the language’s type system. That is, it allows, or even promotes,
the treatment of one type as another type. In the example above, I treat an
int as an int by casting vp to an int*, but
there’s nothing that says I can’t cast it to a char* or
double*, which would modify a different amount of storage that had been
allocated for the int, possibly crashing the program. In general,
void pointers should be avoided, and used only in rare special cases, the
likes of which you won’t be ready to consider until significantly later in
the book.
Scoping rules tell you where a variable
is valid, where it is created, and where it gets destroyed (i.e., goes out of
scope). The
scope of a variable extends from the point where it is defined to the first
closing brace that matches the closest opening brace before the variable was
defined. That is, a scope is defined by its “nearest” set of braces.
To illustrate:
//: C03:Scope.cpp
// How variables are scoped
int main() {
int scp1;
// scp1 visible here
{
// scp1 still visible here
//.....
int scp2;
// scp2 visible here
//.....
{
// scp1 & scp2 still visible here
//..
int scp3;
// scp1, scp2 & scp3 visible here
// ...
} // <-- scp3 destroyed here
// scp3 not available here
// scp1 & scp2 still visible here
// ...
} // <-- scp2 destroyed here
// scp3 & scp2 not available here
// scp1 still visible here
//..
} // <-- scp1 destroyed here
///:~
The example above shows when variables
are visible and when they are unavailable (that is, when they go out of
scope). A variable can be used only when inside its scope. Scopes can be
nested, indicated by matched pairs of braces inside other
matched pairs of braces. Nesting means that you can access a variable in a scope
that encloses the scope you are in. In the example above, the variable
scp1 is available inside all of the other scopes, while scp3 is
available only in the innermost
scope.
As noted earlier in this chapter, there
is a significant difference between C and C++ when defining
variables.
Both languages require that variables be defined before they are used, but C
(and many other traditional procedural languages) forces you to define all the
variables at the beginning of a scope, so that when the compiler creates a block
it can allocate space for those variables.
While reading C code, a block of variable
definitions is usually the first thing you see when entering a scope. Declaring
all variables at the beginning of the block requires the
programmer to write in a particular way because of the implementation details of
the language. Most people don’t know all the variables they are going to
use before they write the code, so they must keep jumping back to the beginning
of the block to insert new variables, which is awkward and causes errors. These
variable definitions don’t usually mean much to the reader, and they
actually tend to be confusing because they appear apart from the context in
which they are used.
C++ (not C) allows you to define
variables anywhere in a scope, so you can define a
variable right before you use it. In addition, you can initialize the variable
at the point you define it, which prevents a certain class of errors. Defining
variables this way makes the code much easier to write and reduces the errors
you get from being forced to jump back and forth within a scope. It makes the
code easier to understand because you see a variable defined in the context of
its use. This is especially important when you are defining and initializing a
variable at the same time – you can see the meaning of the initialization
value by the way the variable is used.
You can also define variables inside the
control expressions of for loops and
while loops, inside the conditional of an
if statement, and inside the selector statement of
a switch. Here’s an example showing
on-the-fly variable definitions:
//: C03:OnTheFly.cpp
// On-the-fly variable definitions
#include <iostream>
using namespace std;
int main() {
//..
{ // Begin a new scope
int q = 0; // C requires definitions here
//..
// Define at point of use:
for(int i = 0; i < 100; i++) {
q++; // q comes from a larger scope
// Definition at the end of the scope:
int p = 12;
}
int p = 1; // A different p
} // End scope containing q & outer p
cout << "Type characters:" << endl;
while(char c = cin.get() != 'q') {
cout << c << " wasn't it" << endl;
if(char x = c == 'a' || c == 'b')
cout << "You typed a or b" << endl;
else
cout << "You typed " << x << endl;
}
cout << "Type A, B, or C" << endl;
switch(int i = cin.get()) {
case 'A': cout << "Snap" << endl; break;
case 'B': cout << "Crackle" << endl; break;
case 'C': cout << "Pop" << endl; break;
default: cout << "Not A, B or C!" << endl;
}
} ///:~
In the innermost scope, p is
defined right before the scope ends, so it is really a useless gesture (but it
shows you can define a variable anywhere). The p in the outer scope is in
the same situation.
The definition of i in the control
expression of the for loop is an example of being able to define a
variable exactly at the point you need it (you can do this only in C++).
The scope of i is the scope of the expression controlled by the
for loop, so you can turn around and re-use i in the next
for loop. This is a convenient and commonly-used idiom in C++; i
is the classic name for a loop counter and you don’t have to keep
inventing new names.
Although the example also shows variables
defined within while, if, and switch statements, this kind
of definition is much less common than those in for expressions, possibly
because the syntax is so constrained. For example, you cannot have any
parentheses. That is, you cannot say:
while((char c = cin.get()) != 'q')
The addition of the extra parentheses
would seem like an innocent and useful thing to do, and because you cannot use
them, the results are not what you might like. The problem occurs because
‘!=’ has a higher precedence than ‘=’, so
the char c ends up containing a bool converted to
char. When that’s printed, on many terminals you’ll see a
smiley-face character.
In general, you can consider the ability
to define variables within while, if, and switch statements
as being there for completeness, but the only place you’re likely to use
this kind of variable definition is in a for loop (where you’ll use
it quite
often).
When creating a variable, you have a
number of options to specify the lifetime of the variable, how the storage is
allocated for that variable, and how the variable is treated by the
compiler.
Global variables are defined outside all
function bodies and are available to all parts of the program (even code in
other files). Global variables are unaffected by scopes and are always available
(i.e., the lifetime of a global variable lasts until the program ends). If the
existence of a global variable in one file is declared using the
extern keyword in another
file, the data is available for use by the second file. Here’s an example
of the use of global variables:
//: C03:Global.cpp
//{L} Global2
// Demonstration of global variables
#include <iostream>
using namespace std;
int globe;
void func();
int main() {
globe = 12;
cout << globe << endl;
func(); // Modifies globe
cout << globe << endl;
} ///:~
Here’s a file that accesses
globe as an extern:
//: C03:Global2.cpp {O}
// Accessing external global variables
extern int globe;
// (The linker resolves the reference)
void func() {
globe = 47;
} ///:~
Storage for the variable globe is
created by the definition in Global.cpp, and that same variable is
accessed by the code in Global2.cpp. Since the code in Global2.cpp
is compiled separately from the code in Global.cpp, the compiler must be
informed that the variable exists elsewhere by the declaration
extern int globe;
When you run the program, you’ll
see that the call to func( ) does indeed affect the single global
instance of globe.
//{L} Global2
This says that to create the final
program, the object file with the name Global2 must be linked in (there
is no extension because the extension names of object files differ from one
system to the next). In Global2.cpp, the first line has another special
comment tag {O}, which says “Don’t try to create an
executable out of this file, it’s being compiled so that it can be linked
into some other executable.” The ExtractCode.cpp program in Volume
2 of this book (downloadable at www.BruceEckel.com) reads these tags and
creates the appropriate makefile so everything compiles properly
(you’ll learn about makefiles at the end of this
chapter).
Local variables occur within a scope;
they are “local” to a function. They are often called
automatic variables because
they automatically come into being when the scope is entered and automatically
go away when the scope closes. The keyword
auto makes this explicit,
but local variables default to auto so it is never necessary to declare
something as an auto.
A register variable is a type of local
variable. The register keyword tells the compiler
“Make accesses to this variable as fast as possible.” Increasing the
access speed is implementation dependent, but, as the name suggests, it is often
done by placing the variable in a register. There is no guarantee that the
variable will be placed in a register or even that the access speed will
increase. It is a hint to the compiler.
There are restrictions to the use of
register variables. You cannot take or compute the address of a
register variable. A register variable can be declared only within
a block (you cannot have global or static register variables). You
can, however, use a register variable as a formal argument in a function
(i.e., in the argument list).
In general, you shouldn’t try to
second-guess the compiler’s optimizer, since it will probably do a better
job than you can. Thus, the register keyword is best
avoided.
The
static keyword has several
distinct meanings. Normally, variables defined local to a function disappear at
the end of the function scope. When you call the function again, storage for the
variables is created anew and the values are re-initialized. If you want a value
to be extant throughout the life of a program, you can define a function’s
local variable to be static and give it an initial value. The
initialization is performed only the first time the function is called, and the
data retains its value between function calls. This way, a function can
“remember” some piece of information between function
calls.
You may wonder why a global variable
isn’t used instead. The beauty of a static variable is that it is
unavailable outside the scope of the function, so it can’t be
inadvertently changed. This localizes errors.
Here’s an example of the use of
static variables:
//: C03:Static.cpp
// Using a static variable in a function
#include <iostream>
using namespace std;
void func() {
static int i = 0;
cout << "i = " << ++i << endl;
}
int main() {
for(int x = 0; x < 10; x++)
func();
} ///:~
Each time func( ) is called
in the for loop, it prints a different value. If the keyword static is
not used, the value printed will always be ‘1’.
The second meaning of static is
related to the first in the “unavailable outside a certain scope”
sense. When static is applied to a function name or to a variable that is
outside of all functions, it means “This name is unavailable outside of
this file.” The function name or variable is local to the file; we say it
has file
scope.
As a demonstration, compiling and linking the following two files will cause a
linker error:
//: C03:FileStatic.cpp
// File scope demonstration. Compiling and
// linking this file with FileStatic2.cpp
// will cause a linker error
// File scope means only available in this file:
static int fs;
int main() {
fs = 1;
} ///:~
Even though the variable fs is
claimed to exist as an extern in the following
file, the linker won’t find it because it has been declared static
in FileStatic.cpp.
//: C03:FileStatic2.cpp {O}
// Trying to reference fs
extern int fs;
void func() {
fs = 100;
} ///:~
The static specifier may also be
used inside a class. This explanation will be delayed until you learn to
create classes, later in the
book.
The
extern keyword has already
been briefly described and demonstrated. It tells the compiler that a variable
or a function exists, even if the compiler hasn’t yet seen it in the file
currently being compiled. This variable or function may be defined in another
file or further down in the current file. As an example of the
latter:
//: C03:Forward.cpp
// Forward function & data declarations
#include <iostream>
using namespace std;
// This is not actually external, but the
// compiler must be told it exists somewhere:
extern int i;
extern void func();
int main() {
i = 0;
func();
}
int i; // The data definition
void func() {
i++;
cout << i;
} ///:~
When the compiler encounters the
declaration ‘extern int i’, it knows that the definition for
i must exist somewhere as a global variable. When the compiler reaches
the definition of i, no other declaration is visible, so it knows it has
found the same i declared earlier in the file. If you were to define
i as static, you would be telling the compiler that i is
defined globally (via the extern), but it also has file
scope (via the static), so the compiler will
generate an error.
To understand the behavior of C and C++
programs, you need to know about linkage. In an executing program, an
identifier is represented by storage in memory that holds a variable or a
compiled function body. Linkage describes this storage as it is seen by the
linker. There are two types of linkage: internal
linkage and external
linkage.
Internal linkage means that storage is
created to represent the identifier only for the file being compiled. Other
files may use the same identifier name with internal linkage, or for a global
variable, and no conflicts will be found by the linker – separate storage
is created for each identifier. Internal linkage is specified by the keyword
static in C and C++.
External linkage means that a single
piece of storage is created to represent the identifier for all files being
compiled. The storage is created once, and the linker must resolve all other
references to that storage. Global variables and function names have external
linkage. These are accessed from other files by declaring them with the keyword
extern. Variables defined outside all functions (with the exception of
const in C++) and function definitions default to external linkage. You
can specifically force them to have internal linkage using the static
keyword. You can explicitly state that an identifier has external linkage by
defining it with the extern keyword. Defining a variable or function with
extern is not necessary in C, but it is sometimes necessary for
const in C++.
Automatic (local) variables exist only
temporarily, on the stack, while a function is being called. The linker
doesn’t know about automatic
variables, and so these have no
linkage.
#define PI 3.14159
Everywhere you used PI, the value
3.14159 was substituted by the preprocessor (you can still use this method in C
and C++).
When you use the preprocessor to create
constants, you place control of those constants outside the scope of the
compiler. No type checking is performed on the name
PI and you can’t take the address of PI (so you can’t
pass a pointer or a reference to
PI). PI cannot be a variable of a user-defined type. The meaning
of PI lasts from the point it is defined to the end of the file; the
preprocessor doesn’t recognize scoping.
C++ introduces the concept of a named
constant that is just like a
variable, except that its value cannot be changed. The modifier
const tells the compiler
that a name represents a constant. Any data type, built-in or user-defined, may
be defined as const. If you define something as const and then
attempt to modify it, the compiler will generate an error.
You must specify the type of a
const, like this:
const int x = 10;
In Standard C and C++, you can use a
named constant in an argument list, even if the argument it fills is a pointer
or a reference (i.e., you can take the address of a const). A
const has a scope, just like a regular variable, so you can
“hide” a const inside a function and be sure that the name
will not affect the rest of the program.
The const was taken from C++ and
incorporated into Standard C, albeit quite differently. In C, the compiler
treats a const just like a variable that has a special tag attached that
says “Don’t change me.” When you define a const in C,
the compiler creates storage for it, so if you define more than one const
with the same name in two different files (or put the definition in a header
file), the linker will generate error messages about conflicts. The intended use
of const in C is quite different from its intended use in C++ (in short,
it’s nicer in C++).
In C++, a const must always have
an initialization value (in C, this is not true). Constant values for built-in
types are expressed as decimal,
octal, hexadecimal, or
floating-point numbers (sadly, binary numbers were not
considered important), or as characters.
In the absence of any other clues, the
compiler assumes a constant value is a decimal number. The numbers 47, 0, and
1101 are all treated as decimal numbers.
A constant value with a leading 0 is
treated as an octal number (base 8). Base 8 numbers can contain only digits 0-7;
the compiler flags other digits as an error. A legitimate octal number is 017
(15 in base 10).
A constant value with a leading 0x is
treated as a hexadecimal number (base 16). Base 16 numbers contain the digits
0-9 and a-f or A-F. A legitimate hexadecimal number is 0x1fe (510 in base
10).
Floating point numbers can contain
decimal points and exponential powers (represented by e,
which means “10 to the power of”). Both the decimal point and the
e are optional. If you assign a constant to a floating-point variable,
the compiler will take the constant value and convert it to a floating-point
number (this process is one form of what’s called implicit type
conversion). However, it is a
good idea to use either a decimal point or an e to remind the reader that
you are using a floating-point number; some older compilers also need the
hint.
Legitimate floating-point constant values
are: 1e4, 1.0001, 47.0, 0.0, and -1.159e-77. You can add suffixes to force the
type of floating-point number: f or F forces a float,
L or l forces a long double; otherwise the number
will be a
double.
Character
constants are characters surrounded by single quotes, as:
‘A’, ‘0’, ‘ ‘. Notice there is
a big difference between the character ‘0’ (ASCII 96) and the
value 0. Special characters are represented with the “backslash
escape”: ‘\n’ (newline), ‘\t’ (tab),
‘\\’ (backslash), ‘\r’ (carriage return),
‘\"’ (double quotes), ‘\'’ (single quote),
etc. You can also express char constants in octal: ‘\17’ or
hexadecimal:
‘\xff’.
Whereas the qualifier const tells
the compiler “This never changes” (which allows the compiler to
perform extra optimizations), the qualifier
volatile tells the compiler “You never know
when this will change,” and prevents the compiler from performing any
optimizations based on the stability of that variable. Use this keyword when you
read some value outside the control of your code, such as a register in a piece
of communication hardware. A volatile variable is always read whenever
its value is required, even if it was just read the line
before.
A special case of some storage being
“outside the control of your code” is in a multithreaded program. If
you’re watching a particular flag that is modified by another thread or
process, that flag should be volatile so the compiler doesn’t make
the assumption that it can optimize away multiple reads of the
flag.
Note that volatile may have no
effect when a compiler is not optimizing, but may prevent critical bugs when you
start optimizing the code (which is when the compiler will begin looking for
redundant reads).
This section covers all the operators in
C and C++.
All operators produce a value from their
operands. This value is produced without modifying the operands, except with the
assignment, increment, and decrement operators. Modifying an operand is called a
side effect. The most common use for operators
that modify their operands is to generate the side effect, but you should keep
in mind that the value produced is available for your use just as in operators
without side
effects.
Assignment is performed with the operator
=. It means “Take the right-hand side (often
called the rvalue) and copy it into the left-hand
side (often called the lvalue).” An rvalue
is any constant, variable, or expression that can produce a value, but an lvalue
must be a distinct, named variable (that is, there must be a physical space in
which to store data). For instance, you can assign a constant value to a
variable (A = 4;), but you cannot assign anything to constant value
– it cannot be an lvalue (you can’t say 4 =
A;).
The basic mathematical operators are the
same as the ones available in most programming languages: addition
(+),
subtraction (-),
division (/),
multiplication (*), and
modulus (%; this produces
the remainder from integer division). Integer division truncates the result (it
doesn’t round). The modulus operator cannot be used with floating-point
numbers.
C and C++ also use a shorthand notation
to perform an operation and an assignment at the same time. This is denoted by
an operator followed by an equal sign, and is consistent with all the operators
in the language (whenever it makes sense). For example, to add 4 to the variable
x and assign x to the result, you say: x += 4;.
This example shows the use of the
mathematical operators:
//: C03:Mathops.cpp
// Mathematical operators
#include <iostream>
using namespace std;
// A macro to display a string and a value.
#define PRINT(STR, VAR) \
cout << STR " = " << VAR << endl
int main() {
int i, j, k;
float u, v, w; // Applies to doubles, too
cout << "enter an integer: ";
cin >> j;
cout << "enter another integer: ";
cin >> k;
PRINT("j",j); PRINT("k",k);
i = j + k; PRINT("j + k",i);
i = j - k; PRINT("j - k",i);
i = k / j; PRINT("k / j",i);
i = k * j; PRINT("k * j",i);
i = k % j; PRINT("k % j",i);
// The following only works with integers:
j %= k; PRINT("j %= k", j);
cout << "Enter a floating-point number: ";
cin >> v;
cout << "Enter another floating-point number:";
cin >> w;
PRINT("v",v); PRINT("w",w);
u = v + w; PRINT("v + w", u);
u = v - w; PRINT("v - w", u);
u = v * w; PRINT("v * w", u);
u = v / w; PRINT("v / w", u);
// The following works for ints, chars,
// and doubles too:
PRINT("u", u); PRINT("v", v);
u += v; PRINT("u += v", u);
u -= v; PRINT("u -= v", u);
u *= v; PRINT("u *= v", u);
u /= v; PRINT("u /= v", u);
} ///:~
The rvalues of all the assignments can,
of course, be much more complex.
Notice the use of the macro
PRINT( ) to save typing (and typing errors!). Preprocessor macros
are traditionally named with all uppercase letters so they stand out –
you’ll learn later that macros can quickly become dangerous (and they can
also be very useful).
The arguments in the parenthesized list following the macro name are substituted in all the code following the closing parenthesis. The preprocessor removes the name PRINT and substitutes the code wherever the macro is called, so the compiler cannot generate any error messages using the macro name, and it doesnR