Whether it's named directly or indirectly, and whether it's in a variable, or an array element, or is just a temporary value, a scalar always contains a single value. This value may be a number, a string, or a reference to another piece of data. Or, there might even be no value at all, in which case the scalar is said to be undefined. Although we might speak of a scalar as "containing" a number or a string, scalars are typeless: you are not required to declare your scalars to be of type integer or floating-point or string or whatever.[9]
Perl stores strings as sequences of characters, with no
arbitrary constraints on length or content. In human terms, you don't
have to decide in advance how long your strings are going to get, and
you can include any characters including null bytes within your
string. Perl stores numbers as signed integers if possible, or as
double-precision floating-point values in the machine's native format
otherwise. Floating-point values are not infinitely precise. This is
important to remember because comparisons like (10/3 ==
1/3*10)
tend to fail mysteriously.
Perl converts between the various subtypes as needed,
so you can treat a number as a string or a string as a number, and
Perl will do the Right Thing. To convert from string to number, Perl
internally uses something like the C library's
atof (3) function. To convert from
number to string, it does the equivalent of an
sprintf (3) with a format of
"%.14g
" on most machines. Improper conversions of a
nonnumeric string like foo
to a number count as
numeric 0; these trigger warnings if you have them enabled, but are
silent otherwise. See Chapter 5,
for examples of detecting what sort of data a string holds.
Although strings and numbers are interchangeable for nearly all intents, references are a bit different. They're strongly typed, uncastable pointers with built-in reference-counting and destructor invocation. That is, you can use them to create complex data types, including user-defined objects. But they're still scalars, for all that, because no matter how complicated a data structure gets, you often want to treat it as a single value.
By uncastable, we mean that you can't, for instance, convert a reference to an array into a reference to a hash. References are not castable to other pointer types. However, if you use a reference as a number or a string, you will get a numeric or string value, which is guaranteed to retain the uniqueness of the reference even though the "referenceness" of the value is lost when the value is copied from the real reference. You can compare such values or extract their type. But you can't do much else with the values, since there's no way to convert numbers or strings back into references. Usually, this is not a problem, because Perl doesn't force you to do pointer arithmetic--or even allow it. See Chapter 8 for more on references.
Numeric literals are specified in any of several customary[10] floating-point or integer formats:
$x = 12345; # integer $x = 12345.67; # floating point $x = 6.02e23; # scientific notation $x = 4_294_967_296; # underline for legibility $x = 0377; # octal $x = 0xffff; # hexadecimal $x = 0b1100_0000; # binary
Because Perl uses the comma as a list separator, you
cannot use it to separate the thousands in a large number. Perl does
allow you to use an underscore character instead. The underscore
only works within literal numbers specified in your program, not for
strings functioning as numbers or data read from somewhere else.
Similarly, the leading 0x
for hexadecimal,
0b
for binary, and 0
for octal
work only for literals. The automatic conversion of a string to a
number does not recognize these prefixes--you must do an explicit
conversion[11] with the oct
function--which works
for hex and binary numbers, too, as it happens, provided you supply
the 0x
or 0b
on the
front.
String literals are usually surrounded by either
single or double quotes. They work much like Unix shell quotes:
double-quoted string literals are subject to backslash and variable
interpolation, but single-quoted strings are not (except for
' and
\
, so that you can
embed single quotes and backslashes into single-quoted strings).
If you want to embed any other backslash sequences such as
(newline), you must use the double-quoted
form. (Backslash sequences are also known as
escape sequences, because you "escape" the
normal interpretation of characters temporarily.)
A single-quoted string must be separated from a
preceding word by a space because a single quote is a valid--though
archaic--character in an identifier. Its use has been replaced by
the more visually distinct :
: sequence. That
means that $main'var
and
$main::var
are the same thing, but the second is
generally considered easier to read for people and programs.
Double-quoted strings are subject to various forms of character interpolation, many of which will be familiar to programmers of other languages. These are listed in Table 2.1.