This chapter begins our introduction to the Java language syntax. Because readers come to this book with different levels of programming experience, it is difficult to set the right level for all audiences. We have tried to strike a balance between giving a thorough tour with several examples of the language syntax for beginners and providing enough background information so that a more experienced reader can quickly gauge the differences between Java and other languages. Since Java’s syntax is derived from C, we make some comparisons to features of that language, but no prior knowledge of C is necessary. Chapter Chapter 5 will build on this chapter by talking about Java’s object-oriented side and complete the discussion of the core language. Chapter 7 discusses generics, a feature that enhances the way types work in the Java language, allowing you to write certain kinds of classes more flexibly and safely. After that, we dive into the Java APIs and see what we can do with the language. The rest of this book is filled with concise examples that do useful things in a variety of areas. If you are left with any questions after these introductory chapters, we hope they’ll be answered as you look at the code. There is always more to learn, of course! We’ll try to point out other resources along the way that might benefit folks looking to continue their Java journey beyond the topics we cover.
For readers just beginning their programming journey, the web will likely be a constant companion. Many, many sites, Wikipedia article, blog posts, and, well, the entirety of Stack Overflow can help you dig into particular topics or answer small questions that might arise. For example, while this book covers the Java language and how to start writing useful programs with Java and its tools, we don’t cover lower, core components of programming such as algorithms. These programming fundamentals will naturally appear in our discussions and code examples, but you might enjoy a few hyperlink tangents to help cement certain details or fill in gaps we must necessarily leave.
Java is a language for the Internet. Since the citizens of the Net speak and write in many different human languages, Java must be able to handle a large number of languages as well. One of the ways in which Java supports internationalization is through the Unicode character set. Unicode is a worldwide standard that supports the scripts of most languages.1 The latest version of Java bases its character and string data on the Unicode 6.0 standard, which uses at least two bytes to represent each symbol internally.
Java source code can be written using Unicode and stored in any number of character encodings, ranging from a full binary form to ASCII-encoded Unicode character values. This makes Java a friendly language for non-English-speaking programmers who can use their native language for class, method, and variable names just as they can for the text displayed by the application.
The Java char
type and String
class natively support Unicode values. Internally, the text is stored using either char[]
or byte[]
; however, the Java language and APIs make this transparent to you and you will not generally have to think about it. Unicode is also very ASCII-friendly (ASCII is the most common character encoding for English). The first 256 characters are defined to be identical to the first 256 characters in the ISO 8859-1 (Latin-1) character set, so Unicode is effectively backward-compatible with the most common English character sets. Furthermore, one of the most common file encodings for Unicode, called UTF-8, preserves ASCII values in their single byte form. This encoding is used by default in compiled Java class files, so storage remains compact for English text.
Most platforms can’t display all currently defined Unicode characters. As a result, Java programs can be written with special Unicode escape sequences. A Unicode character can be represented with this escape sequence:
uxxxx
xxxx
is a sequence of one to four hexadecimal digits. The escape sequence indicates an ASCII-encoded Unicode character. This is also the form Java uses to output (print) Unicode characters in an environment that doesn’t otherwise support them. Java also comes with classes to read and write Unicode character streams in specific encodings, including UTF-8.
As with many long-lived standards in the tech world, Unicode was originally designed with so much extra space that no conceivable character encoding could ever possibly require more than 64K characters. Sigh. Naturally we have sailed past that limit and some UTF-32 encodings are in popular circulation. Most notably, emoji characters scattered throughout messaging apps are encoded beyond the standard range of Unicode characters. (For example, the canonical smiley emoji has the Unicode value 1F600.) Java supports multi-byte UTF-16 escape sequences for such characters. Not every platform that supports Java will support emoji output, but you can fire up jshell to find out if your environment can show emoji characters.
Be careful about using such characters, though. We had to use a screenshot to make sure you could see the little cuties in jshell running on a Mac. But fire up a Java desktop app on that same system with a JFrame
and JLabel
like we did in Chapter 3 and you get Figure 4-2.
jshell> import javax.swing.* jshell> JFrame f = new JFrame("Emoji Test") f ==> javax.swing.JFrame[frame0,0,23,0x0,invalid,hidden ... tPaneCheckingEnabled=true] jshell> JLabel l = new JLabel("Hi uD83DuDE00") l ==> javax.swing.JLabel[,0,0,0x0,invalid,alignmentX=0. ... rticalTextPosition=CENTER] jshell> f.add(l) $12 ==> javax.swing.JLabel[,0,0,0x0,invalid,alignmentX= ... rticalTextPosition=CENTER] jshell> f.setSize(300,200) jshell> f.setVisible(true)
It’s not that you can’t use or support emoji in your applications, you just have to be aware of differences in output features. Make sure your users have a good experience wherever they are running your code.
Java supports both C-style block comments delimited by /*
and */
and C++-style line comments indicated by //
:
/* This is a
multiline
comment. */
// This is a single-line comment
// and so // is this
Block comments have both a beginning and end sequence and can cover large ranges of text. However, they cannot be “nested,” meaning that you can’t have a block comment inside of a block comment without the compiler getting confused. Single-line comments have only a start sequence and are delimited by the end of a line; extra //
indicators inside a single line have no effect. Line comments are useful for short comments within methods; they don’t conflict with block comments, so you can still comment out larger chunks of code in which they are nested.
A block comment beginning with /**
indicates a special doc comment. A doc comment is designed to be extracted by automated documentation generators, such as the JDK’s javadoc program or the context-aware tooltips in many IDEs. A doc comment is terminated by the next */
, just as with a regular block comment. Within the doc comment, lines beginning with @
are interpreted as special instructions for the documentation generator, giving it information about the source code. By convention, each line of a doc comment begins with a *
, as shown in the following example, but this is optional. Any leading spacing and the *
on each line are ignored:
/**
* I think this class is possibly the most amazing thing you will
* ever see. Let me tell you about my own personal vision and
* motivation in creating it.
* <p>
* It all began when I was a small child, growing up on the
* streets of Idaho. Potatoes were the rage, and life was good...
*
* @see PotatoPeeler
* @see PotatoMasher
* @author John 'Spuds' Smith
* @version 1.00, 19 Nov 2019
*/
class
Potato
{
javadoc creates HTML documentation for classes by reading the source code and pulling out the embedded comments and @
tags. In this example, the tags cause author and version information to be presented in the class documentation. The @see
tags produce hypertext links to the related class documentation.
The compiler also looks at the doc comments; in particular, it is interested in the @deprecated
tag, which means that the method has been declared obsolete and should be avoided in new programs. The fact that a method is deprecated is noted in the compiled class file so a warning message can be generated whenever you use a deprecated feature in your code (even if the source isn’t available).
Doc comments can appear above class, method, and variable definitions, but some tags may not be applicable to all of these. For example, the @exception
tag can only be applied to methods. Table 4-1 summarizes the tags used in doc comments.
Tag | Description | Applies to |
---|---|---|
|
Associated class name |
Class, method, or variable |
|
Source code content |
Class, method, or variable |
|
Associated URL |
Class, method, or variable |
|
Author name |
Class |
|
Version string |
Class |
|
Parameter name and description |
Method |
|
Description of return value |
Method |
|
Exception name and description |
Method |
|
Declares an item to be obsolete |
Class, method, or variable |
|
Notes API version when item was added |
Variable |
Javadoc tags in doc comments represent metadata about the source code; that is, they add descriptive information about the structure or contents of the code that is not, strictly speaking, part of the application. Some additional tools extend the concept of Javadoc-style tags to include other kinds of metadata about Java programs that are carried with the compiled code and can more readily be used by the application to affect its compilation or runtime behavior. The Java annotations facility provides a more formal and extensible way to add metadata to Java classes, methods, and variables. This metadata is also available at runtime.
The @
prefix serves another role in Java that can look similar to tags. Java supports the notion of annotations as a means of marking certain content for special treatment. You apply annotations to code outside of comments. The annotation can provide information useful to the compiler or to your IDE. For example, the @SuppressWarnings
annotation causes the compiler (and often your IDE as well) to hide warnings about things such as unreachable code. As you get into creating more interesting classes in “Advanced Class design”, you may see your IDE add @Overrides
annotations to your code. This annotation tells the compiler to perform some extra checks; these checks are meant to help you write valid code and catch errors before you (or your users) run your program.
You can even create custom annotations to work with other tools or frameworks. While a deeper discussion of annotations is beyond the scope of this book, we will take advantage of some very handy annotations for web programming in Chapter 12.
While commenting your code is critical to producing readable, maintainable files, at some point you have to start writing some compilable content. Programming is manipulating that content. In just about every language, such information is stored in variables and constants for easier use by the programmer. Java has both. Variables store information that you plan to change and reuse over time (or information that you don’t know ahead of time such as a user’s email address). Constants store information that is, well, constant. We’ve seen examples of both elements even in our tiny starter programs. Recall our simple graphical label from “HelloJava”:
import
javax.swing.*
;
public
class
HelloJava
{
public
static
void
main
(
String
[]
args
)
{
JFrame
frame
=
new
JFrame
(
"Hello, Java!"
);
JLabel
label
=
new
JLabel
(
"Hello, Java!"
,
JLabel
.
CENTER
);
frame
.
add
(
label
);
frame
.
setSize
(
300
,
300
);
frame
.
setVisible
(
true
);
}
}
In this snippet, frame
is a variable. We load it up in line 5 with a new instance of the JFrame
class. Then we get to reuse that same instance in line 7 to add our label. We reuse the variable again to set the size of our frame in line 8 and to make it visible in line 9. All that reuse is exactly where variable shine.
Line 6 contains a constant: JLabel.CENTER
. Constants contain some value that never changes throughout your program. Information that doesn’t change may seem like a strange thing to store—why not just use the information itself each time? Since the programmer writing the code gets to select the name of the constant, one immediate benefit is that you can describe the information in a useful way. JLabel.CENTER
may seem a little opaque still, but the word “CENTER” at least gives you a hint about what’s happening.
The use of named constants also allows for simpler changes down the road. If you code something like the maximum number of some resource you use, altering that limit is much easier if all you have to do is change the initialized value of the constant. If you use a literal number like “5”, you would have to hunt through all of your Java files to track down every occurrence of a 5
and change it as well—if that particular 5
was in fact referring to the resource limit. That type of manual search and replace is prone to error quite above and beyond being tedious.
We’ll see more details on the types and initial values of variables and constants later in the next section. As always, feel free to use jshell to explore and discover some of those details on your own! Although a quick warning that due to interpreter limitations, you cannot declare your own top-level constants in jshell. You can still use constants defined for classes like JLabel.CENTER
above or define them in your own classes you might type into jshell. The Math
class has all sorts of nifty functions and a constant for π. Try calculating and storing the area of a circle in a variable. Then prove to yourself that reassigning constants won’t work.
jshell> double radius = 42.0; radius ==> 42.0 jshell> Math.PI $2 ==> 3.141592653589793 jshell> Math.PI = 3; | Error: | cannot assign a value to final variable PI | Math.PI = 3; | ^-----^ jshell> double area = Math.PI * radius * radius; area ==> 5541.769440932396 jshell> radius = 6; radius ==> 6.0 jshell> area = Math.PI * radius * radius; area ==> 113.09733552923255 jshell> area area ==> 113.09733552923255
Notice the compiler error when we try to set π to 3
. Also notice that both radius
and area
can be changed after they were declared and intialized. But variables only hold one value at a time. The latest calculation is the only thing that remains in the variable area
.
The type system of a programming language describes how its data elements (the variables and constants we just touched on) are associated with storage in memory and how they are related to one another. In a statically typed language, such as C or C++, the type of a data element is a simple, unchanging attribute that often corresponds directly to some underlying hardware phenomenon, such as a register or a pointer value. In a more dynamic language such as Smalltalk or Lisp, variables can be assigned arbitrary elements and can effectively change their type throughout their lifetime. A considerable amount of overhead goes into validating what happens in these languages at runtime. Scripting languages such as Perl achieve ease of use by providing drastically simplified type systems in which only certain data elements can be stored in variables, and values are unified into a common representation, such as strings.
Java combines many of the best features of both statically and dynamically typed languages. As in a statically typed language, every variable and programming element in Java has a type that is known at compile time, so the runtime system doesn’t normally have to check the validity of assignments between types while the code is executing. Unlike traditional C or C++, Java also maintains runtime information about objects and uses this to allow truly dynamic behavior. Java code may load new types at runtime and use them in fully object-oriented ways, allowing casting and full polymorphism (extending of types). Java code may also “reflect” upon or examine its own types at runtime, allowing advanced kinds of application behavior such as interpreters that can interact with compiled programs dynamically.
Java data types fall into two categories. Primitive types represent simple values that have built-in functionality in the language; they represent simple values such as numbers, booleans, and characters. Reference types (or class types) include objects and arrays; they are called reference types because they “refer to” a large data type that is passed “by reference,” as we’ll explain shortly. Generic types and methods define and operate on objects of various types while providing compile-time type safety. For example, a List<String>
is a List
that can only contain String
s. These are also reference types and we’ll see much more of them in Chapter 7.
Numbers, characters, and Boolean values are fundamental elements in Java. Unlike some other (perhaps more pure) object-oriented languages, they are not objects. For those situations where it’s desirable to treat a primitive value as an object, Java provides “wrapper” classes. (More on this later.) The major advantage of treating primitive values as special is that the Java compiler and runtime can more readily optimize their implementation. Primitive values and computations can still be mapped down to hardware as they always have been in lower-level languages. Indeed, if you work with native libraries using the Java Native Interface (JNI) to interact with other languages or services, these primitive types will figure prominently in your code.
An important portability feature of Java is that primitive types are precisely defined. For example, you never have to worry about the size of an int
on a particular platform; it’s always a 32-bit, signed, two’s complement number. The “size” of a numeric type determines how big (or how precise) a value you can store. For example, the byte
type is for small numbers, from -128 to 127 while the int
type can handle most numeric needs storing values between (roughly) +/- two billion. Table 4-2 summarizes Java’s primitive types.
Type | Definition | Approximate Range or Precision |
---|---|---|
|
logical value |
|
|
16-bit, Unicode character |
64K characters |
|
8-bit, signed, two’s complement integer |
-128 to 127 |
|
16-bit, signed, two’s complement integer |
-32,768 to 32,767 |
|
32-bit, signed, two’s complement integer |
-2.1e9 to 2.1e9 |
|
64-bit, signed, two’s complement integer |
-9.2e18 to 9.2e18 |
|
32-bit, IEEE 754, floating-point value |
6-7 signifcant decimal places |
|
64-bit, IEEE 754 |
15 significant decimal places |
Those of you with a C background may notice that the primitive types look like an idealization of C scalar types on a 32-bit machine, and you’re absolutely right. That’s how they’re supposed to look. The 16-bit characters were forced by Unicode, and ad hoc pointers were deleted for other reasons. But overall, the syntax and semantics of Java primitive types derive from C.
But why have sizes at all? Again, that goes back to efficiency and optimization. The number of goals for a soccer match rarely crest the single digits—they would fit in a byte
variable. The number of fans watching that match, however, would need something bigger. The total amount of money spent by all of the fans at all of the soccer matches in all of the World Cup countries would need something bigger still. By picking the right size, you give the compiler the best chance at optimizing your code thus making your application run faster or consume fewer system resources or both.
If you do need bigger numbers than the primitive types offer, you can check out the BigInteger
and BigDecimal
classes in the java.Math
package. These classes offer near-infinite size or precision. Some scientific or cryptographic applications require you to store and manipulate very large (or very small) numbers and value accuracy over performance. We won’t cover those classes in this book, but store their names away in the back of your brain for a rainy day’s research.
Floating-point operations in Java follow the IEEE 754 international specification, which means that the result of floating-point calculations is normally the same on different Java platforms. However, Java allows for extended precision on platforms that support it. This can introduce extremely small-valued and arcane differences in the results of high-precision operations. Most applications would never notice this, but if you want to ensure that your application produces exactly the same results on different platforms, you can use the special keyword strictfp
as a class modifier on the class containing the floating-point manipulation (we cover classes in the next chapter). The compiler then prohibits these platform-specific optimizations.
Variables are declared inside of methods and classes with a type name followed by one or more comma-separated variable names. For example:
int
foo
;
double
d1
,
d2
;
boolean
isFun
;
Variables can optionally be initialized with an expression of the appropriate type when they are declared:
int
foo
=
42
;
double
d1
=
3.14
,
d2
=
2
*
3.14
;
boolean
isFun
=
true
;
Variables that are declared as members of a class are set to default values if they aren’t initialized (see Chapter 5). In this case, numeric types default to the appropriate flavor of zero, characters are set to the null character (