Chapter 1. Introducing PDF and iText

This chapter covers

  • A summary of what will be presented in this book
  • Compiling and executing your first example
  • Learning the five steps in iText’s PDF creation process

Call me Bruno. About ten years ago—never mind how long precisely—I thought I’d create a small PDF library in Java and publish it as free and open source software (F/OSS). Little did I know that this would lead to my writing a whale of a book about the extensive functionality that has been added over the years.

That library was iText, and the book was titled iText in Action: Creating and Manipulating PDF (2007). Today, iText is the world’s leading F/OSS PDF library. It’s released under the Affero General Public License (AGPL) and is available in two versions: the original Java version, and the C# port, iTextSharp. These libraries make it possible for you to enhance applications with dynamic PDF solutions. You can use iText to create invoices for your customers if you have a web shop, to produce tickets if you work for an airline or railway company, and so on. You can integrate iText into an application to generate PDF documents as an alternative to printing on paper, to add digital signatures to a document, to split or concatenate different documents, and so forth.

In the first edition of iText in Action, readers learned why things work the way they do in iText, complemented with simple examples. This second edition takes you further with more real-life examples, skipping a bit on the whys, but presenting comprehensive code samples that you can use to solve everyday problems.

In this chapter, I’ll give you a quick overview of the things you can do with PDF—you’ll compile and execute a first “Hello World” example—and you’ll learn the basics of creating PDFs with iText.

1.1. Things you can do with PDF

Let’s start with six quick facts about PDF:

  • PDF is the Portable Document Format.
  • It’s an open file format (ISO-32000-1), originally created by Adobe.
  • It’s used for documents that are independent of system software and hardware.
  • PDF documents are an essential part of the web.
  • Adobe Reader is the most widely used PDF viewer.
  • There are a lot of free and proprietary, open and closed source, desktop and web-based software products for creating, viewing, and manipulating PDF documents.

Figure 1.1 offers an overview of the things you can do with PDF. There are tools to create PDF documents, there are applications to consume PDF documents, and there are utilities to manipulate existing PDF documents.

Figure 1.1. Overview of PDF-related functionality. The functionality covered by iText is marked with the iText logo.

If you look at PDF creation, you’ll find that graphical designers use desktop applications such as Adobe Acrobat or Adobe InDesign to create a document in a manual or semimanual process. In another context, PDF documents are created programmatically, using an API to produce PDFs directly from software applications, without—or with minimal—human intervention. Sometimes the document is created in an intermediary format first, then converted to PDF. These different approaches demand different software products. The same goes for PDF manipulation. You can update a PDF manually in Adobe Acrobat, but there are also tools that allow forms to be filled out automatically based on information from a database.

This book will focus on the automation side of things: we’ll create and manipulate PDF documents in an automated process using iText. The functionality covered by iText in figure 1.1 is marked with the iText logo. A smaller logo indicates that the functionality is only partly supported.

Typically, iText is used in projects that have one of the following requirements:

  • The content isn’t available in advance: it’s calculated based on user input or real-time database information.
  • The PDF files can’t be produced manually due to the massive volume of content: a large number of pages or documents.
  • Documents need to be created in unattended mode, in a batch process.
  • The content needs to be customized or personalized; for instance, the name of the end user has to be stamped on a number of pages.

Often you’ll encounter these requirements in web applications, where content needs to be served dynamically to a browser. Normally, you’d serve this information in the form of HTML, but for some documents, PDF is preferred over HTML for better printing quality, for identical presentation on a variety of platforms, for security reasons, or to reduce the file size. In this case, you can serve PDF on the fly.

As you read this book, you’ll create and manipulate hundreds of PDF documents that demonstrate how to use a specific feature, how to solve common and less common issues, and how to build an application that involves PDF technology. We’ll use iText because it’s an API that was developed to allow developers to do the following (and much more):

  • Generate documents and reports based on data from an XML file or a database
  • Create maps and books, exploiting numerous interactive features available in PDF
  • Add bookmarks, page numbers, watermarks, and other features to existing PDF documents
  • Split or concatenate pages from existing PDF files
  • Fill out interactive forms
  • Serve dynamically generated or manipulated PDF documents to a web browser

For first-time users, this book is indispensable. Although the basic functionality of iText is easy to grasp, the first parts of this book significantly lower the learning curve and gradually offer more advanced functionality.

It’s also a must-have for the many developers who are already familiar with iText. In the final chapters, many PDF secrets hidden in ISO-32000-1, the open standard that defines the Portable Document Format, will be unveiled. Even experienced iText developers will learn new ways to master the PDF specification using their favorite PDF library.

Without further ado, let’s start with a simple example that explains how to compile and run the many examples that come with this book.

1.2. Working with the examples in this book

All the source files, as well as the resources and extra libraries necessary to run the book’s examples, were uploaded to a Subversion (SVN) repository on SourceForge. If you have an SVN client, you can check out of the complete working environment at once. This way, you’ll be able to get the latest updates and new examples, even after the book has been released. Please consult appendix B for the URL of this repository.

You can find more info about this on the examples page of the itextpdf.com site. That’s also the place where you’ll find zipped archives, in case you don’t have an SVN client. You can download these archives and unzip them on your local system.

Before you start experimenting, make sure that you have a recent version of the Java Development Kit (JDK) installed. The examples won’t work for versions of iText that are older than iText 5, and iText 5 is compiled with Java 5, so the minimum requirement for your JVM is Sun’s JDK 1.5. You can use other JDKs, but only the JDK from Sun is supported.

Figure 1.2 shows how I compiled and executed the first example, HelloWorld, on Ubuntu Linux using OpenJDK 6. As you can see, you first change the directory to the examples folder (or whichever folder contains your copy of the project). Then you run this command:

javac -d bin -cp lib/iText.jar src/part1/chapter01/HelloWorld.java
Figure 1.2. Compiling and running from the command line

HelloWorld.java is the source file; we’ll take a close look at it in the next section. The option -d says that the compiled code should be written to the bin folder. With option -cp you define the classpath. For this simple example, you only need the iText.jar file. For other examples, you might need to add more JARs, such as a JAR with the database driver, encryption JARs, and so forth.

Once you’ve compiled the code, you can execute it:

java -cp "bin:lib/iText.jar" part1.chapter01.HelloWorld

If you’re working on Windows, you’ll need to replace the colon separating the different parts of the classpath with a semicolon:

java -cp "bin;lib/iText.jar" part1.chapter01.HelloWorld

Congratulations! You have created your first PDF file using iText. Figure 1.3 shows how everything is organized.

Figure 1.3. Organization of the sample files

The source code of the examples can be found in the src folder; see, for instance, the file HelloWorld.java. The package names of the examples correspond to the part and chapter numbers of the book. In the lib directory, you’ll find all the JARs you need to compile the examples. There’s also a resources folder containing all the resources you might need to run the examples: database scripts, images, special fonts, and existing PDF files, such as interactive forms.

The examples are compiled to the bin folder. The HelloWorld.class file will appear as soon as you run the javac command. When you execute the java command, you’ll see the hello.pdf file appear in the results directory. Figure 1.4 shows the end result: a PDF file containing the text “Hello World!”

Figure 1.4. A “Hello World” PDF

It’s certainly possible to compile and execute all the examples from the command line, but it’s more likely that you’ll prefer using an integrated development environment (IDE). Figure 1.5 shows what the project looks like in Eclipse—you’ll recognize the same folders. Observe that Eclipse puts the src folder on top. The bin directory is hidden; you’ll find the JARs under Referenced Libraries. You can view and update the list of registered JARs by selecting Project > Properties > Java Build Path > Libraries.

Figure 1.5. The project opened in Eclipse

Figure 1.5 already gives you a peek at the source code. The hello.pdf file is created in five steps. The next section discusses every step in detail.

1.3. Creating a PDF document in five steps with iText

Let’s copy the content of the main method of figure 1.5, and remove the comments. The numbers to the side in this listing indicate the different steps in the PDF-creation process.

Listing 1.1. HelloWorld.java

We’ll devote a separate subsection to each of these five steps:

  • Step —Create a Document.
  • Step —Get a PdfWriter instance.
  • Step —Open the Document.
  • Step —Add content.
  • Step —Close the Document.

In each of the following subsections, we’ll focus on one specific step. You’ll apply small changes to step in the first subsection, to step in the second, and so on. This way, you’ll create several new documents that are slightly different from the one in figure 1.4. You can hold these variations on the original hello.pdf against a strong light (literally or not) and discover the differences and similarities caused by the small code changes.

1.3.1. Creating a new Document object

Document is the object to which you’ll add content in the form of Chunk, Phrase, Paragraph, and other high-level objects. These objects are often referred to as iText’s basic building blocks, and they’ll be discussed in chapter 2. For now, we’ll only work with Paragraph objects.

Measurements

Upon creating the Document object, you’ll define the page size and the page margins of the first page. Either this happens implicitly, as is the case in step of listing 1.1; or you can define the size and margins explicitly using a com.itextpdf.text.Rectangle object and four float values for the margins as shown here.

Listing 1.2. HelloWorldNarrow.java
Rectangle pagesize = new Rectangle(216f, 720f);
Document document = new Document(pagesize, 36f, 72f, 108f, 180f);

In this example, a rectangle measuring 216 x 720 user units is created. This rectangle is used as the page size in the Document constructor, along with a left margin of 36 user units, a right margin of 72 user units, a top margin of 108 user units, and a bottom margin of 180 user units.

 

FAQ

What is the measurement unit in PDF documents? Most of the measurements in PDFs are expressed in user space units. ISO-32000-1 (section 8.3.2.3) tells us “the default for the size of the unit in default user space (1/72 inch) is approximately the same as a point (pt), a unit widely used in the printing industry. It is not exactly the same; there is no universal definition of a point.” In short, 1 in. = 25.4 mm = 72 user units (which roughly corresponds to 72 pt).

 

If you open the document created by listing 1.2 in Adobe Reader and look at the Description tab in the Document properties dialog box (opened via File > Properties), you’ll find that the document measures 3 in. x 10 in.

iText also created a left margin of 0.5 in. (36/72), a right margin of 1 in. (72/72), a top margin of 1.5 in. (108/72), and a bottom margin of 2.5 in. (180/72).

If you don’t like doing all that math, there’s a Utilities class in iText with static methods that help you switch among points, inches, and millimeters: millimeters-ToPoints(), millimetersToInches(), pointsToMillimeters(), pointsToInches(), inchesToMillimeters(), and inchesToPoints(). All these methods expect a float as their value.

Note that these methods refer to points, not to user units. That’s because the default value of the user unit corresponds with a point, but it’s possible to change this default.

Listing 1.3. HelloWorldMaximum.java
Document document = new Document(new Rectangle(14400, 14400));
PdfWriter writer
= PdfWriter.getInstance(document, new FileOutputStream(RESULT));
writer.setUserunit(75000f);

Looking at the first line in this code snippet, you might expect a document with a page measuring 200 in. x 200 in., but when you look at the document properties of the resulting file, you’ll see that it measures 15,000,000 in. x 15,000,000 in. That’s because you’ve changed the user unit to 75,000 in the last line of listing 1.3. Now, one user unit corresponds with 75,000 points, and you’ve created a PDF document with the largest possible page size.

Page Size

Theoretically, you could create pages of any size, but the PDF specification imposes limits depending on the PDF version of the document.

Table 1.1. Minimum and maximum size of a page depending on the PDF version

PDF version

Minimum size

Maximum size

PDF 1.3 or earlier 72 × 72 units (1 in. × 1 in.) 3240 × 3240 units (45 in. × 45 in.)
PDF 1.4 and later 3 × 3 units (approximately 0.04 in. × 0.04 in.) 14,400 × 14,400 units (200 in. × 200 in.)

Changing the user unit has been possible since PDF 1.6. The minimum value of the user unit is 1 (this is the default; 1 unit = 1/72 in.); the maximum value is 75,000 points (1 unit = 1042 in.).

But enough about exotic page sizes; you’re probably interested in the standard paper sizes. The default value of a page in iText, if you create a Document object without any parameters, is A4, which is the most common paper size in Europe, Asia, and Latin America. It’s specified by the International Standards Organization (ISO) in ISO-216. An A4 document measures 210 mm x 297 mm, or 8.3 in. × 11.7 in., or 595 pt x 842 pt.

If you want to create a document in another standard format, take a look at the PageSize class. This class was written for your convenience, and it contains a list of static final Rectangle objects, offering a wide selection of standard paper sizes, including A0 to A10, B0 to B10, and the American standard sizes: LETTER, LEGAL, LEDGER, and TABLOID. Listing 1.4 shows how to adapt the initial HelloWorld example so that it produces a PDF document saying “Hello World!” on a page that’s the American letter paper size.

Listing 1.4. HelloWorldLetter.java
Document document = new Document(PageSize.LETTER);

The orientation of most of the paper sizes defined in PageSize is portrait. You can change this to landscape by invoking the rotate() method on the Rectangle.

Listing 1.5. HelloWorldLandscape1.java
Document document = new Document(PageSize.LETTER.rotate());

Another way to create a Document in landscape orientation is to create a Rectangle object with a width that is greater than its height.

Listing 1.6. HelloWorldLandscape2.java
Document document = new Document(new Rectangle(792, 612));

The results of both landscape examples look exactly the same in Adobe Reader. The Reader’s Description tab doesn’t show any difference in size. Both PDF documents have a page size of 11 in. x 8.5 in. (instead of 8.5 in. x 11 in.), but there are subtle differences internally:

  • In the first file, the page is defined with a size that has a width smaller than the height, but with a rotation of 90 degrees.
  • The second file has the page size you defined without any rotation (a rotation of 0 degrees).

This difference will matter when you want to manipulate the PDF. We’ll return to this issue in chapter 6.

Page Margins

In listing 1.2, you defined margins using the constructor of the Document object, and you added a Paragraph to it. In the next two examples, you’ll define the page size and margins using the setPageSize() and setMargins() methods. You can use these methods at any time in the document’s creation process, but be aware that the change will never affect the current page, only the next page.

In these examples, you’ll add paragraphs that are aligned on both sides—justified text—so you can clearly see the left and right margins. You’ll add enough paragraphs to cause a page break, so you can make sure the bottom margin is respected.

Suppose this document consists of pages that are to be printed on both sides, and bound into a book. Depending on the way the book is bound, you might want a larger or smaller margin on the inner edges of the pages: the left margin of an odd-numbered page should correspond to the right margin of an even-numbered page. The same goes for the opposite margins. In short, you want the margins to be mirrored.

Listing 1.7. HelloWorldMirroredMargins.java
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream(RESULT));
document.setPageSize(PageSize.A5) ;
document.setMargins(36, 72, 108, 180);
document.setMarginMirroring(true) ;

Listing 1.7 assumes that the spine of the book is to the left (for Western books) or to the right (for Japanese books). But some books are bound in a completely different way, with the spine of the book at the top or bottom of the pages. In that case, you’d need to use this method.

Listing 1.8. HelloWorldMirroredMarginsTop.java
document.setMarginMirroringTopBottom(true);

Now the top and bottom margins are mirrored instead of the left and right margins.

But maybe we’re getting ahead of ourselves. We’re already adding content, but we haven’t yet discussed step in listing 1.1 in the PDF creation process.

1.3.2. Getting a PdfWriter instance

PdfWriter is the class responsible for writing the PDF file. You can also add contents, such as annotations, to PdfWriter. As opposed to the high-level objects added to the Document object, manipulations on PdfWriter are often referred to as low-level access and writing to the direct content. You’ll find out more about these concepts in chapter 3.

Step in listing 1.1 in the PDF creation process combines two actions:

  • It associates a Document with the PdfWriter. This writer will “listen” to the document. High-level objects, such as a Paragraph, will be translated into low-level operations. For example, iText will generate the PDF syntax that draws the textual content of a paragraph at a specific position on a page, taking into account the page size and margins.
  • It tells the PdfWriter to which OutputStream the file should be written. In the previous examples, you have written the content to a FileOutputStream, but you could have written to any other type of OutputStream. You could even have written the bytes of a PDF file to System.out.

In rare circumstances, creating a writer instance can cause a DocumentException.

Exceptions

DocumentException is the most general exception in iText. It can occur in step or step of listing 1.1. For example, if you try adding a Paragraph before you’ve done step , you’ll get the following error message: “The document isn’t open yet; you can only add metadata information.” DocumentExceptions also occur when manipulating existing documents. For instance, “Append mode requires a document without errors even if recovery was possible.”

If you look at listing 1.1, you see that you can also expect an IOException. Once you start using resources such as images, fonts, or existing PDFs, this exception can occur if something goes wrong while reading from an InputStream.

In the examples we’ve looked at so far, the only IOException that could be thrown is a FileNotFoundException. This happens when you’re trying to create a hello.pdf file, but you already have a file with that name opened—and locked—in Adobe Reader. (This happened to me all the time while writing the examples for this book.) Or maybe you’re trying to create the file in the results/part1/chapter01 directory, but this directory doesn’t exist on your filesystem. The empty results directories are provided with the example archives to avoid this problem.

Other Outputstreams

While you’re adding content to the Document, the PdfWriter gradually writes a PDF file to the OutputStream. This PDF file will be written to a file on disk if you choose a FileOutputStream. In a web application, you’ll generally prefer serving the PDF to a web browser without saving it on the server, so you could write directly to the ServletOutputStream, using response.getOutputStream() in your servlets. This will work with some browsers, but unfortunately not with all. Chapter 9 will explain why it’s better to write the complete file to memory before transferring the bytes to the OutputStream of an HttpServletResponse object.

Here’s how to write a file to memory using a ByteArrayOutputStream.

Listing 1.9. HelloWorldMemory.java

Observe that the PDF is created in memory in the first part of this snippet; nothing is written to disk. The bytes are written to a file in the last three lines of the snippet to prove that what was generated in memory represents a valid PDF file.

Now that you have all the infrastructure in place, it’s time to open the Document.

1.3.3. Opening the Document

Java programmers may not be used to having to open streams before being able to add content. When you create a new stream in Java, you can start writing bytes, chars, and Strings to it right away. With iText, it’s mandatory to open the document first.

When a Document object is opened, a lot of initializations take place, and the file header is written to the OutputStream.

The File Header and the PDF Version

Figure 1.6 shows your first PDF file, hello.pdf, opened in the Notepad++ text editor.

Figure 1.6. hello.pdf opened in Notepad++

As you can see, the first lines look like this:

%PDF-1.4
%âãïÓ

This is the header of a PDF file. The structure of a PDF file, with its header, body, cross-reference table, and footer, will be discussed in great detail in chapter 13. For now, it’s sufficient to know that the first line gives you an indication of the PDF version that is used.

By default, iText uses version 1.4, which was introduced in 2001. If you introduce functionality newer than what’s available in PDF 1.4 after step in listing 1.1, it’s your responsibility to set the correct PDF version before step . Otherwise, the default version—PDF-1.4—will be written to the OutputStream, and there’s no going back.

 

Note

Beginning with PDF 1.4, the PDF version can also be stored elsewhere in the PDF (in the root object of the document, aka the catalog; see chapter 13). This implies that a file with header PDF-1.4 can be seen as a PDF 1.6 file if it’s defined that way in the document root.

 

In some cases, iText changes the PDF version automatically. In listing 1.3, you changed the user unit, and this capability was introduced in version 1.6 of the PDF specification. Because you changed the user unit before step , iText was able to update the PDF version in the header to %PDF-1.6.

It’s a better practice to set the version number with PdfWriter.setPdfVersion() if you use PDF features that are newer than what was available in PDF 1.4. Here’s how to change the PDF version to 1.7.

Listing 1.10. HelloWorldVersion_1_7.java
PdfWriter writer
= PdfWriter.getInstance(document, new FileOutputStream(RESULT));
writer.setPdfVersion(PdfWriter.VERSION_1_7);

It’s not forbidden for the PDF version in the header to be different from the PDF version in the catalog, but it’s good practice to make setting the PDF version a part of your initializations to avoid ambiguity.

Initializations

Document.open() also performs many initializations. For instance, you can’t access the outline of the bookmarks before the document has been opened (see chapter 7). If you want to create an encrypted PDF file, you must set the encryption type, strength, and permissions before step in listing 1.1 (see chapter 12).

 

FAQ

I have set feature X, and it doesn’t work, or it doesn’t work for page 1, only for the pages that follow. Why is that? Many settings, such as the page size and margins, only go into effect on the next page. This may seem trivial, but it’s a common question for new iText users. If you want the feature to work on page 1, define it before opening the document.

 

After step , the first page of our document is available for you to add content (step ).

1.3.4. Adding content

In this section, we’re creating simple Hello World PDF documents, learning the elementary mechanics of iText’s PDF creation process. Once these are understood, you can start generating real-world documents containing real-world data.

To learn how to implement step , you’ll copy steps , , , and from listing 1.1 into an application, then focus on step : adding content to the PDF document.

There are different ways to add content. Up until now, you’ve been adding one or more high-level objects of type Paragraph to the Document. In the next chapter, you’ll learn about other objects, such as Chunk, Phrase, Anchor, and List. You can also add content to a page using low-level methods.

Direct Content

Listing 1.11 shows a variation on this chapter’s initial “Hello World” example. Although this is a rather complex example for a first chapter about using iText, it will give you an idea of iText’s internal PDF-creation process.

Listing 1.11. HelloWorldDirect.java

Steps , , and are the same as they were in listing 1.1, but you need to make a small change to step . Instead of using an unnamed instance of PdfWriter, you now give it a name: writer. You need this instance because you want to grab a canvas on which you can draw lines and shapes, and, in this case, text. In listing 1.11, comment sections were added, reflecting the PDF syntax that is written by each method.

By using the setCompressionLevel() method with a parameter of 0, you avoid compressing the stream. This allows you to read the PDF syntax when opening the file in a text editor. Figure 1.7 shows the resulting PDF when opened in WordPad.

Figure 1.7. PDF document opened in WordPad

This screenshot contains less gibberish than figure 1.6, though it’s showing the syntax of a similar “Hello World” PDF. You’ll recognize the PDF header, followed by a PDF object with number 2: 2 0 obj. After reading part 4 of this book, you’ll understand that this object is a stream object, the content stream of the first page. In figure 1.6, the content stream was compressed, but in figure 1.7, the compression is zero. You can see the syntax in clear text, although you’ll need to read chapter 14 to decipher what it means.

 

Note

Setting the compression level to 0 can be interesting if you need to debug your PDF file, but you shouldn’t change the compression level in a production environment, because the file size of the resulting PDFs will be bigger than files generated using the default compression level.

 

As you move on in this book, you’ll find out that you’ll need to add content directly to the page on different occasions, such as when adding page numbers, or when drawing custom borders for tables. As you might imagine, you’ll need a sound understanding of the PDF reference to achieve all this.

 

FAQ

I’ve added text using low-level methods and it doesn’t respect the margins, nor does the text wrap at the end of the line. What is wrong? That is expected behavior. When adding content like this, you need to do all the math necessary to split a String in different lines, and add it at the appropriate coordinates. Also, make sure that you don’t add the text outside the visible area of the page; this is a common mistake when adding text to an existing PDF document.

 

Listing 1.11 gets increasingly complex as soon as you need to add more text. Fortunately, iText comes to the rescue: you can use convenience classes and methods that significantly reduce the complexity and the lines of code needed to work with direct content.

Convenience Classes and Methods

Listing 1.12 is identical to listing 1.11 as far as steps , , , and are concerned, but in step you create a Phrase object and add this to the direct content, named canvas, using the static method ColumnText.showTextAligned(). The phrase hello will be added left aligned at coordinates (36, 788) with rotation 0.

Listing 1.12. HelloWorldColumn.java

If you open the resulting PDFs from listings 1.11 and 1.12 in Adobe Reader, you’ll see that both documents look identical. If you open them in a text editor, you’ll notice that the syntax is slightly different. There’s usually more than one way to create PDF documents that look like identical twins when opened in a PDF viewer. And even if you create two identical PDF documents using the exact same code, there will be small differences between the two resulting files. That’s inherent to the PDF format.

We’re almost finished discussing the five steps in the PDF creation process. It’s time for step 5.

1.3.5. Closing the Document

One of the typical uses of iText is to create documents containing many pages. For example, a financial institution uses iText to create PDFs of bank statements, consisting of 100,000 or more pages. You don’t want to keep the content of that many pages in memory, and that’s why iText will write content to the OutputStream as soon as possible. If a page is full, the content stream of that page will be written to the OutputStream; if you’re writing to a file, that content will be flushed from the memory.

Content Flushed to the Outputstream Versus Content Kept in Memory

If you return to figure 1.6 or 1.7, you’ll see that object 2, the page content stream of page 1, appears as the first object in the file. Other objects will be added at a higher byte position, regardless of their object number. iText has to keep certain objects in memory because there’s a chance you’ll reuse them and change them during the creation process. You’ll use this mechanism in section 5.4.2 to add the total number of pages—a number that is only known when the final page is reached—to all the previous pages.

Specific objects, such as the catalog and the info dictionary, will be added last by iText. They’re written to the OutputStream upon closing the Document. There’s also the cross-reference table, an important structure that is written immediately after the catalog and info dictionary. It contains the byte positions of the PDF objects that define the document. It’s followed by the trailer, containing information that enables an application to quickly find the start of the cross-reference table, and objects such as the info dictionary. Finally, the following byte sequence will be added, indicating that the file has been completely written:

%EOF

You don’t need to close the OutputStream you created in step . iText will close this stream right after the end-of-file sequence.

Keeping the Outputstream Open

There may be occasions when you don’t want the stream to be closed automatically.

Listing 1.13. HelloZip.java

In , you create a ZipOutputStream. It will generate a zip archive named hello.zip containing different PDF files. You use this OutputStream to create an instance of PdfWriter, but you immediately use the setCloseStream() method to tell the writer that it shouldn’t close the stream. If you don’t do this, the ZipOutputStream will be closed , and a java.io.IOException will be thrown , saying “Stream closed.” You have to wait until you’ve closed the final entry added to the zip file, before you can close the ZipOutputStream .

This example concludes our series of simple “Hello World” examples. You now have a solid first impression of how to use iText to create new PDF documents.

1.4. Summary

In this first introductory chapter, you’ve had a brief introduction to PDF, learning what is possible in PDF and what is possible with iText.

You’ve compiled and executed a first example, generating a simple “Hello World” PDF document. Using listings 1.1 through 1.13, you’ve created 15 similar files, of which three were archived in a zip file. In doing so, you’ve gone through the five elementary steps in iText’s PDF-creation process: create a Document, get a PdfWriter instance, open the Document, add content, close the Document.

This chapter contained many forward references, and some of the examples introduced functionality that was probably too complex for a first chapter, but don’t worry: every line of code will be explained further on in the book.

In the next chapter, you’ll create PDFs with content that is more meaningful. I’ll introduce a simple movie database and you’ll use iText’s high-level objects to publish the content of this database in different PDF documents.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset