About this Book

This book will teach you about PDF, Adobe’s Portable Document Format, from a Java developer’s point of view. You’ll learn how to use iText in a Java/J2EE application to produce and manipulate PDF documents. Along the way, you’ll become acquainted with interesting PDF features and discover e-document functionality you may not have known about before.

Who should read this book?

This book is intended for Java developers who want to enhance their projects with dynamic PDF generation or manipulation. It assumes you have some background in Java programming.

This book includes lots of ready-made solutions that can easily be adapted and integrated into larger projects. For reasons of convenience, most of the examples are constructed as standalone command-line applications. If you want to run these examples in a web application, you should know how to set up an application server, where to put the necessary Java archive files (JARs) and resources, and how to deploy a servlet.

.NET developers using iTextSharp, the C# port of iText, can also benefit from this book, but they’ll have to adapt the examples.

Knowledge of the Portable Document Format isn’t necessary, because this book will explain a good deal of the PDF functionality and syntax where needed. ISO-32000-1 is a good companion to this book, for those who want to know every detail about PDF internals.

How to use this book

You can read this book chronologically, starting with the part about creating PDFs, moving on to the part about manipulating documents, and then learning some essential skills in part 3. Part 4 looks under the hood and digs deeper into the PDF specification.

You can also read the book in random order or thematically, selecting specific chapters that explain how to meet your own requirements. Once you’re well acquainted with iText, you’ll probably use the book as a reference manual. In particular, the tables in chapter 14 are the result of my own frustration with tables that were too scattered throughout different chapters in the first edition.

What you’ll be able to achieve after reading this book

The book consists of four parts:

  • Part 1 Creating PDF documents from scratch
  • Part 2 Manipulating existing PDF documents
  • Part 3 Essential iText skills
  • Part 4 Under the hood

Throughout this book, the examples use a movie database created for a (fictional) film festival. You’ll access this database from a series of simple applications, creating and manipulating different PDF files that could be useful for the visitors of the imaginary film festival.

Creating PDF documents

In chapters 1 and 2 you’ll create a series of PDF documents from scratch. You’ll use SQL statements to query a movie database, loop over the ResultSet, and add the data from each record to a PDF document using high-level objects such as Chunks, Phrases, Paragraphs, and so on. You’ll create PDF documents without having to know anything about the PDF specification.

In chapter 3, you’ll learn how to draw lines, shapes, and text to create a timetable visualizing the screenings, using a different color for each festival category. To achieve this, you’ll need low-level operations that demand a sound understanding of how PDF works.

In chapter 4, one of the most important chapters of the first part, you’ll use the database information to create documents containing tabular data. You’ll learn almost everything there is to know about the PdfPTable and PdfPCell objects.

Your knowledge about tables and cells will be completed in chapter 5, where you’ll learn how to add custom behavior to a table and its cells using events. Finally, you’ll also learn about page events. You’ll add the finishing touch to your documents in the form of headers, footers, page numbers, and a watermark.

After reading the first part of the book, you’ll be able to write a proof of concept for any project that requires you to generate PDF reports from scratch. If your project also involves existing PDF documents, you’ll need to move on to part 2.

Manipulating PDF documents

Consider what you can do with paper documentation: you can bundle different articles into a book, you can cut out the pages of a large catalog to create a brochure containing only those pages that are interesting for your customers, you can fill out blanks in an exercise book, and so on.

All of this is also possible with PDF and iText. You’ll use PdfReader to access an existing PDF file, and you’ll use one or more of these document manipulation classes:

  • PdfWriter in combination with PdfImportedPage objects, if you want to take “photocopies” of specific pages
  • PdfStamper, if you want to add content to an existing PDF document
  • PdfCopy, PdfSmartCopy, or PdfCopyFields to combine a selection of pages from different, existing documents into a new PDF document

All these classes will be explained in chapter 6.

You’ll have a closer look at the PdfStamper class in chapter 7, where you’ll use it to annotate a document.

You can interpret the word “annotate” in different ways. One special type of annotation in PDF is the interactive form field. These are used in forms using AcroForm technology. Another type of PDF form is based on the XML Forms Architecture (XFA). You’ll learn about both types of interactive forms in chapter 8.

Having read parts 1 and 2, you’ll have a good idea of the possibilities offered by iText, but there’s more.

Essential iText skills

For the sake of simplicity, most of the examples in this book are standalone applications, but a majority of projects use iText as a PDF engine in server-side web applications. You’ll certainly benefit from chapter 9 if you want to avoid the pitfalls you might encounter while integrating your iText application into a Java servlet.

Once your proof of concept is online, you’ll probably be confronted with many extra user requirements:

  • Can you change this or that color?
  • Can you print the text in a different font?
  • Can you protect the document against abuse?

Part 3 will complete your knowledge about iText.

After mastering the content of the first three parts of the book, you’ll be able to meet over 90 percent of the standard requirements that have ever come up on the iText mailing list in the past 10 years. But please read on if you’re hungry for more.

Under the hood

While the first three parts give you the high-level view of PDF, part 4 will focus on the lowest level of PDF creation and manipulation. You should read this part

  • if you want to know what a PDF looks like under the hood
  • if you need a short introduction to and a quick reference for ISO-32000-1
  • if you want to learn how to tweak PDF files using iText’s low-level objects and methods

In chapter 13, you’ll learn that PDF has undergone many changes over the years. One of Adobe’s important goals was that every new version of the specification had to be backward-compatible. This was possible thanks to the well-designed architecture of a PDF file (the Carousel Object System). By studying the different objects that make up a PDF document, you’ll learn how iText creates a PDF file.

Chapter 14 focuses on the streams holding the content of a page in a PDF document. You’ll learn all the methods for drawing lines and shapes (graphics state), and for writing letters and words (text state).

In chapter 15, you’ll discover how to make content optional, and you’ll also learn about structure in the content stream of a page. You’ll learn how to parse content streams of existing PDF pages.

Finally, you’ll get a closer look at the other streams that can be found in a PDF document: images, fonts, file attachments, and rich media.

The goal of the book

My goal for this book is for it to become a must-have reference for the many developers who are already familiar with iText. With this book, they’ll have a complete overview of iText’s powerful PDF capabilities. But, let’s not forget the first-time users of iText. This book will lower their learning curve and inspire them to use PDF in ways they hadn’t previously considered.

Code conventions

First use of technical terms is in italic. The same goes for emphasized terms.

Source code in listings or in text is in fixed width font. Some code lines are in bold fixed width font for emphasis. Java methods and parameters, XML elements and attributes, PDF operators and operands, are also presented using fixed width font. PDF names are preceded by a forward slash; this is a /Name. Methods can be recognized by the parentheses that are added: this is a method(). In most cases, the parameters are omitted but are explained in the text.

Occasionally, code lines that are too long for the page but that shouldn’t be split on screen are broken with a code-continuation character ().

Code annotations accompany many of the source code listings, highlighting important concepts. Numbered annotations correspond to explanations that follow the listing.

Software requirements and downloads

iText is a free and open source library distributed by 1T3XT BVBA. You can download it from itextpdf.com or from the SourceForge site. The software is protected by the Affero General Public License (AGPL). iText requires Java 5; iTextSharp requires .NET 2.0.

All examples have been tested in a SUN Java runtime environment on Windows XP and Fedora Linux. You can download the source code, resources, and all the tools that are required to compile and run the examples from the SVN repository on SourceForge or from the publisher’s website at www.manning.com/iTextinActionSecondEdition.

See appendix B.1.2 to find out how to get access to these examples.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset