Chapter 1. Meet Lucene
Figure 1.1. Searching the internet with Google
Figure 1.2. Mac OS X Finder with its embedded search capability
Figure 1.3. Apple’s iTunes intuitively embeds search functionality.
Figure 1.5. Classes used when indexing documents with Lucene
Chapter 2. Building a search index
Figure 2.2. Segmented structure of a Lucene inverted index
Figure 2.3. A single IndexWriter can be shared by multiple threads.
Figure 2.4. An in-memory document buffer helps improve Lucene’s indexing performance.
Chapter 3. Adding search to your application
Figure 3.2. The relationship between the common classes used for searching
Figure 3.3. Lucene uses this formula to determine a document score based on a query.
Figure 3.5. Sloppy phrase scoring formula
Chapter 4. Lucene’s analysis process
Figure 4.2. A token stream with positional and offset information
Figure 4.5. TokenFilter and Tokenizer class hierarchy
Figure 4.6. SynonymAnalyzer visualized as factory automation
Figure 4.7. ChineseDemo illustrating analysis of the title Tao Te Ching
Figure 4.8. Analysis chain that includes character normalization
Chapter 5. Advanced search techniques
Figure 5.1. SpanTermQuery for brown
Figure 5.2. SpanFirstQuery requires that the positional match occur near the start of the field
Figure 5.3. SpanNearQuery requires positional matches to be close to one another.
Figure 5.4. One clause of the SpanOrQuery
Figure 5.5. Term vectors for two documents containing the terms cat and dog
Figure 5.6. Formula for computing the angle between two term vectors
Chapter 6. Extending search
Figure 6.1. Which Mexican restaurant is closest to home (at 0,0) or work (at 10,10)?
Chapter 7. Extracting text with Tika
Chapter 8. Essential Lucene extensions
Figure 8.1. This Luke dialog box provides interesting options for opening the index.
Figure 8.2. Luke’s Overview tab allows you to browse fields and terms.
Figure 8.3. Luke’s Documents tab shows all fields for the document you select.
Figure 8.4. Searching: an easy way to experiment with QueryParser
Figure 8.6. Luke includes several useful built-in plug-ins.
Figure 8.7. Highlighting matching query terms within text
Figure 8.8. Java classes and interfaces used by Highlighter
Figure 8.9. FastVectorHighlighter supports multicolored hit highlighting out of the box.
Chapter 9. Further Lucene extensions
Figure 9.1. WordNet shows word interconnections, such as this entry for the word search.
Figure 9.2. Viewing the synonyms for search using Luke’s documents tab
Figure 9.3. Three common options for building a Lucene query from a search UI
Figure 9.4. Advanced search user interface for a job search site, implemented with XmlQueryParser
Figure 9.6. Tiers and grid boxes recursively divide two dimensions into smaller and smaller areas.
Figure 9.7. Remote searching through RMI, with the server searching multiple indexes
Chapter 11. Lucene administration and performance tuning
Figure 11.1. Steps to test indexing throughput on Wikipedia articles
Figure 11.2. ThreadedIndexWriter manages multiple threads for you.
Figure 11.4. File descriptor consumption while building an index of Wikipedia articles
Chapter 12. Case study 1: Krugle
Chapter 13. Case study 2: SIREn
Chapter 14. Case study 3: LinkedIn
Figure 14.2. Zoie’s three-index architecture: two in-memory indexes, and one disk-based index
Figure 14.3. The read-only JMX view of Zoie’s attributes, as rendered by JConsole
Figure 14.4. Zoie exposes controls via JMX, allowing an operator to change its behavior at runtime.
Appendix B. Lucene index format
Figure B.1. The logical, black-box view of a Lucene index
Figure B.2. Unoptimized index with three segments, holding 24 documents