[A][B][C][D][E][F][G][H][I][J][K][L][M][N][O][P][Q][R][S][T][U][V][W][X]
Abstract Window Toolkit.
See AWT.
access control list.
See ACL.
ACL
Activity Monitor process monitor
administration interface
Adobe Flash, extracting text from, 2nd
Adobe PDF, extracting text from, 2nd.
See also PDF.
AFP
used for indexing
AgoFilterBuilder
AllDocCollector, 2nd
AlreadyClosedException
analysis, Snowball, supported languages.
See also indexing, analysis.
analytics interface
Google Analytics
Lucene-specific metrics
analyzer, provided during indexing
AnalyzerDemo, 2nd, 3rd
AnalyzerUtils, 5th
displayTokens, 2nd
displayTokensWithFullDetails
displayTokensWithPositions
tokensFromAnalysis
AnalyzingQueryParser
Another Tool for Language Recognition.
See ANTLR.
Ant, building Lucene.
See also Apache Ant.
ANTLR, 2nd
Apache Ant
preparing to use
to build contrib modules
Apache Commons
columnar formatting
Digester, indexing using
Apache Jakarta
Apache JMeter
Apache POI project
Apache Software Foundation
Apache Software License
Apache subversion instance
Apache Tika.
See Tika.
Aperture
open source project
Apple Mac OS X, search feature
AR Archives, extracting text from
ArabicAnalyzer
Aroush, George
ASCIIFoldingFilter, 2nd
Asian language analysis
Attribute
AttributeSource
addAttribute
analysis
captureState
restoreState
audio formats, extracting text from metadata
AutoDetectParser, 3rd
Tika class
Autonomy
AWT
backward compatibility.
See Version.
BalancedMergePolicy
BalancedSegmentMergePolicy
Balmain, David
Beagle, 2nd
benchmark, OpenIndex
Berkeley DB, storing index
BerkeleyDBJESearcher
Bialecki, Andrzej
Bobo Browse
beyond simple faceting
integration with Zoie
Runtime FacetHandlers
sorting
BoboBrowser
BoboIndexReader, 2nd
BodyContentHandler, 2nd
BookLinkCollector
BooksLikeThis
using MoreLikeThis
BooleanClause
Occur
MUST
MUST_NOT
SHOULD
BooleanFilter
BooleanQueryBuilder
BooleanScorer
BoostingQuery
negativeQuery
positiveQuery
boosts
BrazilianAnalyzer
Browsable
Browse Engine, denormalization
BrowseFacet
BrowseHit
BrowseRequest
setFilter
BrowseResult
getFacets
getHits
BrowseSelection
Builder
BulletinPayloadsAnalyzer
BulletinPayloadsFilter
BZIP2 files, extracting text from, 2nd
C#, tokenizing
C++, tokenizing
CachedFilter
caching
field values
filter
CachingSpanFilter
CachingTokenFilter
used during highlighting
CachingWrapperFilter, 2nd, 3rd, 4th
CartesianTierPlotter
Cascading Style Sheets.
See CSS.
Catasta, Michele
catchall field, for searching multiple fields
categorizing documents, using term vectors
CellConjunctionScorer
CellDisjunctionScorer
CellQuery
CellReqExclScorer
CellScorer
ChainedFilter
combining with AND
combining with ANDNOT
combining with OR
combining with XOR
security filter example
chaining filters
charades
CharFilter
CharReader
CharStream
CharTokenizer
Chinese analysis
Chinese, Japanese, and Korean.
See CJK.
ChineseAnalyzer
ChineseDemo
ChineseTest
CIFS.
See Samba file system.
CJK, analysis of
CJKAnalyzer, 2nd
client-server port definition
CloseIndex benchmark task
CLucene
API compatibility
supported platforms
Unicode support
Collator, used for sorting String fields
Collector, 3rd, 4th
acceptsDocsOutOfOrder
collect
custom
setNextReader
setScorer
using field cache
common errors
Comparable
Compass
denormalization
ComplexPhraseQueryParser
compound index, creating
CompressionTools
ConcurrentMergeScheduler, 2nd, 3rd, 4th, 5th
setMaxThreadCount
ConcurrentModificationException
ConstantScoreQuery
content
acquiring
dividing into shards
raw, extracting for documents
ContentHandler, 2nd
ContentSource, 2nd
contrib modules
introduction
spatial search
coordination, query term
CorePlusExtensionsParser
CorruptIndexException
CPIO Archives, extracting text from
createLineFile.alg
CreateSpellCheckerIndex
CreateThreadedIndexTask
CSS
in highlighting
CustomQueryParser
CustomScoreQuery
getCustomScoreProvider
Cutting, Doug, 3rd
relevant work
CzechAnalyzer
database
primary key
storing index inside Berkeley DB
DatabaseConfig
DateField, dateToString
DateFilter, within ChainedFilter
DateFormat, SHORT
DateRecognizerSinkTokenizer
DateTools
debugging queries
DefaultEncoder
DefaultSimilarity
Delbru, Renaud
DeletionPolicy
denormalization
DERI
Sindice.com search engine
Dictionary
Digester.
See also Apache Commons Digester.
addCallMethod
addObjectCreate
addSetNext
addSetProperties
DigesterXMLDocument, 2nd
Digg
Digital Enterprise Research Institute.
See DERI.
DirContentSource
Directory implementations
FileSwitchDirectory
MMapDirectory
NIOFSDirectory
RAMDirectory
SimpleFSDirectory
directory in Berkeley DB
DirectSolrConnection
DisjunctionMaxQuery
tie-breaker
DistanceComparatorSource
DistanceQueryBuilder
DistanceSortSource
DocIdBitSet
DocIdSet
Document, 3rd, 4th, 7th, 8th
editing with Luke
reuse
setBoost
document type definition.
See DTD.
documentation
documents and fields
DOMUtils
Donovan, Aaron
downloading Lucene
Droids
DSight, denormalization
DTD
DuplicateFilter
DutchAnalyzer
dynamic fragmenting vs. highlighting
EdgeNGramFilter
EdgeNGramTokenizer
Edit distance.
See Levenshtein distance.
Elastic search, sharding and replication
Elschot, Paul
encoding UTF-8
entitlements, definition
EnvironmentConfig
EnwikiContentSource
Eventful
Excel.
See Microsft Excel.
Explanation
Extensible Hypertext Markup Language.
See XHTML.
Extensible Stylesheet Language.
See XSL.
FacetAccessible
faceted search
Bobo Browse
definition
FacetHandler, 2nd
FacetSpec
setMaxCount
setMinHitCount
setOrderBy
FastVectorHighlighter
compared to Highlighter
Ferret
field cache
DEFAULT
memory usage, 2nd
per segment readers
setInfoStream
used by sorting
used for sorting
field options
combinations
compressing fields
indexing
ANALYZED
ANALYZED_NO_NORMS
NO
NOT_ANALYZED
NOT_ANALYZED_NO_NORMS
sorting
storing
NO
YES
term vectors
NO
WITH_OFFSETS
WITH_POSITIONS
WITH_POSITIONS_OFFSETS
YES
FieldCacheRangeFilter, 2nd
FieldCacheSource
FieldCacheTermsFilter, 2nd
FieldComparator
FieldDocs
FieldMaskingSpanQuery
FieldNormModifier
FieldQuery
FieldScoreQuery
FieldSelector, 2nd, 8th
accept
loading only specified fields
specify fields by set
stopping after first field
time savings
FieldSelectorResult
LAZY_LOAD
LOAD
LOAD_AND_BREAK
LOAD_FOR_MERGE
NO_LOAD
SIZE
SIZE_AND_BREAK
FieldSortedTermVectorMapper
file descriptors, finding the limit
FileNotFoundException over remote file systems
FileSwitchDirectory
FilteredDocIdSet, 2nd
match
FilteredQuery, 2nd, 3rd
filtering token.
See TokenFilter.
finding similar documents using termvectors
FlagsAttribute
Flash.
See Adobe Flash.
Formatter
FragmentsBuilder
FrenchAnalyzer
frequency factor formula
fsync
Fuller, Robert
function queries
boosting by recency
using field cache
FuzzyLikeThisQuery
FuzzyQuery, 4th, 5th, 6th, 7th
formula
minimumSimilarity
prohibiting
GermanAnalyzer
Glouser, Grant
Google Analytics
Google Enterprise Connector Manager
GradientFormatter
Grails search plugin
GreekAnalyzer
Grub
GZIP compression, extracting text from
Hadoop, creation of
Harwood, Mark, 2nd, 3rd
hasDeletions
Hatcher, Erik
Heritrix
Hibernate Search, denormalization
hierarchical organizational schemes
HighFreqTerms
highlighting
query terms, 2nd
using CSS
vs. dynamic fragmenting
HightlightIt
Hoschek, Wolfgang
HTML
cookie
extracting text from, 2nd
meta tag
parsing
HtmlParser
HTTP headers, indexing Last-Modified header
HTTP request, content type
HttpServletRequest
Humphrey, Marvin, 2nd
I18N.
See internationalization.
IDF, 2nd
images, extracting text from metadata
index structure, converting
index, inverted
IndexCommit
getUserData
IndexDeletionPolicy, 2nd, 3rd
example usage
Indexer program
IndexFiles
IndexHTML
indexing classes
IndexMergeTool
IndexReaderDecorator, 2nd
IndexReaderFactory
IndexReaderWarmer
IndexSplitter
IndexWrter
information
explosion, dealing with
overload
specific, locating quickly
information retrieval.
See IR.
InputStream
INSO, filters
Installing Lucene
InstantiatedIndex
InstantiatedIndexWriter
intelligent agent, creating
internationalization
InvalidTokenOffsetException
inverse document frequency.
See IDF.
inverted index.
See index, inverted.
InvIndexer
IR
definition
library vs. search engine
ISYS file readers
iTunes search feature
J2ME
JaroWinkler, distance metric for spell correction
Java 2 Micro Edition.
See J2ME.
Java C Compiler.
See JCC.
Java class files, extracting text from
Java JAR files, extracting text from
Java Management Extensions.
See JMX.
Java Native Interface.
See JNI.
Java Runtime Environment.
See JRE.
javac, compile with UTF-8 encoding
JavaCC, building Lucene
JavaServer Page.
See JSP.
JCC
JConsole used by Zoie
JEDirectory
Jetty
JFlex, 2nd
building Lucene
usage in SIREn
JMX
used by Zoie
JNI
Jones, Tim
JRE, 2nd
JSP
Jython
Katta, sharding and replication
KeepOnlyLastCommitDeletionPolicy, 2nd
KeyView filters
keyword analyzer
KeywordAnalyzer, 2nd, 3rd, 4th
KeywordTokenizer
KinoSearch
differences vs Lucene
Krugle
enterprise appliance
Krugle.org
Krugler, Ken
KStem
language detection
Last.fm
lemmatization
LengthFilter
letter ngrams used by spellchecker
LetterTokenizer, 2nd, 3rd
Levenshtein
distance
distance metric for spell correction
LineDocSource, 2nd
LinkedIn, 2nd, 3rd
LoadFirstFieldSelector
local wrapper port, definition
LockFactory
locking
during indexing
write.lock file
LockObtainFailedException
LockStressTest
LockVerifyServer
LogByteSizeMergePolicy
setMaxMergeDocs
setMaxMergeMB
setMergeFactor
setMinMergeMB
LogDocMergePolicy
LowerCaseFilter, 2nd, 3rd, 4th
LowerCaseTokenizer, 2nd, 3rd, 4th
lowercasing, order may matter
lsof
Lucene ports
Lucene.Net
API compatibility
index compatibility
performance
LUCENE_24
LUCENE_29
Lucy
Luke, 2nd, 3rd, 19th
Analyzer Tool
browsing by term
browsing term vectors
Custom Similarity
document browsing
editing documents
Hadoop Plugin
indexing file view
Overview tab
scripting with JavaScript
search explanation
searching
searching with QueryParser
viewing synonyms
viewing term statistics
LuSQL, denormalization
Mannix, Jake
MAP
MapFieldSelector
MappingCharFilter
MatchAllDocsQuery, 2nd
used for browsing facets
Maven 2, used by Tika
maxDoc vs. numDocs
MaxFieldLength
UNLIMITED
UNLIMITED or LIMITED
MD5, reducing field cache memory usage
mean average precision.
See MAP.
mean reciprocal rank.
See MRR.
MemoryIndex
mergeFactor, 3rd, 4th, 5th
performance impact
MergePolicy, 2nd, 3rd, 5th
avoiding large segments
MergeScheduler, 2nd
merging
LogByteSizeMergePolicy
LogDocMergePolicy
waiting for merges to finish
Metadata, Tika class
Metaphone
Microsoft Excel, extracting text from, 2nd
Microsoft Office 2007, extracting text from
Microsoft Outlook, extracting text from, 2nd
Microsoft PowerPoint, extracting text from, 2nd
Microsoft Visio, extracting text from, 2nd
MIDI files, extracting text from
Miller, George and WordNet
MMapDirectory, 2nd
Montezuma
MoreLikeThis
MoreLikeThisQuery
MP3 audio, extracting text from tags
MRR
MultiFieldQueryParser
default operator
interations with Analyzer
multifile index, creating
MultiPassIndexSplitter
MultiPhraseQuery, 2nd, 3rd
QueryParser
slop
MultiSearcher, 2nd, 3rd
multithreaded searching.
See ParallelMultiSearcher.
native port, definition
native2ascii, Java tool
NativeFSLockFactory, 2nd
near-real-time reader
near-real-time search, 4th, 6th
avoiding commit
introduction
reducing turnaround time
Networked File System.
See NFS.
newBooleanQuery
newFuzzyQuery
newMatchAllDocsQuery
newMultiPhraseQuery
newPhraseQuery
newPrefixQuery
newRangeQuery
newTermQuery
newWildcardQuery
NFS
sharing index over
used for indexing
NGramTokenizer
NIOFSDirectory, 2nd
NoLockFactory
non-English language analysis
normalization
field length
query
NullFragmenter
numDocs vs. maxDoc
numeric range queries
NumericField, 6th, 9th, 10th
filtering during searching
precisionStep
setDoubleValue
setIntValue
setLongValue
sorting
NumericPayloadTokenFilter
NumericRangeFilter, 2nd, 3rd
NumericRangeQuery, 2nd, 3rd
created by QueryParser
creation from QueryParser
precisionStep
NutchDocumentAnalyzer
O’Leary, Patrick
OfficeParser
OffsetAttribute, 2nd
endOffset
OLE
Open Office, extracting text from
open source software, judging success
OpenBitSet, used by Filter
OpenDocument files, extracting text from
OpenSolaris, open file limit
optimize
Oracle/Lucene integration, denormalization
OS, I/O cache
Outlook.
See Microsoft Outlook.
OutOfMemoryError
OutOfMemoryException
OutputStream
paging through results
ParallelMultiSearcher, 2nd, 3rd
Parr, Terr
ParseContext
ParseException
parsing, 3rd
query expressions
QueryParser method
versus analysis.
See QueryParser.
ParsingReader
partitioning indexes
PayloadAttribute, 2nd
PayloadHelper
PayloadNearQuery
payloads, 9th
access via TermPositions
and SpanQuery
constructors
during analysis
during searching
example uses
usage in SIREn
used by SIREn
PayloadTermQuery, 2nd
PDF.
See also Adobe PDF.
PDFBox
PDFParser
PerFieldAnalyzerWrapper, 2nd
Perl, tokenizing
per-segment searching, field cache
PersianAnalyzer
PHP Bridge
PhraseQuery, 7th, 11th, 12th, 14th
contrasted with SpanNearQuery
converting to SpanNearQuery
forcing term order
from QueryParser
multiple terms
scoring
slop
slop factor
with synonyms
PipedReader
PipedWriter
plain text, detecting character set
PLucene
Porter stemmer.
See Porter stemming algorithm.
Porter stemming algorithm
Porter, Dr. Martin, 2nd
PorterStemFilter, 2nd, 3rd, 4th
PositionalPorterStopAnalyzer
PositionBasedTermVectorMapper
PositionIncrementAttribute
setPositionIncrement
positionIncrementGap
PowerPoint.
See Microsoft PowerPoint.
PrecedenceQueryParser
precision, definition
PrefixFilter, 2nd
PrefixQuery, 2nd
PrintStream, 2nd
probabilistic model
Process Monitor
properties file, encoding
ps Unix process monitor
pure Boolean model
PyLucene, 2nd
API compatibility
Python, tokenizing
queries, built-in
query expression.
See QueryParser.
QueryAutoStopWordAnalyzer
QueryBuilder
QueryBuilderFactory
querying
QueryNodeProcessor
QueryTemplateManager
QueryTermScorer
QueryWrapperFilter, 2nd, 3rd, 4th
RAID array
RAMDirectory, 2nd, 3rd, 4th, 5th, 6th
RangeFilter
RDF, 2nd
creating the Web of Data
definition
triplestores
ReadTokens task
RecencyBoostingQuery
RegexFragmenter
RegexQuery
regular expressions.
See WildcardQuery.
relational database
relevance
remote file systems
Remote Method Invocation.
See RMI.
remote procedure call.
See RPC.
remote searching
RemoteSearchable
RemoteSearcher
removing common terms.
See stop words.
Representational State Transfer.
See REST.
Resource Description Framework.
See RDF.
REST
ReutersContentSource
reverse native port, definition
ReverseStringFilter
Rich Text Format.
See RTF.
robocopy, for hot backups of an index
RPC
RSolr
rsync, for hot backups of an index
RTF
extracting text from, 2nd
Ruby, tokenizing
RussianAnalyzer
Samba file system, used for indexing
SAX
parsing using
scaling
index replication
index sharding
schema, flexible
SCM
SCMI
score
ScoreCachingWrapperSource
ScoreDoc, 2nd, 3rd
ScoreOrderFragmentsBuilder
scoring
formula
raw score
scrolling.
See paging.
search model
probabilistic
pure Boolean
vector space
search within search, using Filters
Searchable, 2nd, 3rd
SearchClient
Searcher program
SearcherManager
get
maybeReopen
release
warm
SearchFiles
searching classes
SearchServer
security filtering
mixed static and dynamic
SegmentReader, 2nd
SegmentTermEnum, next
Sekiguchi, Koji
Semantic Information Retrieval Engine.
See SIREn.
semantic web
semistructured data
SerialMergeScheduler, 2nd
SetBasedFieldSelector
setMaxBufferedDeleteTerms
setRAMBufferSizeMB
shard, definition
Similarity, 3rd, 4th
improving default relevance
lengthNorm
similarity between documents.
See term vectors.
similarity scoring formula
Simple API for XML.
See SAX.
SimpleDateFormat
SimpleFragmenter
SimpleFSDirectory, 2nd
SimpleFSLockFactory
SimpleHTMLEncoder
SimpleHTMLFormatter
SimpleSpanFragmenter
SingleInstanceLockFactory
sinks
SinkTokenizer
SinusoidalProjector
SIREn
benchmarks
BooleanQuery performance compared to Lucene
data model
data preparation
postings format compare to Lucene
searching entities
semistructured search, 2nd
SirenPayloadFilter, 2nd
slop
factor defined
with MultiPhraseQuery
with SpanNearQuery
SmartChineseAnalyzer, 2nd, 3rd
Snowball stemmer
SnowballAnalyzer, 2nd
solid-state disk.
See SSD.
SolPerl
SolPHP
SolPython
Solr, 5th
creating analysis chain
Ruby response format
sharding and replication
SIREn integration, 2nd
Solr.pm
Solr.QParser
Solr.QParserPlugin
SortedTermVectorMapper
SortField, 3rd
types
SortingExample
Soundex.
See Metaphone.
source code management interface.
See SCMI.
source code management.
See SCM.
span queries
access to payloads
combining
dumpSpans method
excluding matches
matching near one another
matching near the field start
matching single term
phrase within phrase matching
QueryParser
turning into a filter
SpanFirstQuery, 2nd
SpanGradientFormatter
SpanNearQuery, 2nd, 3rd, 8th, 9th, 10th
contrasted with PhraseQuery
deriving from PhraseQuery
inOrder flag
slop
SpanNotQuery, 2nd
SpanOrQuery, 2nd, 3rd
SpanQuery, 5th, 7th, 8th
aggregating
and QueryParser
getSpans
visualization utility
SpanQueryFilter, 2nd
bitSpans
SpanRegexQuery
SpanScorer, 2nd
SpanTermQuery, 2nd, 3rd, 4th
SPARQL query language
SPARQLParser
SPARQLParserPlugin
SPARQLQueryAnalyzer, 2nd
SpecialsAccessor
SpecialsFilter, 2nd
SpellChecker
setAccuracy
suggestSimilar
Spencer, David, 2nd
spider.
See web crawler.
Spolksy, Joel
Spotlight search
Spring
Spring-RPC
SSD
Stale NFS file handle exception
StandardFilter, 2nd
StandardQueryParser
StandardTokenizer, 2nd
Stellent document filters.
See INSO filters.
stemmers, SnowballAnalyzer family
stemming analyzer
stop words, 4th
default
removing
StopAnalyzer, 2nd, 3rd, 4th, 5th
StopFilter, 2nd, 3rd, 4th
setEnablePositionIncrements
StopWordFilter
Store, YES
stored fields, custom loading
String.compareTo, compares by UTF16 code unit
StringDistance, getDistance
StringUtils
swappiness, controlling swapping on Linux
SweetSpotSimilarity
SynLookup
SynonymAnalyzer, 2nd, 3rd
SynonymAnalyzerViewer
SynonymEngine
Syns2Index
System, nanoTime
Tan, Kelvin
TAR Archives, extracting text from, 2nd
tar, for hot backups of an index
TeeSinkTokenFilter
TeeTokenFilter
Term
TermAttribute
TermFreqVector
TermPositions
TermPositionVector
TermRangeFilter, 6th
includeLower
includeUpper
open-ended ranges
with caching
TermRangeQuery, 2nd, 4th, 5th
created by QueryParser
terms, vs. tokens
TermsFilter, 2nd
addTerm
TermVectorAccessor
TermVectorMapper, 2nd, 8th, 9th
isIgnoringOffsets
isIgnoringPositions
map
setDocumentNumber
setExpectations
ThaiAnalyzer
The Grinder load testing tool
ThreadedIndexWriter, 2nd
Tika
alternatives
built-in text extraction tool
customizing parser selection
getFileMetadata
installing
introduction
limitations
logical design
metadata extraction
modular design
parse
parser implementations
using UNIX pipes
utility class
TikaConfig
getParsers
TikaException
TikaIndexer
TimeExceededException
TimeLimitingCollector
limitations
Token
TokenFilter, 6th
additional
importance of order
shingles
splitting source code terms
TokenFilters, for creating payloads
tokenization, definition
Tokenizer
additional
TokenOffsetPayloadTokenFilter
TokenRangeSinkTokenizer
TokenSources
getAnyTokenStream
TokenStream, 2nd
architecture
buffering
incrementToken
used for highlighting
TokenTypeSinkTokenizer
Tomcat, demo application
tool, Luke
top Unix process monitor
top, measuring page faults
TopDocs, 2nd, 3rd
TopFieldCollector
TopFieldDocs
TopScoreDocCollector, 2nd
Toupikov, Nickolai
triplestore, searching the Web of Data
troubleshooting
truncation.
See field truncation.
Tummarello, Giovanni
TupleAnalyzer, 2nd
TupleQuery
addClause
TupleScorer
TupleTokenizer, 2nd
two-phased commit
TypeAsPayloadTokenFilter, 2nd
TypeAttribute
UI novel, creating
unanalyzed fields, searching
Unicode
Unix, deletion of open files
URINormalisationFilter, 2nd
user interface.
See UI.
UTF-8
Vajda, Andi, 2nd
value, 2nd
ValueSource
ValueSourceQuery
van Klinken, Ben
van Rossum, Guido
Vector Space Model, 2nd
VerifyingLockFactory
Version
Visio.
See Microsoft Visio.
vmstat, measuring page faults
W3C
Wall, Larry
Wang, John
WAVE Audio, extracting text from sampling metadata
Web 3.0
web application
CSS highlighting
demo
web application server, thread pool
Web of Data
Wettin, Karl
WhitespaceAnalyzer, 2nd, 3rd
WhitespaceTokenizer
Wikipedia
document source
indexing
WikipediaTokenizer
WildcardQuery, 3rd, 4th
inefficiency
prohibiting
Windows Explorer
Windows Server 2003, open file limit
Windows, deletion of open files
with payloads
Word.
See Microsoft Word.
WordNet
adding synonyms during analysis
building synonym index
example synonyms
WordNetSynonymEngine
write.lock
WriteLineDoc
Writer