Links
Issues
Alternatives
Also see Competitors (below)…
Articles
Delve inside the Lucene indexing mechanism including Improving the indexing performance.
Competitors
Also see Alternatives (above)…
Did you mean
Faceted
History
Index Accessor
lucene-index-accessor
Monitor
LucidGaze for Lucene Monitor and improve your Lucene search performance.
StopWords
Snowball
Text Extractor
Aperture, Extract full-text and metadata from many common file formats Getting started (Appears to use Java 1.5)
OpenXML4j is a complete Java framework supporting Open Package Convention,
Office Open XML (WordProcessingML, SpreadsheetML, PresentationML and shared specs like DrawingML).
html
-
HTML Cleaner For Maven instructions: Maven repository notes.
Microsoft
OpenOffice
pdf
Projects
The Compass Framework is a first class open source Java framework, enabling the power of Search Engine semantics to your application stack decoratively.
Hibernate Search brings the power of full text search engines to the persistence domain model and Hibernate experience, through transparent configuration (Hibernate Annotations) and a common API. Might be here now… http://www.hibernate.org/410.html
Hibernate Annotations includes a package of annotations that allows you to mark any domain model object as indexable and have Hibernate maintain a Lucene index of any instances persisted via Hibernate.
Kowari is an Open Source, massively scalable, transaction-safe, purpose-built database for the storage, retrieval and analysis of metadata.
DBSight is a highly customisable full-text search platform for any relational database.
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP APIs, caching, replication, and a web administration interface.
LIUS is an indexing Java framework based on the Jakarta Lucene project. The LIUS framework indexes : MsWord, MsExcel, MsPowerPoint, RTF, PDF, XML, HTML, TXT, OpenOffice suite, ZIP files, MP3, VCard, Latex and JavaBeans.