Lucene
Encyclopedia : L : LU : LUC : Lucene
Lucene is a free open source, information retrieval API originally implemented in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene has been ported to other programming languages including Perl, C#, C++, and PHP.
While suitable for any application which requires full text indexing and searching capability, Lucene has been widely recognized for its utility in the implementation of internet search engines and local, single-site searching. This has occasionally led to the misperception that Lucene is itself a search engine with built-in crawling and HTML parsing functionality. Instead, any such application utilizing Lucene would have to provide this functionality independently.
At the core of Lucene's logical architecture is a notion of a document containing fields of text. This flexibility allows Lucene's API to be agnostic of file format. Text from PDFs, HTML, Microsoft Word documents, as well as many others can all be indexed so long as their textual information can be extracted.
Software using Lucene
- Beagle uses a port of Lucene to C# called [Lucene.Net] as its indexer.
- Docco ([homepage]) uses Lucene for desktop search.
- CNET uses Lucene to search their product category listings.
- LjFind uses Lucene to search over 110,000,000 LiveJournal posts.
- Nutch is a complete search engine implementation that utilises Lucene.
- [Red-Piranha] is another Lucene based search engine. It is ready to use, deployable as a GUI, command line or Tomcat web application, and has the ability to "learn" what the user wants.
- Wikipedia uses Lucene for full-text search.
- The Flock web browser uses Clucene, a C++ version, to do a full text search of browser history.
- Zimbra groupware incorporates Lucene.
- Ants P2P is using Lucene for the search option, within this anonymous file sharing program.
- [Solr] is an open-source search server based on Lucene with XML/HTTP APIs, caching, replication, and a web admin interface.
- [LIRE - Lucene Image Retrieval] CBIR library, which uses the Lucene search engine
- MMBase has an expansion that uses Lucene for indexing it's data.
- Alfresco[link], an open source Entrerprise Content Management system
Ports
Lucene has been ported or is in the process of being ported to various programming languages other than Java:
- [Lucene4c] - C
- [CLucene] - C++
- [MUTIS] - Delphi
- [NLucene] - .NET
- [Lucene.Net] - .NET
- [Plucene] - Perl
- [Pylucene] - Python
- [Ferret] and [RubyLucene] - Ruby
- [Zend Framework (Search)] - PHP
- [Montezuma] - Common Lisp
External links
- [Lucene homepage]
- [Lucene in Action]
- Article "[Behind the Scenes of the SourceForge.net Search System]" by Chris Conrad
- [Lucene Wikipedia indexer] — introductory article with Java code for search on [Wikipedia data]
License: Apache License | Website: [apache.org]
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
