Apache Lucene

Apache Lucene Reviews

Do you work for this company? Learn how we help vendors

Reviews
(1-3 of 3)

Companies can't remove reviews or game the system. Here's why
Sirish Vadala | TrustRadius Reviewer
Score 10 out of 10
Vetted Review
Verified User
Review Source
Apache Lucene is being used across multiple applications where data keeps updating continuously. Being open source and the efficiency with which the search engine operates, its the perfect solution to implement dynamic text search. The accuracy of the search results is impressive and wouldn't make business sense to implement external products like Google search for a small scale data set.
  • Fast indexing, with proper optimization I can index a Gig of data in 2 mins.
  • Easy integration with web crawlers
  • Quick and Accurate Results
  • Flexible sorting option for results based on the search field and relevance
  • Scalable issues especially when the index grows in size with millions of documents.
  • The Boolean scoring model could be better.
  • Difficulty setting up on cluster based environment.
Apache Lucene is a perfect text search implementation where the heap space usage needs to be kept to its minimal. It also enables search based on various search fields and most importantly the search and index process can happen simultaneously. The only scenario where it might be less appropriate would be when the index size grows too big. We have witnessed few scalable issues where the search would take a while when the index size is too large.
Craig J. Stadler | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Review Source
We currently use Apache Lucene to provide search for extremely large datasets of deep image metadata. It allows quick and easy access to the metadata and search facets/filters.
  • Quick search of very large amounts of data on a single machine instance.
  • Extremely memory and disk efficient/performance.
  • Easy to setup and integrate into external systems.
  • User interface for setup and maintenance would be helpful.
  • Easier cloud/cluster setup.
  • Better, centralized documentation.
Apache Lucene is very good for medium to large datasets that are not searchable as well in MySQL or normal databases. It's extremely fast and robust. Lucene is not as well suited to be used as a strict NoSQL platform.
Score 8 out of 10
Vetted Review
Verified User
Review Source
In my previous position at a higher education university, we had implemented several applications that relied on embedded Lucene search and indexing capabilities. Our Java-based web applications were used internally by stuff as well as externally by students and other state organizations. One of the modules allowed clients to search for students and their uploaded documents matching certain querying parameters.
  • We found Apache Lucene to be extremely performant in querying large amounts of data and retrieving the correct files based on the metadata provided.
  • The online community offers great support for the product. Even though it is an open source tool, it is not difficult to find help online for it.
  • When we were creating a proof of concept application, we found that the software worked just as well, while being run locally on a resource-limited PC.
  • We had difficulty porting the project to a cluster based environment on the cloud.
  • For our particular use case of retrieving documents based on text pattern matching, the program worked efficiently however, we did not find many resources for image pattern recognition based on their metadata.
Apache Lucene offers great full-text search library that makes it easy to add search functionality to a website or other applications. Lucene is ideal if you want low-level access to the indexes and its APIs. For general purposes, Apache Solr, the web application built atop of Lucene can be used instead. Apache Solr comes with caching, HTTP/ JSON APIs and a simple web administration console.

Apache Lucene Scorecard Summary

What is Apache Lucene?

Apache Lucene is an open source search engine library created in Java, available free under the Apache License 2.0. It is associated to Apache Solr, and includes a number of sub-projects, such as Lucene.NET, Apache Tika, and Apache Nutch, now all top level Apache projects.


Lucene supports multiple query types, including phrase queries, wildcard queries, proximity queries, and range queries, and results are ranked so that best results appear first. It is supported by an online community, is usable where resources are limited and is stated by some users to be fairly performant. It enables search of metadata or of data by any field, is integratable with web crawlers, and available for a variety of use cases. And with the PyLucene Python extension for accessing Java Lucene, users can take advantage of Lucene's text indexing and searching capabilities from Python.


Apache Lucene was first made available in 1999, and became part of the Apache Software Foundations’ projects in 2001. It can be used to implement Internet search engines, local single-site search, or search of private resources, as well as other kinds of tools, such as personalization or recommendation engines.


Its crawling and HTML parsing functionality is supplied by optional, ancillary projects, some of these formerly Lucene sub-projects, such as Nutch, and various databases like CrateDB, and Elasticsearch.


Based on research by Schwarzer et al. (2016), Apache Lucene’s MLT function (“MoreLikeThis”) exceeds at providing optimal search results for locating items or articles that are closely related, yet may create linkages to items that are obscure or not as well-known. While it results may be narrow, the authors state a text-based approach can complement alternate (e.g. citation) search methods.


As an open source project, Apache Lucene is available free to edition 8.8.2. Older editions are available free as well, at the Apache Archives.

Apache Lucene Features

  • Supported: ranked searching -- best results returned first
  • Supported: multiple query types: phrase queries, wildcard queries, proximity queries, range queries and more
  • Supported: fielded searching (e.g. title, author, contents)
  • Supported: sorting by any field
  • Supported: multiple-index searching with merged results
  • Supported: allows simultaneous update and searching
  • Supported: flexible faceting, highlighting, joins and result grouping
  • Supported: fast, memory-efficient and typo-tolerant suggesters
  • Supported: pluggable ranking models, including the Vector Space Model and Okapi BM25
  • Supported: configurable storage engine (codecs)

Apache Lucene Screenshots

Apache Lucene Video

( Apache Solr Certification Training - https://www.edureka.co/apache-solr-self-paced ) Watch the sample class recording: http://www.edureka.co/apache-solr?utm_source=youtube&utm_medium=referral&utm_campaign=intro-to-lucene Lucene is an extremely rich and powerful full-text search library written in Java. You can use Lucene to provide full-text indexing across both database objects and documents in various formats (Microsoft Office documents, PDF, HTML, text, and so on). The topics covered in the video : 1.What is Lucene 2.Why Indexing 3.Indexing : Flow 4.Lucene : Writing to Index 5.Lucene : Searching in Index 6.Lucene : Inverted Indexing Technique 7.Lucene : Storage Schema Related post: http://www.edureka.co/blog/apache-solr-shedding-some-light/?utm_source=youtube&utm_medium=referral&utm_campaign=intro-to-lucene Edureka is a New Age e-learning platform that provides Instructor-Led Live, Online classes for learners who would prefer a hassle free and self paced learning environment, accessible from any part of the world. The topics related to ‘Introduction to Lucene' have been covered in our course ‘Apache Solr‘. For more information, please write back to us at sales@edureka.co Call us at US: 1800 275 9730 (toll free) or India: +91-8880862004

Apache Lucene Integrations

Apache Lucene Competitors

Apache Lucene Pricing

Starting Price: $0

More Pricing Information

Apache Lucene Technical Details

Deployment TypesOn-premise
Operating SystemsWindows, Linux, Mac, requires Java 1.5 or greater, ANT 1.7 or greater
Mobile ApplicationNo

Frequently Asked Questions

What is Apache Lucene?

Apache Lucene is an open source and free text search engine library written in Java. It is a technology suitable for applications that requires full-text search, and is available cross-platform.

How much does Apache Lucene cost?

Apache Lucene starts at $0.

Who uses Apache Lucene?

The most common users of Apache Lucene are from Mid-size Companies and the Computer Software industry.