Item: Apache Lucene
Rating: 10
Author: Sirish Vadala

Overall Satisfaction with Apache Lucene

Use Cases and Deployment Scope

Apache Lucene is being used across multiple applications where data keeps updating continuously. Being open source and the efficiency with which the search engine operates, its the perfect solution to implement dynamic text search. The accuracy of the search results is impressive and wouldn't make business sense to implement external products like Google search for a small scale data set.

Pros and Cons

Pros

Fast indexing, with proper optimization I can index a Gig of data in 2 mins.
Easy integration with web crawlers
Quick and Accurate Results
Flexible sorting option for results based on the search field and relevance

Cons

Scalable issues especially when the index grows in size with millions of documents.
The Boolean scoring model could be better.
Difficulty setting up on cluster based environment.

Return on Investment

Cost effective
Opensource and easily customizable
Active community and feedback forums

Alternatives Considered

Apache Solr, Amazon Elasticsearch Service and Google Search Appliance

The search and index performance of [Apache] Lucene is excellent and the quality of results is good, if not better. For implementing it with small scale applications it is a no brainer, Lucene is the best and most cost effective solution. Learning curve is not too steep either.

Key Insights

Do you think Apache Lucene delivers good value for the price?

Yes

Are you happy with Apache Lucene's feature set?

Yes

Did Apache Lucene live up to sales and marketing promises?

Yes

Did implementation of Apache Lucene go as expected?

Yes

Would you buy Apache Lucene again?

Yes

Other Software Used

React, Oracle WebLogic Suite, Apache Derby

Likelihood to Recommend

Apache Lucene is a perfect text search implementation where the heap space usage needs to be kept to its minimal. It also enables search based on various search fields and most importantly the search and index process can happen simultaneously. The only scenario where it might be less appropriate would be when the index size grows too big. We have witnessed few scalable issues where the search would take a while when the index size is too large.

Comments

Please log in to join the conversation

Efficient Open Source Search Engine