Skip to main content
TrustRadius
Apache Lucene

Apache Lucene

Overview

What is Apache Lucene?

Apache Lucene is an open source and free text search engine library written in Java. It is a technology suitable for applications that requires full-text search, and is available cross-platform.

Read more
Recent Reviews

TrustRadius Insights

Apache Lucene and Solr have proven to be valuable tools in various fields, offering a range of use cases that have benefited users. …
Continue reading
Read all reviews
Return to navigation

Pricing

View all pricing

What is Apache Lucene?

Apache Lucene is an open source and free text search engine library written in Java. It is a technology suitable for applications that requires full-text search, and is available cross-platform.

Entry-level set up fee?

  • No setup fee
For the latest information on pricing, visithttps://lucene.apache.org/core/download…

Offerings

  • Free Trial
  • Free/Freemium Version
  • Premium Consulting/Integration Services

Would you like us to let the vendor know that you want pricing?

4 people also want pricing

Alternatives Pricing

What is Elasticsearch?

Elasticsearch is an enterprise search tool from Elastic in Mountain View, California.

What is Guru?

Enterprise AI Search, Intranet, and Wiki in one platform. Guru lives in tools organizations already use, so no need to context switch. Users can find info across any app, have an expert help if the info can't be found, and let Guru proactively identify knowledge gaps, duplicate knowledge, and…

Return to navigation

Product Details

What is Apache Lucene?

Apache Lucene is an open source search engine library created in Java, available free under the Apache License 2.0. It is associated to Apache Solr, and includes a number of sub-projects, such as Lucene.NET, Apache Tika, and Apache Nutch, now all top level Apache projects.


Lucene supports multiple query types, including phrase queries, wildcard queries, proximity queries, and range queries, and results are ranked so that best results appear first. It is supported by an online community, is usable where resources are limited and is stated by some users to be fairly performant. It enables search of metadata or of data by any field, is integratable with web crawlers, and available for a variety of use cases. And with the PyLucene Python extension for accessing Java Lucene, users can take advantage of Lucene's text indexing and searching capabilities from Python.


Apache Lucene was first made available in 1999, and became part of the Apache Software Foundations’ projects in 2001. It can be used to implement Internet search engines, local single-site search, or search of private resources, as well as other kinds of tools, such as personalization or recommendation engines.


Its crawling and HTML parsing functionality is supplied by optional, ancillary projects, some of these formerly Lucene sub-projects, such as Nutch, and various databases like CrateDB, and Elasticsearch.


Based on research by Schwarzer et al. (2016), Apache Lucene’s MLT function (“MoreLikeThis”) exceeds at providing optimal search results for locating items or articles that are closely related, yet may create linkages to items that are obscure or not as well-known. While it results may be narrow, the authors state a text-based approach can complement alternate (e.g. citation) search methods.


As an open source project, Apache Lucene is available free to edition 8.8.2. Older editions are available free as well, at the Apache Archives.

Apache Lucene Features

  • Supported: ranked searching -- best results returned first
  • Supported: multiple query types: phrase queries, wildcard queries, proximity queries, range queries and more
  • Supported: fielded searching (e.g. title, author, contents)
  • Supported: sorting by any field
  • Supported: multiple-index searching with merged results
  • Supported: allows simultaneous update and searching
  • Supported: flexible faceting, highlighting, joins and result grouping
  • Supported: fast, memory-efficient and typo-tolerant suggesters
  • Supported: pluggable ranking models, including the Vector Space Model and Okapi BM25
  • Supported: configurable storage engine (codecs)

Apache Lucene Screenshots

Screenshot of Screenshot of

Apache Lucene Video

( Apache Solr Certification Training - https://www.edureka.co/apache-solr-self-paced ) Watch the sample class recording: http://www.edureka.co/apache-solr?utm_source=youtube&utm_medium=referral&utm_campaign=intro-to-lucene Lucene is an extremely rich and powerful full-text s...
 Show More

Apache Lucene Integrations

Apache Lucene Competitors

Apache Lucene Technical Details

Deployment TypesOn-premise
Operating SystemsWindows, Linux, Mac, requires Java 1.5 or greater, ANT 1.7 or greater
Mobile ApplicationNo

Frequently Asked Questions

Apache Lucene is an open source and free text search engine library written in Java. It is a technology suitable for applications that requires full-text search, and is available cross-platform.

Apache Lucene starts at $0.

Apache Solr, Elasticsearch, and Algolia are common alternatives for Apache Lucene.

The most common users of Apache Lucene are from Mid-sized Companies (51-1,000 employees).
Return to navigation

Comparisons

View all alternatives
Return to navigation

Reviews and Ratings

(9)

Community Insights

TrustRadius Insights are summaries of user sentiment data from TrustRadius reviews and, when necessary, 3rd-party data sources. Have feedback on this content? Let us know!

Apache Lucene and Solr have proven to be valuable tools in various fields, offering a range of use cases that have benefited users. Researchers in Natural Language Processing have utilized Apache Lucene for tasks like Named Entity Recognition, leveraging its powerful indexing and retrieval capabilities. Additionally, users have found Solr to be effective in adding search functionality with indexing to their applications, enabling them to search for desired keywords within documents.

The ability to cache document searches in a JSON format has been highly beneficial for users, as it allows them to retrieve search results quickly without directly accessing the main data store. This feature has enabled organizations to provide full-featured search tools to their patrons, including options like faceting and full-text search.

Apache Lucene's indexing and retrieval capabilities have played a crucial role in improving response times when dealing with large databases. Users have found that implementing Lucene has provided them with a faster and more cost-effective alternative to slow and expensive database clusters, meeting their requirements for traffic and speed.

Lucene's effectiveness extends beyond Natural Language Processing and research tasks. Users working in information retrieval have successfully employed Lucene for solving problems related to query modeling. Furthermore, Lucene has excelled in providing search capabilities for large datasets of deep image metadata, allowing quick access and search facets/filters.

The open-source nature and efficiency of Apache Lucene make it a suitable solution for implementing dynamic text search across multiple applications that deal with continuously updating data. Users appreciate the accuracy of the search results provided by Lucene, making it their preferred choice even when dealing with small-scale datasets compared to external products. Whether used internally by staff or externally by students and other organizations, Apache Lucene proves to be a reliable tool for enhancing search capabilities within Java-based web applications.

Easy and fast implementation: Users have found the implementation of Apache Lucene to be easy and fast, allowing for effective wrappers to be built around it. This sentiment was expressed by several reviewers, highlighting the simplicity and efficiency of getting started with Lucene. Great indexing and searching capabilities: The indexing and searching capabilities of Lucene were highly praised by users, especially in projects dealing with large amounts of financial records that require quick search capabilities. Several reviewers mentioned that Lucene's ability to handle millions of records while providing fast search results greatly benefited their projects. Simple installation and setup process: Many users appreciated the ease of installing and setting up Lucene. They found the process straightforward and user-friendly. Additionally, the administration of Lucene was seen as fairly easy, further enhancing its usability for users. These three pros were commonly mentioned by reviewers, indicating that the ease of implementation, powerful indexing/searching capabilities, and simple installation process are significant strengths of Apache Lucene.

Limitations on logical operators: Some users have found limitations on certain logical operators in Lucene, such as exclusive OR, which has made it challenging to write complex queries.

Lack of flexibility and consistency: Several reviewers have mentioned that different versions of Lucene have different functions for the same job, making it less flexible and consistent in terms of functionality.

Difficulty in setup and maintenance: Many users have expressed difficulties in setting up and maintaining Lucene indexes, finding it non-trivial and challenging to diagnose production issues. They recommend combining Lucene with SOLR for better functionality but mention that even SOLR has room for improvement and is difficult to set up in large-scale production environments.

Reviews

(1-3 of 3)
Companies can't remove reviews or game the system. Here's why
Sirish Vadala | TrustRadius Reviewer
Score 10 out of 10
Vetted Review
Verified User
Incentivized
Apache Lucene is being used across multiple applications where data keeps updating continuously. Being open source and the efficiency with which the search engine operates, its the perfect solution to implement dynamic text search. The accuracy of the search results is impressive and wouldn't make business sense to implement external products like Google search for a small scale data set.
  • Fast indexing, with proper optimization I can index a Gig of data in 2 mins.
  • Easy integration with web crawlers
  • Quick and Accurate Results
  • Flexible sorting option for results based on the search field and relevance
  • Scalable issues especially when the index grows in size with millions of documents.
  • The Boolean scoring model could be better.
  • Difficulty setting up on cluster based environment.
Apache Lucene is a perfect text search implementation where the heap space usage needs to be kept to its minimal. It also enables search based on various search fields and most importantly the search and index process can happen simultaneously. The only scenario where it might be less appropriate would be when the index size grows too big. We have witnessed few scalable issues where the search would take a while when the index size is too large.
  • Cost effective
  • Opensource and easily customizable
  • Active community and feedback forums
The search and index performance of [Apache] Lucene is excellent and the quality of results is good, if not better. For implementing it with small scale applications it is a no brainer, Lucene is the best and most cost effective solution. Learning curve is not too steep either.
Score 8 out of 10
Vetted Review
Verified User
Incentivized
In my previous position at a higher education university, we had implemented several applications that relied on embedded Lucene search and indexing capabilities. Our Java-based web applications were used internally by stuff as well as externally by students and other state organizations. One of the modules allowed clients to search for students and their uploaded documents matching certain querying parameters.
  • We found Apache Lucene to be extremely performant in querying large amounts of data and retrieving the correct files based on the metadata provided.
  • The online community offers great support for the product. Even though it is an open source tool, it is not difficult to find help online for it.
  • When we were creating a proof of concept application, we found that the software worked just as well, while being run locally on a resource-limited PC.
  • We had difficulty porting the project to a cluster based environment on the cloud.
  • For our particular use case of retrieving documents based on text pattern matching, the program worked efficiently however, we did not find many resources for image pattern recognition based on their metadata.
Apache Lucene offers great full-text search library that makes it easy to add search functionality to a website or other applications. Lucene is ideal if you want low-level access to the indexes and its APIs. For general purposes, Apache Solr, the web application built atop of Lucene can be used instead. Apache Solr comes with caching, HTTP/ JSON APIs and a simple web administration console.
  • Being an open source project we did not have to pay any licensing fees for using Apache Lucene. It has greatly improved our search functionality in our web apps.
Craig J. Stadler | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Incentivized
We currently use Apache Lucene to provide search for extremely large datasets of deep image metadata. It allows quick and easy access to the metadata and search facets/filters.
  • Quick search of very large amounts of data on a single machine instance.
  • Extremely memory and disk efficient/performance.
  • Easy to setup and integrate into external systems.
  • User interface for setup and maintenance would be helpful.
  • Easier cloud/cluster setup.
  • Better, centralized documentation.
Apache Lucene is very good for medium to large datasets that are not searchable as well in MySQL or normal databases. It's extremely fast and robust. Lucene is not as well suited to be used as a strict NoSQL platform.
  • Very good at using minimal hardware sets saving money on hosting.
  • Very good at housing multiple cores or instances.
I have tried Elastic and Sphinx, each has their benefits but I feel like Apache Lucene overall is the best performing and easiest to setup and maintain.
Return to navigation