Likelihood to Recommend
Solr spins up nicely and works effectively for small enterprise environments providing helpful mechanisms for fuzzy searches and facetted searching. For larger enterprises with complex business solutions you'll find the need to hire an expert Solr engineer to optimize the powerful platform to your needs. Internationalization is tricky with Solr and many hosting solutions may limit you to a latin character set.
Read full review
The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Read full review Pros Easy to get started with Apache Solr. Whether it is tackling a setup issue or trying to learn some of the more advanced features, there are plenty of resources to help you out and get you going. Performance. Apache Solr allows for a lot of custom tuning (if needed) and provides great out of the box performance for searching on large data sets. Maintenance. After setting up Solr in a production environment there are plenty of tools provided to help you maintain and update your application. Apache Solr comes with great fault tolerance built in and has proven to be very reliable. Read full review Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues Faster in execution times compare to Hadoop and PIG Latin Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner Interoperability between SQL and Scala / Python style of munging data Read full review Cons These examples are due to the way we use Apache Solr. I think we have had the same problems with other NoSQL databases (but perhaps not the same solution). High data volumes of data and a lot of users were the causes. We have lot of classifications and lot of data for each classification. This gave us several problems: First: We couldn't keep all our data in Solr. Then we have all data in our MySQL DB and searching data in Solr. So we need to be sure to update and match the 2 databases in the same time. Second: We needed several load balanced Solr databases. Third: We needed to update all the databases and keep old data status. If I don't speak about problems due to our lack of experience, the main Solr problem came from frequency of updates vs validation of several database. We encountered several locks due to this (our ops team didn't want to use real clustering, so all DB weren't updated). Problem messages were not always clear and we several days to understand the problems. Read full review Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Read full review Likelihood to Renew
Capacity of computing data in cluster and fast speed.
Senior Software Developer (Consultant)
Read full review Usability
The only thing I dislike about spark's usability is the learning curve, there are many actions and transformations, however, its wide-range of uses for ETL processing, facility to integrate and it's multi-language support make this library a powerhouse for your data science solutions. It has especially aided us with its lightning-fast processing times.
Read full review Support Rating
1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Read full review Alternatives Considered
Apache Solr is a ready-to-use product addressing specific use cases such as keyword searches from a huge set of data documents.
Read full review
Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the
stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
Read full review Return on Investment Improved response time in e-commerce websites. Developer's job is easier with Apache Solr in use. Customization in filtering and sorting is possible. Read full review Business leaders are able to take data driven decisions Business users are able access to data in near real time now . Before using spark, they had to wait for at least 24 hours for data to be available Business is able come up with new product ideas Read full review ScreenShots