Likelihood to Recommend Well suited: To most of the local run of datasets and non-prod systems - scalability is not a problem at all. Including data from multiple types of data sources is an added advantage. MLlib is a decently nice built-in library that can be used for most of the ML tasks. Less appropriate: We had to work on a RecSys where the music dataset that we used was around 300+Gb in size. We faced memory-based issues. Few times we also got memory errors. Also the MLlib library does not have support for advanced analytics and deep-learning frameworks support. Understanding the internals of the working of Apache Spark for beginners is highly not possible.
Read full review Pros: Splunk is very well suited if you have multiple log sources of related data. All of them can be correlated and tasks can be automated based on the requirement. Other than alerts, Splunk can also run a specific script of your choice, based on some defined conditions. Cons: If you have a few logs but a large number of log sources, Splunk can be very expensive.
Read full review Pros Apache Spark makes processing very large data sets possible. It handles these data sets in a fairly quick manner. Apache Spark does a fairly good job implementing machine learning models for larger data sets. Apache Spark seems to be a rapidly advancing software, with the new features making the software ever more straight-forward to use. Read full review Real-time + Scheduled alerts - i-e you can set up alerts which are actively monitoring your logs Pretty good response time for search results. With our key/value logging, Splunk makes it blazing fast to query the data. Dashboards provide insights into historical data Love how Splunk indexes all of the data and provides keys to search on Read full review Cons Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Read full review At times some queries can run slowly if indices are not on a portion of the query you use. Setup time initially can be difficult if your logs aren't stored in common locations or in a common way to write the log. Ability to ingest logs from different locations without having to change code to put logs in a certain place (pro and con). Searches can be a bit more difficult to look through if your log isn't pulled in a manner that is easy to read through splunk. Read full review Likelihood to Renew Capacity of computing data in cluster and fast speed.
Steven Li Senior Software Developer (Consultant)
Read full review We are using Splunk extensively in our projects and we have recently upgraded to Splunk version 6.0 which is quite efficient and giving expected results. We keep track of updates and new features Splunk introduces periodically and try to introduce those features in our day to day activities for improvement in our reporting system and other tasks.
Read full review Usability The only thing I dislike about spark's usability is the learning curve, there are many actions and transformations, however, its wide-range of uses for ETL processing, facility to integrate and it's multi-language support make this library a powerhouse for your data science solutions. It has especially aided us with its lightning-fast processing times.
Read full review You can literally throw in a single word into Splunk and it will pull back all instances of that word across all of your logs for the time span you select (provided you have permission to see that data). We have several users who have taken a few of the free courses from Splunk that are able to pull data out of it everyday with little help at all.
Read full review Reliability and Availability When properly setup and configured, Splunk is extremely reliable.
Read full review Support Rating 1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Read full review Splunk maintains a well resourced support system that has been consistent since we purchased the product. They help out in a timely manner and provide expert level information as needed. We typically open cases online and communicate when possible via e-mail and are able to resolve most issues with that method.
Read full review Online Training The online course was simple clear and described the main capabilities of the solution. There is also an initial module that can be done for free so anyone can familiarize themselves with the functionality of this solution. On the other hand, however, there could be more free online courses. Maybe even with a certificate, this would broaden the group of people who are familiar with the platform while increasing familiarity with the solution itself.
Read full review Implementation Rating Smooth without too many major issues.
Read full review Alternatives Considered All the above systems work quite well on big data transformations whereas Spark really shines with its bigger API support and its ability to read from and write to multiple data sources. Using Spark one can easily switch between declarative versus imperative versus functional type programming easily based on the situation. Also it doesn't need special data ingestion or indexing pre-processing like
Presto . Combining it with Jupyter Notebooks (
https://github.com/jupyter-incubator/sparkmagic ), one can develop the Spark code in an interactive manner in Scala or Python
Read full review I wanted to learn a new language that I can quickly master and implement. Splunk is easy, fun to use and best of all, it can be developed in hours not days or weeks. Splunk is fundamentally a programming language that is minimal but yet powerful enough to collect, analyze and visualize data.
Read full review Scalability Splunk can scale in to the petabyte per day range which of course is awesome
Read full review Return on Investment Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark. Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy. Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs. Read full review Overall very positive. It has provided visibility to what is going on within our network. One drawback is the time it takes to get up to speed with the application, but this is up to the user, and Splunk education is excellent. In my field, IT Security, there are few other friends to have in your back pocket better than Splunk. They are just that good. Read full review ScreenShots