Likelihood to Recommend
The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Read full review
[NGINX] is very well suited for high performance. I have seen it used on servers with 1k current connections with no issues. Despite seeing it used in many environments I've never seen software developers use it over apache, express, IIS in local dev environments so it may be more difficult to setup. I've also seen it used to load balance again without issues.
Read full review Pros Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues Faster in execution times compare to Hadoop and PIG Latin Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner Interoperability between SQL and Scala / Python style of munging data Read full review Very low memory usage. Can handle many more connections than alternatives (like Apache HTTPD) due to low overhead. (event-based architecture). Great at serving static content. Scales very well. Easy to host multiple Nginx servers to promote high availability. Open-Source (no cost)! Read full review Cons Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Read full review Customer support can be strangely condescending, perhaps it's a language issue? I find it a little weird how the release versions used for Nginx+ aren't the same as for open source version. It can be very confusing to determine the cross-compatibility of modules, etc., because of this. It seems like some (most?) modules on their own site are ancient and no longer supported, so their documentation in this area needs work. It's difficult to navigate between nginx.com commercial site and customer support. They need to be integrated together. I'd love to see more work done on nginx+ monitoring without requiring logging every request. I understand that many statistics can only be derived from logs, but plenty should work without that. Logging is not an option in many environments. Read full review Likelihood to Renew
Capacity of computing data in cluster and fast speed.
Senior Software Developer (Consultant)
Read full review
Great value for the product
Read full review Usability
The only thing I dislike about spark's usability is the learning curve, there are many actions and transformations, however, its wide-range of uses for ETL processing, facility to integrate and it's multi-language support make this library a powerhouse for your data science solutions. It has especially aided us with its lightning-fast processing times.
Read full review Front end proxy and reverse proxy of Nginx is always useful. I always prefer to Nginx in overall usability when you have application server and database or multiple application servers and single database i.e. clustered application. Nginx provides really good features and flexibility which helps the system administrator in case of troubleshooting and also . Also, Nginx doesn't delay any request because of internal performance issues. from the administration perspective Read full review Support Rating
1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Read full review John Reeve
Principal, Lead developer, Lead designer
Read full review Alternatives Considered
Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the
stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
Read full review
We have used Traffic, Apache, Google Cloud Load Balancing and other managed cloud-based load balancers. When it comes to scale and customization nothing beats Nginx. We selected Nginx over the others because
we have a large number of services and we can manage a single Nginx instance for all of them we have high impact services and Nginx never breaks a sweat under load individual services have special considerations and Nginx lets us configure each one uniquely Read full review Return on Investment Business leaders are able to take data driven decisions Business users are able access to data in near real time now . Before using spark, they had to wait for at least 24 hours for data to be available Business is able come up with new product ideas Read full review Nginx has decreased the burden of web server administration and maintenance, and we are spending less time on server issues than when we were using Apache. Nginx has allowed more people in our company to get involved with configuring things on the web server, so there's no longer a single point of failure ("the Apache guy"). Nginx has given us the ability to handle a larger number of requests without scaling up in hardware quite so quickly. Read full review ScreenShots