We need to calculate risk-weighted assets (RWA) daily and monthly for different positions the bank holds on a T+1 basis. The volume of …
Apache Spark is being widely used within the company. In Advanced Analytics department data engineers and data scientists work closely in …
Apache Spark is used by certain departments to produce summary statistics. The software is used for data sets that are very, very large in …
We are building a model and due to the size of the data, we chose to use Apache Spark for the feature generation. The usage of the tool is …
- We are using Apache Spark in Digital - Data teams to build data products and help business teams to take data-driven decisions.
- We use …
Apache Spark is being used by our organization for writing ETL applications. It enables us to ingest thousands of records of data to …
We use Apache Spark for cluster computing in large-scale data processing, ETL functions, machine learning, as well as for analytics. Its …
We were working for one of our products, which has a requirement for developing an enterprise-level product catering to manage a vast …
We do use Apache Spark for cluster computing for our ETL environment, data and analytics as well as machine learning. It is mainly used by …
We sold a data science product to one of the leading US-based e-commerce firms. Suddenly, their data started growing at a very fast rate. …
We used Apache Spark within our department as a Solution Architecture team. It helped make big data processing more efficient since the …
Used as the in memory data engine for big data analytics, streaming data and SQL workloads. Also, in the process of trying it out for …
Apache Spark is being used by the whole organization. It helps us a lot in the transmission of data, as it is 100 times faster than Hadoop …
We use Apache Spark across all analytics departments in the company. We primarily use it for distributed data processing and data …
My company uses Apache Spark in various ways including machine learning, analytics and batch processing. [We] Grab the data from other …
Leaving a video review helps other professionals like you evaluate products. Be the first one in your network to record a review of Apache Spark, and make your voice heard!
Entry-level set up fee?
- No setup fee
- Free Trial
- Free/Freemium Version
- Premium Consulting / Integration Services
Would you like us to let the vendor know that you want pricing?
5 people want pricing too
Companies can't remove reviews or game the system. Here's why
- With the daily risk reports being calculated via Apache Spark, the bank is able to comply with the FHC rule in the US and other regions and control capitals much better with counterparties.
- Time to market
- Churn reduction
- Customer satisfaction
- In one sense, Apache Spark has been a positive ROI because it helps us figure out details of the vast amounts of data. Sometimes the software leads to answers to questions that are surprising. Small data software tools probably would have failed in discovering some of the insights Spark makes possible.
- Spark has been a negative ROI in the sense that it takes lots and lots of time to produce simple answers to simple questions, and often the answers are what was expected. Because of the confirmatory rather than insightful nature of the software, it seems like a lot of effort for the results garnered.
- Apache Spark represents a positive ROI on the instances when it gives a well-producing machine learning model, a model that produces predictions that actually get used.
- reduce time
- need tuning
- hard to debug
- Business leaders are able to take data driven decisions
- Business users are able access to data in near real time now . Before using spark, they had to wait for at least 24 hours for data to be available
- Business is able come up with new product ideas
- Saves lot of time
- Very powerful
- Automates lots of manual work
- Higher depth of knowledge is required to understand and perform analysis
- Saved time and resources for the company because of it's agility
- High performance data processing.
- The ROI was increased by considerable percentage after using Apache Spark.
- Apache Spark provided the agility towards supporting multiple applications.
- Simplified our landscape.
- Drove great performance for data processing.
- We saved a lot of time and resources, thereby saving a lot of dollars for our company as well as the client.
- Positive impact on analyzing big data.
- Fast customer service saved our time.
- Easy to use which means less time spent on training the team.
- overall positive impact to the business for analysis of big data using hadoop file system
- very well received by data scientists in the business despite its shortcoming on analytical dashboarding
- It has had a very positive impact, as it helps reduce the data processing time and thus helps us achieve our goals much faster.
- Being easy to use, it allows us to adapt to the tool much faster than with others, which in turn allows us to access various data sources such as Hadoop, Apache Mesos, Kubernetes, independently or in the cloud. This makes it very useful.
- It was very easy for me to use Apache Spark and learn it since I come from a background of Java and SQL, and it shares those basic principles and uses a very similar logic.
- Switching from PIG Latin to Apache Spark sped up the overall development time and also the resource utilization has gone up.
- Our offline jobs also run faster than traditional map-reduce like systems.
- Integrating with Jupyter like notebook environments, the development experience becomes more pleasant and we can iterate much faster.
- Apache Spark has faster performance compared to MapReduce.
- Combination of Python & Spark is the best. Shorter code, faster and efficient performance.
- Can replace RDBMS
- Workflow process using spark went from 1 day to 2 hours
- Spark Streaming allowed for quick determiniation of data validity
- spark on yarn was good for manangement. But Spark with Kubernetes was easier to use.
- We were able to make batch job faster by 20 times as compared to MapReduce
- With the language support like Scala, Java, and Python, easily manageable
- Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark.
- Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy.
- Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs.
- Optimization at its best (Super Fast).
- Handles huge data with simple syntax whereas other programming language takes hell a lot of coding.
- Best for parallel computing applications.
- Positive: we don't worry about scale.
- Positive: large support community.
- Negative: Takes time to set up, overkill for many simpler workflows.
- Time saved in developing applications is less.
- ROI on time, resources, money.
- Can replace the traditional database systems.
- By learning Spark, we can become certified and/or provide proper recommendations or implementations on Spark solutions.
- With a background in Hadoop distributed processes, it has been easy to understand and diagnose how Spark handles the transfer of data within a cluster. Especially when using YARN as the resource manager and HDFS as the data source.
- Staying up to date with the latest changes to Spark has become a repetitive task. While most Hadoop distributions only support Spark 1.6 at the moment, Spark 2.0 has introduced some useful features, but those require a re-write of existing applications.