Overall Satisfaction with Revolution R Enterprise
I have used Revolution Analytics Rev-R enterprise 7.0 for data analytics project. I was also engaged in beta-testing release D as well. Rev-R actually solves a Big Data gap by allowing data scientists to load big data in Hadoop HDFS and run complex algorithms such as Random Forest or decision trees by running the algorithms in a distributed way on the cluster. That helps to draw insights from big data sets without having to script complex programs in say Java or Python.
- It allows distributed algorithm runs on Hadoop HDFS cluster
- It allows using different file formats such as SAS7BAT files or complex files in tab or comma delimited making data munging easier
- It provides scalable solutions by allowing users to re-use R scripts and distributing the computing over nodes through RHadoop
- When I reviewed the product - release D, at that time, "decision forest algorithm" was not available.
- The tool needs to be more integrated with other data infrastructure tools such as Teradata, Informatica etc. as well as may be with new Hadoop distribution platforms such as Cloudera or Hortonworks so the users don't have to install the tool from scratch
- I would also like to see improved capability around GUI and integration with other ecosystem. As the Big Data ecosystem would evolve in next 2-3 years, I would like to see Rev-R becoming more compatible with start-ups as well.
- Faster time-to-market on analytics and insights
- Reduction on Level of Effort in terms of running complex algorithmm thus increased job satisfaction
- Improved job empowerment and skills/competency re-use.
My understanding is Revolution Analytics Enterprise version is not cheap. Thus alternatives for the software could be Hadoop/HDFS level programming using Python and Mahout to achieve same distributed computing. Additionally, Cloudera is coming up with new data science tool called Oryx, which could be competitor to Rev-R. But, the tool selection at every organization would depend on the strategy and cost that is budgeted.