RapidMiner - Serves Full Knowledge Data Discovery Process
March 30, 2017

RapidMiner - Serves Full Knowledge Data Discovery Process

Kamesh Emani | TrustRadius Reviewer
Score 10 out of 10
Vetted Review
Verified User

Overall Satisfaction with RapidMiner Studio

It's used by the information technology department. We have used logistic regression algorithm in our company in order to calculate a health risk factor based on different attributes from health care data. We accessed HL7 Hospital Data in SQL Server Management System and wrote queries to retrieve the required fields. We used a RapidMiner Data Mining Tool to perform predictive analysis on the data. We created a web form using HTML 5, CSS 3, Bootstrap and JavaScript which takes input of factors and predicts the risk factor.

Our business problem is to predict whether a member is having a high chance of coming to the hospital based on different factors like age, sex, zip code, marital status, the number of visits, diagnosis code etc.
  • Data Cleaning & Transformation
  • Data Modeling (Algorithm Implementation)
  • Data Visualization
  • Data Integration
  • Data Visualization can be improved. I have used Tableau which has more colorful schema for graphs. If rapidminer improves its graphs look it would be great.
  • If connectivity to Hadoop HDFS is provided that would be great.
  • If more examples would have been added for each block it would be good. Maybe not in the IDE but like videos on the website. I could find videos for some but not for every block.
  • Positive: Data cleansing, discretization & transformation, modelling
  • Negatives: Data Visualization gets stuck sometimes
  • Neutral: ETL capability
  • R studio, Gretl, Informatica and Pentaho
The best part about RapidMiner is it mainly focus on machine learning algorithms whereas other tools focus on mainly the extract transform load (ETL) process. It can serve for all the KDD (Knowledge data discovery) process stages e.g. data cleaning, transformation, modeling and visualization whereas other tools will serve only one or two purposes.

Suited For:

  • Small to average data sets: Best suited for a moderate-sized data set
  • Good Data Modelling Algorithm Implementation: Can implement most of the possible machine learning algorithms in rapidminer
  • Data Cleaning & Data Transformation: RapidMiner is also the best ETL tool which gives good competition to Pentaho and Informatica

Less Appropriate For

  • Big Data: When I tried using a big data IDE it gets stuck and takes a lot of time to fix it.
  • Data Visualization