- I am very impressed at how easily you can work within RapidMiner without much data analytics training. Plus with the help of the crowd, you can see what steps others have taken with their data analytics projects.
- Text mining was simple and clean. We used this for our call transcription problem where we didn't have the resources to listen to each call. We needed to qualify each call based on some key phrases.
- Our direct mail program was large and not very targeted. Using RapidMiner, we were able to isolate a predictive level we felt comfortable with and decided not to send to anyone below that level. We saved quite a bit of money.
- Basic data cleaning is always a problem that RapidMiner might solve, but I am not aware of it.
- The RapidMiner provides a rich set of Machine Learning algorithms for Data Mining tasks, along with a comprehensive set of operators (functions) for data pre-processing. RapidMiner has a repository containing hundreds of machine learning algorithms and functions.
- RapidMiner is easy to use because RapidMiner is a user-friendly visual workflow designer software. Visualization of the process really helps users with data preparation and modelling. It makes my job easier in teaching machine learning and predictive analytics because I can show them the role of each operator and which one is vital in getting the right model. Students can directly see and understand the effect of using specific algorithms and functions after a few clicks, drags and drops. RapidMiner is something quick and easy to master.
- It is FREE! RapidMiner is available for free for educational use. I have been using RapidMiner for about three years, and I have never encountered any problem in renewing my license. The Educational Program License lasts for a year. My students have never complained about RapidMiner as the customer support is very efficient.
- RapidMiner Marketplace: If there are 'missing algorithm' from the RapidMiner library, we always can install extensions from the RapidMiner Marketplace. For example, I can access an extra about 100 additional modelling schemes after installing the WEKA extension. What ever the tasks are, if the required algorithm or functions are not available in the RapidMiner repository of ML algorithms, I can always find it in the marketplace.
- I hope RapidMiner would be the first data science platform that allows data scientists to change the behaviour of a machine learning algorithm that already exists in the repository. For example, I want to be able to change the way a genetic algorithm mutates.
- Automatic programming: One day, I hope RapidMiner can automatically generate codes in any 4th generation programming language based on the developed model.
- More tutorials/samples needed: Why doesn't RapidMiner becomes the next 'UC Irvine Machine Learning Repository'? Provide real examples and real cases for users to study and understand the best practices in modelling. RapidMiner already has some datasets for a tutorial. Besides the existing samples, I hope RapidMiner can provide more sample data and examples.
Since we are into advanced analytics, most of our solutions are delivered using RapidMiner. As a result of which, most of the employees in the organization use RapidMiner. We have dedicated developers for building extensions for RapidMiner as well. Some of the business problems built using RapidMiner are:
- Fraud analysis for Banking and Financial industries
- Claim and travel analytics for a manufacturing firm
- Text mining and text analytics for a pharmaceutical firm and many other organizations
- Optimization for e-commerce and manufacturing firm
- Supply chain management for manufacturing
- Supply chain planning and scheduling for oil and gas companies
- A great tool to start exploring data science and machine learning. Its intuitive GUI, tutorials, help window, sample processes, and recommendations make it the best place to learn and expand your knowledge horizon.
- RapidMiner is an expert in building end to end solutions. Creating a process in the studio and then running it in production using the server is easy and fast. And also using web services, we can integrate the solution into an organization's in-house application or create a new web application in RapidMiner server. This makes solution delivery faster compared to R and Python.
- Text mining and analytics capability in RapidMiner. I think text processing is very easy here. Using Rosette and deep learning extensions, I have delivered such great solutions.
- Smart Automations like automatically identifying parameter values, auto model and turbo prep etc. saves a lot of time and provide better results
- RapidMiner Server- It is very basic in terms of appearance. Web Apps can be improved by providing default themes and it needs a lot more features to be added.
- Multi-process window in RapidMiner Studio. Multiple design view can be added for switching between processes and model building can be made easier.
- Git Integration for version control. We have something called MyExperiment in RapidMiner but it is far from Git. But if we could have git integration, multiple users can work on the same process and this version control can help to refer previous solutions as well.
- Graphs in RapidMiner Studio are a bit old fashioned
RapidMiner is not so good with image, audio or video data. These data points cannot be used directly in their raw form. They must be transformed into some intermediate form for performing analytics over it. Moreover, there are no connectors to directly pull data from their varied sources. For example, we don't have a connector to read audio data directly from a switch and then convert it to text (although Google speech API is available for audio to text conversion.)
DisperSurance is the radical disruptive substitute for
insurance. We don’t sell insurance, we sell “risk coverage”. We have been using
RapidMiner in traditional insurance company data to:
Identify optimization and automation opportunities
in all the insurance processes. Moreover, we had created special extensions for
the most important processes.
Determining the most profitable e-commerce
strategies for selling policies.
We have been able to design new Risk Coverage products that
are as low as 70% cheaper than traditional insurance.
- RapidMiner has a very large ML algorithms library and excellent tools for automated optimization of those algorithms.
- Is one of the best tools I know for text mining and analytics. It’s not only very powerful but also very intuitive and easy to use.
- Since it’s is very easy to pass from design to production, it’s an excellent tool for building and testing complete models.
- It should improve it friendliness with using multimedia (video, pictures, audio). For instance, is not easy to connect between raw audio and its related text data for analytics.
- It also should improve it interface design and intuitiveness. Its design isn’t very motivational and sometimes it’s hard to find some key operators.
- It should improve the capabilities to integrate RapidMiner to third party applications.
- For creating predictive models.
- Excellent for cleaning and preparing data for a
better modeling process.
- Most of the common ML algorithms can be
Is “The Tool” when you need rapid results and the data is
not extremely large or complex.
When you need cooperation between multiple developers in
separate geographical places.
There’re much better tools for Data visualization.
When a project uses lots of memory.
- Build a model
- Validate a model
- See how accurate our predictions are
- We prefer to use our own coding to clean the data since we use huge databases using MySQL, Oracle or MS-SQL
- We use the visualization tools of RapidMiner to analyze the data
- The graphics of the charts needs some work. Sometimes it is hard to read on high resolution screens
- Sometimes it is hard to find the operators
- The interface is far from the standards of Office or Microsoft in general.
- No need to script anything, and still doing modeling is amazing
- Easy to use by just dragging and dropping operators
- Can use the tool for data cleaning, data analysis, data modeling
- Wish the tool was more efficient in terms of processing power. The tool takes a lot of CPU processing power, even for a small process on a small data set
- Wish there were more options on charts and graphs to visualize the data
So, in general, I'd say the business problem is research and education. I used it before to find patterns in protein interaction networks analysis. And I've taught several machine learning workshops using RapidMiner as a tool.
- Easy to use. The Graphic User Interface allows users to build their models very fast and very intuitively
- Fast to learn. There are plenty online resources (official and unofficial) to learn how to use RapidMiner
- Multiple Tools. RapidMiner has several tools to help with the machine learning activity that a person is doing, different models, importing, etc.
- Export. It would be great to be able to export the resulting data, graphs and models in an easier way. Currently I find that not intuitive enough.
- It would be wonderful (not sure if it fits the company business model) to have an API access, so people would be able to integrate some of RapidMiner functionalities inside their applications
Not well suited: if the user is beginning to learn machine learning, it would be advisable to get some general understanding before using the tool.
- Ease of use in the user interface
- Multiple predictive analytic models to address any needs
- Compatibility with other existing software platforms to allow for better data integration
- As the company has grown, support has been a challenge at times. A better response time would be nice.
- Work-flow visualization - the interface allows you to clearly see what the steps are and where any failures occur
- Keeping up to date with the latest algorithms and improving the performance of those algorithms
- Extensions that allow linking up to many of the other top tools
- Some of the error messages are vague enough to confuse end users
- Certain terminology used by the tool can confuse a new user
- There are a lot of available options, many of which have only minimal documentation available. Better documentation of not just what the option is but how it might impact an analysis would help.
- Desktop analysis
- Exploratory data analysis
- Predictive modeling
- Modeling that requires more than one tool
- Data preparation
- Workflow design and development
- Quick turn-around development
- Multiple concurrent developers
- Bleeding edge algorithms (they try, but the release cycle means that there is a lag)
- Obscure or less common analyses
My introduction to RapidMiner Studio began in 2014 when I decided to write a second edition of my data mining textbook. Although I was not familiar with RapidMiner Studio, I knew it to be a popular tool for data mining and analytics. My initial intention was to learn enough about RapidMiner Studio to provide the reader with an example or two of how it can be used. It soon became clear that I would use RapidMiner Studio for the majority of the tutorials and demonstrations in my book.
My textbook “Data Mining A Tutorial-Based Primer 2nd edition” contains 14 chapters two of which are devoted to RapidMiner Studio. Five additional chapters contain one or several data mining tutorials that use RapidMiner Studio 7. Here is a link to preview the text. https://www.crcpress.com/978149876397
- RapidMiner Studio offers a superb user interface with an intuitive workflow paradigm that is very easy to learn.
- RapidMiner Studio’s operators make it a complete and powerful tool for data preprocessing, data visualization, and data mining/analytics.
- RapidMiner Studio provides excellent documentation, countless worked examples, training and support via a large user community.
- Every problem is solved using a sequence of operators.
- Statistical analysis capabilities offered with the T-Test, ANOVA, Grouped ANOVA, and ANOVA Matrix operators.
- Textual data mining operators.
- Web-based and cloud computing capabilities.
- Visualization capabilities.
- Marketplace Extensions – especially Finance And Economics.
- Process portability.
- With large-sized data sets, there are processing speed issues with a few of the operators. However, RapidMiner Studio 7.4 contains several new performance enhancing features.
- Rapid prototyping of machine learning models.
- Provides predefined parameters that are crowd sourced and provide helpful parameter ranges.
- The interface is very easy to use, even for someone with no coding experience.
- Provide model exportability.
- We have had trouble exporting the models to languages like php.
- The ability to build custom models would be useful, using scripting languages.
- Ability to automate running the machine learning for multiple tasks would be useful.
Our business problem is to predict whether a member is having a high chance of coming to the hospital based on different factors like age, sex, zip code, marital status, the number of visits, diagnosis code etc.
- Data Cleaning & Transformation
- Data Modeling (Algorithm Implementation)
- Data Visualization
- Data Integration
- Data Visualization can be improved. I have used Tableau which has more colorful schema for graphs. If rapidminer improves its graphs look it would be great.
- If connectivity to Hadoop HDFS is provided that would be great.
- If more examples would have been added for each block it would be good. Maybe not in the IDE but like videos on the website. I could find videos for some but not for every block.
- Small to average data sets: Best suited for a moderate-sized data set
- Good Data Modelling Algorithm Implementation: Can implement most of the possible machine learning algorithms in rapidminer
- Data Cleaning & Data Transformation: RapidMiner is also the best ETL tool which gives good competition to Pentaho and Informatica
Less Appropriate For
- Big Data: When I tried using a big data IDE it gets stuck and takes a lot of time to fix it.
- Data Visualization
- The hands-on tutorials provide an understandable step-by-step to being able to provide simple solutions to complex problems.
- RapidMiner provides near instant statistics and quick graphs giving you an overview of your data. This saves a lot of time and complexity.
- Provides a simple drag and drop means to solve complex data science problems.
- There are a few places where the documentation for the tutorials differ from the way that RapidMiner actually works.
- Several times the webinar presenter didn't show up.
- If you have people on your team that are new to analytics and do not have programming experience in Sas, R or Python, RapidMiner is the way to go.
- The module based approach of RapidMiner is very useful, they have a heavy community support and one of my favorite features is the suggestions they give by telling you that such and such step was the most common one used after a transformation or an import etc...
- RapidMiner is great for people with no programming experience but I have found that certain tasks that would normally take very little code to do can often become very convoluted with RapidMiner. I am talking about tasks like filtering and subsetting and other transformation type tasks.
- Great GUI. Very easy to use.
- Lot of inbuilt machine learning algorithms.
- The free version also has most of the things that may be required on a day to day basis.
- The integration between RapidMiner and R isn't perfect as yet.
- Less number of statistical methods.
- A lot of operators present.