RapidMiner & R Integration

RapidMiner is the # 1 Open Source Predictive Platform which empowers enterprises to easily mashup data, create predictive models and operationalize predictive analytics within any business process. In fact, RapidMiner was created with the purpose to allow nonexperts to get the same findings as data scientists.

One complete and valuable online source of learning RapidMiner is available here.

Let’s look at how both RapidMiner and R can be integrated, such that with the combined strengths of RapidMiner’s data mining capabilities and R’s powerful analytical packages, we would have an even more versatile platform. Integration between tools is important so that analysts can leverage the best available tool for each discrete step in a workflow and improve their efficiency.

I am using RapidMiner version 6.5.002, and the steps involved are pretty straightforward. Go to Help->Marketplace, type R Scripting and press the Search button. R Scriptiing 6.5.0 will appear, checked ‘Select for installation’, and press ‘Install 1 packages’ button. See here. Verify that the R Scripting Extension is properly installed, by going to Help->Manage Extensions, and see that R Scripting is there.

Next, under Operators, go to Utility->Execution, double click Execute R to bring it to the Process. In Help for Execute R, click the first Tutorial Process on ‘Training and applying a linear model in R‘. In Step 3, click the Run button. In the Results, there should have a column named label and another column named prediction, both highlighted in light green.

Another useful information to know is that RapidMiner is moving its Predictive Analytics to the Cloud.

