Tableau & R Integration

The first impression I have with the Tableau software is “Wah, very powerful man”. It can be easily linked up to different data sources (databases, be it MySQL, MS SQL Server, Oracle, Amazon Redshift, or files like Excel, Text) and extract information from them within seconds. Even tables from different databases can be easily joined (Inner, Left, Right, Full Outer). Visualization of different sets of data is just simple drag and drop. Even dashboards can be easily created with some practices. Tableau would be a good platform to consolidate data from different sources of different type into a single environment.

One complete and valuable online source of learning Tableau is available here.

The next thing I look into is how to integrate Tableau with R, which the combined strengths of Tableau’s visualization and R’s powerful analytical packages, would make an even more powerful platform.

Here is my first simple example of Tableau and R integration.

Below shows the Clustering using R in Tableau.

Useful Resources
1) Tableau Your Data (Recommended), with good examples.
2) Tableau Dashboard Cookbook
3) Tableau Data Visualization Cookbook
3) Communicating Data with Tableau
4) Tableau Desktop Help
5) The Tableau Expert

Useful Tips
1) If you want to make the filter available for people that are viewing the report via Tableau Reader or Server you need to expose the filter control on the desktop. To create a quick filter, point at and right-click on any pill used on any shelf in your worksheet, then select the Show Quick Filter option. For example, to create filter in Dashboard, highlight the relevant worksheet, go to Analysis menu, Quick Filters, then click on the filter that you desire. See this.
2) To extract the total seconds from a dimension (say Testtime) of Data Type in Date & Time, use DATEPART. Remember to check the Attribute and Discrete.
3) When using DATEPART to extract the total seconds, the DATEPART appears on the header which is unsightly. It is impossible to rename that header. To overcome this, create a separate Calculation Field using above formula and give a name to that Calculation Field as “Total Testtime(sec)” Then Drag that column into Rows. You will see bar chart for this field. Convert that into Discrete. You will see normal text.
4) Measure Names and Measure Values are automatically generated when you connect to a data source. When they are being used, make sure Analysis-> Aggregate Measures is ticked.
5) When we want to connect to a specific query rather than the entire data source, we use Custom SQL. After connecting to your data, double-click the New Custom SQL option on the Data Source page. Enter the SQL commands as shown and click OK button. The results would then be displayed. Read here to understand the pros and cons of Custom SQL connections.
6) Another useful feature is to use parameters in a custom SQL query statement to replace a constant value with a dynamic value. See an example here, and its output.
7) Say if we want different plots (having the same x-axis values) to appear on a single plot, we need to move the variable say ‘Customer’ from the Rows shelf to Color on the Marks card. Note that the y-axis will change when selecting more than one Customer. From Analysis -> Legends-> Color Legend, so that different colored plots would be indicated accordingly. See here. 


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s