This module will introduce you to Microsoft Power BI (BI), a business intelligence platform designed to provide visualization dashboards, cloud-based collaboration, and multi-platform distribution across workstations and phones. We will work through examples on loading data from Excel worksheets, visualizing the data using different methods, and integrating the visualizations into an interactive BI dashboard.
Microsoft Power BI (Business Intelligence) is a collection of tools designed to manage, query, and visualize data. The following six components are the ones you are most likely to use, although there are others.
Our main focus in these lectures will be on Power BI Desktop (BI Desktop) and Power BI Website. Additional information can be found on Microsoft's Power BI site if you're interested in exploring additional BI tools.
To use BI, you can download a copy of BI Desktop, either from the university's Microsoft subscription license or directly from Microsoft. We recommend you use the university's download, since this will also give you the ability to publish desktops to BI's public cloud. There are a number of steps you will need to complete to get BI Desktop up and running.
However you download BI Desktop, the first time you run it a dialog should appear with an option in the center to "Sign In." Do this, and enter your Unity ID. Since the university has a license to Power BI, this should present a standard Unity login screen where you can use your Unity ID and password to sign in to Power BI. Once you do this, you will have the ability to publish to and access the Power BI cloud.
BI Desktop loads with a blank report worksheet opened by default. On the left of the worksheet are three workspace icons allowing access to reports, data, and relationships, respectively. On the right are vertical tabs for filters, visualizations, and data fields. These can be shown or hidden with the > or < icons.
In order to visualize, we need to start with a dataset. We'll be using a modified version of a Excel activity dataset originally provided by Peter Aldhous as part of his Tableau demo. Once you've downloaded the activity dataset to your computer, open it in Excel and spend a few minutes familiarizing yourself with its different fields.
To start, we need to load the activity dataset into BI Desktop.
Once the Excel worksheet has loaded, you should see a blank BI Desktop worksheet report, with the fields from the USDA Activity Dataset in the "Fields" column on the far right side of the window.
Unlike Tableau or SAS VA, BI Desktop does not try to subdivide data attributes into categorical (qualitative) or continuous (qualitative) values. Instead, the "Fields" column lists each data attribute with a checkbox, an optional symbol (e.g., Σ), and its name from the spreadsheet header row. The checkbox identifies whether the given attribute is in-use for some part of the currently selected visualization. The symbol defines data type: Σ for numeric attributes, 🌐 for a geographic attribute, a calculator-like symbol for a calculated field, and blank for text for other data that BI Desktop has not automatically classified.
Similar to Tableau and SAS VA, however, visualizations are created by selecting a visualization type and using drag-and-drop to define the visualization's properties. For example, to create a "% Smokers" by "State" bar graph, perform the following steps.
Next, we'll generate a dot map of the adult obesity rate for each county.
Next, we'll create some scatterplots to compare adult obesity rate to other, possibly related measures.
Notice that, by default, BI Desktop links data points in the visualizations. For example, click on any bubble in a scatterplot, and the dot map will zoom in on that bubble's state and county. Similarly, if you click on a state bubble to drill down to a state's counties, only that state's county dots will appear in the scatterplots.
To compare "Adult Obesity" and "% Smokers," we want to build a dual-axis line graph. Unfortunately, BI Desktop does not currently have the ability to support two Y-axes for two line graphs, although it does support a combination bar+line graph. Microsoft's current suggestion is to embed an R-based ggplot2 dual-axis graph. Although this seems somewhat high maintenance on BI Desktop's part, it does give us an opportunity to see how to embed R-based plots into a BI dashboard. It is also possible to use Python to construct plots (e.g., with Pyplot) in a similar fashion.
To start, we need to ensure that BI Desktop has access to an R interpreter, and optionally to an R integrated development environment (IDE) so that R can render the plots properly. To do this, choose "File", then "Options and Settings → Options." Select "R scripting" from the "GLOBAL" list, and enter or confirm a path to an R interpreter in the "Detected R home directories:" drop-down list, and an R IDE in the "Detected R IDEs" drop-down list. Because I already have R 3.3.1 and RStudio installed, BI Desktop has located them and automatically populated my fields.
Once the location of R and its corresponding IDE are set, we can
insert an "R Visual" onto our report worksheet. To do this, click "R
script visual" (5th row, 4th icon) to insert an R script-based
visualization. You should see a ".R" placeholder and an "R script
editor" at the bottom of the screen. Data you want to use in the R
script must be added to the R visual. Since we want to compare "Adult
obesity rate" and "% smokers" by "State", we drag all three attributes
from the "Fields" list to the "Values" field in the "Visualizations"
column. Notice that comment lines appear in the R editor denoting that
a dataframe called dataset
will automatically be created
whenever the R visual executes or updates.
Next, we add the following R code to produce a
dual-axis ggplot2
line graph of "Adult obesity rate" and
"% smokers" by "State."
Once you have entered this code beneath the line # Paste or type
your script code here:
, you can run it by clicking on the run
icon in the R editor's title bar (). Assuming you pointed to a valid R interpreter
and entered the code properly, you should see a dual-axis line graph
in your R visual. You can hide the R editor by clicking on the down
arrow in its titlebar, or by deselecting the R visual.
One problem with this solution is that, unless you're using Microsoft Power BI Pro, R and Python-based visuals will not render if your dashboard is published to the web. Rather than dealing with R, we can switch to a different visualization type that provides the same "visual search for correlation" capability. An obvious choice is a side-by-side bar chart. This has the added advantage that, in this case, both "Adult obesity rate" and "% smokers" are reported by percent, so we only need a single axis to properly annotate both variables. To create the side-by-side bar chart, we'll first create two new columns in our dataset that report true percentages on the range 0…1 rather than the 0…100 range that both "Adult obesity rate" and "% smokers" use. This will ensure we can format the fields as percentages, to properly show a percent symbol on the bar graph's vertical axis.
Measure =
with Obesity_Frac = [Adult obesity rate] / 100.0
Now that we have our new measures created, we can construct the side-by-side bar chart.
The result is a side-by-side bar chart, organized by state, showing each state's obesity rate and smoker rate as percentages. You can click the … in the top-right of the visualization to choose to sort either ascending or descending by state, obesity, or smoker percentage. As noted above, this has the additional advantage of being fully functional when it's published to the web.
In some sense, visualizations are dashboards, or "reports" in BI Desktop terminology, since they provide built-in interactivity by default. For example, we have already seen that selecting an item in any visualization automatically "links" to all other visualizations, filtering their contents to match the selected item or items.
BI Desktop does provide some additional elements to add items to a report. In the "Home" tab's "Insert" region there are options for "Text box", "Image", or "Shape" that can be placed on the dashboard. Under "Custom visuals", you can import specialized visualizations or data manipulation widgets (e.g., sliders, word clouds, etc.) from Microsoft's visual repository. Some of these are built by Microsoft, and others are contributed by independent programmers. Power BI provides an API to allow construction of custom visuals.
Finally, slicers, which act like filters, can be inserted directly into a report, to make them more prominent and accessible versus using the Filter column. As an example, we will create a drop-down menu slicer to allow users to filter our visualizations to one or more individual states.
To insert a state slicer into our report, complete the following steps.
Notice in this image I've also cleaned up the visualization. I've added a title as a text box, and tweaked the text for axes and titles in the visualizations to convert them from potentially uninformative variable names to more descriptive, English-language explanations. Any one of these changes by itself is not significantly impactful, but the whole is definitely more than the sum of the parts, since the dashboard now appears professional and ready for use by domain experts who are not necessarily visualization or Power BI savvy.
The final step is to upload your work to the web. Similar to Tableau Public, this provides a method to easily share your dashboard through a web browser.
To publish to the BI website, make sure you've logged in with your Unity ID and password as describe in the Setup section. Next, Click on "Publish," which is in the "Share" region of the "Home" tab. BI Desktop will ask you if you want to save any changes you've made to the Dashboard, then ask you to "Select a destination." Choose "My workspace" and click "Select." Your report will be uploaded to the BI website, and you will be given options to "Open 'xxx.pbix' in Power BI" or "Get Quick Insights." If you click the "Open..." link, a web browser sheet will open containing your BI report.
Unfortunately, the ability to "Share" your dashboard is again restricted to Power BI Pro users. Providing the URL to the dashboard won't work, since you need to be logged into Power BI with your account credentials in order to access the dashboard. However, this does give you the ability to demonstrate dashboards from the web, which is useful in situations like external presentations, where the BI Desktop is unlikely to be installed on the machine you are using.
This tutorial only begins to explain the power and flexibility of Power BI. You can use Power BI's online documentation to explore BI in more detail. You can also come to talk to me if you're having problems, and I can try to help you find instructions to construct the visualizations you want to build.
Also, remember that, just like Tableau, BI Desktop is a tool to allow you to build visualizations. Like any tool, BI doesn't necessarily guarantee that the visualizations you produce are good visualizations. Try to apply the design principles we discussed in the visualization lecture, to make sure your visualizations are clear, effective, and well structured. The default visualizations that BI provides are unlikely to offer this, since BI doesn't have your domain expertise and understanding of the data's context to decide how best to present it.