R + Shiny: Institute for Advanced Analytics

This module will introduce you to Shiny, a framework that integrates with RStudio to construct web-based dashboards. We will work through a number of simple examples of loading data, visualizing it with R's built-in graphics operations, then integrating those visualizations into an interactive Shiny web dashboard, which can be viewed online by anyone with a web browser.

Setup

In order to use the Shiny examples in this tutorial, you will need to add the following R packages:

R Graphics

R provides numerous ways to generate plots or visualizations of data stored in vectors, matrices, tables, and data frames. The two most common methods for visualizing data are R's base graphics, which are included as part the standard R package, and ggplot2, a package by Hadley Wickham specifically designed to support the flexible design of plots ranging from simple to complex.

As an example, the following code produces a bar graph of the heights of the trees in the built-in trees data frame.

% barplot( trees$Height, names.arg=row.names( trees ), main="Tree Height", col="lightblue", xlab="ID", ylab="Height" )

% library( "ggplot2" ) % x_lbl <- row.names( trees ) % x_lbl <- factor( x_lbl, levels=unique( x_lbl ) ) % ggplot( data=trees, aes( x=x_lbl, y=Height ) ) + geom_bar( fill="lightblue", stat="identity" ) + xlab( "ID" ) + ggtitle( "Tree Height" )

ggplot2 is based on the grammar of graphics, a foundation proposed by Leland Wilkinson to generate visualizations. In this context, a chart is divided into individual components — data, annotations, scales, and so on — that are connected together using addition. In the example above, each command has the following meaning.

Charts

We provide examples of the standard charts you're likely to use when you're building an R+Shiny web application. Basic bar charts have been covered above. Below we examine variations on bar charts, line charts, pie charts, scatterplots, and histograms.

Bar Charts

In addition to basic bar charts, we often want to construct stacked or side-by-side bar charts to compare and contrast subcategories of our data. Consider the beaver1 dataset included in the standard R install.

% library( "ggplot2" ) % str( beaver1 ) 'data.frame': 114 obs. of 4 variables: $ day : num 346 346 346 346 346 346 346 346 346 346 ... $ time : num 840 850 900 910 920 930 940 950 1000 1010 ... $ temp : num 36.3 36.3 36.4 36.4 36.5 ... $ activ: num 0 0 0 0 0 0 0 0 0 0 ... % df <- beaver1 % df$day <- as.factor( df$day ) % df$activ <- factor( df$activ, labels=c( "inactive", "active" ) ) % df <- df[ order( df$activ ), ] % ggplot( df, aes( x=day, y="", fill=activ ) ) + geom_bar( stat="identity" ) + ylab( "Activity" ) + ggtitle( "Beaver Activity vs Inactivity" )

The df[ order( df$activ ), ] command is critical, because it guarantees activity values for individual days are grouped together. Without this, a day's active and inactive values will be spread throughout the day, and the stacked chart will look like a single bar with lines through it where the activ factor changes its value.

Side-by-side bar charts are generated similarly, however here we need to aggregate a total (e.g., a number of occurrences of each activity) for each subcategory (e.g., for each day). Once we have done that, we can plot the subcategories as side-by-side bars.

Here, we select the day and activ columns from the beaver1 dataset, use table to compute the frequency of activity for each day, convert the result to a data frame, then plot it as a side-by-side bar. We use position="dodge" in the geom_bar command to get a side-by-side bar graph. The default, position="stacked", would give us a stacked bar chart.

A ggplot stacked bar chart on the left, and a corresponding side-by-side bar chart on the right

Line Charts

Building a line chart in ggplot is very similar to building a bar chart, except that we substitute geom_bar with geom_line. This shows one of the strengths of ggplot. Since the data, representation of the data, and decorations on the representation are all built separately and combined, switching the representation from bar to line involves changing only one part of the overall ggplot command.

% x_lbl <- as.numeric( row.names( trees ) ) % ggplot( data=trees, aes(x=x_lbl, y=Height) ) + geom_line( colour="red", linetype="dashed", size=1.0 ) + xlab( "ID" ) + ggtitle( "Tree Height" )

Notice here that we have treated the data frame row indices as a sequence of numeric values:

x_lbl <- as.numeric( row.names(
trees ) )

, and not as a factor variable. In a line chart, by default ggplot uses the combination of all factor variables to group the points. This is not what we want, so we cannot use a factor variable for the x-axis. An alternative to this is to manually specify the grouping. Using the aesthetic specification aes( group=1 ) specifies we want a single line connecting all the points.

% x_lbl <- row.names( trees ) % x_lbl <- factor( x_lbl, levels=unique( x_lbl ) ) % ggplot( data=trees, aes(x=x_lbl, y=Height, group=1 ) ) + geom_line( colour="red", linetype="dashed", size=1.0 ) + geom_point( color="red", size=3.0, shape=1 ) + xlab( "ID" ) + ggtitle( "Tree Height" )

Notice that we also used geom_point to add an open circle at each height value. The shape argument defines how points are displayed. Shapes are defined numerically to provide open and filled circles, squares, triangles, other glyphs like plus and X-symbols.

We can also built multi-line charts, where each line represents a separate factor. Consider the chickwts dataset, which lists chicken weight by the type of feed it was given. The following code generates a multi-line chart, one line per feed type, showing the weight of each chicken that received the given feed.

% str( chickwts ) data.frame': 71 obs. of 2 variables: $ weight: num 179 160 136 227 217 168 108 124 143 140 ... $ feed : Factor w/ 6 levels "casein","horsebean",..: 2 2 2 2 2 2 2 2 2 2 ... % df <- chickwts % cat <- table( df$feed ) % idx <- numeric() % for( i in 1:length( cat ) ) { % idx <- c( idx, 1:cat[ i ] ) % } % df$x_lbl <- as.factor( idx ) % ggplot( data=df, aes( x=x_lbl, y=weight, group=feed, color=feed ) ) + geom_line( size=1.0 ) + geom_point( size=4.0, shape=20 ) + xlab( "ID" ) + ggtitle( "Chicken Weight by Feed Type" )

The group=feed argument in the initial ggplot command defines which variable to use to split the dataset into individual lines.

Pie Charts

Pie charts are closely related to stacked bar graphs. In ggplot terms, you can think of a pie chart as a stacked bar chart that's been "wrapped" to form a circle. The code below uses the built in chickwts dataset to build a stacked bar chart of average chicken weight by feed type.

% df <- aggregate( chickwts$weight, by=list( chickwts$feed ), FUN=mean ) % names( df ) <- c( "feed", "weight" ) % df$feed <- reorder( df$feed, order( -df$weight ) ) % ggplot( df, aes( x="", y=weight, fill=feed ) ) + geom_bar( stat="identity", width=0.25 ) + ggtitle( "Mean Chicken Weighty by Feed Type" )

Notice the R commands

aggregate( chickwts$weight, by=list(
chickwts$feed), FUN=mean )

and

df[ order( -df$weight ),
]

. The first command aggregates chicken weight by feed type, producing a data frame with a single average weight entry for each feed type. The second command sorts the data frame descending by average weight. We want to do this, because in a pie chart we want to display slices in descending order from largest to smallest.

To convert the stacked bar chart into a pie chart, we simply add an additional ggplot command coord_polar to plot the data in polar coordinates. This produces the pie chart shown above and to the right.

% df <- aggregate( chickwts$weight, by=list( chickwts$feed ), FUN=mean ) % names( df ) <- c( "feed", "weight" ) % df <- df[ order( -df$weight ), ] % ggplot( df, aes( x="", y=weight, fill=feed ) ) + geom_bar( stat="identity" ) + coord_polar( "y", start=0 ) + ggtitle( "Mean Chicken Weighty by Feed Type" )

Here's a slightly more informative and aesthetic version of the pie chart. You can check your R knowledge and consult the ggplot documentation to explore the commands used to create this chart.

% df <- aggregate( chickwts$weight, by=list( chickwts$feed ), FUN=mean ) % names( df ) <- c( "feed", "weight" ) % df <- df[ order( -df$weight ), ] % lbl <- paste( df$feed, "\n", round( df$weight / 16.0, 1 ), "lb", sep = "" ) % cb_palette <- c( "#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2" ) % df$pos <- cumsum( df$weight ) - ( df$weight / 2 ) % df$feed <- factor( df$feed, levels = df$feed[ order( df$weight ) ] ) % pie <- ggplot( df, aes( x="", y=weight, fill=feed ) ) % pie <- pie + geom_bar( colour="black", stat="identity" ) + coord_polar( "y", start=0 ) % pie <- pie + theme( axis.ticks=element_blank(), axis.title=element_blank(), axis.text.y=element_blank(), axis.text.x=element_text( colour="black" ), legend.position="none" ) % pie <- pie + scale_y_continuous( breaks=df$pos, labels=lbl ) % pie <- pie + scale_fill_manual( values=cb_palette ) % pie <- pie + ggtitle( "Mean Chicken Weight by Feed Type" ) % print( pie )

Scatterplots

A scatterplot is normally used to look for relationships between two variables. For example, suppose we wanted to visually explore whether a relationship exists between a tree's height and its volume. This can be done using the geom_point command.

% df <- trees % ggplot( data=df, aes( x=Height, y=Volume ) ) + geom_point( shape=20, size=3.0 ) + ggtitle( "Tree Height vs Volume" )

The figure suggests there appears to be a relationship between tree height and volume, but it would be useful to plot a regression line through the points to see how well it fits the data, and what its slope is. This can be done in ggplot using the geom_smooth command.

% df <- trees % ggplot( data=df, aes( x=Height, y=Volume ) ) + geom_point( shape=20, size=3.0 ) + geom_smooth( method=lm ) + ggtitle( "Tree Height vs Volume" )

Adding the regression line and confidence interval seems to further confirm a relationship between tree height and volume. Accessing the lm function directly confirms a p-value of less than 0.05, the normal cutoff for rejecting the null hypothesis.

% summary( reg ) Call: lm(formula = trees$Height ~ trees$Volume) Residuals: Min 1Q Median 3Q Max -10.7777 -2.9722 -0.1515 2.0804 10.6426 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 69.00336 1.97443 34.949 < 2e-16 *** trees$Volume 0.23190 0.05768 4.021 0.000378 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 5.193 on 29 degrees of freedom Multiple R-squared: 0.3579, Adjusted R-squared: 0.3358 F-statistic: 16.16 on 1 and 29 DF, p-value: 0.0003784

Histograms

Histograms allow you to: (1) count the number of occurrences in a categorical variable, or (2) discretize a continuous variable, then count the number of occurrences of values within a predefined set of ranges or bins. Both approaches are demonstrated below. The first uses the built in airquality dataset and treats temperature as a factor (i.e., as a categorical variable) to count the number of temperature occurrences within the dataset. The second uses the chickwts to count the number different chicken weights using equal-width bins of eight ounces.

% df <- airquality % df$Temp <- as.factor( df$Temp ) % ggplot( data=df, aes( x=Temp ) ) + geom_bar( color="black", fill="palegreen2" ) + xlab( "Temperature F" ) + ylab( "" ) + scale_y_continuous( breaks=c( 1, 3, 5, 7, 9, 11 ) ) + ggtitle( "Temperature Counts" )

If you create a histogram from a discrete variable (e.g.,factor variable), you use geom_bar. This makes sense intuitively, since counting occurrences in a categorical variable is, in essence, equivalent to generating a bar chart of counts of the variable's values. If you create a histogram from a continuous variable, you use geom_histogram.

In the discrete histogram example above, we use scale_y_continuous to explicitly define the tick positions on the y-axis. In the geom_histogram example above, we use the alpha argument to make each bar semi-transparent.

Boxplots

A final chart that is often useful in statistics is the boxplot, a visualization that identifies the median, the second and third quartiles boundaries Q₁ and Q₃, and the inner and outer "fences", normally 1.5 × IQR (inter-quartile range Q₃ - Q₁) below and above Q₁ and Q₃. Any points outside the fences are plotted as outliers.

An example of boxplots for chickwts feed type versus average weight can be constructed as follows.

% library( ggplot2 ) % ggplot( data=chickwts, aes( x=feed, y=weight ) ) + geom_boxplot( colour="blue", fill="white", outlier.colour="red", outlier.shape=1 )

This boxplot shows only a few outliers in the "sunflower" feed type category. Another example uses the iris dataset to plot Sepel Width by Species. This shows a few additional outliers, both above and below the IQR fences. We have also used geom_dotplot to display all of the data points at their corresponding Sepal Width positions, overlaid on top of the boxplot.

% library( ggplot2 ) % box <- ggplot( data=iris, aes( x=Species, y=Sepal.Width ) ) % box <- box + geom_boxplot( lwd=1, color="black", fill="white" ) % box <- box + geom_dotplot( aes( fill=Species ), binaxis="y", stackdir="center", method="histodot", binwidth=0.1, dotsize=0.75 ) % print( box )

Maps

ggplot also has the ability to visualize data on maps using commands like geom_map and coord_map.

% library( ggplot2 ) % states <- map_data( "state" ) % ggplot() + geom_map( data=states, map=states, aes( map_id=region ), fill="white", colour="black" ) + expand_limits( x=states$long, y=states$lat )

The R code above produces a basic map of the U.S., then warps it using an Albers map projection. An Albers projection requires two parallels to project about, defined as lat0=29.5 and lat1=49.5. The current USGS standard is to display maps using Albers projection, and for maps of the continental United States, parallels of 29.5°N and 49.5°N are recommended.

Choropleth Maps

To produce a map with data overlaid, you normally start by drawing a base map, then adding a second map layer using geom_map containing the data you want to visualize.

For example, suppose we wanted to visualize a choropleth map of state population. The R built in state.x77 data frame contains various information about US states, including estimated population in millions as its first column. We can use this to colour individual states darker for lower populations and lighter for higher populations.

% library( ggplot2 ) % states <- map_data( "state" ) % choropleth <- data.frame( ID = tolower( rownames( state.x77 ) ), pop = state.x77[ , 1 ] ) % map <- ggplot() + geom_map( data=states, map=states, aes( map_id=region ), fill="white", colour="black" ) + expand_limits( x=states$long, y=states$lat ) % map <- map + geom_map( data=choropleth, map=states, aes( fill=pop, map_id=ID ) ) % map <- map + coord_map( "albers", lat0=29.5, lat1=49.5 ) % print( map )

The key concept to understand here is how ggplot maps regions on the map to data values that drive the region's colour. This is done with the map_id aesthetic field. If you look at the state data frame, you'll see that the individual polygons that make up each state are identified by the state's name, in lowercase.

% head( states ) long lat group order region subregion 1 -87.46201 30.38968 1 1 alabama <NA> 2 -87.48493 30.37249 1 2 alabama <NA> 3 -87.52503 30.37249 1 3 alabama <NA> 4 -87.53076 30.33239 1 4 alabama <NA> 5 -87.57087 30.32665 1 5 alabama <NA> 6 -87.58806 30.32665 1 6 alabama <NA>

When we built the choropleth variable, we included an ID column that also used lowercase state name. We then matched the columns between the states and choropleth data frames using the map_id aesthetic field, using map_id=region in the base map and map_id=ID in the choropleth layer that fills in the individual state polygons.

Dot Maps

Other types of maps, like dot maps, can also be generated using ggplot. As with the choropleth map, we begin with a base map, then add points to it, in this case using the geom_point command. The example uses two CSV files, cities-coords.csv and cities-data.csv. You'll need to download these files, and use setwd() to change RStudio's working directory to the directory containing the files.

% library( ggplot2 ) % states <- map_data( "state" ) % coords <- read.csv( url( "http://www.csc2.ncsu.edu/faculty/healey/msa/shiny/cities-coords.csv" ), header=TRUE, sep="," ) % data <- read.csv( url( "http://www.csc2.ncsu.edu/faculty/healey/msa/shiny/cities-data.csv" ), header=TRUE, sep="," ) % data <- data[ data$City != "Houston" & data$Year == 2012, ] % points <- merge( coords, data, by.x = c( "City", "State" ), by.y = c( "City", "State" ) ) % points$Size <- pmax( points$Population / 500000.0, rep( 5.0, 6 ) ) % map <- ggplot() + geom_map( data=states, map=states, aes( map_id=region ), fill="white", colour="black" ) + expand_limits( x=states$long, y=states$lat ) % map <- map + geom_point( data=points, aes( x=Longitude, y=Latitude ), colour="blue", size=points$Size, alpha=0.7 ) % map <- map + coord_map( "albers", lat0=29.5, lat1=49.5 ) % print( map )

In this example, we've actually created a proportional dot map, where the size of each dot represents the population of its corresponding city. The pmax command is used to ensure a minimum dot size of 5.0.

R + Shiny

Shiny is a package developed by RStudio that allows the creation of web-based, interactive dashboards based on R graphics and jQuery user interface (UI) widgets. This provides a way to create web-based dashboards that allow users to interactively explore an underlying dataset.

A Shiny application is made up of at least two separate R files: ui.R that defines the layout of the dashboard and the UI widgets it contains, and server.R that responds when a user interacts with the UI, reading new interface values and generating new visualizations based on those values.

As an example, here is a simple application that allows a user to choose a bin width, then plots the number of chickens from the chickwts dataset that have a weight within the range of each bin.

% library( shiny ) % shinyUI( fluidPage( # Define UI for histogram application % titlePanel( "Distribution Histogram" ), # App title % sidebarLayout( # Sidebar w/slider input for bin width % sidebarPanel( % sliderInput( % "bins", % "Bin Width:", % min = 5, % max = 50, % value = 20 ) % ), % mainPanel( # Plot generated distribution % plotOutput( "distPlot" ) % ) % ) % ) )

% library( ggplot2 ) % library( shiny ) % shinyServer( function( input, output ) { # Server logic for histogram % output$distPlot <- renderPlot( { % ggplot( data=chickwts, aes( x=weight ) ) + geom_histogram( binwidth=input$bins, colour="white", fill="lightblue", alpha=0.7 ) + scale_y_continuous( breaks=seq( 0, length( chickwts$weight ), by=2 ) ) + ggtitle( "Chicken Weight Distribution" ) % } ) % } )

To run a Shiny application, place ui.R and server.R in a common directory. We'll assume the directory is called shiny. Next, ensures RStudio's current working directory is the parent directory that holds the shiny subdirectory. You can check the working directory with the command getwd() and set the working directory with the command setwd(). One RStudio is in the proper directory, run the Shiny app with the runApp command.

This will create a new web browser window within RStudio and run the Shiny app in that window.

UI and Server Code

Let's look at the ui.R and server.R in more detail. As discussed previously, the UI code builds the user interface, which includes interactive widgets and output (visualizations) displayed based on the current value of the widgets. The server code is responsible for reacting to changes in the UI widget values, generating updated visualizations based on those changed values, and pushing the results back to the UI side to be displayed.

The server code is shorter and simpler, since its only job is to receive the current value of the sliderInput widget variable bins, create a histogram of chicken weights based on the values of bins, then assign that histogram to the output variable distPlot.

Reactivity

The way that the UI and server code communicate with one another occurs through reactivity, Shiny's terminology for an approach similar to callbacks in other languages. To start, we'll discuss reactive sources and reactive endpoints.

Reactive Sources and Endpoints

In simple terms, a reactive source is normally a variable attached to a user interface widget in the UI source code. For example, in our ui.R example, the variable bins is a reactive source, since it is attached to a slider and needs the dashboard to "react" whenever its value changes.

On the other hand, the variable distPlot is a reactive endpoint, since code in server.R assigns a histogram to distPlot based on the reactive source bins. The histogram is then displayed in the dashboard's mainPanel.

From this example, we can see that reactive source and endpoint variables are normally defined in the UI code, and responses to changes in reactive sources are normally managed in the server code, with a typical response being to update the value of a reactive endpoint. You can see this exact type of processing happening in our example program.

It is now clear what input and output represent in the function defined within shinyServer. input contains (among other things) values for all of the reactive sources in ui.R. In particular, it contains the value of bins, which is accessed in typical R fashion as input$bins.

Similarly, output contains values for the reactive endpoints in ui.R. In our example, this includes distPlot, which needs to be updated whenever bins changes. Shiny recognizes that the assignment to output$distPlot in the shinyServer function access input$bins (to define the histogram's binwidth), so shinyServer is automatically called by Shiny whenever the value of the reactive source bins changes. Similarly, Shiny calls the shinyUI function whenever the value of the reactive endpoint distPlot changes.

A bins reactive source acting as a dependent to a distPlot reactive endpoint

Shiny represents the relationships between reactive sources and endpoints as shown in the diagram above. We would describe this as reactive sources having one or more dependents (in our example, bins has one dependent distPlot), and reactive endpoints being dependent on one or more reactive sources (in our example, distPlot depends on bins).

Reactive Conductors

A final type of component used in Shiny is a reactive conductor. The purpose of a conductor is to encapsulate a computation that depends on reactive source(s). The result returned by a reactive conductor is normally used by a reactive endpoint. Since the reactive conductor caches its return value, if the value is slow to compute, or if it is going to be used by numerous reactive endpoints, the conductor improves the overall efficiency of the Shiny program. Reactive conductors are also useful for performing longer computations that you might not want to embed directly in code used to assign a value to a reactive endpoint.

Consider the following modifications to ui.R and server.R, to allow us to print some text information about the histogram in our dashboard.

% library( ggplot2 ) % library( shiny ) % shinyServer( function( input, output ) { # Server logic for histogram % g <- reactive( { % ggplot( data=chickwts, aes( x=weight ) ) + geom_histogram( binwidth=input$bins, color="white", fill="lightblue", alpha=0.7 ) + scale_y_continuous( breaks=seq( 0, length( chickwts$weight ), by=2 ) ) + ggtitle( "Chicken Weight Distribution" ) % } ) % output$distPlot <- renderPlot( { % g() % } ) % output$distInfo <- renderUI( { % hist_data <- ggplot_build( g() )$data[[ 1 ]] % min <- hist_data$xmin % max <- hist_data$xmax % rng <- range( chickwts$weight ) % bin_s <- paste( "Weight Range: [", rng[ 1 ], ",", rng[ 2 ], "] " ) % bin_s <- paste( bin_s, "Number of Bins: ", length( min ), " " ) % bin_s <- paste( bin_s, "Left Boundary: ", min[ 1 ], " " ) % bin_s <- paste( bin_s, "Right Boundary: ", max[ length( max ) ], " " ) % HTML( bin_s ) % } ) % } )

This code makes the following changes to the original chicken weights histogram dashboard.

The Shiny chicken weights dashboard with a uiOutput object to display text describing specific properties of the dashboard

Finally, here is a Shiny dashboard that displays information from the built in iris dataset. It allows a user to choose which samples from the three iris species to visualize as boxplots using checkboxes. It allows the user to display outliers only, or all points using radio buttons. Finally, it allows the user to set the IQR range to define outliers using a slider.

As in the example above, a reactive function is used to build a reactive conductor df() that contains the subset of samples to plot (based on which species the user chooses to visualize). This allows the Points and Outlier IQR inputs to change, without needing to re-subset the original dataset.

% library( shiny ) % shinyUI( fluidPage( % titlePanel( "Sepal Width Boxplot" ), % sidebarLayout( % sidebarPanel( % checkboxGroupInput( % "species", % "Species:", % c( "Setosa" = "setosa", % "Versicolor" = "versicolor", % "Virginica" = "virginica" % ), % inline = TRUE, % selected = "setosa" % ), % radioButtons( % "points", % "Points:", % c( "Outlier" = "outlier", "All" = "all" ), % inline = TRUE % ), % sliderInput( % "iqr", % "Outlier IQR:", % min = 0.5, % max = 3.0, % step = 0.25, % value = 1.5 % ) % ), % mainPanel( % plotOutput( "distPlot" ) % ) % ) % ) )

% library( ggplot2 ) % library( shiny ) % shinyServer( function( input, output ) { % df <- reactive( { % return( subset( iris, Species == input$species ) ) % } ) % output$distPlot <- renderPlot( { % if ( input$points == "outlier" ) { % ggplot( data=df(), aes( x=Species, y=Sepal.Width ) ) + geom_boxplot( color="black", fill="palegreen", outlier.color="red", outlier.shape=1, coef=input$iqr ) + ggtitle( "Sepal Width Boxplot" ) % } else { % ggplot( data=df(), aes( x=Species, y=Sepal.Width ) ) + geom_boxplot( lwd = 1, color="black", fill="palegreen", coef=input$iqr ) + geom_dotplot( aes( fill=Species ), binaxis="y", stackdir="center", method="histodot", binwidth=0.1, dotsize=0.75 ) + ggtitle( "Sepal Width Boxplot" ) + guides( fill=FALSE ) % } % } ) % } )

Boxplot visualizations of the iris dataset, with controls to define which species to visualize, whether to visualize outliers only or all points, and how to define an outlier based on the IQR

Publishing

There are numerous ways to publish your applications so others can use them. One simple, built in method is to deploy your application on Shiny's application cloud, located at https://www.shinyapps.io. Another is to embed your Shiny UI and server together in a single R file, then send that file to other users.

shinyapps.io

Shiny applications can be published on RStudio's cloud application server. You may have noticed a "Publish" button in the upper-right corner of your Shiny application window. If you've authorized a shinyapps.io account in RStudio, clicking this button generates a dialog that allows you to choose a directory with Shiny code to deploy, and an account to use to host the application.

To setup your shinyio.apps account, visit shinyapps.io and choose "Sign Up" to create an account to host your Shiny applications. You will be asked to enter an email address and a password for you new account, then asked to choose a name for your account. Once this is done, instructions will be provided to setup RStudio to publish applications. This involves three steps.

Once your Shiny app is deployed, it will be available at a specific URL, so anyone with a web browser will be able to run it. The URL uses your shinyapps.io account name account-name and the name of the application directory app-directory you used to upload your application, specifically:

So, for example, if my account name was msa and I uploaded an application in a directory called shiny, users could access that application at the URL https://msa.shinyapps.io/shiny.

Self-Contained Shiny Apps

Another option is to embed Shiny UI and server code directly in a single R file, then share the file with other R users. The code below shows an example of embedding our original Shiny histogram application as a single R file.

% load <- function() { % if ( require( "ggplot2" ) != TRUE ) { % print( "Required library 'ggplot2' could not be loaded" ) % return( FALSE ) % } else if ( require( "shiny" ) != TRUE ) { % print( "Required library 'shiny' could not be loaded" ) % return( FALSE ) % } else { % return( TRUE ) % } % } % % if ( load() == TRUE ) { % hist_ui <- fluidPage( % titlePanel( "Distribution Histogram" ), % sidebarLayout( % sidebarPanel( % sliderInput( % "bins", % "Bin Width:", % min = 5, % max = 50, % value = 20 ) % ), % mainPanel( % plotOutput( "distPlot" ) % ) % ) % ) % % hist_server <- function( input, output ) { % output$distPlot <- renderPlot( { % ggplot( data=chickwts, aes( x=weight ) ) + geom_histogram( binwidth=input$bins, colour="white", fill="lightblue", alpha=0.7 ) + scale_y_continuous( breaks=seq( 0, length( chickwts$weight ), by=2 ) ) + ggtitle( "Chicken Weight Distribution" ) % } ) % } % % shinyApp( ui = hist_ui, server = hist_server ) % }

Here, we create two variables hist_ui and hist_server, each containing the contents of the original ui.R and server.R. We then run the application using the shinyApp() command, passing ui = hist_ui and server = hist_server as arguments to define the UI and server components of the application.

In a self-contained program like this, we also need to ensure the proper libraries ggplot2 and shiny are available when a given user tries to run the application. The load() function is built to do this. It uses require() to attempt to load both libraries, returning FALSE if either library is not available. The load() function monitors return codes from require(), returning TRUE only if both libraries are loaded. The mainline of the program begins by calling load(), and only runs the body of the program to create and execute the Shiny app if load() returns TRUE (that is, if both the ggplot2 and shiny libraries are available).

Introduction