Database using Dplyr, DBI, and POOL

In this section, we will learn to use the dplyr package to access data from database sources. We will also see how to hook up to an external database using the DBI package. The Pool package is also an important topic to manage connections and prevent leaks to manage performance.

  • dplyr: A popular data-manipulation package for internal and external databases. It internally works as SQL. It provides a variety of functions for data manipulation:
    • filter()
    • select()
    • arrange()
    • rename()
    • distinct()
    • mutate()
    • transmute()
    • summarise()
    • sample_n()
    • sample_frac()

Let's see an example using some of these functions with the iris dataset:

library(dplyr) 
iris %>% filter(Sepal.Length>4 &Sepal.Length<5) 

In the preceding code, the filter function has been used to filter the rows of the iris dataset, which has values between 4 and 5. We can also select only certain columns, using select():

iris %>% select(Sepal.Length) 

For selecting distinct values, we can use distinct():

iris %>% distinct() 

We can also use these functions in combination. The code shown here is first filtering the data and then applying select():

iris %>% filter(Sepal.Length>4 &Sepal.Length<5) %>% select(Sepal.Length) 

In this way, we can get ride of using R and SQL separately using dplyr.

  • DBI: A common interface between R and DBMS. It also provides a variety of functions for supporting the following functions:
    • Connecting and disconnecting to the DBMS
    • Creating statements and executing in the DBMS
    • Extracting results from statements
    • Exception-handling
    • Transaction management

dbGetQuery() is a convenient way of executing queries with a connection object. It takes two arguments importantly connection to database and query:

dbGetQuery(conn, "SELECT * FROM City LIMIT 5;") 
  • Pool: With the DBI package, there are some problems with connection-management and performance. We have to create and destroy connections as and when necessary. Otherwise, there will be an accumulation of leaked connections. This leaked connection holds resources that must have been freed. Because of this app, the performance goes down. For dealing with such problems, pool package is available. It provides an extra layer of abstraction while establishing a connection. Using this package, we can create an object with a reference to the database. This object is called pool. This makes us avoid directly fetching the database. Also, this pool object can hold a number of connections to the database. Whenever we are querying the database, we are actually querying to the pool that holds the running and waiting connections.

The pool can provide us with idle connections that were previously fetched or with a new one. Once the connection is ended, the garbage-collection process makes free all resources held by the connection. So, in the context of the Shiny app, we don't have to worry about ending the connection.

The pool package is available on GitHub and can be downloaded as follows:

devtools::install_github("rstudio/pool") 

Let's see the skeleton of a Shiny app with pool:

library(shiny) 
library(DBI) 
library(pool) 
pool<- dbPool( 
drv = RMySQL::MySQL(), 
dbname = "Database_Name", 
host = "Host_link", 
username = "username", 
password = "password" 
) 
ui<- fluidPage( 
 #-------- Ui stuff---------------------- 
) 
server<- function(input, output, session) { 
#----Server Stuff------------------- 
} 
shinyApp(ui, server) 

With sqlInterpolate(), we can create a query that can be the input to dbGetQuery(). In sqlInterpolate, we can see three parameters in the function: pool, sql, and id. pool is the object and sql is the SQL query. ID can be any input. dbGetQuery() will have two parameters. The pool object and query are created using interpolate:

query<- sqlInterpolate(pool, sql, id = input$ID) 
dbGetQuery(pool, query) 

This app skeleton is for a single-file app. For a multi-file app, it can be put on the top of the server and ui file or in a global file.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset