Google's BigQuery (https://cloud.google.com/bigquery/) is a service provided by Google as part of Google Cloud Platform as a serverless enterprise data warehouse. This can be deployed in minutes and tested using a free tier. They have a data sample that we can use. Perform the following steps to set up an account:
- Set up the data lake access using https://cloud.google.com/bigquery/ and click on TRY BIGQUERY FREE. From here, you will be redirected to your Google account login form:
Once you are redirected to Google Cloud Platform (https://console.cloud.google.com), which is the Google Software as a Service (SaaS) ecosystem home, we can see where the BigQuery console resides:
- To the left side of our screen, find the BIG DATA section, under which we can see the BigQuery section. Click on BigQuery:
- Open https://console.cloud.google.com/bigquery?p=bigquery-public-data&d=new_york_citibike, which redirects you to a screen that should look like this:
- On the left-hand side, click in new_york_citibike, expanding the view so you can see citibike_trips. Click on citibike_trips:
- Press the QUERY TABLE option:
- A template of SQL query will be inserted into editor and is similar to the following query:
SELECT FROM 'bigquery-public-data.new_york_citibike.citibike_trips' LIMIT 1000
Exchange the template script with the following script (you can find it at https://github.com/PacktPublishing/Hands-On-Business-Intelligence-with-Qlik-Sense/blob/master/Chapter08/01%20-%20First.sql). This script below will retrieve data aggregated for every month with start and end stations, number of trips and total duration of the trips. The resulting data will be the base for our Qlik Sense summarized app:
SELECT
TIMESTAMP_TRUNC(starttime, month) AS trip_month
,start_station_name
,end_station_name
,sum(tripduration) AS tripduration
,count(starttime) AS trips
FROM 'bigquery-public-data.new_york.citibike_trips'
GROUP BY
TIMESTAMP_TRUNC(starttime, month)
,start_station_name
,end_station_name
limit 100