Index

Symbols

! (exclamation mark), lines in Jupyter Notebook preceded by, Notebooks on Google Cloud Platform
! (logical negation) operator, Comparisons
!= (not-equals) comparison operator, Comparisons
# (pound sign), comments beginning with, Retrieving Rows by Using SELECT
$ (dollar sign), end of string matching in regular expressions, Regular Expressions
% (percent sign)
- enclosing named parameters, Named parameters
- lines in Jupyter Notebook preceded by, Notebooks on Google Cloud Platform
%%bigquery Magics (see Jupyter)
& (bitwise AND) operator, Numeric Types and Functions, Comparisons
() (parentheses)
- controlling order of evaluation, Filtering with WHERE
- enclosing subqueries, Subqueries with WITH
- grouping in regular expressions, Regular Expressions
, (comma)
- comma cross join, CROSS JOIN
- in correlated CROSS JOIN, Using arrays for generating data
- leading commas in SELECT clause, Creating Arrays by Using ARRAY_AGG
- (hyphen), escaping in dataset name, Retrieving Rows by Using SELECT
-- (double dash), comments beginning with, Retrieving Rows by Using SELECT
; (semicolon) separating statements in a script, A sequence of statements
<, <=, >, >=, and != (or <>) comparison operators, Comparisons
- using with Boolean variables, Logical Operations
<< (bitwise) operator, Numeric Types and Functions
<> (not-equals) comparison operator, Comparisons
>> (bitwise) operator, Numeric Types and Functions
? (question mark), in positional parameters, Positional parameters
?: (capture group) in regular expressions, Regular Expressions
@ (at symbol), marking named parameters, Named parameters
@run_date parameter, Named timestamp parameters
@run_time parameter, Named timestamp parameters
[] (square brackets), array operator, A Brief Primer on Arrays and Structs
d matching digits in regular expressions, Regular Expressions
s matching spaces in regular expressions, Regular Expressions
`` (backticks), escape character in dataset name, Retrieving Rows by Using SELECT
| (bitwise OR) operator, Numeric Types and Functions, Comparisons
ʌ (caret), beginning of string matching in regular expressions, Regular Expressions

A

access control
- BigQuery's use of Google's IAM system, Security and Compliance
- Identity and Access Management (IAM), Identity and Access Management-Resource
- on dataset, examining using dsinfo object, Dataset information
access tokens, Table manipulation
Access Transparency program, Access transparency
ACID operations with BigQuery, Managed Storage
admin role, Predefined roles
administering BigQuery, Administering and Securing BigQuery, Administering BigQuery-Stackdriver monitoring and audit logging
- authorizing users, Authorizing Users
- availability, disaster recovery, and encryption, Availability, Disaster Recovery, and Encryption-Customer-Managed Encryption Keys
- continuous integration/continuous deployment, Continuous Integration/Continuous Deployment-Cost/Billing Exports
- cost/billing exports, Cost/Billing Exports-Dashboards, Monitoring, and Audit Logging
- dashboards, monitoring, and audit logging, Dashboards, Monitoring, and Audit Logging-Stackdriver monitoring and audit logging
- job management, Job Management
- regulatory compliance, Regulatory Compliance-Data Exfiltration Protection
- restoring deleted records and tables, Restoring Deleted Records and Tables
Advanced Encryption Standard (AES-256), Infrastructure Security
advanced queries (see queries)
aggregates, Aggregates-A Brief Primer on Arrays and Structs
- AGGREGATE in join+ stage of broadcast JOIN query, Broadcast JOIN query
- AGGREGATE step in scan-filter-aggregate query, Stage 0
- array, Creating Arrays by Using ARRAY_AGG
  - (see also ARRAY_AGG function)
- computing using GROUP BY, Computing Aggregates by Using GROUP BY
- counting records using COUNT function, Counting Records by Using COUNT
- filtering grouped items using HAVING, Filtering Grouped Items by Using HAVING
- finding unique values using DISTINCT, Finding Unique Values by Using DISTINCT
aggregation functions, Numeric Types and Functions
- approximate, Using Approximate Aggregation Functions-Optimizing How Data Is Stored and Accessed
  - (see also APPROX_* functions; HLL functions)
- not defined on Booleans, Using COUNTIF to Avoid Casting Booleans
aggregations
- aggregating analytic functions, Aggregate analytic functions-Aggregate analytic functions
- centroid of an aggregate of geometries, Geometry transformations and aggregations
- manual, using HLL function, HLL functions
AI (artificial intelligence)
- AI Factory section of GCP Cloud Console, Notebooks on Google Cloud Platform
- AI Platform, Hyperparameter tuning using AI Platform
  - (see also Cloud AI Platform)
- Cloud AI Platform, training ML programs from data in BigQuery, Integration with Google Cloud Platform
aliasing
- retaining use of an alias with subqueries, Subqueries with WITH
- using aliases in ORDER BY, Sorting with ORDER BY
- using AS to alias column names, Aliasing Column Names with AS
allAuthenticatedUsers, access for, Identity
Alpega Group, use of BigQuery, Data Processing Architectures
ALTER TABLE SET OPTIONS statement, Data Management (DDL and DML), Labels and tags, Changing options
analytic functions, Numeric Types and Functions, Window Functions
- (see also window functions)
analytic window functions, Window Functions
- (see also window functions)
analytics
- creating pivot table from BigQuery Data Sheet, Exploring BigQuery tables as a data sheet in Google Sheets
- moving the computation to the data, How BigQuery Came About
- powerful, performing with BigQuery, Powerful Analytics
AND condition, combining categorical features into, Human insights and auxiliary data
AND keyword, Filtering with WHERE
ANY_VALUE, GIS Measures
Apache Beam, Writing a Dataflow job, Using the Streaming API directly, Cloud Dataflow
- exporting BigQuery data into TensorFlow records on GCS, Apache Beam/Cloud Dataflow
Apache Hive, loading and querying Hive partitions, Loading and querying Hive partitions
Apache Spark, MapReduce Framework
API gateway infrastructure, secured global, Infrastructure Security
APIs
- BigQuery API for data using other frameworks, Managed Storage
  - (see also REST APIs)
application-default credentials, Table manipulation
APPROX_* functions
- APPROX_COUNT_DISTINCT, Using Approximate Aggregation Functions, HLL functions
- APPROX_QUANTILES, Approximate top, Quantiles
- APPROX_TOP_COUNT, Approximate top
- APPROX_TOP_SUM, Approximate top
Apps Scripts client library, Incorporating BigQuery Data into Google Slides (in G Suite)
architecture of BigQuery, Architecture of BigQuery-Summary
- Dremel query engine, Query Engine (Dremel)-Hash join query
  - query execution, Query Execution-Hash join query
  - Query Masters, Query Master
  - scheduler, Scheduler
  - shuffle, Shuffle
  - worker shards, Worker Shard
- high-level, High-Level Architecture-BigQuery Upgrades
  - life of a query request, Life of a Query Request-Step 5: Returning the query results
  - upgrades to BigQuery components, BigQuery Upgrades
- storage, Storage-Meta-File
  - metadata, Metadata-Meta-File
  - physical storage in Colossus, Physical storage: Colossus-Physical storage: Colossus
  - storage format, Capacitor, Storage format: Capacitor-Storage format: Capacitor
- upgrades, BigQuery Upgrades
arithmetic operations
- supported by INT64 and FLOAT64 types, Numeric Types and Functions
- with timestamps, Arithmetic with Timestamps
ARRAY type, Creating Arrays by Using ARRAY_AGG, Data Types, Functions, and Operators, Summary
- JSON arrays, Creating Arrays by Using ARRAY_AGG
arrays, A Brief Primer on Arrays and Structs-Joining Tables
- adding entry using DML UPDATE, Updating row values
- ambiguities in Standard SQL, Advanced SQL
- ARRAY of STRUCT, Array of STRUCT
- ARRAY of tuples or anonymous struct, TUPLE
- array parameters, Array and struct parameters
- BigQuery support for, Powerful Analytics
- converting to structs for hybrid recommendation model, Training hybrid recommendation model
- creating using ARRAY type and ARRAY_AGG function, Creating Arrays by Using ARRAY_AGG
- experimenting with, A Brief Primer on Arrays and Structs
- finding length of and retrieving individual items, Working with Arrays
- in a script, Anatomy of a simple script
- NULL elements in, Creating Arrays by Using ARRAY_AGG
- storing data as arrays of structs, Storing data as arrays of structs-Storing data as arrays of structs
- string representations, Internationalization
- unnesting, UNNEST an Array
- working with, in advanced SQL, Working with Arrays-Window Functions
  - array functions, Array functions-Array functions
  - using arrays in generating data, Using arrays for generating data
  - using arrays to preserve ordering, Using arrays to preserve ordering
  - using arrays to store repeated fields, Using arrays to store repeated fields
ARRAY_AGG function, Creating Arrays by Using ARRAY_AGG, Numbering functions
- using with GROUP BY, Data skew
ARRAY_CONCAT function, Array functions
ARRAY_LENGTH function, Working with Arrays, Using arrays for generating data
ARRAY_TO_STRING function, Array functions
artificial intelligence (see AI; machine learning)
AS statement, aliasing column names with, Aliasing Column Names with AS-Filtering with WHERE
audit logging, Stackdriver monitoring and audit logging
authorization tokens, Step 1: HTTP POST
authorized views, Authorized views
authorizing users, Authorizing Users
AUTO partitioning mode, Loading and querying Hive partitions
AutoML, Bulk reads using BigQuery Storage API
- custom machine learning models, AutoML-Support for TensorFlow
- models, Custom Machine Learning Models on GCP
auxiliary data for regression model, Human insights and auxiliary data
availability, Availability, Disaster Recovery, and Encryption-Regional failures
- BigQuery and failure handling, BigQuery and Failure Handling-Regional failures
availability zones, Storage Data
averages, computing, Named timestamp parameters, Aggregate analytic functions
AVG function, Computing Aggregates by Using GROUP BY, Creating Arrays by Using ARRAY_AGG, Aggregate analytic functions
- decimal calculations and, Precise Decimal Calculations with NUMERIC
Avro files, ETL, EL, and ELT
- benefits and drawback of, Loading Data Efficiently
- extraction format for table data using Google Cloud Client Library, Extracting data from a table

B

backups, Durability, Backups, and Disaster Recovery
bag of words, Unstructured data
balancing classes in machine learning, Balancing Classes
bandwidth, dynamic provisioning with BigQuery networking infrastructure, Storage and Networking Infrastructure
bash
- getting access tokens for BigQuery REST API URL via, Table manipulation
- heredoc syntax specifying EOF to begin/end query, Querying
- scripting with BigQuery, Bash Scripting with BigQuery-Summary
  - BigQuery objects, BigQuery Objects
  - creating datasets and tables, Creating Datasets and Tables-Executing Queries
  - executing queries, Executing Queries-BigQuery Objects
  - reading multiline string into a variable, Executing Queries
batch data, ingest of, support by BigQuery, Powerful Analytics
BATCH job priority, Batch Queries
batch queries, Batch Queries
Beam (see Apache Beam)
BI Engine, accelerating queries with, Accelerating queries with BI Engine
BigQuery
- about, Data Processing Architectures
- features making it successful and unique, What Makes BigQuery Possible?-Security and Compliance
  - integration with Google Cloud Platform, Integration with Google Cloud Platform
  - managed storage, Managed Storage
  - security and compliance, Security and Compliance
  - separation of compute and storage, Separation of Compute and Storage
  - storage and networking infrastructure, Storage and Networking Infrastructure
- model types in, Summary of model types
- origins of, How BigQuery Came About-How BigQuery Came About
- serverless, distributed SQL engine, BigQuery: A Serverless, Distributed SQL Engine
- working with, Working with BigQuery-Simplicity of Management
  - deriving insights across datasets, Deriving Insights Across Datasets
  - ETL, EL, and ELT, ETL, EL, and ELT
  - powerful analytics, Powerful Analytics
  - simplicity of management, Simplicity of Management
bigquery library from CRAN, Working with BigQuery from R
BigQuery Mate, Estimating per-query cost
.bigqueryrc file, Executing Queries
BigQueryReader, TensorFlow’s BigQueryReader
binary classification problems, Classification, Summary of model types
- probability threshold, Choosing the Threshold
bitwise operations (<< and >>), Numeric Types and Functions
BOOL type, Data Types, Functions, and Operators, Summary
Boolean expressions
- in join conditions, INNER JOIN
- in WHERE clause, Filtering with WHERE, Logical Operations
Booleans, Working with BOOL-String Functions
- casting and coercion, Casting and Coercion
- cleaner NULL-handling with COALESCE, Cleaner NULL-Handling with COALESCE
- using COUNTIF to avoid casting, Using COUNTIF to Avoid Casting Booleans
- using in conditional expressions in SELECT, Conditional Expressions
boosted decision trees, Gradient-boosted trees
boosted_tree_classifier model type, Training
boosted_tree_regressor model type, Gradient-boosted trees
Borg container management system, Worker Shard
bq command-line tool, Loading from a Local Source, Bash Scripting with BigQuery
- adding a label to a dataset, Labels and tags
- --batch flag, Batch Queries
- bq extract command, Extracting data
- bq load command, Loading and inserting data
  - invoking from GCS locations, Impact of compression and staging via Google Cloud Storage
  - invoking on data on Cloud Storage, Data Migration Methods
  - options, finding full list of, Loading from a Local Source
- bq query command, Executing Queries
- bq wait, Copying datasets
- checking if a dataset exists with bq ls, Checking whether a dataset exists
- copying datasets using bq cp, Copying datasets
- copying tables using bq cp, Data Management (DDL and DML)
- creating a dataset in a different project with bq mk, Creating a dataset in a different project
- creating a table with bq mk --table, Creating a table
- creating a transfer job, Create a transfer job
- creating datasets using bq mk and specifying the location, Creating Datasets and Tables
- creating table definition using bq mkdef, How to Use Federated Queries
- deleting a table or view as a whole, Data Management (DDL and DML)
- --dry_run option, Estimating per-query cost
- examining information from query statistics, Scan-filter-count query
- initiating cross-region dataset copy via bq mk, Cross-region dataset copy
- listing BigQuery objects with bq ls, BigQuery Objects
- making external table definition with bq mk, How to Use Federated Queries
- previewing a table using bq head, Previewing data
- showing details of BigQuery objects with bq show, Showing details
- specifying Hive partition mode to bq load, Loading and querying Hive partitions
- SQL dialect used by, Executing Queries
- updating details of tables, datasets, and other objects with bq update, Updating
- using wildcards in file path for bq mkdef and bq load, Wildcards
BREAK statement, Looping
broadcast JOIN query, Broadcast JOIN query-Broadcast JOIN query
- coalesce stage, Broadcast JOIN query
- join+ stage, Broadcast JOIN query
broadcast joins, Broadcast JOIN query
bucketizing variables, Human insights and auxiliary data
- for regression model, Bucketizing the hour of day
bulk reads, using BigQuery Storage API, Bulk reads using BigQuery Storage API
business intelligence (BI) tools, using on data held in BigQuery, Powerful Analytics
BY HASH directive, Stage 0
BYTES type, Data Types, Functions, and Operators, Summary
BYTE_LENGTH function, Internationalization

C

caching
- increasing query speed by caching results, Caching the Results of Previous Queries-Accelerating queries with BI Engine
  - accelerating queries with BI Engine, Accelerating queries with BI Engine
  - caching intermediate results, Caching intermediate results
- query history and, Query History and Caching
calendar, extracting parts from timestamps, Extracting Calendar Parts
cannibalization, What’s Being Clustered?
Capacitor, Storage Data, Storage format: Capacitor-Storage format: Capacitor
cardinality, Storage format: Capacitor
- low, in partitions, Partitioning
- partitioning and clustering by, Reclustering
casting
- cast as bytes, Internationalization
- DATETIME to TIMESTAMP, Date, Time, and DateTime
- of Booleans, using COUNTIF to avoid, Using COUNTIF to Avoid Casting Booleans
- of strings to FLOAT64, Loading from a Local Source
- requiring explicit use of CAST function, Casting and Coercion
- string as INT64 or FLOAT64 to parse it, using CAST function, Printing and Parsing
categorical_weights, Examining Model Weights
centroid of an aggregate of geometries, Geometry transformations and aggregations
charts, Saving query results to pandas
- (see also visualizations)
- automatic, creating in Google Sheets using machine learning, Exploring BigQuery tables using Sheets
CHAR_LENGTH function, Internationalization
classification, Classification
- building a classification model, Building a Classification Model-Choosing the Threshold
  - choosing the threshold, Choosing the Threshold
  - evaluating the model, Evaluation
  - predicting with the model, Prediction
  - training the model, Training
client API functions and SQL alternatives, Table manipulation
client libraries, Summary
- allowing programmatic queries and manipulation of BigQuery resources, Developing with BigQuery
- cloud client libraries for BigQuery, Step 5: Returning the query results
- for BigQuery REST API, Loading from a Local Source
- Google Cloud Client Library, Google Cloud Client Library
- versus JDBC/ODBC drivers, JDBC/ODBC drivers
Cloud AI Platform (CAIP), Deep Neural Networks
- hyperparameter tuning in, Hyperparameter tuning using AI Platform
Cloud Bigtable, SQL Queries on Data in Cloud Bigtable-Improving performance
- NoSQL queries on data based on row-key prefix, NoSQL Queries based on a row-key prefix
- SQL queries on data in, Ad hoc SQL queries on Cloud Bigtable data-Improving performance
  - improving performance, Improving performance
Cloud Catalog, Integration with Google Cloud Platform
Cloud Client API (Python), Parameterized Queries
Cloud Composer, Integration with Google Cloud Platform, Loading from a Local Source
Cloud Console
- AI Factory section, Notebooks on Google Cloud Platform
- Cloud Dataflow template, creating to load data from MySQL, Using a Dataflow template to load directly from MySQL
- monitoring Dataflow job from, Cloud Dataflow
- Notebooks section, Open JupyterLab link, Notebooks on Google Cloud Platform
- requesting custom quota from, Estimating per-query cost
Cloud Data Labeling Service, Clustering
Cloud Dataflow, Bulk reads using BigQuery Storage API
- accessing BigQuery from, Cloud Dataflow-Cloud Dataflow
- exporting BigQuery data into TensorFlow records on GCS, Apache Beam/Cloud Dataflow
- using for streaming inserts into BigQuery, File Loads
- using to read/write from BigQuery, Using Cloud Dataflow to Read/Write from BigQuery-Using the Streaming API directly
  - using Dataflow template to load directly from MySQL, Using a Dataflow template to load directly from MySQL
  - using streaming API directly, Using the Streaming API directly
  - writing a Dataflow job, Writing a Dataflow job
Cloud Dataproc, Integration with Google Cloud Platform, Bulk reads using BigQuery Storage API
Cloud Functions, Integration with Google Cloud Platform, Loading from a Local Source
Cloud Natural Language, Unstructured data
Cloud Pub/Sub, Minimizing Network Overhead
- using for streaming inserts into BigQuery, File Loads
Cloud Scheduler, Integration with Google Cloud Platform
Cloud Shell
- default text editor, Specifying a Schema
- downloading MovieLens dataset and loading it as BigQuery table, The MovieLens Dataset
- paging through gzipped file using zless, Loading from a Local Source
Cloud Vision API, Unstructured data
clustering, Clustering
- clustering column in a query, Insert SELECT
- clustering tables based on high-cardinality keys, Clustering Tables Based on High-Cardinality Keys-Side benefits of clustering
  - clustering by partitioning column, Clustering by the partitioning column
  - reclustering, Reclustering
  - side benefits of clustering, Side benefits of clustering
- partitioning versus, Reclustering, Reclustering
- performance optimizations with clustered tables, Performance optimizations with clustered tables
- reclustering, Reclustering
clustering (in machine learning), Clustering
- k-means algorithm, k-Means Clustering-Data-Driven Decisions
  - carrying out clustering, Carrying Out Clustering
  - clustering bicycle stations, Clustering Bicycle Stations
  - determining what's being clustered, What’s Being Clustered?
  - making data-driven decisions, Data-Driven Decisions
  - understanding the clusters, Understanding the Clusters
- optimal number of clusters supported by data, Hyperparameter tuning using scripting
clustering ratio, Reclustering
COALESCE function, using to evaluate expressions until non-NULL value is obtained, Cleaner NULL-Handling with COALESCE
coalesce stage, Broadcast JOIN query
coercion, Casting and Coercion
Coldline Storage, Setting up life cycle management on staging buckets
Colossus File System, Storage and Networking Infrastructure, Step 5: Returning the query results, Storage Data
- physical storage for BigQuery, Physical storage: Colossus
- sources of a query on, Worker Shard
column stores, How BigQuery Came About
column-oriented stores, Storage format: Capacitor, Clustering
columnar files, Loading Data Efficiently
columnar storage formats
- Capacitor, Storage format: Capacitor-Storage format: Capacitor
- Parquet and Optimized Row Columnar (ORC), Storage format: Capacitor
comma cross joins, CROSS JOIN
comments, lines beginning with -- or #, Retrieving Rows by Using SELECT
committed state (storage sets), Storage sets
community-developed, open source UDFs, Public UDFs
comparisons
- carried out using <, <=, >, >=, and != (or <>) comparison operators, Comparisons
- comparison operators applied to NULL, Finding Unique Values by Using DISTINCT
- using comparison operators with Boolean variables, Logical Operations
compliance, Security and Compliance
- (see also regulatory compliance)
compression of files, Loading from a Local Source
- impact of, in loading data into BigQuery, Impact of compression and staging via Google Cloud Storage
computation, moving to the data, How BigQuery Came About
compute
- scaling compute in BigQuery, Separation of Compute and Storage
- separation from storage in BigQuery, ETL, EL, and ELT, Separation of Compute and Storage
compute_fit method, Cloud Dataflow
CONCAT function, String Functions, String Manipulation Functions, Building queries dynamically
concatenation of arrays, Array functions
conda environment for Jupyter, Working with BigQuery from R
conditional expressions, Conditional Expressions
constants, defining, Defining constants
container management system (Borg), Worker Shard
CONTINUE statement, Looping
continuous integration/continuous deployment (CI/CD), Continuous Integration/Continuous Deployment-Cost/Billing Exports
correlated CROSS JOINs, Using arrays for generating data
correlated subqueries, Correlated subquery
- for cases seeming to require a script, A sequence of statements
correlation coefficients, Correlation
correlation, functions for, Correlation
costs
- controlling, Controlling Cost-Finding the most expensive queries
  - estimating per-query cost, Estimating per-query cost
  - finding most expensive queries, Finding the most expensive queries
- cost/billing exports, Cost/Billing Exports-Dashboards, Monitoring, and Audit Logging
  - cost by month by product, Costs by month by product
  - labels, using, Labels
- data loaded into BigQuery, Loading from a Local Source
- staging files on Google Cloud Storage, Impact of compression and staging via Google Cloud Storage
COUNT function, counting records with, Counting Records by Using COUNT
COUNTIF function, using to avoid casting Booleans, Using COUNTIF to Avoid Casting Booleans
COUNT_STAR operator, Stage 0
CRAN, bigquery library from, Working with BigQuery from R
CREATE FUNCTION IF NOT EXISTS, Persistent UDFs
CREATE FUNCTION statements, Persistent UDFs
CREATE IF NOT EXISTS statement, Setting up destination table
CREATE MODEL statement, Training and Evaluating the Model
CREATE OR REPLACE FUNCTION, Persistent UDFs
CREATE OR REPLACE PROCEDURE statement, Stored procedures
CREATE OR REPLACE TABLE statement, Setting up destination table
- creating an empty table, Empty table
- making tables irrecoverable, Restoring Deleted Records and Tables
- OPTIONS list, using, Options list
CREATE TABLE AS SELECT statement, Data Management (DDL and DML), Step 5: Returning the query results
CREATE TABLE statement, Copying into a New Table, Setting up destination table
CreateDisposition and WriteDisposition, controlling load of pandas DataFrame, Loading a pandas DataFrame
CROSS JOIN statement, CROSS JOIN, Using arrays for generating data
cross-entropy loss measure in classification, Evaluation
cross-region dataset copies, Cross-region dataset copy
cross-selling of product groups, improving, What’s Being Clustered?
CRUD operations
- on REST API, mapped to HTTP verbs, Accessing BigQuery via the REST API
- supported on persistent storage, Query Essentials
crypto-shredding, Crypto-shredding
cryptography
- BigQuery support for MD5 and SHA hashing algorithms, MD5 and SHA
- services provided for BigQuery by Google, Infrastructure Security
CSV files, ETL, EL, and ELT, Loading from a Local Source
- compressed, loading into BigQuery, Impact of compression and staging via Google Cloud Storage
- drawbacks of, for loading data into BigQuery, Loading Data Efficiently
- extraction format for table data using Google Cloud Client Library, Extracting data from a table
- loading into BigQuery, Loading from a Local Source
- querying external tables created from, Temporary table
- using Cloud Dataflow to read and write to BigQuery, Writing a Dataflow job
curl utility
- issuing GET request to BigQuery REST API URL, Table manipulation
- sending raw HTTP requests via, Step 1: HTTP POST
- using in measuring query time, Measuring Query Speed Using REST API
CURRENT_TIMESTAMP function, Query History and Caching, Parsing and Formatting Timestamps
custom roles, Custom roles
customer information, security of, Infrastructure Security
customer segmentation, What’s Being Clustered?
customer targeting, Summary of model types, What’s Being Clustered?, Customer targeting
Customer-Managed Encryption Keys (CMEK), Customer-Managed Encryption Keys, CMEK

D

dashboards, tables accessed from, using BI Engine with, Accelerating queries with BI Engine
data
- correctness of, impact of time on, The Basics
- loading, Loading Data into BigQuery
  - (see also loading data into BigQuery)
- moving on-premises data, Moving On-Premises Data-Data Migration Methods
- slowly changing dimension, The Basics
Data Catalog, searching for tables with specific label, Creating a table
Data Definition Language (DDL), DDL-DML
- changing options after table creation, Changing options
- creating empty tables, Empty table
- options list, Options list
- statements, Data Management (DDL and DML)
  - CREATE OR REPLACE TABLE, Setting up destination table
- support by BigQuery, How BigQuery Came About
data exfiltration protection, Data Exfiltration Protection
data locality, Data Locality
data loss prevention, Data Loss Prevention-Data Loss Prevention
data management (DDL and DML), Data Management (DDL and DML)-Data Management (DDL and DML)
Data Manipulation Language (DML), DML, Caching the Results of Previous Queries, DML-MERGE statement
- BigQuery and very-high-frequency DML updates, DML
- deleting rows with DELETE WHERE, Deleting rows
- INSERT SELECT, Insert SELECT
- INSERT VALUES, Insert VALUES
- INSERT VALUES with subquery SELECT, Insert VALUES with subquery SELECT
- MERGE statement, MERGE statement
- removing all transactions related to a single individual, DML
- statements, Data Management (DDL and DML)
- statements forcing a recluster, Reclustering
- support by BigQuery, How BigQuery Came About
- updating row values, Updating row values
data marketplace, How BigQuery Came About
data processing architectures, Data Processing Architectures-BigQuery: A Serverless, Distributed SQL Engine
- MapReduce framework, MapReduce Framework
- relational database management system, Relational Database Management System
data science tools, accessing BigQuery from, Accessing BigQuery from Data Science Tools-Incorporating BigQuery Data into Google Slides (in G Suite)
- Cloud Dataflow, Cloud Dataflow-Cloud Dataflow
- incorporating BigQuery data into Google Slides, Incorporating BigQuery Data into Google Slides (in G Suite)-Incorporating BigQuery Data into Google Slides (in G Suite)
- JDBC/ODBC drivers, JDBC/ODBC drivers
- notebooks on Google Cloud Platform, Notebooks on Google Cloud Platform-Working with BigQuery, pandas, and Jupyter
- working with BigQuery from R, Working with BigQuery from R-Cloud Dataflow
- working with BigQuery, pandas, and Jupyter, Working with BigQuery, pandas, and Jupyter-Working with BigQuery, pandas, and Jupyter
Data Sheets (BigQuery), Exploring BigQuery tables as a data sheet in Google Sheets
data skew, Data skew
data split, controlling in BigQuery ML, Controlling Data Split
Data Studio, Integration with Google Cloud Platform
- billing dashboard example, Visualizing the billing report
- exploring visualizations in, Powerful Analytics
- visualizing cluster attributes, Understanding the Clusters
Data Transfer Service (BigQuery), Data Transfer Service-Cross-region dataset copy, Data Migration Methods
- creating a transfer job, Create a transfer job
- cross-region dataset copy, Cross-region dataset copy
- data locality, Data locality
- scheduled queries, Scheduled queries
- setting up destination table, Setting up destination table
data types, Data Types, Functions, and Operators-Summary
- Booleans, working with, Working with BOOL-String Functions
  - casting and coercion, Casting and Coercion
  - cleaner NULL-handling with COALESCE, Cleaner NULL-Handling with COALESCE
  - in conditional expressions, Conditional Expressions
  - logical operations, Logical Operations
  - using COUNTIF to avoid casting Booleans, Using COUNTIF to Avoid Casting Booleans
- geographic, Geographic types
- Geography functions, Working with GIS Functions
- numeric types and functions, Numeric Types and Functions-Precise Decimal Calculations with NUMERIC
  - comparisons, Comparisons
  - mathematical functions, Mathematical Functions
  - precise decimal calculations with NUMERIC, Precise Decimal Calculations with NUMERIC
  - SAFE functions, SAFE Functions
  - standard-compliant floating-point division, Standard-Compliant Floating-Point Division
- strings and string functions, String Functions-Working with TIMESTAMP
  - internationalization of strings, Internationalization
  - printing and parsing strings, Printing and Parsing
  - regular expressions, Regular Expressions
  - string manipulation functions, String Manipulation Functions
  - summary of string functions, Summary of String Functions
  - transformation functions, Transformation Functions
- strongly typed managed storage with BigQuery, Managed Storage
- supported by BigQuery, Data Types, Functions, and Operators
- TIMESTAMP, working with, Working with TIMESTAMP-Date, Time, and DateTime
  - arithmetic with timestamps, Arithmetic with Timestamps
  - DATE, TIME, and DATETIME, Date, Time, and DateTime
  - extracting calendar parts, Extracting Calendar Parts
  - parsing and formatting timestamps, Parsing and Formatting Timestamps
data warehouses
- architectural differences of BigQuery on on-premises and cloud data warehouses, What Makes BigQuery Possible?
- BigQuery's evolution into, How BigQuery Came About
data-driven decisions, making with k-means clustering, Data-Driven Decisions
dataEditor role, Predefined roles
DataFrames (see pandas)
dataOwner role, Predefined roles
datasets, Metadata
- access to, on BigQuery, Loading from a Local Source
- checking if a dataset exists with bq ls, Checking whether a dataset exists
- copying using bq cp, Copying datasets
- creating in a different project, Creating a dataset in a different project
- creating to load into BigQuery, Loading from a Local Source
- creating using bq mk, Creating Datasets and Tables
- creating using Google Cloud Client Library, Creating a dataset
- cross-region dataset copy via Data Transfer Service, Cross-region dataset copy
- deleting a dataset using Google Cloud Client Library, Deleting a dataset
- deriving insights across, Deriving Insights Across Datasets
- determining those involved in query requests, Step 2: Routing
- information on, using Google Cloud Client Library, Dataset information
- joining Google Sheets data with dataset in BigQuery, Joining Sheets data with a large dataset in BigQuery
- manipulation through HTTP request to BigQuery REST API URL, Dataset manipulation
- manipulation via Google Cloud Client library for BigQuery, Dataset manipulation
- modifying attributes using Google Cloud Client Library, Modifying attributes of a dataset
- names of, Retrieving Rows by Using SELECT
- names, key components of, Retrieving Rows by Using SELECT
- permissions to access, Predefined roles
- primitive roles providing access to, Primitive roles
- providing for Identity and Access Management (IAM), Retrieving Rows by Using SELECT
- training dataset for regression model, creating, Creating a Training Dataset
dataViewer role, Predefined roles, Resource
DATE type, Date, Time, and DateTime, Summary
- detection by AUTO partitioning mode, Loading and querying Hive partitions
dates and time, working with timestamps, Working with TIMESTAMP-Date, Time, and DateTime
DATETIME type, Data Types, Functions, and Operators, Date, Time, and DateTime, Summary
Davies-Bouldin index, Hyperparameter tuning using scripting
decision trees, Gradient-boosted trees
Deep Learning Virtual Machine, Notebooks on Google Cloud Platform
deep neural networks, Deep Neural Networks-Deep Neural Networks
- classification with, dnn_classifier model, Training
- training a model, Deep Neural Networks
- using a smaller network, Deep Neural Networks
DELETE statement, Data Management (DDL and DML), DML, DML
- deleting rows in DML, Deleting rows
deletions
- deleting a dataset using Google Cloud Client Library, Deleting a dataset
- deleting a table or view from BigQuery, Data Management (DDL and DML)
- deleting a table or view from BigQuery using SQL, Data Management (DDL and DML)
- deleting a table using Google Cloud Client Library, Deleting a table
- deleting partitions, Partitioning
- HTTP DELETE request to BigQuery REST API URL, Dataset manipulation
- restoring deleted records and tables, Restoring Deleted Records and Tables
denormalization, Denormalization, Joining with precomputed values
- JOIN versus, JOIN versus denormalization
DENSE_RANK function, Numbering functions
descriptive analytics, powerful, performing with BigQuery, Powerful Analytics
developing with BigQuery, Developing with BigQuery-Summary
- accessing BigQuery from data science tools, Accessing BigQuery from Data Science Tools-Incorporating BigQuery Data into Google Slides (in G Suite)
  - JDBC/ODBC drivers, JDBC/ODBC drivers
  - working with BigQuery from R, Working with BigQuery from R-Cloud Dataflow
  - working with BigQuery, pandas, and Jupyter, Working with BigQuery, pandas, and Jupyter-Working with BigQuery, pandas, and Jupyter
- bash scripting with BigQuery, Bash Scripting with BigQuery-Summary
- developing programmatically, Developing Programmatically-Parameterized queries
  - accessing BigQuery via REST API, Accessing BigQuery via the REST API-Limitations
  - using Google Cloud Client Library, Google Cloud Client Library-Parameterized queries
dictionary encoding, Storage format: Capacitor
disaster recovery, Durability, Backups, and Disaster Recovery
disks
- failures of, Disk failures
- failures of, avoiding loss of data, Physical storage: Colossus
- query shuffling and spilling to disk, Shuffle
DISTINCT, finding unique values with, Finding Unique Values by Using DISTINCT
division
- / operator, Numeric Types and Functions
- standard-compliant floating-point division, Standard-Compliant Floating-Point Division
DNN (see deep neural networks)
dnn_classifier model type, Training
dnn_regressor model, Deep Neural Networks, Training hybrid recommendation model
- difficulties of, Deep Neural Networks
draining the zone, Zonal failures
drains or failovers of compute clusters, Step 3: Job Server
Dremel (SQL engine), How BigQuery Came About, Step 4: Query engine
- cloud version of, How BigQuery Came About
- SQL dialect used by, Simple Queries
Dremel query engine, Query Engine (Dremel)-Hash join query
- architecture, Dremel Architecture-Query Execution
  - Query Masters, Query Master
  - scheduler, Scheduler
  - shuffle, Shuffle
- current architecture, Dremel X, Query Engine (Dremel)
- initial architecture, Query Engine (Dremel)
- query execution, Query Execution-Hash join query
  - broadcast JOIN query, Broadcast JOIN query-Broadcast JOIN query
  - hash join query, Hash join query-Hash join query
  - scan-filter-aggregate query, Scan-filter-aggregate query-Stage 2
  - scan-filter-count query, Scan-filter-count query-Stage 1
DROP FUNCTION statement, Persistent UDFs
DROP TABLE statement, Data Management (DDL and DML)
--dry_run option, running parameterized queries with, Array and struct parameters
dry runs for queries, Dry run
dsinfo object, Dataset information
- modifying, Modifying attributes of a dataset
durability, Durability, Backups, and Disaster Recovery
dynamic SQL queries, Building queries dynamically

E

EL (extract and load), ETL, EL, and ELT
ELT (extract, load, and transform), ETL, EL, and ELT
empty tables, Empty table
encoding (storage), Physical storage: Colossus
- dictionary encoding, Storage format: Capacitor
- erasure encoding, Physical storage: Colossus
- replicated encoding, Physical storage: Colossus
- run-length encoding, Storage format: Capacitor
encryption, Simplicity of Management, Privacy and Encryption-Customer-Managed Encryption Keys
- Customer Managed Encryption Keys (CMEK), CMEK
- Customer-Managed Encryption Keys (CMEK), Customer-Managed Encryption Keys
- encrypting all sensitive data corresponding to a user, Crypto-shredding
ENDS_WITH function, String Manipulation Functions
entity extraction, Summary of model types
envelope encryption, CMEK
equality, not-equals, using != or <> operator, Comparisons
erasure encoding, Storage Data, Physical storage: Colossus
errors, inserting rows into a table, Inserting rows into a table
etags, Updating a table’s schema
ETL (extract, transform, and load)
- manipulating strings in ETL pipelines, String Manipulation Functions
- using BigQuery, ETL, EL, and ELT
- using robust ETL pipeline and making decisions early, Copying into a New Table
evaluating machine learning models
- controlling data split with training, Controlling Data Split
- evaluation tab of web UI for classification model, Evaluation
- loss curve for classification model, Training
- matrix factorization model, Matrix Factorization
- regression model, Evaluating the model
EXCEPT, using with SELECT, SELECT *, EXCEPT, REPLACE
execution plans (see query plans)
execution stages (queries), Scan-filter-count query
EXISTS operator, Using arrays to store repeated fields
expensive computations, reducing number of, Reducing the number of expensive computations
experimenting with BigQuery, using sandbox, Estimating per-query cost
expiration
- cached tables expiring, Caching the Results of Previous Queries
- changing for a table after creating it, Changing options
- for partitions, Partitioned tables
- specifying for partitions, Partitioning, Partitioned tables
- specifying for tables, Loading from a Local Source, Data Management (DDL and DML), Creating Datasets and Tables, Creating a table, Options list
- system event logged when table or partition expires, Stackdriver monitoring and audit logging
- temporary tables holding query results, Query History and Caching
explicit conversion, Casting and Coercion
- (see also casting)
exports of data from BigQuery
- exporting Stackdriver logs, Exporting Stackdriver Logs-Exporting Stackdriver Logs
- extracting data from a table and exporting it to GCS, Extracting data from a table
extensions to SQL in BigQuery supporting data analytics, Powerful Analytics
extensions, invoking in Jupyter Notebook, Notebooks on Google Cloud Platform
external data sources
- how to use federated queries on, How to Use Federated Queries-Loading and querying Hive partitions
- interactive querying of data in Google Sheets, Interactive Exploration and Querying of Data in Google Sheets-Joining Sheets data with a large dataset in BigQuery
- recommendations for use, Internal versus external data sources
- SQL queries on data in Cloud Bigtable, SQL Queries on Data in Cloud Bigtable-Improving performance
- supported by BigQuery, Federated Queries and External Data Sources
- when to use with federated queries, When to Use Federated Queries and External Data Sources-Interactive Exploration and Querying of Data in Google Sheets

F

failover processes, Step 3: Job Server
failure handling, BigQuery and Failure Handling-Regional failures
- disk failures, Disk failures
- machine failures, Machine failures
- regional failures, Regional failures
- zonal failures, Zonal failures
FARM fingerprint algorithm, Fingerprint function
feature engineering, Exploring the Dataset to Find Features
features (in machine learning), Formulating a Machine Learning Problem
- combining categorical features into AND comdition, Human insights and auxiliary data
- creating input features for hybrid recommendation model, Creating input features-Creating input features
- dayofweek feature in linear regression model, weights, Examining Model Weights
- finding in dataset for regression model, Exploring the Dataset to Find Features-Number of bicycles
  - day of week, Day of week
  - impact of station, Impact of station
  - number of bicycles, Number of bicycles
- for classification model, Training
- regression model features, other ways to represent, Combining days of the week
- standardize_features option in k-means clustering, Carrying Out Clustering
federated queries, ETL, EL, and ELT, Integration with Google Cloud Platform
- and external data sources, Federated Queries and External Data Sources-Improving performance
  - how to use federated queries, How to Use Federated Queries-Loading and querying Hive partitions
  - interactive explorations and querying of data in Google Sheets, Interactive Exploration and Querying of Data in Google Sheets-Joining Sheets data with a large dataset in BigQuery
  - SQL queries on data in Cloud Bigtable, SQL Queries on Data in Cloud Bigtable-Improving performance
  - when to use, When to Use Federated Queries and External Data Sources-Interactive Exploration and Querying of Data in Google Sheets
file compression, Loading from a Local Source
file loads, File Loads
fingerprint function, Fingerprint function
FIRST_VALUE function, Navigation functions
FLOAT64 type, Data Types, Functions, and Operators, Summary
- coercion of INT64 or NUMERIC to, Casting and Coercion
- decimal calculations and, Precise Decimal Calculations with NUMERIC
floating-point numbers, standard-compliant floating-point division, Standard-Compliant Floating-Point Division
folium package, plotting a map with, Working with BigQuery, pandas, and Jupyter
FORMAT function, Printing and Parsing
FORMAT_DATE function, Printing and Parsing
FORMAT_TIMESTAMP function, Printing and Parsing, Parsing and Formatting Timestamps
FROM clause
- correlated subqueries in, Correlated subquery
- dataset named in, Loading Data into BigQuery
- in SELECT statement, WHERE clause and, Filtering with WHERE
- including parameters so constant can be used in a query, Defining constants
- UNNEST function in, Using arrays to store repeated fields
from_items, The JOIN Explained
functions, Numeric Types and Functions
- advanced, Advanced Functions-Summary
  - BigQuery Geographic Information Systems, BigQuery Geographic Information Systems-Geometry transformations and aggregations
  - hash algorithms, Hash Algorithms-Summary
  - statistical functions, Useful Statistical Functions-Correlation
- mathematical, Mathematical Functions
- SAFE, SAFE Functions
- standard-compliant floating-point division, Standard-Compliant Floating-Point Division
- types of, summary, Numeric Types and Functions

G

G Suite, Interactive Exploration and Querying of Data in Google Sheets, Incorporating BigQuery Data into Google Slides (in G Suite)
- (see also Google Slides)
Gamma distribution fit, computing parameters of, Cloud Dataflow
GARBAGE, marking old storage sets as, Storage sets, DML
gcloud command-line tool, Notebooks on Google Cloud Platform
- gcloud auth command, Creating a dataset in a different project
GCP (see Google Cloud Platform)
GCP Cloud Console (see Cloud Console)
GCS (see Google Cloud Storage)
generational (storage system), Storage optimization
Geo Viz (BigQuery), Geometry transformations and aggregations
Geographic Information Systems (GIS), BigQuery Geographic Information Systems-Geometry transformations and aggregations
- function performing GIS measures, GIS Measures
- functions for creating polygons, Creating Polygons
- functions performing geometry transformations and aggregations, Geometry transformations and aggregations
- GIS functions operating on geographic types, Geographic types
- predicate functions, GIS predicate functions
- ST_GeoHash function, Human insights and auxiliary data
geographic types, Geographic types
GEOGRAPHY type, Data Types, Functions, and Operators, Working with GIS Functions, Summary
- storing data as, Storing data as geography types-Storing data as geography types
geohash, Creating Polygons
GeoJSON geospatial data, Geographic types
- converting geographies to/from strings in, Geographic types
GET requests (HTTP), Table manipulation
GitHub repository for this book, Table manipulation
- Google Apps Script in, Incorporating BigQuery Data into Google Slides (in G Suite)
Global Positioning System (GPS), Working with GIS Functions
Google Apps Script, Incorporating BigQuery Data into Google Slides (in G Suite)
Google BigQuery (see BigQuery)
Google Cloud Client Library, Developing Programmatically, Google Cloud Client Library-Parameterized queries, Notebooks on Google Cloud Platform
- browsing rows of a table, Browsing the rows of a table
- copying a table, Copying a table
- creating a dataset, Creating a dataset
- creating an empty table, Creating an empty table
- creating an empty table with schema, Creating an empty table with schema
- dataset information from dsinfo object, Dataset information
- dataset manipulation, Dataset manipulation
- deleting a dataset, Deleting a dataset
- deleting a table, Deleting a table
- extracting data from a table, Extracting data from a table
- inserting rows into a table, Inserting rows into a table
- installing BigQuery client library, Google Cloud Client Library
- instantiating a Client, Google Cloud Client Library
- loading a BigQuery table directly from Google Cloud URI, Loading from a URI
- loading a BigQuery table from a local file, Loading from a local file
- loading a pandas DataFrame, Loading a pandas DataFrame
- modifying attributes of a dataset, Modifying attributes of a dataset
- querying with, Querying-Parameterized queries
  - creating a pandas DataFrame, Creating a pandas DataFrame
  - dry run before executing the query, Dry run
  - executing the query, Executing the query
- table management with, Table management
- updating a table's schema, Updating a table’s schema
Google Cloud Data Loss Prevention API, Integration with Google Cloud Platform
Google Cloud Identity and Access Management (see Identity and Access Management)
Google Cloud Platform (GCP)
- BigQuery interacting with, using bq tool, Loading from a Local Source
- custom machine learning models in, Custom Machine Learning Models on GCP-Predicting with TensorFlow models
- Google Cloud Storage or Cloud Pub/Sub, Minimizing Network Overhead
- integration of BigQuery with, Integration with Google Cloud Platform
- notebooks on, Notebooks on Google Cloud Platform-Working with BigQuery, pandas, and Jupyter
  - Jupyter Magics, Jupyter Magics
  - running a parameterized query, Running a parameterized query
  - saving query results to pandas, Saving query results to pandas
- Pricing Calculator, Estimating per-query cost
- security features provided by, Administering and Securing BigQuery
Google Cloud Software Development Kit (SDK), Table manipulation, Bash Scripting with BigQuery
Google Cloud Storage (GCS), MapReduce Framework, Minimizing Network Overhead
- exporting BigQuery data to TensorFlow records on, Apache Beam/Cloud Dataflow
- exporting data from a table to file in GCS, Extracting data from a table
- federated queries extracting data from, ETL, EL, and ELT
- loading Hive partitions on, Loading and querying Hive partitions
- loading on-premises data into, Data Migration Methods
- staging files before loading into BigQuery, Impact of compression and staging via Google Cloud Storage
- transferring data from, Create a transfer job
Google File System (GFS), Physical storage: Colossus
Google Front-End (GFE) servers, Step 2: Routing
Google Sheets, When to Use Federated Queries and External Data Sources, Interactive Exploration and Querying of Data in Google Sheets
- joining Sheets data with large dataset in BigQuery, Joining Sheets data with a large dataset in BigQuery
- loading data into BigQuery and querying it, Loading Google Sheets data into BigQuery
- populating a spreadsheet with data from BigQuery, Populating a Google Sheets spreadsheet with data from BigQuery
  - exploring BigQuery tables as data sheet, Exploring BigQuery tables as a data sheet in Google Sheets
  - exploring BigQuery tables using Sheets, Exploring BigQuery tables using Sheets
- storing BigQuery query results in spreadsheet, Incorporating BigQuery Data into Google Slides (in G Suite)
Google Slides, incorporating BigQuery data into, Incorporating BigQuery Data into Google Slides (in G Suite)-Incorporating BigQuery Data into Google Slides (in G Suite)
gradient-boosted trees, Gradient-boosted trees
Gradle build tool, installing, Measuring Query Speed Using BigQuery Workload Tester
Gray, Jim, How BigQuery Came About
GROUP BY
- computing aggregates with, Computing Aggregates by Using GROUP BY
- using instead of scripts, A sequence of statements
- using with ARRAY_AGG function, Data skew
gsutil cp command, Impact of compression and staging via Google Cloud Storage, Data Migration Methods
gzip file compression, Loading from a Local Source

H

Hadoop, MapReduce Framework
hash algorithms, Hash Algorithms-Summary
- fingerprint function, Fingerprint function
- generating UUIDs, UUID
- MD5 and SHA, MD5 and SHA
- random number generator, Random number generator
hash join query, Hash join query-Hash join query
hash joins, Hash join query
hashes
- about, Stage 0
- BY HASH directive in scan-filter-aggregate query, Stage 0
HAVING clause, Anatomy of a simple script
- filtering grouped items with, Filtering Grouped Items by Using HAVING
Heartbleed vulnerability, Infrastructure Security
heredoc syntax in Bash, Querying
hidden_units, Deep Neural Networks
history of queries, Query History and Caching
Hive partitions, loading and querying, Loading and querying Hive partitions
HLL functions, HLL functions
- HLL_COUNT.EXTRACT, HLL functions
- HLL_COUNT.INIT, HLL functions, HLL functions
- HLL_COUNT.MERGE, HLL functions
- HLL_COUNT.MERGE_PARTIAL, HLL functions
HTTP requests
- batching requests ot BigQuery REST API, Batching multiple requests
- BigQuery REST API documentation specifying details of, Dataset manipulation
- DELETE request to BigQuery REST API URL, Dataset manipulation, Table manipulation
- GET request to BigQuery REST API URL, Table manipulation
- GET, POST, PUT, PATCH, and DELETE methods, Dataset manipulation
- getting status of jobId using REST API with GET request, Limitations
- POST request for a query, Step 1: HTTP POST
- POST request to BigQuery REST API URL with JSON request embedded, Querying
- to BigQuery REST API, Accessing BigQuery via the REST API
HTTPS, Accessing BigQuery via the REST API
human insights in regression model, Human insights and auxiliary data
HyperLogLog++ (HLL++) algorithm, HLL functions
hyperparameter tuning, Hyperparameter Tuning-Hyperparameter tuning using AI Platform
- for deep neural networks, Deep Neural Networks
- using AI Platform, Hyperparameter tuning using AI Platform
- using Python, Hyperparameter tuning in Python
- using scripting, Hyperparameter tuning using scripting

I

I/O, minimizing for queries, Minimizing I/O-Reducing the number of expensive computations
Identity and Access Management (IAM), Simplicity of Management, Administering and Securing BigQuery, Identity and Access Management-Resource
- provided by datasets, Retrieving Rows by Using SELECT
- resources, Resource
- roles, Role-Custom roles
  - custom, Custom roles
  - predefined, Predefined roles
  - primitive, Primitive roles
IEEE_Divide function, Standard-Compliant Floating-Point Division
IF conditions, Looping
IF function, Conditional Expressions
IF statement, using on Booleans, Using COUNTIF to Avoid Casting Booleans
IFNULL function, Cleaner NULL-Handling with COALESCE
image captioning, Summary of model types
image classification, Summary of model types
implicit conversion, Casting and Coercion
- (see also coercion)
in-memory filesystem, Worker Shard
- (see also Colossus File System)
increasing query speed, Increasing Query Speed-Optimizing How Data Is Stored and Accessed
- caching results of previous queries, Caching the Results of Previous Queries-Accelerating queries with BI Engine
  - accelerating queries with BI Engine, Accelerating queries with BI Engine
- minimizing I/O, Minimizing I/O-Reducing the number of expensive computations
  - being purposeful in SELECT, Be purposeful in SELECT
  - reducing data being read, Reducing data being read
  - reducing number of expensive computations, Reducing the number of expensive computations
- performing efficient joins, Performing Efficient Joins-JOIN versus denormalization
  - avoiding self-joins of large tables, Avoiding self-joins of large tables
  - denormalization, Denormalization
  - JOIN versus denormalization, JOIN versus denormalization
  - joining with precomputed values, Joining with precomputed values
  - reducing data being joined, Reducing the data being joined
  - using window function instead of self-join, Using a window function instead of self-join
- using approximate aggregation functions, Using Approximate Aggregation Functions-Optimizing How Data Is Stored and Accessed
indexes (array), Using arrays for generating data
indexing, not needed in BigQuery, Simplicity of Management
infinite loops, avoiding with SQL, How BigQuery Came About
INFORMATION_SCHEMA view, Table manipulation, Obtaining table properties, Building queries dynamically
- associated with a project, finding most expensive queries, Finding the most expensive queries
infrastructure provisioning, not needed with BigQuery, Simplicity of Management
INNER JOIN statement, INNER JOIN, CROSS JOIN
- INNER JOIN EACH WITH ALL, Broadcast JOIN query
- INNER JOIN EACH WITH EACH, Hash join query
- summary of, OUTER JOIN
INSERT SELECT statement, Insert SELECT
INSERT statement, Data Management (DDL and DML), Step 5: Returning the query results, DML
INSERT VALUES statement, Data Management (DDL and DML), Insert VALUES
- with SELECT subquery, Insert VALUES with subquery SELECT
Institute of Electrical and Electronics Engineers (IEEE), Standard-Compliant Floating-Point Division
INT64 type, Data Types, Functions, and Operators, Summary
- converting (coercing) to FLOAT64 or NUMERIC, Casting and Coercion
- decimal calculations and, Precise Decimal Calculations with NUMERIC
- returned by fingerprint function, Fingerprint function
INTEGER type, detection by AUTO partitioning mode, Loading and querying Hive partitions
internationalization of strings, Internationalization
intersection of geography types, Geometry transformations and aggregations
IS NOT NULL operator, Finding Unique Values by Using DISTINCT
IS NULL operator, Finding Unique Values by Using DISTINCT
IS operator
- using in comparing against built-in constants, Logical Operations
- using to check where value is NULL, Logical Operations
isolation between jobs, Simplicity of Management

J

Java Database Connectivity (JDBC), JDBC/ODBC drivers
JavaScript
- tensorflow.js, Exporting to TensorFlow
- user-defined functions, Optimizing user-defined functions, JavaScript UDFs-JavaScript UDFs
JDBC/ODBC drivers, JDBC/ODBC drivers, Step 5: Returning the query results
job management, Job Management
job priority, BATCH, Batch Queries
job servers, Step 3: Job Server
- upgrades to, BigQuery Upgrades
JobConfig flags, Loading from a URI
jobIds, Limitations, Step 5: Returning the query results
jobUser role, Predefined roles, Resource
job_config, Parameterized queries
join+ stage
- of broadcast JOIN treaty, Broadcast JOIN query
- of hash join query, Hash join query
joins, Joining Tables-Saving and Sharing
- broadcast and hash, Broadcast JOIN query
- broadcast JOIN query, Broadcast JOIN query-Broadcast JOIN query
- complex, support by BigQuery, Powerful Analytics
- CROSS JOIN, CROSS JOIN
- for cases seeming to require a script, A sequence of statements
- hash join query, Hash join query-Hash join query
- INNER JOIN, INNER JOIN
- JOIN statement, The JOIN Explained
  - GIS predicate functions in, GIS predicate functions
  - how it works, The JOIN Explained
- joining user table and machine learning weights, Creating input features
- OUTER JOIN, OUTER JOIN
- performing efficient joins, Performing Efficient Joins-JOIN versus denormalization
  - avoiding self-joins of large tables, Avoiding self-joins of large tables
  - denormalization, Denormalization
  - JOIN versus denormalization, JOIN versus denormalization
  - reducing data being joined, Reducing the data being joined
  - using precomputed values, Joining with precomputed values
  - using window function instead of self-join, Using a window function instead of self-join
- queries doing JOIN operations, Query Engine (Dremel)
- summary of types of joins and their output, OUTER JOIN
JSON, ETL, EL, and ELT
- arrays, Creating Arrays by Using ARRAY_AGG
- compressed files, loading into BigQuery, Impact of compression and staging via Google Cloud Storage
- converting arrays to JSON strings, Array functions
- creating JSON strings for dataset schema, Specifying a Schema
- creating table definition of data stored in newline-delimited JSON for Hive partition, Loading and querying Hive partitions
- GeoJSON, Geographic types
  - converting geographies to/from strings in, Geographic types
- JSON request in body of HTTP POST sent to BigQuery REST API URL, Querying
- JSON/REST interface, Accessing BigQuery via the REST API
- loading files into BigQuery, Loading from a Local Source
- newline-delimited files, extract format using Google Cloud Client Library, Extracting data from a table
- response from HTTP POST request to BigQuery REST API URL, Querying
- transformation of JSON HTTP request to Protobufs, Step 2: Routing
- writing rows to insert into tables as newline-delimited JSON, Loading and inserting data
Jupiter Networking, Storage and Networking Infrastructure
Jupyter
- creating Python 3 notebook, Notebooks on Google Cloud Platform
- Magics (BigQuery extensions), Jupyter Magics
  - running a parameterized query, Running a parameterized query
  - using with BigQuery Storage API, Bulk reads using BigQuery Storage API
- using R from Jupyter notebook, Working with BigQuery from R
  - Jupyter Magics for R, Working with BigQuery from R
- working with BigQuery and pandas, Working with BigQuery, pandas, and Jupyter-Working with BigQuery, pandas, and Jupyter
Jupyter Notebooks, Geometry transformations and aggregations
- lines preceded by ! or %, Notebooks on Google Cloud Platform
- lines preceded by !, running using command-line shell, Notebooks on Google Cloud Platform

N

named parameters, Named parameters
- named timestamp parameters, Named timestamp parameters
NaN (Not-a-Number), Standard-Compliant Floating-Point Division
Natural Language API, Unstructured data
navigation functions, Navigation functions
Nearline Storage, Setting up life cycle management on staging buckets
nested fields, Storing data as arrays of structs
- nested, repeated fields, Storing data as arrays of structs
networking
- BigQuery's reliance of Jupiter Networking, Storage and Networking Infrastructure
- minimizing network overhead, Minimizing Network Overhead-Choosing an Efficient Storage Format
  - accepting compressed, partial responses, Compressed, partial responses
  - bulk reads using BigQuery Storage API, Bulk reads using BigQuery Storage API
- security of Google's global network, Infrastructure Security
nodes in deep neural networks, Deep Neural Networks
nondeterministic behavior, queries exhibiting, Caching the Results of Previous Queries
NoSQL
- Cloud Bigtable NoSQL database service, SQL Queries on Data in Cloud Bigtable
- queries on data in Cloud Bigtable, NoSQL Queries based on a row-key prefix
NOT keyword, Filtering with WHERE
NOT MATCHED BY TARGET or NOT MATCHED BY SOURCE, MERGE statement
Not-a-Number (see NaN)
notebooks, Accessing BigQuery from Data Science Tools
- on Google Cloud Platform, Notebooks on Google Cloud Platform-Working with BigQuery, pandas, and Jupyter
  - Jupyter Magics, Jupyter Magics
  - running a parameterized query, Running a parameterized query
  - saving query results to pandas, Saving query results to pandas
- using R from Jupyter notebook, Working with BigQuery from R
NP-hard problems, Storage format: Capacitor
NTH_VALUE function, Navigation functions
NULL values
- cleaner handling with COALESCE, Cleaner NULL-Handling with COALESCE
- CROSS JOIN excluding rows with empty or NULL arrays, Using arrays for generating data
- filtering for in WHERE clause, Finding Unique Values by Using DISTINCT
- in comparisons, Comparisons, Logical Operations
- in dataset CSV filed loaded into BigQuery, Loading from a Local Source
- making scalar functions return, SAFE Functions
- NULL elements in arrays, Creating Arrays by Using ARRAY_AGG
- replacing privacy-suppressed values with, Specifying a Schema
- returning NULL from casting, not an error, Casting and Coercion
numbering functions, Numbering functions
NUMERIC type, Data Types, Functions, and Operators, Summary
- coercions, Casting and Coercion
- precise decimal calculations with, Precise Decimal Calculations with NUMERIC
numeric types
- and functions used with, Numeric Types and Functions-Precise Decimal Calculations with NUMERIC
  - comparisons, Comparisons
  - mathematical functions, Mathematical Functions
  - precise decimal calculations with NUMERIC, Precise Decimal Calculations with NUMERIC
  - SAFE functions, SAFE Functions
  - standard-compliant floating-point division, Standard-Compliant Floating-Point Division
  - types of functions, summary, Numeric Types and Functions
numeric_weights, Examining Model Weights
num_clusters option, Carrying Out Clustering
num_factors option, Matrix Factorization

O

OAuth2 tokens, Step 1: HTTP POST
objects (BigQuery)
- listing with bq ls and appropriate options, BigQuery Objects
- showing details with bq show, Showing details
- updating details with bq update, Updating
OFFSET function, Using arrays for generating data
- retrieving first array item, Working with Arrays
ogr2ogr tool, converting Shapefiles to GeoJSON, Geographic types
on-demand pricing, Controlling Cost
online transaction processing (OLTP) databases, relational, Relational Database Management System
- benefits and drawbacks, Relational Database Management System
Open Database Connectivity (ODBC), JDBC/ODBC drivers
operators
- <, <=, >, >=, and != (or <>) comparison operators, Comparisons
optimization, Optimizing Performance and Cost
- (see also performance and cost, optimizing)
- premature, Principles of Performance
Optimized Row Columnar (ORC) files, Loading Data Efficiently, Storage format: Capacitor
- loading and querying, Loading and querying Parquet and ORC
OPTIONS list
- changing options after table creation, Changing options
- customizing when creating machine learning models, Customizing BigQuery ML-Regularization
- label column and model type for regression model, Training and Evaluating the Model
- using at table creation, Options list
OR keyword, Filtering with WHERE
ORDER BY
- adding a LIMIT to, Data skew
- using to control row order in result set, Sorting with ORDER BY
ordering, preserving using arrays, Using arrays to preserve ordering
ORDINAL indexing of arrays, Using arrays for generating data
OUTER JOIN statement, summary of, OUTER JOIN
OVER clause, Aggregate analytic functions, Navigation functions
- adding PARTITION BY to, Aggregate analytic functions
overfitting, Training
- avoiding by using regularization, Regularization
- decision trees and, Gradient-boosted trees
- defined, Regularization
- reducing in matrix factorization model, Matrix Factorization

P

pandas
- creating a DataFrame to hold query results, Creating a pandas DataFrame
- loading a DataFrame into BigQuery table, Loading a pandas DataFrame
- reading BigQuery table into in-memory DataFrame, Using pandas
- saving query results from Jupyter notebook on GCP to pandas DataFrame, Saving query results to pandas
- working with BigQuery and Jupyter, Working with BigQuery, pandas, and Jupyter-Working with BigQuery, pandas, and Jupyter
parallelization of query execution in BigQuery, Simplicity of Management
parameterized queries, Parameterized queries, Parameterized Queries-Array and struct parameters
- array and struct parameters, Array and struct parameters
- named parameters, Named parameters
- named timestamp parameters, Named timestamp parameters
- parameters added to scheduled queries when invoked, Named timestamp parameters
- positional parameters, Positional parameters
- running from Jupyter notebook on GCP, Running a parameterized query
Parquet files, Storage format: Capacitor
- benefits and drawbacks of, Loading Data Efficiently
- loading and querying, Loading and querying Parquet and ORC
PARSE_TIMESTAMP function, Parsing and Formatting Timestamps
parsing strings, Printing and Parsing
PARTITION BY, Aggregate analytic functions
partitioning, Partitioning
- clustering by the partitioning column, Clustering by the partitioning column
- clustering versus, Reclustering, Reclustering
- partitioning column in a query, Insert SELECT
- partitioning tables to reduce scan size, Partitioning Tables to Reduce Scan Size-Partitioned tables
  - antipattern, table suffixes and wildcards, Antipattern: Table suffixes and wildcards
  - partition filters, BigQuery runtime statically determining, Partitioned tables
  - partitioned tables, Partitioned tables-Partitioned tables
partitioning mode, specifying for bq load, Loading and querying Hive partitions
partitions, Partitioning
- expiration time for, Partitioned tables
- partition ID, storage sets marked with, Partitioning
PATTERN variable, Anatomy of a simple script
Pearson correlation coefficient, Number of bicycles
Pending state, Storage sets
per-query costs, Controlling Cost
- estimating, Estimating per-query cost
performance and cost, optimizing, Optimizing Performance and Cost-Checklist
- checklist for performance improvements, Checklist
- controlling cost, Controlling Cost-Finding the most expensive queries
- increasing query speed, Increasing Query Speed-Optimizing How Data Is Stored and Accessed
  - avoiding overwhelming a worker, Avoiding Overwhelming a Worker-Optimizing user-defined functions
  - caching results of previous queries, Caching the Results of Previous Queries-Accelerating queries with BI Engine
  - minimizing I/O, Minimizing I/O-Reducing the number of expensive computations
  - performing efficient joins, Performing Efficient Joins-JOIN versus denormalization
- key drivers of performance, Key Drivers of Performance
- measuring and troubleshooting query performance, Measuring and Troubleshooting-Visualizing the query plan information
  - measuring speed using BigQuery Workload Tester, Measuring Query Speed Using BigQuery Workload Tester-Measuring Query Speed Using BigQuery Workload Tester
  - measuring speed using REST API, Measuring Query Speed Using REST API
  - reading query plan information, Reading Query Plan Information-Visualizing the query plan information
  - troubleshooting workloads using Stackdriver, Troubleshooting Workloads Using Stackdriver-Troubleshooting Workloads Using Stackdriver
- optimizing how data is stored and accessed, Optimizing How Data Is Stored and Accessed-Side benefits of clustering
  - choosing efficient storage format, Choosing an Efficient Storage Format-Storing data as geography types
  - clustering tables based on high-cardinality keys, Clustering Tables Based on High-Cardinality Keys
  - minimizing network overhead, Minimizing Network Overhead-Choosing an Efficient Storage Format
  - partitioning tables to reduce scan size, Partitioning Tables to Reduce Scan Size-Partitioned tables
- time-insensitive use cases, Time-Insensitive Use Cases-File Loads
  - batch queries, Batch Queries
  - file loads, File Loads
permissions, Security and Compliance
- (see also Identity and Access Management)
- for access to user-defined functions, Persistent UDFs
persistent user-defined functions, Persistent UDFs
personas, What’s Being Clustered?
points, Geographic types
- incorporating geographic point in BigQuery into machine learning, Creating Polygons
polygons, Geographic types
- creating, Creating Polygons
positional parameters, Positional parameters
POST requests (HTTP), Querying, Step 1: HTTP POST
PostgreSQL, Relational Database Management System
- arrays in, Advanced SQL
precision, Choosing the Threshold
predicate functions (GIS), GIS predicate functions
predictions, Powerful Analytics
- (see also machine learning)
- making in recommender system, Making Recommendations
  - batch predictions for all users and movies, Batch predictions for all users and movies
- predicting with classification model, Prediction
- predicting with regression model, Predicting with the Model-Generating batch predictions
  - generating batch predictions, Generating batch predictions
  - TRANSFORM clause in prediction query, The need for TRANSFORM
- predicting with TensorFlow models, Predicting with TensorFlow models
preprocessing functions
- ML.BUCKETIZE, Bucketizing the hour of day
- putting all in TRANSFORM clause for prediction query, The need for TRANSFORM
Pricing Calculator (GCP), Estimating per-query cost
pricing plans, Controlling Cost
primitive roles, Primitive roles
primitives, geographic data in, Geographic types
printing strings, Printing and Parsing
privacy and encryption, Privacy and Encryption-Customer-Managed Encryption Keys
- Customer-Managed Encryption Keys, Customer-Managed Encryption Keys
- Virtual Private Cloud Service Controls, Virtual Private Cloud Service Controls
probability threshold, choosing for classification model, Choosing the Threshold
product features, getting for movies data, Creating input features
product groups, What’s Being Clustered?
product recommendations, What’s Being Clustered?
programmatic development
- accessing BigQuery via Google Cloud Client Library, Google Cloud Client Library-Parameterized queries
  - browsing rows of a table, Browsing the rows of a table
  - copying a table, Copying a table
  - creating a dataset, Creating a dataset
  - creating an empty table with schema, Creating an empty table with schema
  - creating empty table, Creating an empty table
  - dataset information, Dataset information
  - dataset manipulation, Dataset manipulation
  - deleting a dataset, Deleting a dataset
  - deleting a table, Deleting a table
  - extracting data from a table, Extracting data from a table
  - inserting rows into a table, Inserting rows into a table
  - loading a pandas DataFrame, Loading a pandas DataFrame
  - loading from a Google Cloud URI, Loading from a URI
  - loading from a local file, Loading from a local file
  - modifying attributes of a dataset, Modifying attributes of a dataset
  - obtaining table properties, Obtaining table properties
  - querying, Querying-Parameterized queries
  - table management, Table management
  - updating a table's schema, Updating a table’s schema
- accessing BigQuery via REST API, Developing Programmatically-Limitations
  - dataset manipulation, Dataset manipulation
  - queries, limitations of, Limitations
  - querying, Querying
  - table manipulation, Table manipulation
  - using SQL instead of, Table manipulation
programming languages
- Google Cloud Client Library, Google Cloud Client Library
- protobufs and, Simple Queries
- Python, pandas library, Loading a pandas DataFrame
- R language, Working with BigQuery from R
project ID, Retrieving Rows by Using SELECT
projects
- allocation among reserved slots, Scheduler
- in dataset names, Retrieving Rows by Using SELECT
- rebalancing of project and data, Step 3: Job Server
protocol buffers (protobufs), How BigQuery Came About, Step 2: Routing
public user-defined functions, Public UDFs
- community-developed, open source UDFs, Public UDFs
Python
- BigQuery client, three ways of loading data, Creating an empty table with schema
- Cloud Client API, Parameterized Queries
- code for Google Cloud Client Library for BigQuery, Google Cloud Client Library
- hyperparameter tuning in, Hyperparameter tuning in Python

Q

quantiles, Quantiles
queries, Query Essentials-Summary, Query Engine (Dremel)
- (see also Dremel query engine)
- advanced, Advanced Queries-Summary
  - advanced SQL, Advanced SQL-MERGE statement
  - reusable queries, Reusable Queries-Defining constants
  - using advanced functions, Advanced Functions-Summary
  - using operations in languages other than SQL, Beyond SQL-Advanced Functions
- aggregates, Aggregates-A Brief Primer on Arrays and Structs
  - computing using GROUP BY, Computing Aggregates by Using GROUP BY
  - counting records using COUNT, Counting Records by Using COUNT
  - filtering grouped items using HAVING, Filtering Grouped Items by Using HAVING
  - finding unique values using DISTINCT, Finding Unique Values by Using DISTINCT
- batch, Batch Queries
- executing using bq query and specifying the query, Executing Queries
  - setting flags in .bigqueryrc, Executing Queries
- execution by Dremel, Query Execution-Hash join query
  - broadcast JOIN query, Broadcast JOIN query-Broadcast JOIN query
  - hash join query, Hash join query-Hash join query
  - scan-filter-aggregate query, Scan-filter-aggregate query-Stage 2
  - scan-filter-aggregate query with high cardinality, Scan-filter-aggregate query with high cardinality-Broadcast JOIN query
  - scan-filter-count query, Scan-filter-count query-Stage 1
- joining tables, Joining Tables-Saving and Sharing
  - CROSS JOIN, CROSS JOIN
  - INNER JOIN, INNER JOIN
  - JOIN statement, The JOIN Explained
  - OUTER JOIN, OUTER JOIN
- life of a query request, Life of a Query Request-Step 5: Returning the query results
  - HTTP POST request, Step 1: HTTP POST
  - job server, Step 3: Job Server
  - query engine, Step 4: Query engine
  - returning query results, Step 5: Returning the query results
  - routing to REST endpoint, Step 2: Routing
- performance, key drivers of, Key Drivers of Performance
- primer on arrays and structs, A Brief Primer on Arrays and Structs-Joining Tables
  - ARRAY of STRUCT, Array of STRUCT
  - creating ARRAYs using ARRAY_AGG, Creating Arrays by Using ARRAY_AGG
  - tuples, TUPLE
  - working with arrays, Working with Arrays
- querying BigQuery using Jupyter Magics and saving results to pandas DataFrame, Working with BigQuery, pandas, and Jupyter
- querying with Google Cloud Client Library, Querying-Parameterized queries
  - creating a pandas DataFrame, Creating a pandas DataFrame
  - dry run before executing the query, Dry run
  - executing the query, Executing the query
  - parameterized queries, Parameterized queries
- running from Jupyter notebook on GCP
  - saving results to pandas, Saving query results to pandas
- running within notebooks, Jupyter Magics
- saving and sharing, Saving and Sharing-Summary
  - query history and caching, Query History and Caching
  - saved queries, Saved Queries
  - views versus shared queries, Views Versus Shared Queries
- scheduling in BigQuery, Scheduled queries
- simple, Simple Queries-Sorting with ORDER BY
  - aliasing column names with AS, Aliasing Column Names with AS-Filtering with WHERE
  - filtering SELECT results with WHERE, Filtering with WHERE
  - retrieving rows using SELECT, Retrieving Rows by Using SELECT-Retrieving Rows by Using SELECT
  - SELECT*, EXCEPT, REPLACE, SELECT *, EXCEPT, REPLACE
  - sorting with ORDER BY, Sorting with ORDER BY
  - subqueries using WITH, Subqueries with WITH
query engine, distributed (Dremel), Query Engine (Dremel)-Hash join query
Query Masters, Step 4: Query engine, Query Master
- upgrades of, BigQuery Upgrades
query plans, Query Master
- for scan-filter-aggregate query, Stage 0
- for scan-filter-count query, Scan-filter-count query
- reading information in, Reading Query Plan Information-Visualizing the query plan information
  - obtaining query plan information from job details, Obtaining query plan information from the job details
  - visualizing query plan information, Visualizing the query plan information-Visualizing the query plan information
QUERY_TEXT variable, Querying, Executing Queries
question answering, Summary of model types

R

r (raw) prefix for string literals, Regular Expressions
R language, working with BigQuery from, Working with BigQuery from R-Cloud Dataflow
race conditions, preventing in table schema updates, Updating a table’s schema
RAND function, Query History and Caching, Random number generator
random number generator, Random number generator
RANGE, Aggregate analytic functions
RANK function, Numbering functions
- difference from DENSE_RANK and ROW_NUMBER in handling ties, Numbering functions
readSessionUser role, Predefined roles
recall, Choosing the Threshold
reclustering, Reclustering
recommender systems, Recommender, Summary of model types, Recommender Systems-Training hybrid recommendation model
- incorporating user and movie information, Incorporating User and Movie Information-Training hybrid recommendation model
  - creating input features, Creating input features-Creating input features
  - obtaining user and product factors, Obtaining user and product factors
  - training hybrid recommendation model, Training hybrid recommendation model
- making recommendations, Making Recommendations-Batch predictions for all users and movies
  - batch predictions for all users and movies, Batch predictions for all users and movies
  - customer targeting, Customer targeting
- matrix factorization of ratings matrix, Matrix Factorization-Matrix Factorization
- MovieLens dataset, using, The MovieLens Dataset
record-oriented stores, How BigQuery Came About, Storage format: Capacitor
Reed-Solomon encoding, Physical storage: Colossus
- (see also erasure encoding)
REGEXP_CONTAINS function, Regular Expressions
REGEXP_EXTRACT function, Regular Expressions
REGEXP_EXTRACT_ALL function, Regular Expressions
REGEXP_REPLACE function, Regular Expressions
regions, Zones, Regions, and Multiregions
- regional failures, Regional failures
- routing query requests to, Step 2: Routing
regression, Regression, Summary of model types
- building a regression model, Building a Regression Model-More-Complex Regression Models
  - choosing the label, Choose the Label
  - examining model weights, Examining Model Weights-More-Complex Regression Models
  - exploring the dataset to find features, Exploring the Dataset to Find Features-Number of bicycles
  - predicting ratings, Obtaining user and product factors
  - predicting with the model, Predicting with the Model-Generating batch predictions
  - training and evaluating the model, Training and Evaluating the Model-Bucketizing the hour of day
- more complex regression models, More-Complex Regression Models-Human insights and auxiliary data
  - deep neural networks, Deep Neural Networks-Deep Neural Networks
  - gradient-boosted trees, Gradient-boosted trees
  - human insights and auxiliary data, Human insights and auxiliary data
regular expressions
- using on strings, Regular Expressions
- using WITH clause to abstract away expensive regex function, Caching intermediate results
regularization in BigQuery ML, Regularization
regulatory compliance, Regulatory Compliance-Data Exfiltration Protection
- data exfiltration protection, Data Exfiltration Protection
- data locality, Data Locality
- data loss prevention, Data Loss Prevention-Data Loss Prevention
- GCP features providing compliance for BigQuery, Security and Compliance
- removing all transactions related to a single individual, Removing All Transactions Related to a Single Individual-Crypto-shredding
- restricting access to subsets of data, Restricting Access to Subsets of Data-Dynamic filtering based on user
relational database management systems, Relational Database Management System
remote procedure call (RPC) interface exposed by worker shards, Worker Shard
repeated fields, Storing data as arrays of structs
- nested, repeated fields, Storing data as arrays of structs
- using arrays to store, Using arrays to store repeated fields
REPLACE, using with SELECT, SELECT *, EXCEPT, REPLACE
replicated encoding, Physical storage: Colossus
reservations, Step 2: Routing
- flat-rate, Controlling Cost
- reserved slots, Scheduler
- updating size with bq update, Updating
resources
- access to, management by IAM, Resource
- labels for, Labels
REST APIs
- accessing BigQuery via, Accessing BigQuery via the REST API-Limitations
  - dataset manipulation with HTTP request, Dataset manipulation
  - queries, limitations of, Limitations
  - querying, Querying
  - table manipulation with HTTP requests, Table manipulation
  - using SQL instead of, Table manipulation
- batching multiple BigQuery requests, Batching multiple requests
- bq command invoking API exposed by BigQuery, Loading from a Local Source
- measuring query speed using BigQuery REST API, Measuring Query Speed Using REST API
- streaming data directly into BigQuery via, Powerful Analytics
restoring deleted records and tables, Restoring Deleted Records and Tables
restoring deleted tables, Deleting a table
restricting access to subsets of data, Restricting Access to Subsets of Data-Dynamic filtering based on user
- authorized views, Authorized views
- dynamic filtering based on user, Dynamic filtering based on user
reusable queries, Reusable Queries-Defining constants
- parameterized queries, Parameterized Queries-Array and struct parameters
  - array and struct parameters, Array and struct parameters
  - named parameters, Named parameters
- reusing parts of queries, Reusing Parts of Queries-Defining constants
  - correlated subquery, Correlated subquery
  - defining constants, Defining constants
  - WITH clause, WITH clause
- SQL user-defined functions, SQL User-Defined Functions-Public UDFs
  - public UDFs, Public UDFs
- user-defined functions
  - persistent UDFs, Persistent UDFs
REVERSE function, Transformation Functions
roles, Role-Custom roles
- custom, Custom roles
- predefined, Predefined roles
- primitive, Primitive roles
ROUND function, Mathematical Functions
ROW_NUMBER function, Limiting large sorts, Numbering functions
RPAD function, Transformation Functions
RTRIM function, Transformation Functions
run-length encoding, Storage format: Capacitor

S

SAFE functions, SAFE Functions
- SAFE_CAST, Casting and Coercion
sandbox, using to experiment with BigQuery, Estimating per-query cost
saving queries, Saved Queries
- making saved queries shareable, Saved Queries
scalar functions, Numeric Types and Functions
- prefixing with SAFE to return NULL, SAFE Functions
scalar query parameters, Array and struct parameters
scan-filter-aggregate query example, Scan-filter-aggregate query-Stage 2
- stage 0, Stage 0
- stage 1, Stage 1
- stage 2, Stage 2
scan-filter-aggregate query with high cardinality, Scan-filter-aggregate query with high cardinality-Broadcast JOIN query
scan-filter-count query example, Scan-filter-count query-Stage 1
- post-stage 0, Post–stage 0
- stage 0, Stage 0
- stage 1, Stage 1
scatter plots, drawing in pandas from saved query results, Saving query results to pandas, Working with BigQuery, pandas, and Jupyter
scheduler, Query Master
- assigning slots to queries, Scheduler
scheduling of queries, Scheduled queries
schemas
- authoritative schema for managed storage, Managed Storage
- changing to use arrays, Using arrays to store repeated fields
- complex, using JSON file for, Complex schema
- creating empty table with schema, Creating an empty table with schema
- examining details of insert job to ascertain the schema, Troubleshooting Workloads Using Stackdriver
- for dataset tables loaded into BigQuery, Loading from a Local Source
- in external table definitions for CSV and JSON files, Temporary table
- information, Building queries dynamically
- not specifying for Parquet and ORC files, Loading and querying Parquet and ORC
- schema of imported TensorFlow model, Predicting with TensorFlow models
- specifying for dataset loaded into BigQuery, Specifying a Schema-Specifying a Schema
- star schemas applied to clustered tables, Side benefits of clustering
- updating table schema using Google Cloud Client Library, Updating a table’s schema
scipy package (Python), Cloud Dataflow
scripting, Scripting-Advanced Functions
- anatomy of a simple script, Anatomy of a simple script
- loops, Looping
- saving scripts in stored procedures, Stored procedures
- sequence of statements, A sequence of statements
- using for hyperparameter tuning, Hyperparameter tuning using scripting
- using WITH clauses, joins, correlated subqueries, or GROUP BY instead of, A sequence of statements
security
- BigQuery features supporting, Simplicity of Management
- Cloud Security Command Center, Cloud Security Command Center
- GCP features providing security for BigQuery, Security and Compliance
- infrastructure provided by public cloud services, Administering and Securing BigQuery
- infrastructure security for BigQuery, Infrastructure Security-Infrastructure Security
- managing access control for BigQuery using IAM, Administering and Securing BigQuery
- managing access control for BigQuery with IAM, Identity and Access Management-Resource
- privacy and encryption, Privacy and Encryption-Customer-Managed Encryption Keys
- verifying effectiveness of, Dashboards, Monitoring, and Audit Logging
SELECT * ... LIMIT 10, Side benefits of clustering
SELECT * EXCEPT statement, Be purposeful in SELECT
SELECT * LIMIT statement, Be purposeful in SELECT
SELECT * REPLACE statement, Storing data as geography types
SELECT * statement, selecting all columns in a table, SELECT *, EXCEPT, REPLACE
SELECT statement, Query Essentials
- being purposeful in, Be purposeful in SELECT
- combining with UNION ALL, A Brief Primer on Arrays and Structs
- conditional expressions using Booleans, Conditional Expressions
- filtering with WHERE clause, Filtering with WHERE
- from UNNEST, UNNEST an Array
- in CREATE OR REPLACE MODEL, data split in, Controlling Data Split
- in WITH clause, Numbering functions
- INSERT VALUES with SELECT subquery, Insert VALUES with subquery SELECT
- leading commas in SELECT clause, Creating Arrays by Using ARRAY_AGG
- limits on results for SELECT queries, Step 5: Returning the query results
- preparing training dataset, Training and Evaluating the Model
- reducing data being read, Reducing data being read
- retrieving rows with, Retrieving Rows by Using SELECT-Retrieving Rows by Using SELECT
- SELECT DISTINCT, Finding Unique Values by Using DISTINCT
- withing a loop, Looping
self-joins
- of large tables, avoiding, Avoiding self-joins of large tables-Reducing the data being joined
- using window function instead of, Using a window function instead of self-join
sentiment analysis, Summary of model types
serverless (BigQuery), BigQuery: A Serverless, Distributed SQL Engine
SESSION_USER function, Dynamic filtering based on user
SHA hashing algorithms, MD5 and SHA
Shapefiles, geospatial data in, Geographic types
shards
- in BigQuery upgrades, BigQuery Upgrades
- scheduler farming out work to query shards, Step 4: Query engine
- sharding a query to two or more shards to prevent spilling to disk, Shuffle
- shuffling to sinks, Shuffle
- worker shard allocation by scheduler, Query Master
sharing queries
- making saved queries shareable, Saved Queries
- turning off link sharing to make queries unshareable, Saved Queries
- views versus shared queries, Views Versus Shared Queries
shuffle sinks, Scheduler, Shuffle
shuffles, Storage and Networking Infrastructure
- in BigQuery queries, Shuffle
slots in BigQuery, Separation of Compute and Storage, Step 4: Query engine, Worker Shard
- assignment by scheduler to queries, Scheduler
- determining how many slots were used by a query, Scan-filter-count query
- purchase of reserved slots, Scheduler
- returned by scheduler, Query Master
slowly-changing dimensions, The Basics
Software as a Service (SaaS) applications, loading data into BigQuery, Data Transfer Service
sorting
- clustering data, Clustering
- distributed sort in scan-filter-aggregate query with high cardinality, Distributed sort
- limiting large sorts, Limiting large sorts
Spanner, Step 5: Returning the query results
- database index (IDX), helping find storage sets within a range, Partitioning
Spark, MapReduce Framework
- writing ETL pipeline and executing it on Hadoop cluster, Using the Streaming API directly
SPLIT function, A Brief Primer on Arrays and Structs
split points for distributed sort, Distributed sort
splittable files, Loading from a Local Source
Spotify, use of BigQuery, Data Processing Architectures
SQL (Structured Query Language), Relational Database Management System
- advanced, Advanced SQL
  - arrays, working with, Working with Arrays-Window Functions
  - Data Definition Language and Data Manipulation Language, Data Definition Language and Data Manipulation Language-MERGE statement
  - table metadata, using, Table Metadata-Time travel
- ambiguities of Standard SQL, Advanced SQL
- BigQuery's full-featured support for SQL:2011, Powerful Analytics
- BigQuery, serverless distributed SQL engine, BigQuery: A Serverless, Distributed SQL Engine
- creating string containing SQL to be executed by BigQuery, Querying
- creating tables in, Setting up destination table
- deleting a table or view from BigQuery, Data Management (DDL and DML)
- dialect used in bq command-line tool, Executing Queries
- DML (Data Manipulation Language), DML
- execution by worker shard, Worker Shard
- for computation of data in the cloud, reasons for choosing, How BigQuery Came About
- legacy SQL used by Dremel, Simple Queries
- queries on data in Cloud Bigtable, SQL Queries on Data in Cloud Bigtable-Improving performance
- queries on distributed datasets, Hadoop runningSpark, MapReduce Framework
- SQL/MM 3 specification for spatial functions, Working with GIS Functions
- SQL:2011, BigQuery: A Serverless, Distributed SQL Engine
- standard SQL used by BigQuery, Simple Queries
- support for standard SQL in BigQuery, launch of, How BigQuery Came About
- user-defined functions, SQL User-Defined Functions-Public UDFs
- using instead of client API to access BigQuery programmatically, Table manipulation
- using to automate schema creation, Specifying a Schema
SQL injection attacks, Parameterized queries
- using parameterized queries to prevent, Named parameters
SSL 3.0 exploit, Infrastructure Security
SSL/TLS channels, access to API gateway infrastructure, Infrastructure Security
Stackdriver, Integration with Google Cloud Platform
- exporting logs, Exporting Stackdriver Logs-Exporting Stackdriver Logs
- monitoring and audit logging, Stackdriver monitoring and audit logging
- using to troubleshoot workloads, Troubleshooting Workloads Using Stackdriver-Troubleshooting Workloads Using Stackdriver
standardize_features option, Carrying Out Clustering
star schemas, Side benefits of clustering
STARTS_WITH function, String Manipulation Functions
statistical functions, Useful Statistical Functions-Correlation
- for correlation, Correlation
- for quantiles, Quantiles
storage, Storage-Meta-File
- BigQuery storage system providing table and file abstractions, How BigQuery Came About
- choosing efficient storage format, Choosing an Efficient Storage Format-Storing data as geography types
  - internal vs. external data sources, Internal versus external data sources
  - setting up life cycle management on staging buckets, Setting up life cycle management on staging buckets
  - storing data as arrays of structs, Storing data as arrays of structs-Storing data as arrays of structs
  - storing data as geography types, Storing data as geography types-Storing data as geography types
- managed, in BigQuery, Managed Storage
- metadata, Metadata-Meta-File
  - clustering, Clustering
  - DML (Data Manipulation Language), DML
  - meta-file, Meta-File
  - partitioning, Partitioning
  - storage optimization, Storage optimization
  - storage sets, Storage sets
  - time travel, Time travel
- of intermediate query results, Scheduler
- physical storage in Colossus, Physical storage: Colossus-Physical storage: Colossus
- separation from compute in BigQuery, ETL, EL, and ELT, Separation of Compute and Storage
- storage format, Capacitor, Storage format: Capacitor-Storage format: Capacitor
- storing data as arrays, Working with Arrays
Storage API (BigQuery), bulk reads using, Bulk reads using BigQuery Storage API
storage encoding (see encoding)
storage sets, Storage sets
- new, created by reclustering, Reclustering
- optimized, Storage optimization
- representing partitions in metadata, Partitioning
- with clustering, Clustering
stored procedures, Insert VALUES with subquery SELECT
- saved scripts in, Stored procedures
streaming data
- Cloud Dataflow using streaming inserts to load data into BigQuery, Writing a Dataflow job
- ingest of, support by BigQuery, Powerful Analytics
- into BigQuery, Loading from a Local Source
- newly inserted rows in streaming table, Inserting rows into a table
- streaming inserts into BigQuery via Cloud Pub/Sub and Cloud Dataflow, File Loads
- to ingestion-timed partitioned table, Partitioned tables
- using BigQuery streaming API directly, Using the Streaming API directly
- using time travel to run repeatable query over table fed via stream, Time travel
string functions, String Functions-Working with TIMESTAMP
- internationalization of strings, Internationalization
- printing and parsing strings, Printing and Parsing
- SAFE prefix, SAFE Functions
- string manipulation functions, String Manipulation Functions
- summary of, Summary of String Functions
- transformation functions, Transformation Functions
STRING type, Data Types, Functions, and Operators, Summary
- converting arrays to strings to, Array functions
- detected by AUTO partitioning mode, Loading and querying Hive partitions
- in fingerprint function, Fingerprint function
strings
- arrays of, Array functions
- casting to FLOAT64, Loading from a Local Source
- creating query doing string formatting, security risks of, Parameterized queries
- explicitly converting to INT64, Casting and Coercion
- geographic data in, Geographic types
- in schema autodetection by BigQuery, Specifying a Schema
- NUMERIC types ingested into BigQuery as strings, Precise Decimal Calculations with NUMERIC
- query provided in, Executing Queries
- representing as array of Unicode characters, array of bytes, or array of Unicode code points, Internationalization
- SPLIT function, A Brief Primer on Arrays and Structs
STRPOS function, String Functions, String Manipulation Functions
STRUCT keyword
- ARRAY of STRUCT, Array of STRUCT
- ending up with tuple or anonymous struct instead of, TUPLE
STRUCT type, Data Types, Functions, and Operators, Summary
structures
- storing data as arrays of structs, Storing data as arrays of structs-Storing data as arrays of structs
- struct parameters, Array and struct parameters
ST_AsGeoJSON function, Geographic types
ST_AsText function, Geographic types
ST_CENTROID_AGG function, Geometry transformations and aggregations
ST_Contains function, Working with GIS Functions, GIS predicate functions
ST_CoveredBy function, GIS predicate functions
ST_Distance function, GIS Measures
ST_DWithin function, GIS predicate functions
ST_GeogFromGeoJSON function, Geographic types
ST_GeogFromText function, Geographic types
ST_GeogPoint function, Geographic types
ST_GeoHash function, Creating Polygons, Human insights and auxiliary data
ST_Intersects function, GIS predicate functions
ST_MakeLine function, Creating Polygons
ST_MakePolygon function, Creating Polygons
ST_SnapToGrid function, GIS Measures
ST_UNION function, Geometry transformations and aggregations
subqueries, Query Engine (Dremel)
- correlated, Correlated subquery
  - for cases seeming to require a script, A sequence of statements
- with WITH clause, Subqueries with WITH
SUBSTR function, String Functions, String Manipulation Functions
- prefixing with SAFE, SAFE Functions
suffixes (table), Antipattern: Table suffixes and wildcards
SUM function, using NUMERIC type, Precise Decimal Calculations with NUMERIC
superQuery, Estimating per-query cost
supervised machine learning, Machine Learning in BigQuery
SYSTEM_TIME AS OF, Restoring Deleted Records and Tables

T

table-valued functions, Numeric Types and Functions
tables, Metadata
- avoiding creation of tables with same name, Deleting a table
- browsing rows using Google Cloud Client Library, Browsing the rows of a table
- clustered, performance optimizations with, Performance optimizations with clustered tables
- copying between datasets using bq cp, Copying datasets
- copying between datasets using Google Cloud Client Library, Copying a table
- creating empty table using Google Cloud Client Library, Creating an empty table
- creating empty table with schema, using Google Cloud Client Library, Creating an empty table with schema
- creating in SQL, Setting up destination table
- creating staging table for updates to apply, DML
- creating with bq mk --table, Creating a table
- creating with complex schema, Complex schema
- deleting a table using Google Cloud Client Library, Deleting a table
- extracting data from using bq extract, Extracting data
- extracting data from, using Google Cloud Client Library, Extracting data from a table
- inserting rows into with bq insert, Loading and inserting data
- inserting rows using Google Cloud Client Library, Inserting rows into a table
- joining, Joining Tables-Saving and Sharing
  - project and data rebalancing, Step 3: Job Server
- management using Google Cloud Client Library, Table management
- manipulating through HTTP requests to BigQuery REST API, Table manipulation
- metadata, Table Metadata-Time travel
- obtaining properties using Google Cloud Client Library, Obtaining table properties
- query results functionally equivalent to, Step 5: Returning the query results
- recovering deleted tables, Restoring Deleted Records and Tables
- structured storage at table level, Managed Storage
- table/view in dataset names, Retrieving Rows by Using SELECT
- updating schema using Google Cloud Client Library, Updating a table’s schema
tagging
- BigQuery tagging a table with each update, Updating a table’s schema
- tags and labels, Labels and tags
- using label to tag tables with characteristics, Creating a table
temporary tables
- for scripts, Temporary tables
- reading directly from, avoiding, Caching the Results of Previous Queries
- using for federated queries of external data sources, Temporary table
TensorFlow, Bulk reads using BigQuery Storage API, Machine Learning in BigQuery, Support for TensorFlow-Predicting with TensorFlow models
- BigQueryReader, TensorFlow’s BigQueryReader
- exporting BigQuery model as SavedModel, Exporting to TensorFlow
- exporting BigQuery table into TensorFlow records on GCS using Apache Beam/Cloud Dataflow, Apache Beam/Cloud Dataflow
- predicting with TensorFlow models, Predicting with TensorFlow models
- using pandas, Using pandas
text classification, Summary of model types
text editors, Specifying a Schema
text summarization, Summary of model types
text, Well Known Text (WKT) format for geographic strings, Geographic types
threshold (probability), choosing for classification model, Choosing the Threshold
time functions prefixed with SAFE, SAFE Functions
time travel
- querying historical state of a table, Time travel
- using to restore deleted tables, Deleting a table, Time travel
TIME type, Date, Time, and DateTime, Summary
time utility, Measuring Query Speed Using REST API
time zones, Parsing and Formatting Timestamps, Date, Time, and DateTime
time-insensitive use cases, Time-Insensitive Use Cases-File Loads
- batch queries, Batch Queries
- file loads, File Loads
TIMESTAMP type, Data Types, Functions, and Operators, Working with TIMESTAMP-Date, Time, and DateTime, Summary
- arithmetic with, Arithmetic with Timestamps
- DATE, TIME, and DATETIME, Date, Time, and DateTime
- detection by AUTO partitioning mode, Loading and querying Hive partitions
- extracting calendar parts, Extracting Calendar Parts
- parsing and formatting, Parsing and Formatting Timestamps
timestamps
- named timestamp parameters, Named timestamp parameters
- Unix timestamp in milliseconds, Job Management
- using to restore table version from past time within seven days, Deleting a table
TIMESTAMP_MILLIS function, Extracting Calendar Parts
Titan chip, Infrastructure Security
tools for direct reads from BigQuery Storage API, Bulk reads using BigQuery Storage API
TO_JSON_STRING function, Specifying a Schema, Array functions
training datasets, creating for regression model, Creating a Training Dataset
training models
- classification model, Training
- data split with evaluation, controlling, Controlling Data Split
- deep neural network model, Deep Neural Networks
- hybrid recommendation model, Training hybrid recommendation model
- linear regression model, Training and Evaluating the Model
- matrix factorization model, Matrix Factorization
Transfer Appliance, Data Migration Methods
transfers of data into BigQuery, Transfers and Exports-Cross-region dataset copy
- Data Transfer Service, Data Transfer Service-Cross-region dataset copy
  - creating a transfer job, Create a transfer job
  - cross-region dataset copy, Cross-region dataset copy
  - data locality, Data locality
  - scheduled queries, Scheduled queries
  - setting up destination table, Setting up destination table
transformations
- TRANSFORM clause in CREATE OR REPLACE MODEL, data split in, Controlling Data Split
- TRANSFORM clause, using for regression model, The need for TRANSFORM-Generating batch predictions
- transforming TensorFlow records with tf.transform, Apache Beam/Cloud Dataflow
TRIM function, Transformation Functions
tuples, TUPLE
Twitter, use of BigQuery, Data Processing Architectures

U

UDFs (see user-defined functions)
undoing deletions of records and tables, Restoring Deleted Records and Tables
Unicode strings in BigQuery, Internationalization
UNION ALL, using with SELECT, A Brief Primer on Arrays and Structs
union of geography types, Geometry transformations and aggregations
Unix epoch, number of seconds from, Extracting Calendar Parts
Unix shell, using bash to get access tokens, Table manipulation
UNIX_MILLIS function, Extracting Calendar Parts
UNIX_SECONDS, Aggregate analytic functions
UNNEST function, A Brief Primer on Arrays and Structs, UNNEST an Array, Storing data as arrays of structs
- flattening arrays, in FROM clause, Using arrays to store repeated fields
unstructured data, Unstructured data, Summary of model types
- converting to structured data, Unstructured data
UPDATE statement, DML
- adding entry to an array, Updating row values
- applying no-op UPDATE to a partition, Reclustering
- updating row values, Updating row values
updates, BigQuery not designed for very-high-frequency DML updates, DML
upgrades to BigQuery, BigQuery Upgrades
URIs
- BigQuery, Accessing BigQuery via the REST API
- loading BigQuery table directly from Google Cloud URI, Loading from a URI
URLs
- BigQuery REST URLs, Dataset manipulation
- HTTP DELETE request to BigQuery REST API URL, Table manipulation
user role, Predefined roles
user-defined functions, Numeric Types and Functions
- JavaScript, JavaScript UDFs-JavaScript UDFs
- optimizing, Optimizing user-defined functions
- SQL, SQL User-Defined Functions-Public UDFs
  - persistent UDFs, Persistent UDFs
  - public UDFs, Public UDFs
users
- authorizing, Authorizing Users
- dynamic filtering based on, Dynamic filtering based on user
UTF-8 encoding, Internationalization
UUIDs (universally unique identifiers), UUID

V

variables
- declaring constants as, Defining constants
- declaring for stored procedures, Parameters to stored procedures
versions (BigQuery), Accessing BigQuery via the REST API
views
- authorized, Authorized views
- creating from queries using bq mk, Creating views
- versus shared queries, Views Versus Shared Queries
- updating query corresponding to, using bq update, Updating
Virtual Private Cloud Service Controls (VPC-SC), Security and Compliance, Virtual Private Cloud Service Controls
visualizations
- drawing scatter plot in pandas from saved query results, Saving query results to pandas, Working with BigQuery, pandas, and Jupyter
- of geospatial data, Geometry transformations and aggregations
- plotting interactive map using Python folium package, Working with BigQuery, pandas, and Jupyter
- visualizing query plan information, Visualizing the query plan information-Visualizing the query plan information
- visualizing the billing report, Visualizing the billing report

W

web UI (BigQuery)
- newly inserted rows in streaming table, Inserting rows into a table
- one-time data loads from, Loading from a Local Source
- saving and sharing queries from, Saved Queries
- transfers of data into BigQuery, Data locality
- viewing persistent user-defined function, Persistent UDFs
weights
- examining for linear regression model, Examining Model Weights-More-Complex Regression Models
- joining with user table in recommender system, Creating input features
- user and product factors for recommender system, Obtaining user and product factors
Well Known Text (WKT), Geographic types
- converting geographies to/from strings in, Geographic types
WGS84 ellipsoid, Working with GIS Functions, Geographic types
What-If tool, Examining Model Weights
WHERE clause
- Boolean expressions in, Logical Operations
- casting in, Loading from a Local Source
- comparisons and NULL values, Comparisons
- correlated subqueries in, Correlated subquery
- filtering for NULL values in, Finding Unique Values by Using DISTINCT
- filtering results returned by SELECT, Filtering with WHERE
- GIS predicate functions in, GIS predicate functions
- LIKE operator, SELECT *, EXCEPT, REPLACE
- partitioning and clustering tables in, Insert SELECT
- using GROUP BY instead of, Computing Aggregates by Using GROUP BY
WHILE loop, Looping
wildcards
- using for file paths with bq mkdef and bq load, Wildcards
- using to search tables, Antipattern: Table suffixes and wildcards
window functions, Window Functions-Table Metadata
- aggregating analytic functions, Aggregate analytic functions
- navigation functions, Navigation functions
- numbering functions, Numbering functions
- using instead of self-join, Using a window function instead of self-join
WITH clause
- for cases seeming to require a script, A sequence of statements
- frequent use of, caching query results instead of, Caching intermediate results
- holding constants, Defining constants
- reusing parts if queries in, WITH clause
- SELECT statement in, Numbering functions
- using for subqueries, Subqueries with WITH
- using to abstract away expensive regex function, Caching intermediate results
- using user-defined functions in, SQL User-Defined Functions
worker shards
- allocation by scheduler, Query Master
- avoiding overwhelming a worker, Avoiding Overwhelming a Worker-Optimizing user-defined functions
  - data skew, Data skew
  - limiting large sorts, Limiting large sorts
  - optimizing user-defined functions, Optimizing user-defined functions
- functions of, Worker Shard
- JavaScript UDFs limited to single worker, JavaScript UDFs
Workload Tester, using to measure query speed, Measuring Query Speed Using BigQuery Workload Tester-Measuring Query Speed Using BigQuery Workload Tester
workloads, troubleshooting using Stackdriver, Troubleshooting Workloads Using Stackdriver-Troubleshooting Workloads Using Stackdriver

X

XGBoost machine learning model, Gradient-boosted trees

Y

YouTube Channel, transferring data from, Create a transfer job

Z

zless, Loading from a Local Source
zones, Zones, Regions, and Multiregions
- zonal failures, Zonal failures

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Index

Create new playlist

Sign In

Sign Up

Index

Symbols

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

Table of Contents for
Index