DataFrames
categorical variables
correlation
covariance
CSV file
describe() function
frequent items
horizontal stacking
inner join
new creation
PostgreSQL
joining
arguments
Cassandra table
full outer joins
inner joins
left outer joins
right outer joins
students table
subjects table
types of joins
JSON file
MongoDB
MySQL
ORC file
Parquet file
PostgreSQL
removing duplicate records
sample records (
see
Sampling data)
simple SQL creation
create aliases
filtering on case-sensitive
use alias columns
using column names
where clause filtering
summary() function
swimming competition (
see
Swimming competition, dataframes)
temp view creation
vertical stacking
new creation
PostgreSQL
SQL commands
Descriptive statistics
agg() function
corrData.json
counting, number of elements
population variance
pyspark.sql.functions submodule
sample mean
sample variance
spark.read.json function
summation, mean, and standard deviation
variance, mean, and standard deviation