How to do it...

In this example, we will extend our original data with the form factor for each model of Apple's computer:

models_df = sc.parallelize([
('MacBook Pro', 'Laptop')
, ('MacBook', 'Laptop')
, ('MacBook Air', 'Laptop')
, ('iMac', 'Desktop')
]).toDF(['Model', 'FormFactor'])

models_df.createOrReplaceTempView('models')

sample_data_schema.createOrReplaceTempView('sample_data_view')

spark.sql('''
SELECT a.*
, b.FormFactor
FROM sample_data_view AS a
LEFT JOIN models AS b
ON a.Model == b.Model
ORDER BY Weight DESC
''').show()
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset