How it works...

The two key concepts used for this code snippet are:

  • spark.read.csv: This SparkSession method returns a DataFrameReader object that encompasses the classes and functions that will allow us to read CSV files from a filesystem
  • spark.sql: This allows us to execute Spark SQL statements

For more information, please refer to the preceding chapters on Spark DataFrames, or refer to the PySpark master documentation of the pyspark.sql module at http://spark.apache.org/docs/2.3.0/api/python/pyspark.sql.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset