Week 5 Process/Suppliment Contents | Notion
Introduction to Batch processing
Introduction to Spark
(Optional) Installing Spark on Linux
First Look at Spark/PySpark
Spark DataFrames
(Optional) Preparing Yellow and Green Taxi Data
SQL with Spark
Anatomy of a Spark Cluster
GroupBy in Spark
Joins in Spark
(Optional) Operations on Spark RDDs
(Optional) Spark RDD mapPartition
Connecting to Google Cloud Storage
Creating a Local Spark Cluster
Setting up a Dataproc Cluster
Connecting Spark to Big Query