Main Source we may used in this session!

https://spark.apache.org/docs/latest/spark-standalone.html

https://spark.apache.org/docs/latest/submitting-applications.html

For this section, we might try to dive deeper into the under-the-hood knowledge, which might be interesting to know about or get in touch with :D

Step Snap 1: [Essential Jupyter Notebook Commands]

Converting Notebooks

# Convert notebook to Python script
jupyter nbconvert --to=script notebook.ipynb

# Convert to HTML
jupyter nbconvert --to=html notebook.ipynb

# Convert to PDF
jupyter nbconvert --to=pdf notebook.ipynb

# Execute notebook and convert (useful for reports)
jupyter nbconvert --execute --to=html notebook.ipynb

Managing Jupyter Server

# Start Jupyter Notebook server
jupyter notebook

# Start Jupyter Lab (modern interface)
jupyter lab

# Start with specific port
jupyter notebook --port=8889

# Start without opening browser
jupyter notebook --no-browser

# Start in a specific directory
jupyter notebook --notebook-dir=/path/to/directory

Notebook Extensions

# Install notebook extensions
pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user

# Enable Table of Contents extension
jupyter nbextension enable toc2/main

Kernel Management

# List available kernels
jupyter kernelspec list

# Install a new kernel
python -m ipykernel install --name env_name --display-name "Python (env_name)"

# Remove a kernel
jupyter kernelspec uninstall env_name

Cell Magic Commands

These are used inside notebook cells:

# Show all magic commands
%lsmagic

# Time execution of a cell
%%time

# Run shell commands
!ls -la

# Import notebook as module
%run another_notebook.ipynb

# Load external files
%load script.py

# Display matplotlib plots inline
%matplotlib inline

# Auto-reload modules
%load_ext autoreload
%autoreload 2

Data Analysis Helpers

# Display all variables
%whos

# SQL magic for database queries
%load_ext sql
%sql SELECT * FROM table

# Debug with pdb
%debug

Export Environment