Main Source we may used in this session!
https://spark.apache.org/docs/latest/spark-standalone.html
https://spark.apache.org/docs/latest/submitting-applications.html
For this section, we might try to dive deeper into the under-the-hood knowledge, which might be interesting to know about or get in touch with :D
# Convert notebook to Python script
jupyter nbconvert --to=script notebook.ipynb
# Convert to HTML
jupyter nbconvert --to=html notebook.ipynb
# Convert to PDF
jupyter nbconvert --to=pdf notebook.ipynb
# Execute notebook and convert (useful for reports)
jupyter nbconvert --execute --to=html notebook.ipynb
# Start Jupyter Notebook server
jupyter notebook
# Start Jupyter Lab (modern interface)
jupyter lab
# Start with specific port
jupyter notebook --port=8889
# Start without opening browser
jupyter notebook --no-browser
# Start in a specific directory
jupyter notebook --notebook-dir=/path/to/directory
# Install notebook extensions
pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
# Enable Table of Contents extension
jupyter nbextension enable toc2/main
# List available kernels
jupyter kernelspec list
# Install a new kernel
python -m ipykernel install --name env_name --display-name "Python (env_name)"
# Remove a kernel
jupyter kernelspec uninstall env_name
These are used inside notebook cells:
# Show all magic commands
%lsmagic
# Time execution of a cell
%%time
# Run shell commands
!ls -la
# Import notebook as module
%run another_notebook.ipynb
# Load external files
%load script.py
# Display matplotlib plots inline
%matplotlib inline
# Auto-reload modules
%load_ext autoreload
%autoreload 2
# Display all variables
%whos
# SQL magic for database queries
%load_ext sql
%sql SELECT * FROM table
# Debug with pdb
%debug