date_trunc is a date-time truncation function used to truncate datetime type data to a specified precision (such as year, month, day, hour, etc.).
date_trunc originally comes from PostgreSQL database system, not SparkSELECT
    date_trunc('hour', lpep_pickup_datetime) AS hour,
    PULocationID AS zone,
    SUM(total_amount) AS amount,
    COUNT(1) AS number_records
FROM
    green
WHERE
    lpep_pickup_datetime >= '2020-01-01 00:00:00'
GROUP BY
    1, 2
The example above truncates timestamps to the hour level, facilitating grouping and statistics by hour.
Different SQL systems have similar but not identical functions:
date_trunc('hour', timestamp)DATE_FORMAT(timestamp, '%Y-%m-%d %H:00:00')DATETRUNC(hour, timestamp) or using date part extraction functionsTRUNC(timestamp, 'HH')TRUNC(timestamp, 'HH')