Step Snap 1: [Mastering Spark Date Functions]

Basic Introduction

date_trunc is a date-time truncation function used to truncate datetime type data to a specified precision (such as year, month, day, hour, etc.).

Origin

Example Usage

SELECT
    date_trunc('hour', lpep_pickup_datetime) AS hour,
    PULocationID AS zone,
    SUM(total_amount) AS amount,
    COUNT(1) AS number_records
FROM
    green
WHERE
    lpep_pickup_datetime >= '2020-01-01 00:00:00'
GROUP BY
    1, 2

The example above truncates timestamps to the hour level, facilitating grouping and statistics by hour.

Equivalent Functions in Other SQL Systems

Different SQL systems have similar but not identical functions:

Common Truncation Units