7-Day Rolling Purchase Amount by Customer

SQL coding challenge · Difficulty: medium · +150 XP

Problem

For each order show a 7-day (7-row) rolling sum of amount per customer.

Tables

Table: orders

| customer_id | order_date | amount |
| --- | --- | --- |
| 1 | 2024-01-01 | 100 |
| 1 | 2024-01-02 | 200 |
| 1 | 2024-01-03 | 150 |
| 1 | 2024-01-04 | 300 |
| 1 | 2024-01-05 | 250 |
| 1 | 2024-01-06 | 180 |
| 1 | 2024-01-07 | 220 |
| 1 | 2024-01-08 | 90 |
| 2 | 2024-01-02 | 300 |
| 2 | 2024-01-04 | 250 |
| 2 | 2024-01-06 | 100 |

Expected Output

| customer_id | order_date | amount | rolling_7day_amount |
| --- | --- | --- | --- |
| 1 | 2024-01-01 | 100 | 100 |
| 1 | 2024-01-02 | 200 | 300 |
| 1 | 2024-01-03 | 150 | 450 |
| 1 | 2024-01-04 | 300 | 750 |
| 1 | 2024-01-05 | 250 | 1000 |
| 1 | 2024-01-06 | 180 | 1180 |
| 1 | 2024-01-07 | 220 | 1400 |
| 1 | 2024-01-08 | 90 | 1390 |
| 2 | 2024-01-02 | 300 | 300 |
| 2 | 2024-01-04 | 250 | 550 |
| 2 | 2024-01-06 | 100 | 650 |
  • Return: customer_id, order_date, amount, rolling_7day_amount
  • Sort by customer_id, order_date
  • Function to use: SUM() OVER (PARTITION BY customer_id ORDER BY order_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)

Solve this challenge on PySpark.in