7-Day Rolling Sales by Product

SQL coding challenge · Difficulty: medium · +150 XP

Problem

For each row show a 7-day (7-row) rolling sum of sales per product.

Tables

Table: product_sales

| product_id | sale_date | sales |
| --- | --- | --- |
| 1 | 2024-01-01 | 10 |
| 1 | 2024-01-02 | 20 |
| 1 | 2024-01-03 | 30 |
| 1 | 2024-01-04 | 40 |
| 1 | 2024-01-05 | 50 |
| 1 | 2024-01-06 | 60 |
| 1 | 2024-01-07 | 70 |
| 1 | 2024-01-08 | 80 |
| 2 | 2024-01-01 | 100 |
| 2 | 2024-01-02 | 50 |
| 2 | 2024-01-03 | 70 |

Expected Output

| product_id | sale_date | sales | rolling_7day_sum |
| --- | --- | --- | --- |
| 1 | 2024-01-01 | 10 | 10 |
| 1 | 2024-01-02 | 20 | 30 |
| 1 | 2024-01-03 | 30 | 60 |
| 1 | 2024-01-04 | 40 | 100 |
| 1 | 2024-01-05 | 50 | 150 |
| 1 | 2024-01-06 | 60 | 210 |
| 1 | 2024-01-07 | 70 | 280 |
| 1 | 2024-01-08 | 80 | 350 |
| 2 | 2024-01-01 | 100 | 100 |
| 2 | 2024-01-02 | 50 | 150 |
| 2 | 2024-01-03 | 70 | 220 |
  • Return: product_id, sale_date, sales, rolling_7day_sum
  • Sort by product_id, sale_date
  • Function to use: SUM() OVER (PARTITION BY product_id ORDER BY sale_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)

Solve this challenge on PySpark.in