对于窗口函数的一些案例 Demo。
窗口函数
基于 窗口函数示例,对窗口函数的使用的深化。
查询顾客的购买明细及月购买总额
1 | hive (default)> select *, sum(cost) over(partition by month(orderdate)) from business; |
按照 name 分组,组内数据累加
方式 1
1 | hive (default)> select *, sum(cost) over(partition by name order by orderdate) user_cost_sum from business; |
方式 2
1 | hive (default)> select *, sum(cost) over(partition by name order by orderdate rows between UNBOUNDED PRECEDING and current row) user_cost_sum from business; |
按照 name 分组,当前行和前一行数据聚合
1 | hive (default)> select *, sum(cost) over(partition by name order by orderdate rows between 1 PRECEDING and current row) user_cost_sum from business; |
按照 name 分组,当前行和前一行、后一行数据整合
1 | hive (default)> select *, sum(cost) over(partition by name order by orderdate rows between 1 PRECEDING and 1 FOLLOWING ) user_cost_sum from business; |
按照 name 分组,当前行和后面所有行聚合
1 | hive (default)> select *, sum(cost) over(partition by name order by orderdate rows between CURRENT ROW and UNBOUNDED FOLLOWING ) user_cost_sum from business; |
统计每个用户的累计访问次数
使用 测试数据,完成需求。
1 | create table action( |
思路:简化查询,一步步深入。
第一步:时间格式化
1 | select userId, date_format(regexp_replace(visitDate, '/', '-'), 'yyyy-MM') mn, visitCount from action |
执行结果:
1 | OK |
第二步:查询每个用户每个月的总和
1 | select userId, mn, sum(visitCount) xj from ( |
执行结果:
1 | OK |
第三步:根据用户累加
1 | select userId, mn, xj, sum(xj) over(partition by userId order by mn) user_sum from ( |
执行结果
1 | OK |