oracle sql 分組後找出每組之最(續)

  • 230
  • 0

記錄使用keep dense_rank語法

這次在 https://www.hackerrank.com/challenges/15-days-of-learning-sql/problem 遇到的題目

要找出每天submission count (scnt) 最多的人 (hacker_id),有超過一人的scnt= max(scnt) 時則取 hacker_id 最小的
可以看到partition by submission_date order by scnt desc, hacker_id就是在以此條件排序

select submission_date, hacker_id, scnt from (
    select submission_date, 
           hacker_id, 
           scnt, 
           row_number() over (partition by submission_date order by scnt desc, hacker_id) as rn
    from (
        select submission_date, hacker_id, count(1) as scnt
        from Submissions
        group by submission_date, hacker_id
    )
) where rn = 1

還找到一個方法是使用keep dense_rank語法,在scnt條件固定時再以min(hacker_id) 取 hacker_id 最小的,達到相同效果

select submission_date, 
       min(hacker_id) keep (dense_rank first order by scnt desc) hacker_id,
       max(scnt) submission_scnt
from (
    select submission_date, hacker_id, count(1) as scnt
    from Submissions
    group by submission_date, hacker_id
)
group by submission_date

順便附上好心人提供本題的test case語法,可以去sqltest測試

create table submissions ( submission_date date, submission_id number, hacker_id number, score number );
insert into submissions values (date'2016-03-01', 8494, 20703, 0);
insert into submissions values (date'2016-03-01', 23965, 53473, 60);
insert into submissions values (date'2016-03-01', 23965, 79722, 60); 
insert into submissions values (date'2016-03-01', 30173, 36396, 70);
insert into submissions values (date'2016-03-02', 34928, 20703, 0);
insert into submissions values (date'2016-03-02', 38740, 15758, 60); 
insert into submissions values (date'2016-03-02', 42769, 79722, 60); 
insert into submissions values (date'2016-03-02', 44364, 79722, 60);
insert into submissions values (date'2016-03-03', 45440, 20703, 0);
insert into submissions values (date'2016-03-03', 49050, 36396, 70);
insert into submissions values (date'2016-03-03', 50273, 79722, 5);
insert into submissions values (date'2016-03-04', 50344, 20703, 0); 
insert into submissions values (date'2016-03-04', 51360, 44065, 90);
insert into submissions values (date'2016-03-04', 54404, 53473, 65); 
insert into submissions values (date'2016-03-04', 61533, 79722, 45);
insert into submissions values (date'2016-03-05', 72852, 20703,0);
insert into submissions values (date'2016-03-05', 74546, 38289, 0); 
insert into submissions values (date'2016-03-05', 76487, 62529, 0); 
insert into submissions values (date'2016-03-05', 82439, 36396, 10); 
insert into submissions values (date'2016-03-05', 9006,  36396, 40);
insert into submissions values (date'2016-03-06', 90404, 20703, 0); 

create table hackers ( hacker_id number, name varchar(255) ); 
insert into hackers values (15758, 'Rose');
insert into hackers values (20703, 'Angela');
insert into hackers values (36396, 'Frank'); 
insert into hackers values (38289, 'Patrick'); 
insert into hackers values (44065, 'Lisa'); 
insert into hackers values (53473, 'Kimberly');
insert into hackers values (62529, 'Bonnie'); 
insert into hackers values (79722, 'Michael');

題目還有一個欄位要找至該日期為止每天都有submit的人數,這邊使用遞迴處理
附上最後提交的sql


select s.submission_date, se.cnt, h.hacker_id, h.name
from (
    select submission_date, 
           max(scnt) scnt,
           min(hacker_id) keep (dense_rank first order by scnt desc) hacker_id
    from (
        select submission_date, hacker_id, count(1) as scnt
        from Submissions
        group by submission_date, hacker_id
    )
    group by submission_date
) s
join Hackers h
on s.hacker_id = h.hacker_id
join (
    select submission_date, count(distinct hacker_id) cnt
    from Submissions
    START WITH submission_date = to_date('2016-03-01', 'yyyy-MM-dd')
    CONNECT BY PRIOR submission_date+1 = submission_date and PRIOR hacker_id = hacker_id
    group by submission_date
) se
on s.submission_date = se.submission_date;

最後把上次的情境用keep dense_rank語法改寫 (之前使用row_number + partition by 的方法 https://dotblogs.com.tw/acc2note/2020/07/03/230937)

select to_char(max(record_date) keep (dense_rank first order by qty desc), 'yyyy/mm/dd') as record_date,
       max(qty) as qty
  from record_test t
 group by to_char(record_date, 'yyyymm')

需要注意的是dense_rank後還有一個聚總函數max,如果有超過一筆qty = max(qty)就會透過聚總函數篩掉

google: oracle分析函數 keep 可以找到更詳細的說明