接上一篇文章【kafka KSQL】遊戲日誌統計分析(1),展現一下KSQL WINDOW 功能的使用。segmentfault
測試用日誌數據:測試
{"cost":7, "epoch":1512342568296,"gameId":"2017-12-04_07:09:28_高手2區_500_015_185175","gameType":"situan","gamers": [{"balance":4405682,"delta":-60,"username":"lza"}, {"balance":69532,"delta":-60,"username":"lzb"}, {"balance":972120,"delta":-60,"username":"lzc"}, {"balance":23129,"delta":180,"username":"lze"}],"reason":"game"}
table
:CREATE TABLE users_per_minute AS \ SELECT username, COUNT(*) AS game_count, SUM(delta) AS delta_sum, SUM(tax) AS tax_sum , WINDOWSTART() AS win_start, WINDOWEND() AS win_end \ FROM USER_SCORE_EVENT \ WINDOW TUMBLING (SIZE 2 MINUTE) \ WHERE reason = 'game' \ GROUP BY username;
game_count
大於3局的玩家:SELECT username, game_count, win_start, win_end FROM users_per_minute WHERE game_count >= 3;
輸出:spa
lze | 6 | 1546744320000 | 1546744440000 lzc | 6 | 1546744320000 | 1546744440000 lza | 6 | 1546744320000 | 1546744440000 lzb | 6 | 1546744320000 | 1546744440000 lzb | 3 | 1546744440000 | 1546744560000 lzc | 3 | 1546744440000 | 1546744560000 lza | 3 | 1546744440000 | 1546744560000 lze | 3 | 1546744440000 | 1546744560000
不限定某個特定的10分鐘,只要在某個10分鐘以內完成了便可。日誌
CREATE TABLE users_hopping_10_minute AS \ SELECT username, COUNT(*) AS game_count, SUM(delta) AS delta_sum, SUM(tax) AS tax_sum , TIMESTAMPTOSTRING(WindowStart(), 'yyyy-MM-dd HH:mm:ss') AS win_start, TIMESTAMPTOSTRING(WindowEnd(), 'yyyy-MM-dd HH:mm:ss') AS win_end \ FROM USER_SCORE_EVENT \ WINDOW HOPPING (SIZE 10 MINUTE, ADVANCE BY 30 SECONDS) \ WHERE reason = 'game' \ GROUP BY username;
SELECT username \ FROM users_hopping_10_minute \ WHERE game_count >= 3 \ GROUP BY username;