1200字范文,内容丰富有趣,写作的好帮手!
1200字范文 > 11月每天新用户的次日留存率

11月每天新用户的次日留存率

时间:2022-04-20 00:11:04

相关推荐

11月每天新用户的次日留存率

描述

用户行为日志表tb_user_log

iduidartical_idin_timeout_timesign_cin11010-11-01 10:00:00-11-01 10:00:42121029001-11-01 10:00:00-11-01 10:00:09031039001-11-01 10:00:01-11-01 10:01:50041019002-11-02 10:00:09-11-02 10:00:28051039002-11-02 10:00:51-11-02 10:00:59061049001-11-02 10:00:28-11-02 10:00:50071019003-11-03 11:00:55-11-03 11:01:24081049003-11-03 11:00:45-11-03 11:00:55091059003-11-03 11:00:53-11-03 11:00:590101019002-11-04 11:00:55-11-04 11:00:590

问题:统计11月每天新用户的次日留存率(保留2位小数)

注:

次日留存率为当天新增的用户数中第二天又活跃了的用户数占比。

如果in_time-进入时间和out_time-离开时间跨天了,在两天里都记为该用户活跃过,结果按日期升序。

建表语句:

DROP TABLE IF EXISTS tb_user_log;CREATE TABLE tb_user_log (id INT PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',uid INT NOT NULL COMMENT '用户ID',artical_id INT NOT NULL COMMENT '视频ID',in_time datetime COMMENT '进入时间',out_time datetime COMMENT '离开时间',sign_in TINYINT DEFAULT 0 COMMENT '是否签到') CHARACTER SET utf8 COLLATE utf8_bin;INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in) VALUES(101, 0, '-11-01 10:00:00', '-11-01 10:00:42', 1),(102, 9001, '-11-01 10:00:00', '-11-01 10:00:09', 0),(103, 9001, '-11-01 10:00:01', '-11-01 10:01:50', 0),(101, 9002, '-11-02 10:00:09', '-11-02 10:00:28', 0),(103, 9002, '-11-02 10:00:51', '-11-02 10:00:59', 0),(104, 9001, '-11-02 10:00:28', '-11-02 10:00:50', 0),(101, 9003, '-11-03 11:00:55', '-11-03 11:01:24', 0),(104, 9003, '-11-03 11:00:45', '-11-03 11:00:55', 0),(105, 9003, '-11-03 11:00:53', '-11-03 11:00:59', 0),(101, 9002, '-11-04 11:00:55', '-11-04 11:00:59', 0);

解法:

基本的思路是:用union把in_time和out_time并联起来,对uid和date去重活获得一张用户活跃表。之后找出次活用户和活跃用户,计算留存率即可

细节:没有新用户不用输出,留存率保留2位小数,结果按日期升序,时间范围为11月。

关联in_time和out_time字段,建立用户活跃表

SELECT DISTINCT uid,DATE(in_time) AS dt FROM tb_user_logUNION--UNION去重,UNION ALL不去重SELECT DISTINCT uid,DATE(out_time) AS dt FROM tb_user_log

找出新用户和次活新用户,并用‘1’来表示。

WITH t1 AS(SELECT DISTINCT uid,DATE(in_time) AS dt FROM tb_user_logUNIONSELECT DISTINCT uid,DATE(out_time) AS dt FROM tb_user_log)SELECT dt,CASE WHEN (uid,dt)IN (SELECT uid,date_add(dt,INTERVAL -1 DAY) FROM t1) AND (uid,dt)IN (SELECT uid,MIN(dt)FROM t1 GROUP BY uid)THEN 1 ELSE 0 END AS next_user ,#次活新用户CASE WHEN (uid,dt) IN (SELECT uid,MIN(dt)FROM t1 GROUP BY uid)#新用户用户THEN 1 ELSE 0 END new_userFROM t1WHERE DATE_FORMAT(dt,'%Y%m')='11';#取11月的数据

计算每日的次活用户数和新用户数,并求新用户次日留存率。

每日次活用户数:SUM(next_user )每日新用户数:SUM(new_user)次日留存率,保留2位小数ROUND(SUM(next_user )/SUM(new_user),2)

计算留存率

ROUND(SUM(CASE WHEN (uid,dt)IN (SELECT uid,date_add(dt,INTERVAL -1 DAY) FROM t1) AND (uid,dt)IN (SELECT uid,MIN(dt) FROM t1 GROUP BY uid) THEN 1 ELSE 0 END) /SUM(CASE WHEN (uid,dt) IN (SELECT uid,MIN(dt)FROM t1 GROUP BY uid) THEN 1 ELSE 0 END),2) AS uv_left_rate

完整代码:

WITH t1 AS(SELECT DISTINCT uid,DATE(in_time) AS dt FROM tb_user_logUNIONSELECT DISTINCT uid,DATE(out_time) AS dt FROM tb_user_log)#用户活跃表SELECT dt,ROUND(SUM(CASE WHEN (uid,dt)IN (SELECT uid,date_add(dt,INTERVAL -1 DAY) FROM t1) AND (uid,dt)IN (SELECT uid,MIN(dt) FROM t1 GROUP BY uid) THEN 1 ELSE 0 END) /SUM(CASE WHEN (uid,dt) IN (SELECT uid,MIN(dt)FROM t1 GROUP BY uid) THEN 1 ELSE 0 END),2) AS uv_left_rateFROM t1WHERE DATE_FORMAT(dt,'%Y%m')='11'GROUP BY dtHAVING uv_left_rate IS NOT NULLORDER BY dt;

解法二:窗口函数,个人感觉这个好理解

照例找出用户活跃表

通过LEAD窗口函数对每个用户的活跃日向上移动一行

SELECT uid,dt,MIN(dt) OVER (PARTITION BY uid) AS new_dt,LEAD(dt,1) OVER (PARTITION BY uid ORDER BY dt) AS next_dt--取下一日向上移动FROM(SELECT DISTINCT uid,DATE(in_time) AS dt FROM tb_user_logUNIONSELECT DISTINCT uid,DATE(out_time) AS dt FROM tb_user_log)t1WHERE DATE_FORMAT(dt,'%Y%m')='11'

如果dt=new_dt那这个用户为新用户,如果dt=new_dt且next_dt和new_dt的日期差为1则这个用户为次留新用户

WITH t2 AS(SELECT uid,dt,MIN(dt) OVER (PARTITION BY uid) AS new_dt,LEAD(dt,1) OVER (PARTITION BY uid ORDER BY dt) AS next_dtFROM(SELECT DISTINCT uid,DATE(in_time) AS dt FROM tb_user_logUNIONSELECT DISTINCT uid,DATE(out_time) AS dt FROM tb_user_log)t1WHERE DATE_FORMAT(dt,'%Y%m')='11')select dt,ROUND(sum(CASE WHEN dt = new_dt and DATEDIFF(next_dt,dt) = 1 THEN 1 ELSE 0 END)/sum(CASE WHEN dt = new_dt THEN 1 ELSE 0 END),2) uv_left_rateFROM t2GROUP BY dthaving uv_left_rate IS NOT NULLORDER BY dt;)

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。