Calculate the 30-day retention rate for users who watched their first video. Retention means the user watched another video exactly 30 days after their first watch. Return user_id and whether they were retained (1 or 0).
This cohort retention problem is critical for ByteDance's TikTok and other video platforms. Understanding how many users return after their first session is a key metric for product-market fit and growth. This question tests your ability to work with dates, perform self-joins on temporal conditions, and calculate retention metrics—essential skills for any consumer app company.
Advanced concepts: self-join on the same table with date conditions, DATE() function for date arithmetic, MIN() to find first watch date, CASE WHEN for binary retention flag, and LEFT JOIN to preserve all users (even those not retained). The challenge is correctly identifying 'exactly 30 days later' and handling users who never returned.
ByteDance's growth teams use similar queries to: calculate D1, D7, D30 retention cohorts for product health monitoring, identify which features drive retention, segment users by retention behavior for targeted campaigns, measure the impact of onboarding changes, power executive dashboards showing user lifecycle metrics, and feed churn prediction models.
When tackling this ByteDance problem, the key is to understand the grain of the result. Are you returning one row per user, or one row per category? Always start by identifying your unique join keys and consider if filtered aggregations (CASE WHEN) are more efficient than multiple subqueries.
Be careful with NULL values in your JOIN conditions or aggregate functions. In interview scenarios, datasets often include edge cases like zero-count categories or duplicate entries that can throw off a simple COUNT(*) if not handled with DISTINCT.
Share your approach, optimized queries, or ask questions. Learning from others is the fastest way to master SQL.