Duplicate Job Listings

Assume you're given a table containing job postings from various companies. Write a query to retrieve the count of companies that have posted duplicate job listings (same title and description).

Schema

SQL Editor

Execution Result

Write and run your query to see results here.

Problem Context & Learning

💡Why This Question Matters

This data quality problem is typical of LinkedIn's focus on maintaining a clean, trustworthy platform. Duplicate detection is crucial for user experience—job seekers shouldn't see the same position multiple times. This question tests your ability to use GROUP BY with multiple columns and HAVING clauses to identify duplicates, then count distinct entities—a pattern used extensively in data cleaning and quality assurance workflows.

🔑Key SQL Concepts

Concepts tested: GROUP BY with multiple columns (company_id, title, description), HAVING clause for post-aggregation filtering, COUNT() with conditions, nested queries (subquery in FROM clause), and COUNT(DISTINCT) for final aggregation. Understanding the difference between counting rows vs counting distinct values is critical for accurate duplicate detection.

🌍Real-World Applications

LinkedIn's data quality teams use similar queries to: detect and merge duplicate job postings to improve search results, identify companies that may be spamming the platform, generate data quality reports for internal monitoring, flag potential policy violations for review, clean up legacy data during migrations, and maintain the integrity of their professional network graph.

Interview Insights & Approach

Strategic Approach

When tackling this LinkedIn problem, the key is to understand the grain of the result. Are you returning one row per user, or one row per category? Always start by identifying your unique join keys and consider if filtered aggregations (CASE WHEN) are more efficient than multiple subqueries.

Common Pitfalls

Be careful with NULL values in your JOIN conditions or aggregate functions. In interview scenarios, datasets often include edge cases like zero-count categories or duplicate entries that can throw off a simple COUNT(*) if not handled with DISTINCT.

Discussion & Solutions

Share your approach, optimized queries, or ask questions. Learning from others is the fastest way to master SQL.

Duplicate Job Listings

Schema

Execution Result

Problem Context & Learning

💡Why This Question Matters

🔑Key SQL Concepts

🌍Real-World Applications

Interview Insights & Approach

Strategic Approach

Common Pitfalls

Discussion & Solutions

Comments

Duplicate Job Listings

Schema

Execution Result

Problem Context & Learning

💡Why This Question Matters

🔑Key SQL Concepts

🌍Real-World Applications

Interview Insights & Approach

Strategic Approach

Common Pitfalls

Discussion & Solutions

Comments