DAD-220 · Structured Database Environments

SQL Return Rate Analysis

SQLMySQLINNER JOINData analysis Stakeholder reportingReturn rate

This project uses SQL against the QuantigrationUpdates database to analyze product return patterns for a product manager. Using multi-table JOIN queries, it surfaces return frequency by state and return percentage by SKU — then interprets those findings, identifies data limitations, and communicates conclusions clearly to a non-technical stakeholder.

Database schema

The analysis draws on three related tables: Orders, RMA (return merchandise authorizations), and Collaborators. The queries use INNER JOIN to connect return records to geographic data and product identifiers.

Query 1: returns by state

SELECT Collaborators.State AS STATE,
       COUNT(*) AS RETURN_FREQUENCY
FROM Orders
INNER JOIN RMA
  ON Orders.OrderID = RMA.OrderID
INNER JOIN Collaborators
  ON Collaborators.CollaboratorID = Orders.CollaboratorID
GROUP BY STATE
ORDER BY RETURN_FREQUENCY DESC;

Findings

Massachusetts shows the highest return frequency in the dataset, followed by Arkansas. This geographic concentration warrants further investigation — it may reflect regional distribution patterns, a high-volume partner concentrated in those states, or a logistics or quality issue specific to that region.

Data limitation The query counts return events but does not normalize against total orders per state. A state with high sales volume will naturally appear to have more returns. Without a sales-volume denominator, the return rate cannot be interpreted in isolation as a sign of a problem.

Query 2: returns by product SKU

SELECT Orders.SKU AS SKU,
       COUNT(*) * 100 / (SELECT COUNT(*) FROM RMA) AS PERCENTAGE
FROM RMA
INNER JOIN Orders
  ON Orders.OrderID = RMA.OrderID
GROUP BY SKU
ORDER BY PERCENTAGE DESC;

Findings

1
BAS-48-1C — 22.05%This SKU accounts for the largest share of all returns in the dataset by a significant margin, suggesting a potential product defect, packaging issue, or description mismatch worth investigating.
2
ENT-48-40F — 16.29%The second-highest return rate, possibly related to a related product line or similar issue. The proximity to BAS-48-1C in the return ranking may indicate a shared component or supplier.
3
ENT-48-10F — 11.41%A third ENT-series product with an elevated return rate. The concentration of returns in the ENT product line suggests category-level investigation is warranted.

Stakeholder summary

The data gives the product manager two actionable signals. First, Massachusetts is a geographic concentration point for returns and deserves a closer look at whether a single large partner or regional logistics issue is driving volume. Second, BAS-48-1C and the ENT product line are disproportionately represented in returns — this warrants a quality review, customer feedback analysis, and potentially a conversation with the relevant supplier.

Limitations and next steps

LimitationRecommended action
Return counts not normalized against order volume Add a total orders subquery to calculate a true return rate percentage per state and SKU
No breakdown by customer or partner within states Add a GROUP BY CollaboratorID query to determine whether a few large accounts drive the Massachusetts concentration
No return reason codes in the RMA table Request that RMA data be enriched with reason codes (defect, shipping damage, wrong item) to enable root cause analysis
Time period of data is unspecified Add a date filter to understand whether return spikes are seasonal or trend-driven

Skills demonstrated

Multi-table SQL JOIN queries
Aggregate functions (COUNT, %)
GROUP BY and ORDER BY
Subquery construction
Data interpretation
Stakeholder communication