17 sales channels once tallied by hand across 26 spreadsheets — moving to unattended collection and a real-time P&L dashboard
We are building a data infrastructure that automatically collects sales, ad, and settlement data across 17 channels, with an LLM agent mapping each channel's inconsistent product names to a master SKU.
- Client
- Company D (skincare D2C brand)
- Period
- Apr 2026 – ongoing
- Role
- Design and build of the data pipeline and dashboards
17
Integrated sales channels
21 (sales + ads)
Channels collected
26
Spreadsheets replaced
31→16
Dashboard screen consolidation
50× (target)
SKU-mapping throughput
The Problem
- Data from 17 channels was copy-pasted by hand into 26 spreadsheets — the entire four-step P&L calculation was manual
- No master SKU existed: each channel's product-name + option-name combination effectively served as the key
Approach
- Channel collection runs unattended on official API integrations first, with custom scrapers covering channels without APIs
- A LangGraph-based SKU-mapping agent: vector search (RAG) + feedback loop, fully traced (LangSmith)
- Consolidating 31 operational and management dashboard screens into 16 in a redesign (ops home, data-quality KPIs, revenue Sankey, and more)
Systems Built
| Module | Description |
|---|---|
| 21-channel collection pipeline | Official API integrations · custom scrapers |
| Normalization & SKU-decomposition backend | FastAPI |
| SKU-mapping agent | LangGraph |
| 4-depth P&L engine | Revenue, COGS, ad spend, fees |
| Operations & management dashboard | React |
Screens


Tech Stack
Python (FastAPI)ReactPostgreSQL + pgvectorLangGraphLangSmithAWSPlaywright
This is an ongoing project; metrics reflect the design and build scope.



