If I could start over, this is how I'd build our analytics platform
Here are some hard lessons I learned from the battle grounds of building and migrating data platform tooling
Greetings, Data Engineer,
Nobody tells you this upfront. You build something useful, solve one team’s problem, and suddenly… everyone wants in. And they all expect it to just work for them.
That was my path. I started small. Solved a local problem. Watched it scale. Then everything cracked.
This is what I wish I’d known from the beginning. It’s not a checklist. It’s the hard lessons, earned through fire drills, late pivots, and quiet rewrites.
I thought I was building something simple for the ML team
I didn’t set out to build a company-wide analytics platform
This started small. The machine learning team needed features from product usage data. They needed it clean. Fresh. Easy to experiment with.
That was the brief. And I agreed. It sounded straightforward.
The org was technical. AWS was the default. I imagined a lightweight stack where each engineering team could plug in their own needs. Events would flow into storage.
The ML folks would model their features. Everyone else already had their own tools. No need to centralise. No need to unify anything.
So I treated this as an internal developer platform for ML. And because I thought the scope was narrow, I didn’t build for scale. I didn’t build for change.
And that’s where it started to go off course.
I didn’t think about cross-team alignment because nobody asked for it
Marketing was off in their world. Sales had their CRM. Product already had Mixpanel. None of them had asked for my help. So I didn’t factor them into the plan.
They weren’t in the room. So I assumed they wouldn’t care.
That assumption made sense… until it didn’t.
Because here’s what happens when your platform starts to deliver real value: suddenly everyone does care. Suddenly, everyone wants in.
But by then, you’ve already made architectural decisions that are hard to reverse.
That’s the trap.
You build the wrong thing well. Then you get stuck improving it for use cases it wasn’t designed for.
Great. Here’s the next expanded section:
One team turned into ten and it broke the system
When one team sees value, the floodgates open
The ML team got their data. Fast. Accurate. Flexible. Their experiments picked up speed. That success didn’t stay isolated.
A product manager asked if we could reuse the pipeline to analyse a specific feature usage.
An analyst wanted to create a churn report from the same tables.
Soon after, sales asked for lead scoring inputs based on user activity.
Nobody said, “Hey, we need a centralised analytics platform.” But that’s exactly what we were becoming. Quietly. Informally. And very quickly.
Each new team brought new requirements. Different refresh intervals. Different definitions of core metrics. Different ideas of what “accurate” meant.
And I was still the only data engineer.
Redshift cracked first but the real issue was deeper
Technically, our first big wall was Redshift. It couldn’t handle the growing concurrency, data volume, and refresh frequency. Queries slowed. Pipelines started failing silently.
We spent more time debugging than building.
But the problem wasn’t Redshift. That was just the symptom.
The real issue was that we never designed for breadth.
The platform was built to answer one team’s questions. Not to be shared. Not to be standardised. Not to be productised.
So everything depended on tribal knowledge. Every improvement had a cost.
And every time someone asked for “just one more field,” it meant another join or untracked dependency.
The operating model collapsed under the pressure
Requests came in faster than we could respond. Fire drills became normal. Stakeholders started bypassing us entirely, building one-off workarounds in Google Sheets. That eroded trust.
I was stuck in a loop - part janitor, part architect, part SQL monkey.
The backlog kept growing. Quality dropped.
And the most painful part? I could feel the resentment from teams who thought we were moving too slow, when in fact we were drowning in invisible work.
That was the moment I knew: this wasn’t working.
We weren’t scaling. We were patching. And patching doesn’t scale.
Lesson 1: You’re not building for the user in front of you. You’re building for everyone who comes next
The people who ask for data first are rarely the only ones who need it
Don’t miss out. By becoming a paid member, you get access to a massive library of premium content that will help you close the gap between tech and business and grow your career.
The ML team was just the starting point. But I built as if they were the end state.
That was the mistake.
Because in data, success multiplies demand. Every time you answer a question well, five new questions show up. Not from the same team. From others who never knew it was possible.
And those teams don’t wait. They expect the same speed and flexibility.
But you can’t deliver at scale what you hacked together for one team. At least, not without structure.
So your job isn’t to build for the current request.
Your job is to anticipate the pattern behind requests and design for the ten other people who’ll ask for something similar in the next six months.