Post

Building My Understanding of AI From The Ground Up

Building My Understanding of AI From The Ground Up

I’ve been circling the AI bonfire for about a year and a half now. Built inference infrastructure, wrangled CUDA kernels, sweet-talked MPS into cooperating on my Mac, deployed LLM serving pipelines both for personal projects and professionally. I can make models go. What I can’t always explain is why they go, or why they sometimes don’t, or why they hallucinate about medieval popes when you ask them for a cookie recipe.

It’s time to fix that.

The itch

Here’s the thing about working with AI at the infrastructure layer: you develop a very specific kind of competence. You know how to shard a model across GPUs. You know the difference between FP16 and BF16 and when each one will ruin your afternoon. You know that VRAM is the most precious substance in the known universe, edging out both saffron and printer ink.

But ask me to derive the backpropagation algorithm from scratch? To explain why attention mechanisms work the way they do? To articulate what makes a transformer different from, say, an RNN in mathematical terms beyond “it’s better”?

Narrator: He could not.

I’ve been driving a very fast car without fully understanding the engine. And while that’s fine for getting from A to B, it’s useless if the engine catches fire and you need to rebuild it from spare parts.

The plan

I’m going to go deeper down the matrix. Yes, that’s a pun. No, I’m not sorry.

The goal is to build a ground-up understanding of the mathematics and theory behind machine learning, deep learning, and large language models. Not just text - I want to understand how these systems process images, audio, video, and whatever modality someone decides to throw at them next. Probably smell. Someone’s definitely working on smell.

Now, a confession. I sucked at maths. Not just in school - the trauma runs deep and spans decades. Childhood maths classes where I’d stare at the blackboard like it was broadcasting in a language I hadn’t unlocked yet. Then college happened, and it got worse. Mathematics-1, Mathematics-2, Numerical Methods, Operations Research - a parade of subjects designed to remind me, semester after semester, that numbers and I were not on speaking terms. I barely passed most of them. Whether the professors took pity on me or were simply too embarrassed to have me as a repeat student, I’ll never know. I choose not to ask.

The one exception was Discrete Mathematics, which I aced. Apparently my brain is fine with maths as long as it’s about logic, sets, and graphs rather than whatever fresh hell Numerical Methods was cooking up.

And now here I am, an engineer who works with computers, doing mathematical calculations for a living, voluntarily signing up to study more maths. The irony isn’t lost on me. It’s the only constant in my life - and yes, I see the maths joke in that.

Here’s what I’m starting with:

  • Mathematics for Machine Learning (the MML book) - because apparently the best way to deal with childhood trauma is to walk straight back into it. Therapists call this exposure therapy. I call it poor judgment.
  • Deep Learning with PyTorch (2nd Edition, by Howard Huang, Eli Stevens, Luca Antiga, et al.) - for training and applying deep learning and generative AI models, the hands-on way.
  • Build a Large Language Model from Scratch by Sebastian Raschka - because nothing says “I understand this” quite like building one from the ground up.

The trajectory is: math foundations first, then deep learning and its algorithms, then LLMs specifically, with a lot of implementation along the way. I learn by building. Always have. Probably always will.

The real reason

Sure, intellectual curiosity is part of it. But let’s be honest about the other part.

AI, at the rate it’s going, is going to eat the planet. Not in a Terminator way - in a “we need another three nuclear power plants just to answer questions about whether a hot dog is a sandwich” way. The compute demands are scaling faster than our ability to power them sustainably. Every major AI company is in an arms race to build bigger models, train on more data, and burn through more electricity, while publishing blog posts about their commitment to sustainability. It’s like watching someone set fire to a forest while tweeting about Earth Day.

I want to understand what’s under the hood well enough to think about what can be improved. Not just “how do I make this 10% faster” but “does this architecture even make sense, or are we brute-forcing our way through problems that have elegant solutions we haven’t found yet?” The current trajectory isn’t sustainable. Someone needs to think about making it better. Might as well be one more person trying.

The ethical bit

There’s another motivation I won’t sugarcoat: I don’t want to keep giving AI companies my money in the long run.

The ethics of the current AI landscape are… let’s go with “complicated.” Training data sourced without consent. Models that perpetuate biases at scale. Companies that preach openness while locking everything behind API paywalls. The whole thing has a vaguely extractive energy that I’m not thrilled about funding indefinitely.

The long-term play is simple: understand the technology deeply enough to build my own. Run my own models. On my own hardware. Answering to nobody’s terms of service.

Is that ambitious? Yes. Is it naive? Maybe. Is it more productive than complaining about it on Twitter? Absolutely.

What this means for the blog

Expect posts about the learning journey. The breakthroughs, the confusions, the moments where a mathematical concept finally clicks and the moments where I stare at a gradient descent derivation for three hours and question every life choice that led me here.

I’ll write about what I learn, what I build, and what I think can be done differently. If I’m lucky, some of it will be useful to others walking the same path. If I’m unlucky, it’ll at least be entertaining in the way that watching someone learn to juggle chainsaws is entertaining.

Either way, we’re going deeper.

Time to see how far the rabbit hole goes.

This post is licensed under CC BY 4.0 by the author.