Skip to content

From Code to Clarity: When Machine Learning Starts Making Sense

Published: at 12:00 AM

From Code to Clarity - When Machine Learning Starts Making Sense

Ground Zero

When I first encountered machine learning concepts, it felt like opening a textbook written in hieroglyphics—or maybe just staring at a wall of advanced math formulas. As a software engineer, I’m used to reading code and immediately understanding the flow—functions, loops, conditionals. But diving into ML? That was a different story. The starting point for many, myself included, is often something like linear regression. Take this formula for example:

L(β,σ2X,y)=(2πσ2)n2exp(12σ2i=1n(yixiTβ)2)L(\beta, \sigma^2 \mid X, y) = (2\pi\sigma^2)^{-\frac{n}{2}} \exp\left(-\frac{1}{2\sigma^2} \sum_{i=1}^{n} (y_i - x_i^T \beta)^2\right)

That first function introduced is a likelihood function, and honestly, it still feels foreign. Stripped down even further, it becomes f(w, b)(x) = wx + b—a simple linear function. It takes an input, applies a transformation, and gives an output. In many ways, it’s no different from the data transformations we write in APIs or algorithms every day.

As I’ve tried to wrap my head around supervised learning, I found myself drawn to text classification. It’s definitely different from linear regression, but at the end of the day, they’re both just examples of supervised learning. With linear regression, you’re predicting a number—like figuring out house prices based on square footage (something we’ve all done while aimlessly browsing Zillow). With text classification, you’re still making predictions, but now it’s about picking categories—like figuring out what plant type matches a description.

Visual Learning as a Bridge

What really helped everything start to click for me was exploring probability visually. Tools like Mermaid flowcharts, Brilliant, and Seeing Theory by Daniel Kunin were great support tools on my learning path. These tools let you tweak parameters and see how they affect outcomes in real time. It wasn’t just about reading concepts—it was about playing with them and seeing the impact on graphs.

This hands-on approach made abstract ideas, like variance, feel practical. It reminded me of debugging with charts or metrics in engineering—adjusting variables, analyzing the results, and figuring out what’s going on.

The ML workflow even felt familiar—it reminded me of how we handle data in software engineering. On the backend, we often preprocess or normalize the shape of data before passing it to the frontend. Machine learning pipelines follow a similar flow:

flowchart LR
    D[Training Data] --> P[Preprocessing]
    P --> F[Feature Extraction]
    F --> T[Training]
    T --> M[Model]
    M --> Pred[Predictions]

Starting with Plants: An Unexpected Teacher

When I needed a project to practice with, plants turned out to be the perfect choice. Not because I’m a expert gardener (I do garden as a hobby!), but because plants are basically walking datasets. They come with:

Let’s break it down with a real example: the Aloe Vera plant, one I have in my own home.

Features as Data Points:

These features gave me a starting point to think about how to structure my problem.

Starting with Data and Resources

I started with Hugging Face to grab some data and test if I could even get a model running locally. What I discovered was more than just a repository—it’s a hub for datasets, tools, and a solid community. Github for machine learning.

Specific resources that helped me:

Defining Requirements First

Before diving into the code, I stepped back to define my requirements. I started mapping out scenarios like:

What questions and answers are we looking for?

This approach felt natural because it mirrors software engineering: start with the problem, design the interface, and define the expected output. Instead of writing explicit logic for each case, the idea is to design examples that guide the model in learning patterns—almost like TDD, where the focus is on expected outcomes and functionality before implementation.

Learning From Failure and Moving Forward

My first attempt at plant classification? 22% accuracy—barely better than guessing! But that failure taught me more than any tutorial. It showed me that getting the program running and debugging it was a step forward in itself.

To get started, I needed:

My process? Play with the APIs, print everything, check the classification report, break things, and repeat. When I hit something I didn’t understand, I bounced between YouTube, Coursera courses, and LLMs to break down concepts.

One particularly helpful strategy was using LLMs as a reference point. I’d take my test cases and run them through advanced language models to see how they classified plants. This gave me a “north star” to aim for—if my model can get close to matching their outputs, I’d consider it progress. For example, if the LLM predicted “Sunflower” for a description but my model guessed “Daisy,” I could dive in to explore where and why my model diverged.

The Pattern Recognition Click

The real breakthrough came when I started seeing how supervised learning mirrors how we naturally learn patterns. Just like recognizing a song from years ago within the first few beats because of its rhythm and pattern, supervised learning is about finding patterns in labeled examples based on historical knowledge.

The difference? Scale and precision. Machines can process thousands of examples and find subtle patterns we might overlook.

What’s Next?

I’ll continue experimenting and building on what I’ve learned so far, but with a more practical focus on my text classification model.

My next steps:

My end goal hasn’t changed—exploring AI and machine learning, picking up a new language, and possibly building something on top of an LLM, even if it isn’t my own model.

“The most important attitude that can be formed is that of desire to go on learning.”
― John Dewey, Experience and Education


Next Post
Humans vs Machine Learning