My Journey Using AI: Building Models and Leveraging Pre-Trained Solutions

May 07, 2024

My Journey Using AI: Building Models and Leveraging Pre-Trained Solutions

Over the past few years, I've been fortunate enough to explore and utilize AI in various ways, from building models from scratch to harnessing pre-trained solutions for solving new problems. Here's a quick highlight of my journey.

Building a CNN for Audio Classification

My first foray into AI involved building a convolutional neural network (CNN) to classify audio sounds. I used the librosa library to manipulate the waveforms and generate spectrograms representing sound fragments. This hands-on project, which began in 2019, led to the creation of a series of YouTube videos where I documented my progress. One of my most popular videos from that series has garnered over 14,000 views. You can check it out here.

Coming from a software development background, this project was a way for me to learn more about machine learning. I dove deep into the math behind machine learning and consumed content by Andrew Ng and other AI thought leaders. Through these resources, I learned about fundamental concepts like gradient descent and different architectures such as RNNs, CNNs, and Autoencoders.

Leveraging Pre-Trained Models and Transfer Learning

After some time focused on other development work, I returned to AI research in 2022, this time with a new set of challenges. Working with a team that communicated mostly via videos on a proprietary platform, I needed a way to keep track of decisions and discussions. I wrote programs that use pre-trained Hugging Face models to address various tasks, such as capturing system audio, converting speech to text, and performing sentiment analysis.

During this period, I realized just how much the AI landscape had progressed since 2019, particularly in terms of accessibility. Transfer learning had become a popular approach, enabling developers like me to take a pre-trained model solving a similar problem and refine it for our specific needs. The availability of pre-trained models made it easier to build on existing solutions rather than starting from scratch. Transfer learning works well because the pre-trained models have already learned a wealth of general features, meaning less data and training time is required to adapt them to new but related tasks.

Using AI Services for Code Generation

In recent times, services like ChatGPT-4, Gemini 1.5, and LLaMA3 have made it even easier to apply AI to real-world problems, particularly for generating code. These tools take advantage of state-of-the-art natural language processing (NLP) models, offering quick and efficient solutions for developers.

I'll cover these approaches in more detail in separate blog posts, so stay tuned for those. However, I wanted to provide this overview of how my journey with AI has evolved and how embracing new tools and pre-trained models has unlocked new possibilities.

Search This Blog

Devin Venable