CNNs 101: How do they work? (Part 2 of 3)

A continuation of the series on convolutional basics. In this post I provide some useful resources on learning about the inner workings of convolutional neural networks. In the next post I will be discussing hyperparameter selection and optimization.

CNNs 101: How do they work? (Part 1 of 3)

How the hell do convolutional neural nets work? What even is a convolution? If these are some of the questions you've found yourself asking, you've come to the right place. In this series of blog posts, I will be attempting to unpack the mechanisms involved in convolutional neural nets (CNNs) and make them accessible to a wide audience.

This is the first in a three part series. In this post I will be covering the basics of CNNs, focusing on the structure and mechanisms involved. After reading this post, you should be able to at least hold a simple discussion about CNNs with your computer science friend who just finished their third year of undergrad and wants to create an AI start-up focusing on using deep learning to create sentient candles. They're your friend, not mine...

So what even is machine learning?

Welcome back!

In my previous post, I gave a little bit of an about myself and what I hope the site to be. I realized after I published last week's post that not everyone who visits the site even knows what machine learning and neural networks are. So for this week's post, I thought I should first briefly go over the general topic of machine learning and then give a very brief introduction to deep learning / deep neural nets.

Machine Learning has become quite popular recently. Particularly, more and more start ups are incorporating machine learning into their list of expertise, and larger companies are looking to use machine learning in their businesses. I still haven't address the question though: What even is machine learning? Thankfully, Jason Brownlee at machinelearningmastery has a very thorough post that exactly answers this question, so I won't reinvent the wheel. Essentially, machine learning algorithms are used in order to automatize the process of extracting patterns from large, or not-so-large, sets of data.

In basic practice, we have some data that are collected and a question that needs to be answered. You believe that there is a potentially hidden relationship between the data collected and the questions to be answered that a machine learning algorithm can reveal. Unfortunately for you, there are a large number of algorithms that fall under the broad term of machine learning. Fortunately for you, the Internet is a thing, and there are sources that can help you determine the kind of machine learning algorithm you might want to apply to your data. One of the most popular modules for machine learning in python is scikit-learn and yes, they have a neat little map that helps you decide what algorithm to use given the data you have. Additionally, Jason Brownlee has a post on his website that divides machine learning algorithms into several distinct classes and sub-class, which give you several different ways to decide which algorithm to use.

So you may be asking yourself, "Nick, you still haven't told me anything about deep learning." Well, yes and no. Deep learning itself is actually a subset of algorithms within the umbrella term of Machine Learning. I suppose Deep Learning is a sub-umbrella term. The subset of algorithms that fall under deep learning can be generalized to two architectures, recurrent neural networks (RNNs) and convolutional neural networks (CNNs). I plan on writing a post specifically for each of these architectures so I won't go into detail here other than to say what their (very) general strengths are. In general, RNNs are very adept at learning temporal relationships, while CNNS are very good at learning spatial relationships. I'll leave you to ponder on these strengths for a bit. Next week, I plan on discussing CNNs a bit, sharing sources that I find helpful, as well as some cool everyday applications of CNNs.

This is Nick, signing off...