Machine learning 101: What ML is and how it works
February 16, 2024
|Once upon a time (and not that long ago), we thought the best way to solve a problem with software was to make an expert system. We’d find an expert and ask them a lot of questions, write up some rules, and then write a program to apply those rules in just the right order. This approach encapsulates human knowledge in a very narrow way in a program.
Sometimes it worked, but other times we would leave out an important rule or use the wrong expert. These systems were static, meaning that they could not get any smarter or adjust if things changed.
Enter machine learning.
What is machine learning (ML)?
Machine learning (ML) is the idea that if you have a large collection of data, instead of finding some expert to make up rules, you let data teach the system what is meaningful. Machine learning allows a computer to see patterns. You feed in some information you have, and the machine learns to return a correct prediction. If you keep feeding new data to the resulting computer, it keeps getting smarter. More data is great, and the more diverse the data, the better the model performs.
How does machine learning work?
Machine learning uses things called neural networks, which are nodes in a program that act a bit like brain cells. These nodes get excited by specific things. The magic of machine learning is no one tells each of these nodes what to get excited about! The model itself decides what is important by looking repeatedly at large data examples to detect patterns and determine which ones matter.
A machine learning model is a black box: we give it inputs and it generates the right output, but we are not sure what happens inside the box.
There is a subset of machine learning called deep learning. The deep part refers to needing deeper neural networks (many more nodes in complicated formations). As you might guess, deep learning is used to tackle more complicated problems.
Let’s start with exploring non-deep machine learning and see what it can do!
Machine learning use cases
What is machine learning good for in the real world? Here are some examples (and each uses a different flavor of machine learning) to help marketers, healthcare professionals, and finance professionals do their jobs faster and more effectively:
1. Real estate value
You can predict house prices based on factors like size, location, and number of bedrooms. Linear regression models can analyze these variables to predict prices.
2. Medical diagnosis based on symptoms
A decision tree can help doctors diagnose diseases by following a tree-like model of decisions and their possible consequences.
3. Email spam detection
Support vector machines (SVMs) can classify emails as spam or legitimate by learning from the characteristics of known spam and non-spam emails. SVMs find how to best separate the different classes of data.
4. Shopping recommendations
Customers’ purchase histories can be used to recommend products on an e-commerce site. K-nearest neighbors can suggest products that similar customers have bought. This technique looks at the ‘K’ closest points (customers with similar histories) and makes predictions based on their behaviors.
5. Credit scoring
Logistic regression can be used to predict the probability of a customer defaulting on a loan. Logistic regression is used for binary (yes or no) classification problems (like default or no default).
6. Sentiment analysis in product reviews
Naïve Bayes classifiers can classify reviews as positive, negative, or neutral.
The more data you can feed a model, the better it can make its predictions, so living in the time of big data feeds right into machine learning.
Machine learning and GPUs
Another thing that enables machine learning is graphics processing unit (GPU) chips, originally developed to help video games run quickly! Why, you might ask? It turns out that the math behind machine learning is all about very rapidly handling vectors.
Think of vectors as arrows pointing in a specific direction and of a specific length. This is exactly how the amazing screens in video games are generated.
As video gamers have insisted on increasingly life-like visuals, the science community has benefitted from faster GPUs, as GPUs can act in parallel.
It has been estimated that it would take 355 years to train ChatGPT on a single GPU, but because they used something like 25,000 GPUs in parallel, it only took a matter of days. Machine learning models that would take months (or longer) to train on traditional computer CPUs can happen in hours on fast GPUs.
Without gamers, we would still be years away from having ChatGPT.
Can machine learning solve complex problems?
The use cases above were previously impossible or very expensive to solve, but they either return a yes or no answer (is it spam or not? Will Bob default or not?).
There are way more complicated problems we need to solve, like getting computers to understand English, finding the right moment in a movie from a verbal description, or developing and training the recently famous ChatGPT.
There is a subset of machine learning called deep learning that is better suited to tackling harder problems with far more complicated answers.
Deeper problems deserve deeper machine learning models!