What is Machine learning (ML) ?
Machine learning is about learning patterns within datasets and to model them in order to make useful predictions and to provide answers for difficult problems.
Academic ML vs applied ML
Academic ML obsesses over those algorithms and the math behind each one, while applied ML is focused on practical results.
To succeed in data science, it's more important to understand the end-to-end framework—plus the practical tools used in each step—than to obsess over the math and theory behind each algorithm.
High-level framework for Machine learning
To create real-world business value with ML, the most important thing is to have a comprehensive framework.
At a very high level, it consists of 5 core steps.
1) Exploratory Analysis
Exploratory Analysis is the process of "getting to know" the dataset before you begin your modeling or other analyses. It consists of plotting key charts, displaying key statistics, and digging into the dataset—often into individual observations—to make sure you have everything you need to complete your project.
2) Data Cleaning
In real-world problems, better data beats fancier algorithms every single time. Garbage in gets you garbage out. On the flipside, if you have a clean dataset, even simple algorithms can learn useful insights from it. While it's not the "sexiest" part of machine learning, proper data cleaning will make or break your project.
3) Feature Engineering
Feature engineering is the process of creating new input features using your dataset. This is one of the best ways data scientists add value to the ML process and improve model results, as you're able to incorporate domain knowledge with feature engineering.
4) Algorithm Selection
For business use cases of DS and ML, it's important to choose modern algorithms that are relevant and applicable to the problem. Generally speaking, the best place for beginners to start are tree ensembles (e.g. random forests), as they are very effective general-purpose algorithms. Don't jump into neural nets and deep learning right away, as those tend to have more niche use-cases.
5) Model Training
Once you have the previous steps down, training a professional-level model is actually pretty straightforward and formulaic. There are a few best practices, such as cross-validation and train/test splitting, that you'll want to incorporate to avoid overfitting your models.
Now you know the basic framework for Machine Learning, now go out there to kick some dataset's ass!
Tuesday, September 10, 2019
Subscribe to:
Comments (Atom)
Framework for Applied Machine Learning
What is Machine learning (ML) ? Machine learning is about learning patterns within datasets and to model them in order to make useful pre...
-
What is Machine learning (ML) ? Machine learning is about learning patterns within datasets and to model them in order to make useful pre...
-
Whenever we do an analysis of any survey results, we should first consider the basis of its validity. If not the analysis would be a classi...