- Data Preparation for Machine Learning
- Recode column names containing special characters
- Treat missing values
- Transform column data types
- Feature engineering
- Aggregating categories to reduce the number of categories
- Transforming numeric variables to improve their distribution properties
- Compute new features from two or more existing features
- Remove duplicates
These are my notes taken from Microsoft Learning’s Principles of Machine Learning in Python - Module 3.
What I learnt:
Data Preparation
I have reviewed and went through Data Preparation for automobile prices & german bank credit in this link.
More ways to detect and deal with missing values are discussed in this link.