- Data Type
- Information
- Rows and Columns
- Numerical or Categorical
- Find Missing Data
- Find Relation between Features and Labels
- Visual Data
- Fill or Drop Missing
- Encode Categorical Data
- Use K Fold Cross Validation to get Better Accuracy and Observe the Cross Validation Score
- Apply Grid Search Cross Validation to Find Optimal Hyperparameters of a Model which results in the most Accurate Predictions
- Find Best Parameters
- Evaluate the Results on Validation Set using the Best Performing Parameters
- Create more than one Model to Find Best Performing Model for Test Set
- Select the Final Best Performing Model on Test Set for Evaluation.
| Model | Type | Train Speed | Predict Speed | Performance |
|---|---|---|---|---|
| Logistic Regression | Classification | Fast | Fast | Low |
| Support Vector Machine | Classification | Slow | Moderate | Medium |
| Multi Layer Perceptron | Both | Slow | Moderate | High |
| Random Forest | Both | Moderate | Moderate | Medium |
| Boosted Tree | Both | Slow | Fast | High |