Visualizing Cross-validation Code

Visualizing Cross-validation Code  #MachineLearning #dataviz

  • Let us say, you are writing a nice and clean Machine Learning code (e.g. Linear Regression).
  • As the name of the suggests, cross-validation is the next fun thing after learning Linear Regression because it helps to improve your prediction using the K-Fold strategy.
  • But we divide the dataset into equal K parts (K-Folds or cv).
  • Then train the model on the bigger dataset and test on the smaller dataset.
  • This graph represents the k- folds Cross Validation for the Boston dataset with Linear Regression model.


Cross-validation helps to improve your prediction using the K-Fold strategy. What is K-Fold you asked? Check out this post for a visualized explanation.

Continue reading “Visualizing Cross-validation Code”

Classification in the presence of missing data

Learn how to improve prediction accuracy in the presence of missing data!
#machinelearning

  • The misclassification error is much lower when surrogate splits are used with decision trees.
  • The example shows how decision trees with surrogate splits can be used to improve prediction accuracy in the presence of missing data.
  • y_pred1 = predict(RF1,Xtest); confmat1 = confusionmat(Ytest,y_pred1); y_pred2 = predict(RF2,Xtest); confmat2 = confusionmat(Ytest,y_pred2); disp( ‘Confusion Matrix – without surrogates’ ) disp(confmat1) disp( ‘Confusion Matrix – with surrogates’ ) disp(confmat2) Confusion Matrix – without surrogates 67 1 24 13 Confusion Matrix – with surrogates 65 3 4 33
  • Decreasing value with number of trees indicates good performance.
  • There are several ways to improve prediction accuracy when missing data in some predictors without completely discarding the entire observation.

Read the full article, click here.


@MATLAB: “Learn how to improve prediction accuracy in the presence of missing data!
#machinelearning”


Learn about MATLAB support for machine learning. Resources include examples, documentation, and code describing different machine learning algorithms.


Classification in the presence of missing data