Data AnalysisEducation

What is Cross-Validation

Cross-validation can be defined as the use of one or more statistical techniques to validate the reliability of prediction of the model. Typically, cross-validation is used in case of small dataset which are difficult i.e. splitting the data in two parts does not result in good prediction.

cross validation

Let’s n be the number of data points in the training dataset. Let’s k be an integer index that is much smaller than n.


In a k-fold cross-validation, we divide entire data set into k equal-size data subsets, and use k-1 part for the training and remaining part for testing and calculation of the prediction error.


We repeat the procedure k times, and report the average from k-runs. This method is frequently used in reporting the results in the literature as 10-fold cross-validation, where the data set is divided into 10 subsets and final prediction error is calculated as 1/10 times the sum of the ten errors.

Show More
Back to top button