Train Only Predictions — a new method to enhance reliability of your machine learning/artificially intelligent solutions


For various reasons, the performance of a machine learning/artificial intelligence model once deployed can lag far behind the training performance. These range from having biases and variations in data, to unidentified assumptions to improper hyper parameter usage.

Since such instances are a direct loss to the investments made, and often leads to the solution being shelved, it is important that such situations be addressed.

This paper outlines one method that can be used as a last resort and that can allow organizations to continue using the deployment machine learning/artificial intelligence solutions until the root cause is addressed.

Proposed solution

While the long term solution shall be to identify the root cause of deviation in performance, the given method can be used in the interim.

For the sake of the discussion, let us say we are using kmeans clustering in our solution for unsupervised learning.

In an ideal world, the kmeans would be trained and then the fully trained, validated model would be deployed to production for future uses. During production runs, the model would loaded and straight away be used to predict the new incoming data.

The proposed solution, that we are calling, Train Only Prediction, takes a slightly different approach. Here are the outlined steps:

Why it works

Let us first understand why prediction fails. Most models make predictions in a single pass, ie, they have one look at the incoming data and guess what the result should look like, based on whatever they have learnt during training. If this guess is wrong, there are no second takes.

Training phase works different. In training phase, the models are designed to iterate a few times over the entire set and incrementally self correct their results in each iteration until the final result becomes acceptable. In machine learning model,s this is achieved through cross validation techniques and in neural networks the combination of epochs and mini batches achieves the result.

Train Only Predcition leverages this behaviour. By putting the small amount of d2 with a vast d1 ensures that range of similar items as contained in d2 increases manifold. Now when this enhanced dataset is run iteratively through the training phase, in the end the probability of it finding the right clusters increases manifold.


This approach suffers from the following limitations:

1) This increases the prediction time by a large factor
2) This is not suitable for supervised learning or regression problems
3) This method can be used for cases when time can be sacrificed in return of a reliable prediction system

Using stratified sampling can speed the process up slightly.

If you have any thoughts on this, please share your views and feedbacks.

Also published on



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store