Overview of Machine Learning Metrics
Introduction One of the core tasks in building a machine learning model is to evaluate its performance. The usual data science pipeline consists of prototyping a model on some historical data, reaching a satisfying model and deploying it into production, where it will go through further testing on live data. The stages are usually called offline and online evaluations, where the former analyses prototyped model on historical data and the latter the deployed model on live data. Surprisingly to some, evaluation is really hard as good measurement are often vague or infeasible. Also generally statistical models assume that the distribution of data stays the same over time. But in practice, the distribution of data changes constantly, sometimes drastically. This is called distribution drift. One way to detect distribution drift is to continue tracking the model’s performance on the validation metric on live data. That's why any data science project cannot just end after the model i...