ML Data Versioning With DVC: How to manage machine learning data 🗃
Recently I wrote a post about DVC at my company’s, Appsilon blog. DVC is like a git, but for data, models and experiments. It also allows for creating an automated experiments pipelines.
As a teaser I’ll just say that, having prepared scripts for model training and evaluating, when new data is added to the repo, the whole training is run automatically. Metrics are saved to appropriate files alongside with parameters, same with plots.