Cloud Run as Cold Storage for ML Models
Where to store your models if you want to access them rarely? Let’s say you want to be able to access them in few months, but you don’t want to pay for the running server all the time. This may not be the regular issue, but we faced this challenge when working on some PoC that was supposed to be shown to clients every few months. We didn’t want to pay for the server all the time, didn’t have to maintain it, but we wanted to have the model ready to be served in a few minutes. The way I solved this was by wrapping the ML model in a FastAPI app with BentoML and deploying it on Google Cloud Run as a docker container. ...