One thing is to train the model, but another is to serve it. In between those two phases, model serialization and deserialization occurs. In simpler words it’s just model saving and loading. It’s important because different methods result in different:
- Inference speed
- Model size
- Python environment size
If you are curious what are the ways to serialize model in PyTorch and how they compare, checkout my new post on Appsilon blog.