📄 article
Model Deployment & REST APIs
Difficulty: M.TechRead Time: ~15 min
The MLOps Lifecycle
Building a model is only 10% of the work. The remaining 90% is deploying, serving, monitoring, and maintaining that model in production.
Serving Models with FastAPI
FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.7+ based on standard Python type hints.
Building a Prediction API
First, you must serialize (save) your trained model using a library like joblib or pickle.
python
import joblib
# Save the model
joblib.dump(clf, 'model.joblib')
Then, load the model inside a FastAPI application to serve predictions over HTTP.
python
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np
app = FastAPI()
# Load the model globally so it's loaded only once on startup
model = joblib.load('model.joblib')
class PredictionRequest(BaseModel):
features: list[float]
@app.post("/predict")
def predict(request: PredictionRequest):
# Convert input to 2D numpy array
input_data = np.array(request.features).reshape(1, -1)
# Generate prediction
prediction = model.predict(input_data)
return {"prediction": int(prediction[0])}
To test this API, you can send a POST request with a JSON payload containing the features. This is how the frontend will communicate with your machine learning model!