Batch Prediction Health Metrics
Data Pipelines, Monitoring & Reliability DS practice problem on Onlearn.
Difficulty: easy.
Topics: Batch Prediction Health Metrics, Data Drift Detection, Inference Latency Percentiles, Feature Store Consistency, Job Failure Rate, Throughput Bottleneck Analysis, MLOps Infrastructure, Data Engineering, Software Reliability Engineering, Statistical Process Control, Distributed Systems, Batch Inference Orchestration, Data Quality Validation, Model Performance Monitoring, Pipeline Observability, Resource Utilization Tracking.
In production ML systems, monitoring the health of batch prediction jobs is essential for maintaining service reliability. Given a list of prediction results from a batch job, compute key health metrics that are commonly tracked in MLOps dashboards. Each prediction result is a dictionary with: 'status': Either 'success' or 'error' 'confidence': A float between 0 and 1 (only present when status is 'success') Write a function calculate batch health(predictions, confidence threshold) that computes: 1. Success Rate : Percentage of predictions that completed successfully 2. Average Confidence : Mean confidence score of successful predictions (as a percentage) 3. Low Confidence Rate : Percentage of successful predictions with confidence below the threshold The function should return a dictionary with these three metrics. If the input list is empty, return an empty dictionary. If there are no successful predictions, return success rate as calculated and both confidence metrics as 0.0. All returned values should be rounded to 2 decimal places.