Back to all work
— Project 04
Forecasting

Forecasting and Failure Prediction

In-house ML platform

A 16-day solar power forecasting platform that compares deep-learning models against gradient-boosted baselines. I also wrote a custom accuracy metric for dawn and dusk, where normal percentage-error metrics break down.

Role
ML Engineer
Period
2024 to 2025
Status
Production
PyTorchTiDETFTXGBoostOptunaBSRN
— Chapter 01
System shape

How the system fits together.

Click a block to zoom in
Two ML systems on one data-engineering backbone: a 16-day solar energy forecaster and an inverter-failure predictor. Click any block to see how.
Fig. 01 — Forecasting and Failure Prediction architecture
— Chapter 02
Decisions and outcomes

The calls that shaped it.

  1. 01

    Authored a custom accuracy metric (MSCA, Mean Squared Capacity Accuracy) that solves the sunrise problem of dividing by near-zero capacity at dawn and dusk. A per-site seasonal bias-correction layer added roughly 17% accuracy on internal benchmarks at v0.13a.

  2. 02

    Wrote a 12-step preparation pipeline with BSRN physics-based radiation cleaning, adaptive LOESS smoothing, operational features derived from work-order history, and a strict data-leakage gate that runs before every training experiment.

  3. 03

    Companion XGBoost early-warning system predicts inverter failures 7, 14, and 30 days ahead across 189 sites and roughly 2,000 hardware units, with the work formalized in a technical report.

— Aside
The interesting work isn't the stack. It's the boundaries.
— Chapter 03
How it runs

What it runs on.

  • 01
    PyTorch + Lightning + PyTorch Forecasting for TFT
  • 02
    NeuralForecast for TiDE
  • 03
    Darts for the multi-model tournament
  • 04
    Optuna for hardware-aware hyperparameter search
  • 05
    XGBoost for the failure-prediction track