Rough Architecture ConsiderationsIngestion
- Azure Data Factory
- Triggers hourly (solar) / daily (heat pump + weather)
- Calls Azure Functions or Databricks jobs to hit APIs
- Writes raw JSON into ADLS Gen2 (Bronze)
- Processing
- Databricks
- PySpark notebooks
- Delta Lake Bronze → Silver → Gold
- Auto Loader if you want streaming
- MLflow for model tracking
- Storage
- ADLS Gen2 with Delta Lake
- Bronze: raw JSON
- Silver: cleaned tables
- Gold: analytical models, aggregations
- Transformation
- DBT inside Databricks (optional but very impressive)
- Serving
- Power BI / Databricks SQL dashboards
- REST API endpoint via Databricks Serving
- Orchestration
- ADF pipeline
- Extract → Load → Transform
- Alerts + retries
- Logging + monitoring
- Infrastructure-as-Code
- Databricks workspace
- KeyVault
- Storage
- ADF
- Functions
- Networking
Data Model (Bronze → Silver → Gold)
- Bronze (Raw). Unmodified, just stored.
- /solar/raw/yyyy/mm/dd/deviceid_*.json
- /heatpump/raw/yyyy/mm/dd/*
- /weather/raw/yyyy/mm/dd/api=owm/*
- Silver (Cleaned)
- Solar (solar_readings):
- timestamp, pv_watts, grid_import, export_kwh, load_watts, battery_soc…
- Heat pump (heatpump_readings):
- timestamp, flow_temp, return_temp, outside_temp, power_kwh…
- Weather (weather_combined):
- timestamp, temp_api1, temp_api2, temp_api3, temp_mean, solar_radiation…
- Gold (Modelled)
- Daily Energy Balance
- Solar produced
- Heat pump consumed
- Home load vs solar contribution
- Net export
- Solar self-consumption %
- Heat Pump Efficiency Model
- COP = thermal output / electrical input
- Correlate COP vs outside temperature
- COP vs flow temperature
- Efficiency heatmap
- Home Energy Forecast Model
- Predict solar generation next day
- Predict heat pump usage based on weather
- Forecast grid import/export
- Anomaly Detection
- Identify days where the heat pump consumes more than expected
- Detect inverter or PV issues