Solar energy is unpredictable. Output depends on weather conditions that change hour to hour, which makes it hard for grid operators to plan around. If you can forecast how much energy solar panels will produce on a given day, you can manage supply more effectively and avoid shortfalls.
I built machine learning models to predict daily solar energy production using weather and irradiance data. I started with a Decision Tree as a baseline and then used a Random Forest as the main model. The raw weather data needed some work before it was useful for prediction, so I engineered new features from it to better capture patterns like seasonality and how different weather variables interact. For tuning, I used 5-fold cross-validation stratified by season and ran a grid search to find the best model configuration.
The Random Forest got an R² of 0.87 on the test set, a solid improvement over the Decision Tree baseline at 0.72. Feature importance showed that peak solar irradiance, ambient temperature, and cloud cover were the three biggest predictors, accounting for over 75% of the model's explanatory power.