my thoughts here (not claiming it is all correct): the question is if one model can do it all. sometimes you need a model to forecast baseline demand and a second to model the crazy peaks. if you want to use one model that should get the crazy peaks right, you might want to stick to mse (maybe even use a weighted loss function, which you can easily integrate into xgb using nixtla) as evaluation function and maybe even mse as objective functon. I would try mse and tweedie as objective during tuning and then evaluate on mse. but in the end you have to test and see what works best, of course. it is always hard to judge without looking at the data. on top of that: do you have other exogenous feature, how do you encode categorical features (sometimes target or catbooost works better than OHE). maybe this gives you some inspiration on what to try next 🙂 p.s.: you could maybe do the following: build a baseline model and a second model (a residual model) to forecast the peaks, which takes the residuals of your fc and the true peak and then corrects the initial forecast to match the peaks better