This message was deleted.
# general
s
This message was deleted.
j
Hey. You shouldn't do that because that would result in oversubscription, e.g. you have 10 CPUs and you're trying to assign 20 concurrent tasks to them, so each CPU will be switching back and forth between tasks, which will slow down the overall program. You should choose whether you want each trial to be performed sequentially (optuna=1, rf=-1) or run several trials in parallel where each random forest is trained in a single thread (optuna=-1, rf=1). I think the fastest would be the first one, but you can run a quick benchmark to verify
You could also mix them, such that the product of them is the number of CPUs, e.g. if you have 10 CPUs you could use (optuna=2, rf=5)
Also you can use lightgbm with
boosting='rf'
, which should yield similar results but is way faster
💡 1
m
perfect, thanks José!
Hello @José Morales, sorry to bother you again, but it's a quick thing. It worked and I managed to train the model, I used lightgbm with the boosting = 'rf' argument, it was a great tip. Now I will train a second model on the residuals of the first. What is the best way for me to access the in-sample predictions? mlf is the mlforecast object that lightgbm is already tuned. I believe that somehow from mlf I can access in sample prediction but I didn't find it:
Copy code
mlf.fit(train,
    id_col='unique_id',
    time_col='ds',
    target_col='y',
    static_features= ['ADI','CV2', 'accumulated_sold_qty', 'price', 'freight','median_ticket','origin','revenue'],)
j
Hey, yeah sorry I just realized this isn't very well documented. You have to run fit with
fitted=True
and then use the forecast_fitted_values method
m
It worked perfectly. Thank you very much, José!
👍 1