Thanks for developing such a great project. I have...
# mlforecast
m
Thanks for developing such a great project. I have a question around using LightGBM... i'm having issues with the distributed module (completely unrelated) so I'm thinking about just using a large single machine to do the training. My dataset is roughly 800M records and reading it into a pandas dataframe may be an issue (or maybe not, just speculating). does MLforecast accept inputs other than pandas dataframes? would I be able to use the native lgb.Dataset(...)?
j
Hey. We currently support only pandas dataframes, so you'd need to load it into memory. You could build a lightgbm dataset after the preprocessing and train using that, not sure if it that'd help a lot though.
m
Makes sense. Thanks for the quick response!
j
We try to keep the types where possible, so if you define the id as categorical and the target as float32 you could reduce the memory usage