Hello everyone, I finished fully annotating and updating my PyData NYC tutorial on Large Scale Timeseries Forecasting.
It covers the following topics:
1. Using the statsforecast library to run statistical models on top of Spark, Dask, and Ray
2. Preprocessing for large scale data using Fugue on top of Spark, Dask, and Ray
3. A section on Hierarchical Forecasting with hierchicalforecast, contributed by @fede (nixtla) (they/them) and @Max (Nixtla) from Nixtla.
4. Running the training pipeline on top of a Dask cluster managed by Coiled, though the same setup will work on Spark, Dask, and Ray clusters.
Happy to present it at any Meetup/event if you know of any!