Hello everyone, I have come to know about nixtla f...
# general
k
Hello everyone, I have come to know about nixtla from databricks support team while our company was browsing for solutions for distributed forecasting. Before we move forward with implementation of nixtla we want to verify that nixtla has following features : 1. Distributed computing across cluster and workers in spark environment 2. Compatible in GPUs and Multi node GPU 3. Bring your own models 4. Control over dataset creation Could this community provide blogs or links or answers this questions Cc : <!here> @Nixtla Team
m
Hi @Kaustav Chaudhury! Thanks for the interest. We are happy to jump into a brief call with you if that works. Either way @fede (nixtla) (they/them) will provide some examples.
k
Sure max we can come up with some time 🙂
f
hey @Kaustav Chaudhury! 1. That’s perfectly possible with our mlforecast and statsforecast libraries. Regarding statsforecast here’s an example using databricks (https://notebooks.databricks.com/notebooks/RCG/intermittent-forecasting/index.html?&amp;_ga=2[…]08250371.1680567893-82c188cb-e021-40d7-bdec-55707a9664cd). The API changed a little bit, now it is not necessary to declare a backend, it is as simple as passing a spark dataframe to the forecast method. MLForecast works similarly, you have to import the
DistributedMLForecast
class and pass a spark dataframe to the
fit
method, here’s an example: https://nixtla.github.io/mlforecast/distributed.forecast.html. 2. Our neuralforecast library is compatible with GPUs and multinode GPUs, but we are still working on making it compatible with spark. We haven’t tested it yet, but mlforecast also could work in such environments (through lightgbm). Unfortunately, StatsForecast currently does not support gpu. 3. Bringing your own models is perfectly possible using statsforecast (univariate models). We can help you with that if you are interested. MLForecast can also support custom models but in a distributed environment such as spark it could not be easy since the model needs to be distributed as well (for example, we use synapse ml to train lightgbm). 4. You can use your own (spark,dask, or ray) dataframe without a problem. The only requirement is to have at least three columns: unique_id, identifying the time series,
ds
, identifying the temporal column, and
y
, the target column. Here’s a description of the input dataset: https://nixtla.github.io/neuralforecast/examples/data_format.html. Let us know how we can help you. :)
k
We want to bring deepvar as a model in neural forcast