This message was deleted Nixtla Community #general

Join Slack

This message was deleted.

# general

Slackbot

03/11/2023, 1:57 PM

This message was deleted.

👍 1

🙌 1

fede (nixtla) (they/them)

03/13/2023, 5:16 PM

Hey @Akmal Soliev! That would be an amazing enhancement. The function that transforms the dataframe into numpy is this: https://github.com/Nixtla/statsforecast/blob/main/statsforecast/core.py#L407.

Akmal Soliev

03/14/2023, 10:08 AM

Thx, will take a look!

Akmal Soliev

03/14/2023, 8:39 PM

Hey, I had a follow up question, I'm currently trying to use your synthetic data generator function (https://github.com/Nixtla/statsforecast/blob/main/statsforecast/utils.py). In order to use it as polars DataFrame I would need to recreate the steps from scratch due to the categorical dtypes (I have no clue why that's such a pain in the @$$ to deal with). Prior to doing that I wanted to ask you what's the point of having those values as categorical?

fede (nixtla) (they/them)

03/14/2023, 8:54 PM

hey @Akmal Soliev! It is just for efficiency purposes when dealing with experiments that have thousands of time series, it’s more memory efficient to deal with categorical instead of object with such a large amount of data

fede (nixtla) (they/them)

03/14/2023, 8:56 PM

maybe we could add a new argument to

generate_series

controlling if the conversion is required, something like

uid_to_categorical=True

Akmal Soliev

03/15/2023, 11:17 AM

I'll see if I can just rebuild it

Akmal Soliev

03/15/2023, 4:53 PM

okay rebuilt it, also got some time shaved of the generation:

Copy code

before:
________________________________________________________
Executed in    3.93 secs    fish           external
   usr time    4.68 secs    0.06 millis    4.68 secs
   sys time    1.44 secs    1.03 millis    1.43 secs

after:
________________________________________________________
Executed in  833.23 millis    fish           external
   usr time    2.23 secs      0.07 millis    2.23 secs
   sys time    0.50 secs      1.20 millis    0.50 secs

Akmal Soliev

03/15/2023, 5:53 PM

This is the most useless improvement 🤣

Max (Nixtla)

03/17/2023, 6:41 PM

@Akmal Soliev we think its great. Thanks.

🙏 1

Akmal Soliev

05/09/2023, 9:21 PM

[PR Update] After two months of persistent effort, I have successfully developed a versatile DataFrame conversion solution for both Polars and Pandas. This solution is compatible with any structured two-dimensional data, provided that the output can be consolidated into named Numpy arrays. ✅ All local tests have been passed, Github Actions required

util.py

to be modified, as I have added Polars test. NOTE: • Have not yet implemented engine into

_StatsForecast

and/or

StatsForecast

class for I/O to match. •

core.ipynb

has `util.py`'s

generate_series

function so that code can actually run. Will be removed in future. More info: https://github.com/Nixtla/statsforecast/pull/448#issuecomment-1537431035

🙌 3

fede (nixtla) (they/them)

05/10/2023, 7:58 PM

Awesome @Akmal Soliev!

Max (Nixtla)

05/10/2023, 8:09 PM

You are the best @Akmal Soliev, thanks.

Max (Nixtla)

05/10/2023, 8:10 PM

We are going to write a brief post to communicate the new feature. Which accounts should we mention to thank you.

Akmal Soliev

05/11/2023, 3:58 PM

@Max (Nixtla) Thank you, I'm glad to help. I just implemented a change where if input dataframe is polars or pandas

StatsForecast

should run without any issues. TODO: • Modify

plot

staticmethod to work both with polars and pandas • Implement I/O matching, at current moment there is variable for that

self.engine

◦ At current moment it is: ◦ polars in and pandas out ◦ pandas in and pandas out

Max (Nixtla)

05/11/2023, 7:26 PM

Great :)

Akmal Soliev

05/12/2023, 11:01 PM

@Max (Nixtla) All the local tests have been passed on Polars and Pandas using modified

generate_series

. More information: https://github.com/Nixtla/statsforecast/pull/448#issuecomment-1546411593 @fede (nixtla) (they/them) could I ask you to please check the PR and let me know if I've missed anything. From my end everything worked smoothly.

Akmal Soliev

05/14/2023, 8:32 PM

@fede (nixtla) (they/them) here's the file with all the tests done with Polars. NOTE:

groupby

doesn't have sort param in polars, hence, have to chain

.sort('unique_id')

P.S. Latest PR is up with bug fixes

core_polars_test.ipynb

Akmal Soliev

05/23/2023, 3:20 PM

Hey, there is a

_parse_ds_type

bug on

main

where in

int

datestamps are converted into datetime in certain cases, due to the check failure; specifically

issubclass(df["ds"].dtype.type, np.integer)

, which can be checked much better with kind,

np.array().dtype.kind in ["i", "f"]

, where

stands for int and

stands for float. Implemented this change in my PR: https://github.com/Nixtla/statsforecast/pull/448

🙌 1

2 Views

Open in Slack

Previous Next