Trying to make some test in a data frame called pa...
# general
a
Trying to make some test in a data frame called panel, that has info in 1 minute time frame that has the next characteristics. Unfortunately Im getting all values a Nan. What could be wrong? How can I fix it? panel.dtypes: unique_id int64 ds datetime64[ns] y float64 Data_Example: unique_id ds y 499995 2024-02-27 05500 1.08497 499996 2024-02-27 05600 1.08495 499997 2024-02-27 05700 1.085 499998 2024-02-27 05800 1.08494 499999 2024-02-27 05900 1.08497 tsfeatures(panel, dict_freqs={'M': 1},features=[acf_features]) unique_id x_acf1 x_acf10 diff1_acf1 diff1_acf10 diff2_acf1 diff2_acf10 0 0 NaN NaN NaN NaN NaN NaN 1 1 NaN NaN NaN NaN NaN NaN 2 2 NaN NaN NaN NaN NaN NaN 3 3 NaN NaN NaN NaN NaN NaN 4 4 NaN NaN NaN NaN NaN NaN ... ... ... ... ... ... ... ... 499995 499995 NaN NaN NaN NaN NaN NaN 499996 499996 NaN NaN NaN NaN NaN NaN 499997 499997 NaN NaN NaN NaN NaN NaN 499998 499998 NaN NaN NaN NaN NaN NaN 499999 499999 NaN NaN NaN NaN NaN NaN
j
If it's a single serie you need to set
unique_id
to a single value, like 0 (not consecutive integers). Also minute frequency in pandas is
min
(
M
is month end)
a
José thanks for your quick answer but still not getting it… Trying to use your indicators as features for a regression and / or binary classification model reason why I need to have the indicator in as many rows as I can. Perfectly understand the I may get some Nan at the beginning of the data frame as the indicator uses some rows as length but Im only getting 1 row in a data frame that has 500,000 rows. Also is very important to mention that I can’t group any single row from the 1 minute data frame. Need to get the indicator in as many rows as possible with out grouping any row as they represent each minute in a tame series that have variations each minute. How can I fix it? Can you help me with some examples / link to documents or GitHub repos? On the next lines a small image from the tail(5) of the data frame and a code suggest by Chatgpt that takes your code as input but still not getting the indicator into as many rows as I can but still getting all the results as Nan unique_id ds y 499992 2024-02-27 05500 1.08497 499993 2024-02-27 05600 1.08495 499994 2024-02-27 05700 1.085 499995 2024-02-27 05800 1.08494 499996 2024-02-27 05900 1.08497 from tsfeatures import hurst tsfeatures(panel, dict_freqs={‘min’: 1}, features=[hurst])
j
Have you read the README here? As I said before you're not setting the
unique_id
correctly
a
Trully not sure what Im missing.... Keep getting Nan in all rows... Is it possible for you to help me with the link where to find the description and examples on how to set the field
unique_id?
Trully not sure what Im missing.... Keep getting Nan in all rows in all indicators... Is it possible for you to help me with the link where to find the description and examples on how to set the field
unique_id?
As I mention have a very simple data frame that has 3 columns:
Copy code
unique_id             int64
ds           datetime64[ns]
y                   float64
in which I have change the column unique_id to a integer series, to a string 'min1' as I the dataframe has changes from 1 row to another in 1min intervals, change it to 0, ask chatgp and was not able and also ask perplexity with any results.... Is it possible for you to help me witn an example?
y
@Alejandro Holguin Mora I think what's happening with your code is that you're treating unique_id as if it were an index. It should be like the name of a series. If you're only working with a single series (along with exogeneous regressors), the unique_id should be a constant on every row. It could be like "EURUSD", or "H1" or even a number, as long as it's the same number for every row. Instead of 0, 1, 2, 3, 4... it should be like H1, H1, H1, H1, H1 ... I think that's what Jose is trying to tell you.
j
Also note that the tsfeatures library outputs one value per serie and it sounds like you want a rolling aggregation
a
@José Morales José hi, thanks for your answer. Yes I need to apply / get a rolling indicators over a data frame that has individual and unique rows / data entries. Is is possible to get this rolling points / rows by using the Nixtla indicators?