Hello everyone, Good day.
I'm new to ML and have a task of performing a timeseries forecasting. I have sensor data from electric vehicles sampled at 1hz frequency.
I want to forecast the ageing of the battery. Since the ageing of the battery is more associated with the charge and discharge cycles,
I have made my features on the cycle level. By doing this, I end up with very minimal data of around 1000 rows. I decided to make another dataset aggregating the raw signals on min or hour level
and thought to pass it to the model. Is there any better solution to this problem statement ?
As far as I researched, having a separate model to each dataset and combine them using heirarchical reconciliation. But is this an optimal solution ?
Thought to ask if anyone have experience a smiliar situation.
Any leads would be of a great help.
Thanks,
m
Matthew Lesko
08/23/2024, 5:51 PM
Not an expert, but I don't think you need hierarchical reconciliation. I'd start with a simpler model of daily maximum. Graph it. Presumably you'll see a downward trend over time as the battery holds less and less charge. With that I'd try the AutoCES/ETS/ARIMA/Theta algorithms with cross validation: https://nixtlaverse.nixtla.io/statsforecast/docs/tutorials/crossvalidation.html
Next analyze the graph to see if there's weekly seasonality - for example charge troughs during the week because of more usage than the weekend - if so consider using MSTL with a weekly seasonality (season_length = [7]) in combination with the above: https://nixtlaverse.nixtla.io/statsforecast/docs/models/multipleseasonaltrend.html
Depending on how well that performs you might revisit with hourly granularity and apply both daily and weekly seasonality.
Lastly, you can also include exogenous variables with AutoARIMA. For example is there a regular point in time where vehicles are charged or do you have good weather forecast data? I can imagine those might both influence battery lifespan. See https://nixtlaverse.nixtla.io/statsforecast/docs/how-to-guides/exogenous.html
a
adhi v
08/23/2024, 7:02 PM
Hello Mathew, Thanks a lot for your inputs. I will go through these and check if it works. My problem is the ageing value from the sensors are not great. So I use the ageing calclulated from an empirical model which can only be calculated on the cycle level. So I don't have the ageing value on second/minute or hour level. Anyways, the suggestion you gave make sense. I will try it. Thanks again for your suggestions.
m
Matthew Lesko
08/23/2024, 7:37 PM
If the sensor data is unreliable I might look at doing some outlier detection and replacement. I've found ensemble methods useful for this. Basically train a model and use it to predict itself. Replace values that are outside some diminishing threshold iteratively and retrain, repredict. Once you are no longer making replacements, take that data set through a time series model.