Hello everyone, I'm currently working on a project...
# general
r
Hello everyone, I'm currently working on a project involving refrigerator sensors labeled a, b, c, and y. Sensor y measures a specific gas value inside the refrigerator, which is usually accessible only to technicians in real-world scenarios. Currently, I have access to data from all these sensors, but in the future, we will no longer have access to sensor y. The sensors a, b, and c are directly related to the gas value that y measures. I am looking to develop a model using sensors a, b, and c to accurately predict the readings of sensor y. Given this setup, I would ideally like to explore advanced models such as Graph Attention Neural Networks, but I am also open to more practical solutions that might be available within frameworks like Nextla. Most of the models I've found require sensor y's data as an input rather than using it solely as a label for predictions. Could anyone advise on the best type of model to use for this prediction task? Any recommendations on models that effectively utilize y as a label without requiring it as an input would be greatly appreciated. Thank you!
m
Hi @reinier, I think that the main challenge isn't necessarily time series forecasting but rather predictive modeling using the sensor readings. From your problem description, I’m assuming your data looks something like this: Date | sensor a | sensor b | sensor c | y Given this setup, I think you're correct in using sensors a, b, and c to predict the values of sensor y. If the date feature doesn’t play a critical role in the changes of sensor readings, it might be best to not add it to the predictive model, but rather use it to keep the data organized. Date 1: (a1, b1, c1) -> y1 Date 2: (a2, b2, c2) -> y2 Date 3: (a3, b3, c3) -> y3 TimeGPT and the rest of our models are for solving time series forecasting problems, and in your case, I think what is needed is some kind of ML algorithm for prediction. Thinking about the simplest model, something like this: yhat = alpha*a + beta*b + gamma*c + error Here the coefficients alpha, beta and gamma can be estimated using the historical values (I don’t think a linear model will be enough, but just to give you an example)
r
Hi @Mariana Menchero, Thank you for your detailed response. I appreciate your suggestion of using sensors a, b, and c to predict sensor y through a predictive modeling approach. However, I have a concern regarding the potential loss of valuable information inherent in the time series data. For example, consider the temperature sensor inside the refrigerator. When the refrigerator door is opened, we immediately see an increase in temperature. The machine then works to cool down, leading to a later increase in temperature in other parts of the machine and a decrease in temperature in the refrigerator itself. This temporal relationship suggests that there might be significant value in using a time series model to capture these dynamics. Would it not be beneficial to utilize a time series-based model, such as an RNN or LSTM, to account for these temporal dependencies? Or am I misunderstanding something about the problem setup? So, to circle back, I do believe that the time series column (in my case, date and timestamp) plays an important role. However, I am having difficulty finding a model within your library that can effectively address my problem. Perhaps you can provide further assistance, and if not, I will continue searching for a solution elsewhere. Thank you again for your insights. Best regards, Reinier
m
Just seeing this thread now, but I'd probably model this as a regression, random forest or gradient boost instead of a time series. To account for the impact of time on readings I would introduce lag variables to the regression indicating how long the door has been opened or closed relative to each reading. Algorithms that would better serve you are probably from sklearn.