m

12/07/2022, 10:40 AM
Greetings!, I hope everyone is doing well. I am working on a regression problem and I am looking forward to use Transformers for it but before jumping into the implementation and all stuff, I am curious that did any of you use transformers for regression problem. I have around 90 features (floating points) and one target. I couldn't find any paper on transformers for regression problems so please let me know if any of you used transformers for this purpose.
a

Andrei Tulbure

12/07/2022, 10:48 AM
TFT comes in mind if you want to do forecasting. But first try with a simple lin reg and then move on to random forest/lightgbm. Get those as baselines. Then move over to more deep learning stuff imo.
m

12/07/2022, 10:50 AM
I already trained LR and tree based architectures and I am considering those as baseline models and I read TFT paper but TFT comes for time series problem where they are taking the timestamp information separately, I don't have timestamp info but I can use index as timestamp but it will not make any sense but it is worth giving it a try.
a

Andrei Tulbure

12/07/2022, 10:50 AM
Have you tried a simple NN? Like a 2-3 layered one?
m

12/07/2022, 10:51 AM
Yes, I did but MSE loss is quite high.
a

Andrei Tulbure

12/07/2022, 11:11 AM
fine tuned it ? did you do neural architecture search ?
m

12/07/2022, 12:09 PM
Yes, I fine tuned it but no luck with these approaches. I didn't do the NAS till now but will have a look for sure.
a

Andrei Tulbure

12/07/2022, 12:50 PM
try fione tuning nr of layers and nr of neurons
k

Kin Gtz. Olivares

12/07/2022, 2:19 PM
Hey @Muhammad Hasnain Khan, If you have a classic regression problem, you might want to treat it as such. P(Y_i | X_i) The methods that we focus on Nixtla are forecasting methods, solving: P(Y_[t+1:t+h] | Y_[ :t ]) Key regression/forecasting differences: • forecasting aims to predict several steps ahead, making it a multivariate regression problem. • forecasting usually exploits time dependencies in the features, for example through lags (past series values). TFT, along with most Nixtla's methods are actively exploiting the time dependency structures. • regression often times does not consider if the prediction will be future, current or past. As @Andrei Tulbure, suggested you might want to try RF/LGBM and linear regression first. If your problem does not have time dependencies, forecasting methods might not be suited. If you are interested in trying TFT here is a usage example: https://nixtla.github.io/neuralforecast/models.tft.html
🚀 2