I've been learning some of the core concepts of ML lately and writing code using the Sklearn library. After some basic practice, I tried my hand at the AirBnb NYC dataset from kaggle (which has around 40000 samples) - https://www.kaggle.com/dgomonov/new-york-city-airbnb-open-data#New_York_City_.png
I used the sklearn.linear_model.Ridge as my baseline and after doing some basic data cleaning, I got an abysmal R^2 score of 0.12 on my test set. Then I thought, maybe the linear model is too simplistic so I tried the 'kernel trick' method adapted for regression (sklearn.kernel_ridge.Kernel_Ridge) but they would take too much time to fit (>1hr)! To counter that, I used the sklearn.kernel_approximation.Nystroem function to approximate the kernel map, applied the transformation to the features prior to training and then used a simple linear regression model. However, even that took a lot of time to transform and fit if I increased the n_components parameter which I had to to get any meaningful increase in the accuracy.