Random Forest Regression using Scikit Learn
Background
Price of a house can be affected by a lot of factors, and people can have contradictory opinions on it, depending on their knowledge, experience and understanding.
To see which opinions hold true, we can analyse a housing dataset. In this post I have tried to identify the most important factor which influences the house price, from a housing dataset from Taiwan. Also, I looked at the relationships and trends existing between factors available, as it gives a sense of what can be expected.
Business challenge:
To understand which factors are the most important in influencing the house price.
The market historical data set of real estate valuation is collected from Sindian Dist., New Taipei City, Taiwan and is available here
Results:
Conclusion
Hopefully this is a helpful introduction to a straightforward implementation of the Random Forest regression algorithm. The result helps us to understand which factors are the most important in explaining the variability of the dependent variable in the dataset. In addition, the analysis gave an insight into the relationships existing between the factors which has its own benefit in building an understanding of the domain.
For the code, refer to this link
For the dataset, refer to this link
No comments:
Post a Comment