Posts

Showing posts from January, 2017

Python for Data Scientists - scikit-learn

Image
Introduction In the previous posts we've covered the basics of data analysis. Now it's gloves off and here come the big guns - machine learning library called scikit-learn. scikit-learn has become one of the most popular open source machine learning libraries for Python. It provides algorithms for machine learning tasks including classification, regression, dimensionality reduction, clustering and many more. It also provides modules for extracting features, processing data and evaluating models. Installation scikit-learn is dependent upon both NumPy and SciPy, of which we've talked. So make sure to upgrade both to latest version prior to installing the package, which is done, of course, using the python package manager. pip install scikit-learn Conclusion scikit-learn covers a very broad spectrum of data science fields, each deserving a dedicated discussion. And this is exactly what we're going to do for the next couple of sessions, diving deeper into each