Posts

Showing posts from June, 2020

NLP for Data Scientists - SpaCy

Image
Introduction It's been a while since we introduced new library, so today we'll talk a bit about SpaCy , which is an Natural Language Processing library for Python. Now, you'll say: Wait a minute, what about NLTK? Yes, both in Natural Language Processing with Python and Tweets analysis with Python and NLP we used NLTK, but from now on - no more. The reason couldn't be described better than in Spacy's author article about why he chose to write the library in the first place. What NLTK has is a decent tokenizer, some passable stemmers, a good implementation of the Punkt sentence boundary detector, some visualization tools, and some wrappers for other libraries. Nothing else is of any use. Installation Starting to work with SpiCy is easy, first install it and then download the model data. pip install scipy python -m spacy.en.download The rest is pretty straight forward, import the library and start using according to the documentation. Let's see ho