NLP with a fashion twist: Zalando’s natural language processing framework
Zalando is serving up some natural language processing models with a fashion twist. Flair is a powerful state-of-the-art NLP framework based on PyTorch and Python. NLP never looked so good.
Who says natural language processing and fashion don’t overlap?
Zalando research brings the latest flair to the scene. (Yes, that Zalando. The German-based fashion and beauty online shop operates in fifteen different European countries.)
Add some Flair
Flair is a simple framework for state-of-the-art natural language processing. It builds on top of Pytorch – a popular deep learning platform, which makes it easy to use.
A powerful NLP library. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS), sense disambiguation and classification.
Multilingual. Thanks to the Flair community, we support a rapidly growing number of languages. We also now include ‘one model, many languages’ taggers, i.e. single models that predict PoS or NER tags for input text in various languages.
A text embedding library. Flair has simple interfaces that allow you to use and combine different word and document embeddings, including our proposed Flair embeddings, BERT embeddings and ELMo embeddings.
A Pytorch NLP framework. Our framework builds directly on Pytorch, making it easy to train your own models and experiment with new approaches using Flair embeddings and classes.
Currently, it is at version 0.40, which released new experimental multilingual models, new languages, and lots more. With this new update, Flair now includes pre-trained FastText Embeddings for 30 languages, named entity recognition, part-of-speech tagging, and two pre-trained classification models.
It claims some impressive benchmarks. Try reproducing them and see how you fare.
NLP is so in vogue
The research project behind Flair was led by Alan Akbik. His Zalando research blog post about it links to a paper titled “Contextual String Embeddings for Sequence Labeling” by Alan Akbik, Duncan Blythe, and Roland Vollgraff.
It includes an example about how the named entity recognition and part-of-speech tagging works. There’s a fashion-forward twist. The named entity recognition focuses on identifying seasons, colors, clothes parts, publishers, and designers such as fashion house Alexander McQueen. (Will we one day have the world’s most fashionable robot?)
While the mismatch between open source and a fashion shop may initially come as a surprise, Zalando research has plenty of research.
Check out their list of repos on GitHub. Repos include an adversarial framework for non-parametric image stylization mosaics. A look at their site shows off their projects, which include other deep learning dives created to help with online shopping.
We often see big name commercial companies contributing to open source. Is it still so unusual to see computer science research come from a retailer? Of course, all of these discoveries help their website predict user activities, which in turn leads to more sales and more growth. Machine learning is no longer a realm of theoretical data. That data is being put into use in all sorts of sectors.
Requirements for using Flair include:
- PyTorch 0.4+
- Python 3.6+
Explore the tutorial to guide your way through its usage.