Baidu Research

Baidu Team Wins the Suggestion Mining Challenge at SemEval 2019

2019-03-11

The 13th International Workshop on Semantic Evaluation (SemEval 2019) has concludedin February 2019. In the Suggestion Mining from Online Reviews and Forums (Task 9A), Baidu triumphed over 210 teams from around the world and became the champion with an F score of 78.12%, demonstrating the company’s pioneering achievements in thisfield.

图片 1.png

Baidu ranks No.1 in SemEval-2019 Task 9A.

As the top international competition in natural language processing (NLP), SemEval is organized by a subsidiary of the Association for Computational Linguistics (ACL). Ithas been successfully held 12 times since 2001 and attracted many universities and research institutions from around the world. One of thetasks of SemEval this yearwas Suggestion Mining from Online Reviews and Forums which aims to automatically identify constructive suggestions from online forums or reviews.

Nowadays, consumer opinions towards commercial entities like brands, services, and products are usually expressed through online reviews, blogs, discussion forums, or social media platforms. Such opinions not only contain the expression of consumers’ emotions, but also give suggestions for improving the entity or advice to the other consumers. For example, "I like the food of this restaurant, but it’d be better if the place is cozier." Traditional research on sentiment analysis placed more emphasis on whether users express negative emotions and tend to ignore those reviews like the one above even though it can be extremely valuable for improving services. The industry is trying to advance research on sentiment analysis, however, Suggestion Mining remains a relatively young area. Baidu is now leading the pioneering progress in this field.

Suggestion Mining requires a comprehensive analysis ofthe semantics, voice, emotion, structure, context and more information of sentences to make accurate judgment. For instance, sentences such as "consider adding more flights for holiday seasons" and "I would stay in this hotel again if they provide kettles in the room" have completely different sentence structures, sentence patterns, and entities to be reviewed, but they offer valuable suggestions. Understanding the core and extended meanings of "suggestions" is a subjective task, leaving room for inconsistency in defining questions and labeling information and thus, more difficulty in extracting suggestions.

In the competition, Baidu’s team adopted multiple technical approaches to resolve the difficulties. Firstly, the team constructed a cross-domain, multi-structure deep semantic classification model based on large-scale unsupervised data. An ensemble learning model is used to integrate shallow learning and deep learning to resolve the issues of an imbalanced dataset. To process the diverse styles and non-standard expressions in online content, Baidu’s team utilized fine-grained featuresand attention transfermechanisms. In the end, the team won the championship with an F score of 78.12%.

While companies seek consumers' reviews to effectively improve their services, governments and media publications hope to obtain opinions from massive information online as a reference. Therefore, the use of suggestion mining to assist in decision-making is undoubtedly a practical and valuable practice of NLP. It can empower both public and private sectors by extracting suggestions which are spontaneously expressed on various online platforms, enabling organizations to collect suggestions from much larger and varied sources.

Giving machines cognitive abilities is one of the biggest challenges in artificial intelligence for which NLP is crucial. With more than ten years of experience in NLP, Baidu has the most advanced technologies and a commitment to applying the technologies to resolve actual problems. Baidu's semantic technology has been widely adopted within and outside of Baidu in products such as search and news feed.