All-in-One Hate Speech Detectors May not be what You Want

Mondher Bouazizi, Natsuho Niida, Tomoaki Ohtsuki

研究成果: Conference contribution

抄録

The detection of Hate speech has been an increasingly active research topic. The results reported by the state-of-the-art systems to automatically detect hateful contents achieved almost perfect performance on common data sets. However, "hate speech"is a very subjective term, and people with different backgrounds have different levels of tolerance to what constitutes hate. In this paper, we show the limitations of having a single classifier handling the problem of hate speech detection. We then propose to build classifiers customized for different people, instead of a single classifier. The main obstacle towards achieving such a goal is the scarcity of data. Therefore, we use transfer learning to overcome this issue and use very limited amount of annotated data to build these customized classifiers. In a first stage, we build a classifier on a large data set which classifies tweets into 3 classes: hate, offensive, clean, and which we refer to as the general classifier. In the second stage, we asked 3 annotators with different backgrounds to re-annotate a small sub-set of tweets (600 tweets) from the original one. We refer to this newly created data set as "the customized data set."We then fine-tune the general classifier on the customized data set and build the customized classifier for each annotator. The accuracy of classification of corresponding customized data set got 0.08, 0.06 and 0.11 higher than the general classifier. The result shows that it is possible to start with a general classifier, and adjusted it to each individual despite the very limited amount of the training data for him/her.

本文言語English
ホスト出版物のタイトルICSIM 2021 - Proceedings of the 2021 4th International Conference on Software Engineering and Information Management
出版社Association for Computing Machinery
ページ165-170
ページ数6
ISBN(電子版)9781450388955
DOI
出版ステータスPublished - 2021 1 16
イベント4th International Conference on Software Engineering and Information Management, ICSIM 2021 - Virtual, Online, Japan
継続期間: 2021 1 162021 1 18

出版物シリーズ

名前ACM International Conference Proceeding Series

Conference

Conference4th International Conference on Software Engineering and Information Management, ICSIM 2021
国/地域Japan
CityVirtual, Online
Period21/1/1621/1/18

ASJC Scopus subject areas

  • ソフトウェア
  • 人間とコンピュータの相互作用
  • コンピュータ ビジョンおよびパターン認識
  • コンピュータ ネットワークおよび通信

フィンガープリント

「All-in-One Hate Speech Detectors May not be what You Want」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル