A Pattern-Based Approach for Multi-Class Sentiment Analysis in Twitter

Mondher Bouazizi, Tomoaki Ohtsuki

Research output: Contribution to journalArticle

24 Citations (Scopus)

Abstract

Sentiment analysis and opinion mining in social networks present nowadays a hot topic of research. However, most of the state of the art works and researches on the automatic sentiment analysis and opinion mining of texts collected from social networks and microblogging websites are oriented towards the binary classification (i.e., classification into “positive” and “negative”) or the ternary classification (i.e., classification into “positive”, “negative” and “neutral”) of texts. In this paper, we propose a novel approach that, in addition to the aforementioned tasks of binary and ternary classification, goes deeper in the classification of texts collected from Twitter and classifies these texts into multiple sentiment classes. While in this work, we limit our scope to 7 different sentiment classes, the proposed approach is scalable and can be run to classify texts into more classes. We first introduce SENTA, our tool built to help users select out of a wide variety of features the ones that fit the most for their application, to run the classification, through an easy-to-use graphical user interface. We then use SENTA to run our own experiments of multi-class classification. Our experiments show that the proposed approach can reach up to 60.2% accuracy on the multi-class classification. Nevertheless, the approach proves to be very accurate in binary classification and ternary classification: in the former case, we reach an accuracy of 81.3% for the same dataset used after removing neutral tweets, and in the latter case, we reached an accuracy of classification of 70.1%.

Original languageEnglish
JournalIEEE Access
DOIs
Publication statusAccepted/In press - 2017 Aug 18

Fingerprint

Graphical user interfaces
Websites
Experiments

Keywords

  • Data mining
  • Feature extraction
  • Machine Learning
  • Sentiment analysis
  • Sentiment Analysis
  • Tagging
  • Tools
  • Twitter
  • Twitter

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Cite this

A Pattern-Based Approach for Multi-Class Sentiment Analysis in Twitter. / Bouazizi, Mondher; Ohtsuki, Tomoaki.

In: IEEE Access, 18.08.2017.

Research output: Contribution to journalArticle

@article{c5a30c2a699b44728044f441cac55a6a,
title = "A Pattern-Based Approach for Multi-Class Sentiment Analysis in Twitter",
abstract = "Sentiment analysis and opinion mining in social networks present nowadays a hot topic of research. However, most of the state of the art works and researches on the automatic sentiment analysis and opinion mining of texts collected from social networks and microblogging websites are oriented towards the binary classification (i.e., classification into “positive” and “negative”) or the ternary classification (i.e., classification into “positive”, “negative” and “neutral”) of texts. In this paper, we propose a novel approach that, in addition to the aforementioned tasks of binary and ternary classification, goes deeper in the classification of texts collected from Twitter and classifies these texts into multiple sentiment classes. While in this work, we limit our scope to 7 different sentiment classes, the proposed approach is scalable and can be run to classify texts into more classes. We first introduce SENTA, our tool built to help users select out of a wide variety of features the ones that fit the most for their application, to run the classification, through an easy-to-use graphical user interface. We then use SENTA to run our own experiments of multi-class classification. Our experiments show that the proposed approach can reach up to 60.2{\%} accuracy on the multi-class classification. Nevertheless, the approach proves to be very accurate in binary classification and ternary classification: in the former case, we reach an accuracy of 81.3{\%} for the same dataset used after removing neutral tweets, and in the latter case, we reached an accuracy of classification of 70.1{\%}.",
keywords = "Data mining, Feature extraction, Machine Learning, Sentiment analysis, Sentiment Analysis, Tagging, Tools, Twitter, Twitter",
author = "Mondher Bouazizi and Tomoaki Ohtsuki",
year = "2017",
month = "8",
day = "18",
doi = "10.1109/ACCESS.2017.2740982",
language = "English",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - A Pattern-Based Approach for Multi-Class Sentiment Analysis in Twitter

AU - Bouazizi, Mondher

AU - Ohtsuki, Tomoaki

PY - 2017/8/18

Y1 - 2017/8/18

N2 - Sentiment analysis and opinion mining in social networks present nowadays a hot topic of research. However, most of the state of the art works and researches on the automatic sentiment analysis and opinion mining of texts collected from social networks and microblogging websites are oriented towards the binary classification (i.e., classification into “positive” and “negative”) or the ternary classification (i.e., classification into “positive”, “negative” and “neutral”) of texts. In this paper, we propose a novel approach that, in addition to the aforementioned tasks of binary and ternary classification, goes deeper in the classification of texts collected from Twitter and classifies these texts into multiple sentiment classes. While in this work, we limit our scope to 7 different sentiment classes, the proposed approach is scalable and can be run to classify texts into more classes. We first introduce SENTA, our tool built to help users select out of a wide variety of features the ones that fit the most for their application, to run the classification, through an easy-to-use graphical user interface. We then use SENTA to run our own experiments of multi-class classification. Our experiments show that the proposed approach can reach up to 60.2% accuracy on the multi-class classification. Nevertheless, the approach proves to be very accurate in binary classification and ternary classification: in the former case, we reach an accuracy of 81.3% for the same dataset used after removing neutral tweets, and in the latter case, we reached an accuracy of classification of 70.1%.

AB - Sentiment analysis and opinion mining in social networks present nowadays a hot topic of research. However, most of the state of the art works and researches on the automatic sentiment analysis and opinion mining of texts collected from social networks and microblogging websites are oriented towards the binary classification (i.e., classification into “positive” and “negative”) or the ternary classification (i.e., classification into “positive”, “negative” and “neutral”) of texts. In this paper, we propose a novel approach that, in addition to the aforementioned tasks of binary and ternary classification, goes deeper in the classification of texts collected from Twitter and classifies these texts into multiple sentiment classes. While in this work, we limit our scope to 7 different sentiment classes, the proposed approach is scalable and can be run to classify texts into more classes. We first introduce SENTA, our tool built to help users select out of a wide variety of features the ones that fit the most for their application, to run the classification, through an easy-to-use graphical user interface. We then use SENTA to run our own experiments of multi-class classification. Our experiments show that the proposed approach can reach up to 60.2% accuracy on the multi-class classification. Nevertheless, the approach proves to be very accurate in binary classification and ternary classification: in the former case, we reach an accuracy of 81.3% for the same dataset used after removing neutral tweets, and in the latter case, we reached an accuracy of classification of 70.1%.

KW - Data mining

KW - Feature extraction

KW - Machine Learning

KW - Sentiment analysis

KW - Sentiment Analysis

KW - Tagging

KW - Tools

KW - Twitter

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=85028513340&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85028513340&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2017.2740982

DO - 10.1109/ACCESS.2017.2740982

M3 - Article

AN - SCOPUS:85028513340

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

ER -