Titre : |
Arabic Subjective Sentiment Analysis Using Machine Learning |
Type de document : |
texte imprimé |
Auteurs : |
Besma Mokrane Ghadir, Auteur ; Aouatif Bouchareb, Auteur ; Sadik Bessou, Directeur de thèse |
Année de publication : |
2022 |
Importance : |
1 vol (6 1f .) |
Format : |
29cm |
Langues : |
Français (fre) |
Catégories : |
Thèses & Mémoires:Informatique
|
Mots-clés : |
Informatique |
Index. décimale : |
004 Informatique |
Résumé : |
Sentiment analysis has gotten a lot of attention in the recent decade because of the
benefits it may give in several fields, including politics, social sciences, marketing and
economics... because social networks are now full of texts in which Internet users express
themselves on a variety of topics and their opinions are important in making decisions in
many of these fields.
Unfortunately, most of the resources and systems developed in this field are designed
for English and other European languages, and sentiment analysis in the Arabic language
has only recently begun to be researched and developed, and progress is slow compared
to research in English and other languages.
In this work, we will make our mark in the field of Arabic sentiment analysis using
machine learning by doing several experiments on the effect of using word and character
grams with different N-grams (unigram, bigram, trigram and 4-gram) and different vectorizer
(CountVectorizer and Tfidfvectorizer), and see how the outcome will change also
we used in all our experiment five of ML algorithms (SVM, NB, LR, RF, DT).
We applied our experiments on two sets of data that were about Twitter comments
and restaurant reviews, categorized into three groups, positive, negative, and neutral, and
containing 23 414 comments.
After we did all the experiments on our data and saw all the results, we came to the
conclusion that the algorithm of Logistic Regression gave us the best result with wordgram
and char-gram which we reached accuracy 90% and 91% respectively and the best
n-gram was the bigram for the word-gram and the trigram for the char-gram and the best
vectorizer was TfidfVectorizer. |
Côte titre : |
MAI/0694 |
En ligne : |
https://drive.google.com/file/d/1yxnHJoZ6uOlD5yqlcNIKtVEfZd1oUeLI/view?usp=share [...] |
Format de la ressource électronique : |
pdf |
Arabic Subjective Sentiment Analysis Using Machine Learning [texte imprimé] / Besma Mokrane Ghadir, Auteur ; Aouatif Bouchareb, Auteur ; Sadik Bessou, Directeur de thèse . - 2022 . - 1 vol (6 1f .) ; 29cm. Langues : Français ( fre)
Catégories : |
Thèses & Mémoires:Informatique
|
Mots-clés : |
Informatique |
Index. décimale : |
004 Informatique |
Résumé : |
Sentiment analysis has gotten a lot of attention in the recent decade because of the
benefits it may give in several fields, including politics, social sciences, marketing and
economics... because social networks are now full of texts in which Internet users express
themselves on a variety of topics and their opinions are important in making decisions in
many of these fields.
Unfortunately, most of the resources and systems developed in this field are designed
for English and other European languages, and sentiment analysis in the Arabic language
has only recently begun to be researched and developed, and progress is slow compared
to research in English and other languages.
In this work, we will make our mark in the field of Arabic sentiment analysis using
machine learning by doing several experiments on the effect of using word and character
grams with different N-grams (unigram, bigram, trigram and 4-gram) and different vectorizer
(CountVectorizer and Tfidfvectorizer), and see how the outcome will change also
we used in all our experiment five of ML algorithms (SVM, NB, LR, RF, DT).
We applied our experiments on two sets of data that were about Twitter comments
and restaurant reviews, categorized into three groups, positive, negative, and neutral, and
containing 23 414 comments.
After we did all the experiments on our data and saw all the results, we came to the
conclusion that the algorithm of Logistic Regression gave us the best result with wordgram
and char-gram which we reached accuracy 90% and 91% respectively and the best
n-gram was the bigram for the word-gram and the trigram for the char-gram and the best
vectorizer was TfidfVectorizer. |
Côte titre : |
MAI/0694 |
En ligne : |
https://drive.google.com/file/d/1yxnHJoZ6uOlD5yqlcNIKtVEfZd1oUeLI/view?usp=share [...] |
Format de la ressource électronique : |
pdf |
|