University Sétif 1 FERHAT ABBAS Faculty of Sciences
Détail de l'auteur
Auteur Hicheur,Nésrine |
Documents disponibles écrits par cet auteur



Titre : Sentiment analysis for saudi dialect using machine learning Type de document : texte imprimé Auteurs : Hicheur,Nésrine, Auteur ; Kara-mohamed, Chafia, Directeur de thèse Editeur : Setif:UFA Année de publication : 2019 Importance : 1 vol 72 f .) Format : 29 cm Langues : Français (fre) Catégories : Thèses & Mémoires:Informatique Mots-clés : Sentiment analysis
Machine learning
Arabic languageIndex. décimale : 004 - Informatique Résumé : The increasing expansion of social media provides a big content of informations, opinions and thoughts sharing by people in their native languages or dialects. So sentiment analysis was found to classify the various expressed sentiments from a huge amount of data. There are many machine learning and deep learning algorithms to handle this and many presented difficulties influences the process of evaluation of the produced classifier. In this methodology, the underlying idea is to achieve a particular accuracy using different approaches and techniques in a collection of tweets written in Arabic Saoudite dialect. Note de contenu : Sommaire
Table of Contents ------------------------------------------------------------------------------ ii
Dedication -------------------------------------------------------------------------------------- vii
Abstract ----------------------------------------------------------------------------------------- ix
Acknowledgement ----------------------------------------------------------------------------- vii
List of Figures ---------------------------------------------------------------------------------- iv
List of Tables ----------------------------------------------------------------------------------- v
List of Abbreviations ------------------------------------------------------------------------- vi
CHAPTER ONE
INTRODUCTION
1.1 Introduction --------------------------------------------------------------------------- 12
1.2 Goals and Objectives ---------------------------------------------------------------- 12
1.3 Study Scope--------------------------------------------------------------------------- 12
1.4 Study Plan and Schedule------------------------------------------------------------ 13
CHAPTER TWO
LITERATURE REVIEW
2.1 Introduction---------------------------------------------------------------------------14
2.2 Sentiment Analysis ------------------------------------------------------------------14
2.2.1 Sentiment Analysis components -------------------------------------------14
2.2.2 Sentiment Analysis Scope --------------------------------------------------14
2.2.3 Sentiment Analysis Types---------------------------------------------------14
2.2.4 Ensemble Methods of Sentiment Analysis -------------------------------15
2.2.5 Sentiment Analysis Techniques --------------------------------------------17
2.2.6 Performance measures ------------------------------------------------------21
2.2.6.1 Confusion Matrix-------------------------------------------------------21
2.2.6.2 Accuracy-----------------------------------------------------------------21
2.2.6.3 Precision-----------------------------------------------------------------21
2.2.6.4 Recall---------------------------------------------------------------------22
2.2.6.5 F1 Score------------------------------------------------------------------22
2.2.7 Sentiment Analysis process -------------------------------------------------22
2.2.8 Sentiment Analysis applications --------------------------------------------23
2.2.9 Sentiment Analysis Challenges --------------------------------------------23
2.3 Machine learning for SA-------------------------------------------------------------24
2.3.1 Definition-----------------------------------------------------------------------24
2.3.2 Common ML algorithms for SA---------------------------------------------24
2.3.2.1 Logistic regression------------------------------------------------------24
2.3.2.2 Decision tree-------------------------------------------------------------25
2.3.2.3 SVM----------------------------------------------------------------------26
2.3.2.4 Naïve Bayes-------------------------------------------------------------27
2.3.2.5 Maximum Entropy-----------------------------------------------------29
2.3.2.6 Neural networks------------------------------------------------------- 29
2.3.3 Deep learning algorithms for SA---------- -----------------------------------32
2.3.3.1 Definition----------------------------------------------------------------32
2.3.3.2 Distributed representations -------------------------------------------33
2.3.3.2.1 Word Embeddings---------------------------------------------34
2.3.3.2.2 sentence representation :Doc2Vec--------------------------35
2.3.3.2.3 Character level model :Character Embeddings------------35
2.3.3.3 Deep learning Models -------------------------------------------------36
2.3.3.3.1 Word2Vec averaging & deep dense networks-------------36
2.3.3.3.2 Recursive networks-------------------------------------------
36 2.3.3.3.3 Recurrent Networks-------------------------------------------37
2.3.3.3.4 Convolutional Networks-------------------------------------39
2.3.3.4 Deep learning for sentence level SA---------------------------------39
2.3.3.5 Deep learning for Arabic language-----------------------------------40
2.4 Arabic language-----------------------------------------------------------------------41
2.4.1 Definition ----------------------------------------------------------------------41
2.4.2 Arabic varieties-----------------------------------------------------------------42
2.4.3 SA in Arabic : Challenges ----------------------------------------------------43
2.4.4 Differences between MSA & regional dialects----------------------------43
2.4.5 Computational processing of standard Arabic-----------------------------44
2.5 Related works ------------------------------------------------------------------------------44
2.5.1 issues related to current work------------------------------------------------------45
2.5.2 Previous studies----------------------------------------------------------------------45
2.6 Proposed work------------------------------------------------------------------------------45
2.6.1 proposed theory & framework-----------------------------------------------------45
2.6.2 proposed model/system-------------------------------------------------------------47
2.7 Summary------------------------------------------------------------------------------------50
CHAPTER THREE
METHODOLOGY AND IMPLEMENTATION
3.1 Introduction--------------------------------------------------------------------------------51
3.2. Methodology -----------------------------------------------------------------------------51
3.2.1 Type of study-------------------------------------------------------------------------51
3.2.2 System used -------------------------------------------------------------------------51
3.2.3 Data description ---------------------------------------------------------------------51
3.3 Implementation ---------------------------------------------------------------------------52
3.4 Summary-----------------------------------------------------------------------------------54
CHAPTER FOUR:
RESULTS AND DISCUSSION
4.1 Introduction --------------------------------------------------------------------------------55
4.2 Data Analysis methods--------------------------------------------------------------------55
4.3 First level comparison---------------------------------------------------------------------55
4.3.1 With Tf-idf Vectorizer---------------------------------------------------------------55
4.3.2 With CountVectorizer---------------------------------------------------------------56
4.4 Second level comparison:-----------------------------------------------------------------63
4.5 Neural networks with the pervious algorithms comparison--------------------------64
4.5.1 Neural network---------------------------------------------------------------------64
4.5.2 LSTM--------------------------------------------------------------------------------65
4.5.3 Comparison-------------------------------------------------------------------------67
4.6 Summary -----------------------------------------------------------------------------------68
CHAPTER FIVE:
CONCLUSIONS AND FUTURE WORK
5
5.1 Conclusion----------------------------------------------------------------------------------69
5.2 Future Work--------------------------------------------------------------------------------69
Appendix ---------------------------------------------------------------------------------------70
Visualization of some test results--------------------------------------------------70
References--------------------------------------------------------------------------------------72
List of Figures
Figure 1:The bagging technique [6] ---------------------------------------------------------Côte titre : MAI/0310 En ligne : https://drive.google.com/file/d/1dppP-vKTfakkJWol8LTh7zJ9kBfYB5ou/view?usp=shari [...] Format de la ressource électronique : Sentiment analysis for saudi dialect using machine learning [texte imprimé] / Hicheur,Nésrine, Auteur ; Kara-mohamed, Chafia, Directeur de thèse . - [S.l.] : Setif:UFA, 2019 . - 1 vol 72 f .) ; 29 cm.
Langues : Français (fre)
Catégories : Thèses & Mémoires:Informatique Mots-clés : Sentiment analysis
Machine learning
Arabic languageIndex. décimale : 004 - Informatique Résumé : The increasing expansion of social media provides a big content of informations, opinions and thoughts sharing by people in their native languages or dialects. So sentiment analysis was found to classify the various expressed sentiments from a huge amount of data. There are many machine learning and deep learning algorithms to handle this and many presented difficulties influences the process of evaluation of the produced classifier. In this methodology, the underlying idea is to achieve a particular accuracy using different approaches and techniques in a collection of tweets written in Arabic Saoudite dialect. Note de contenu : Sommaire
Table of Contents ------------------------------------------------------------------------------ ii
Dedication -------------------------------------------------------------------------------------- vii
Abstract ----------------------------------------------------------------------------------------- ix
Acknowledgement ----------------------------------------------------------------------------- vii
List of Figures ---------------------------------------------------------------------------------- iv
List of Tables ----------------------------------------------------------------------------------- v
List of Abbreviations ------------------------------------------------------------------------- vi
CHAPTER ONE
INTRODUCTION
1.1 Introduction --------------------------------------------------------------------------- 12
1.2 Goals and Objectives ---------------------------------------------------------------- 12
1.3 Study Scope--------------------------------------------------------------------------- 12
1.4 Study Plan and Schedule------------------------------------------------------------ 13
CHAPTER TWO
LITERATURE REVIEW
2.1 Introduction---------------------------------------------------------------------------14
2.2 Sentiment Analysis ------------------------------------------------------------------14
2.2.1 Sentiment Analysis components -------------------------------------------14
2.2.2 Sentiment Analysis Scope --------------------------------------------------14
2.2.3 Sentiment Analysis Types---------------------------------------------------14
2.2.4 Ensemble Methods of Sentiment Analysis -------------------------------15
2.2.5 Sentiment Analysis Techniques --------------------------------------------17
2.2.6 Performance measures ------------------------------------------------------21
2.2.6.1 Confusion Matrix-------------------------------------------------------21
2.2.6.2 Accuracy-----------------------------------------------------------------21
2.2.6.3 Precision-----------------------------------------------------------------21
2.2.6.4 Recall---------------------------------------------------------------------22
2.2.6.5 F1 Score------------------------------------------------------------------22
2.2.7 Sentiment Analysis process -------------------------------------------------22
2.2.8 Sentiment Analysis applications --------------------------------------------23
2.2.9 Sentiment Analysis Challenges --------------------------------------------23
2.3 Machine learning for SA-------------------------------------------------------------24
2.3.1 Definition-----------------------------------------------------------------------24
2.3.2 Common ML algorithms for SA---------------------------------------------24
2.3.2.1 Logistic regression------------------------------------------------------24
2.3.2.2 Decision tree-------------------------------------------------------------25
2.3.2.3 SVM----------------------------------------------------------------------26
2.3.2.4 Naïve Bayes-------------------------------------------------------------27
2.3.2.5 Maximum Entropy-----------------------------------------------------29
2.3.2.6 Neural networks------------------------------------------------------- 29
2.3.3 Deep learning algorithms for SA---------- -----------------------------------32
2.3.3.1 Definition----------------------------------------------------------------32
2.3.3.2 Distributed representations -------------------------------------------33
2.3.3.2.1 Word Embeddings---------------------------------------------34
2.3.3.2.2 sentence representation :Doc2Vec--------------------------35
2.3.3.2.3 Character level model :Character Embeddings------------35
2.3.3.3 Deep learning Models -------------------------------------------------36
2.3.3.3.1 Word2Vec averaging & deep dense networks-------------36
2.3.3.3.2 Recursive networks-------------------------------------------
36 2.3.3.3.3 Recurrent Networks-------------------------------------------37
2.3.3.3.4 Convolutional Networks-------------------------------------39
2.3.3.4 Deep learning for sentence level SA---------------------------------39
2.3.3.5 Deep learning for Arabic language-----------------------------------40
2.4 Arabic language-----------------------------------------------------------------------41
2.4.1 Definition ----------------------------------------------------------------------41
2.4.2 Arabic varieties-----------------------------------------------------------------42
2.4.3 SA in Arabic : Challenges ----------------------------------------------------43
2.4.4 Differences between MSA & regional dialects----------------------------43
2.4.5 Computational processing of standard Arabic-----------------------------44
2.5 Related works ------------------------------------------------------------------------------44
2.5.1 issues related to current work------------------------------------------------------45
2.5.2 Previous studies----------------------------------------------------------------------45
2.6 Proposed work------------------------------------------------------------------------------45
2.6.1 proposed theory & framework-----------------------------------------------------45
2.6.2 proposed model/system-------------------------------------------------------------47
2.7 Summary------------------------------------------------------------------------------------50
CHAPTER THREE
METHODOLOGY AND IMPLEMENTATION
3.1 Introduction--------------------------------------------------------------------------------51
3.2. Methodology -----------------------------------------------------------------------------51
3.2.1 Type of study-------------------------------------------------------------------------51
3.2.2 System used -------------------------------------------------------------------------51
3.2.3 Data description ---------------------------------------------------------------------51
3.3 Implementation ---------------------------------------------------------------------------52
3.4 Summary-----------------------------------------------------------------------------------54
CHAPTER FOUR:
RESULTS AND DISCUSSION
4.1 Introduction --------------------------------------------------------------------------------55
4.2 Data Analysis methods--------------------------------------------------------------------55
4.3 First level comparison---------------------------------------------------------------------55
4.3.1 With Tf-idf Vectorizer---------------------------------------------------------------55
4.3.2 With CountVectorizer---------------------------------------------------------------56
4.4 Second level comparison:-----------------------------------------------------------------63
4.5 Neural networks with the pervious algorithms comparison--------------------------64
4.5.1 Neural network---------------------------------------------------------------------64
4.5.2 LSTM--------------------------------------------------------------------------------65
4.5.3 Comparison-------------------------------------------------------------------------67
4.6 Summary -----------------------------------------------------------------------------------68
CHAPTER FIVE:
CONCLUSIONS AND FUTURE WORK
5
5.1 Conclusion----------------------------------------------------------------------------------69
5.2 Future Work--------------------------------------------------------------------------------69
Appendix ---------------------------------------------------------------------------------------70
Visualization of some test results--------------------------------------------------70
References--------------------------------------------------------------------------------------72
List of Figures
Figure 1:The bagging technique [6] ---------------------------------------------------------Côte titre : MAI/0310 En ligne : https://drive.google.com/file/d/1dppP-vKTfakkJWol8LTh7zJ9kBfYB5ou/view?usp=shari [...] Format de la ressource électronique : Exemplaires (1)
Code-barres Cote Support Localisation Section Disponibilité MAI/0310 MAI/0310 Mémoire Bibliothéque des sciences Français Disponible
Disponible