University Sétif 1 FERHAT ABBAS Faculty of Sciences
Détail de l'auteur
Auteur Barouchi,Anouar |
Documents disponibles écrits par cet auteur



Titre : Reducing Toxic Language Generation Type de document : texte imprimé Auteurs : Barouchi,Anouar, Auteur ; Harrag,Fouzi, Directeur de thèse Editeur : Setif:UFA Année de publication : 2021 Importance : 1 vol (42 f .) Format : 29 cm Langues : Français (fre) Catégories : Thèses & Mémoires:Informatique Mots-clés : Informatique Index. décimale : 004 - Informatique Résumé :
Recent years have witnessed a rapid spread on the Internet of abusive or hate speech language, and the problem is now exacerbated. At times, toxic comments on the Internet provoked violence on the ground.
In this paper, we propose a computer model Able to detect and reduce toxic language in languages that are spread through social networks, and to track its spread.
In our project we will use a dataset of tweets containing offensive language from the shared task of the fourth workshop on open source arabic corpora processing tools in language resources and evaluation conférence 2020. The dataset utilized is the OLID 2020 (Offensive Language Identification Dataset). It was manually annotated and ready to be used for the Arabic language.It contained 8,000 tweets in Arabic hostile dialect , 7000 tweets of training and 1000 tweets of test.
This dataset contains offensive and hate speech tweets .Training dataset contains 7000 tweets, 5589 of them was considered as not offensive tweets and 1410 was considered as offensive tweets , while 361 of training dataset considered as hate speech and 6638 as not hate speech.Test data contains 1000 tweets, 821 of them was considered as not offensive tweets and 179 was considered as offensive tweets ; while 44 of test data considered as hate speech and the rest as not hate speech.
The results indicate that deep learning using cnn may extract important biomarkers related to the toxic language, the performance of the studied structure cnn is evaluated by computing classification accuracy (90.29% ) and better performance of the model.Côte titre : MAI/0547 En ligne : https://drive.google.com/file/d/1qp737B7f_vw23QsQAKohgSENN0ejWMqV/view?usp=shari [...] Format de la ressource électronique : Reducing Toxic Language Generation [texte imprimé] / Barouchi,Anouar, Auteur ; Harrag,Fouzi, Directeur de thèse . - [S.l.] : Setif:UFA, 2021 . - 1 vol (42 f .) ; 29 cm.
Langues : Français (fre)
Catégories : Thèses & Mémoires:Informatique Mots-clés : Informatique Index. décimale : 004 - Informatique Résumé :
Recent years have witnessed a rapid spread on the Internet of abusive or hate speech language, and the problem is now exacerbated. At times, toxic comments on the Internet provoked violence on the ground.
In this paper, we propose a computer model Able to detect and reduce toxic language in languages that are spread through social networks, and to track its spread.
In our project we will use a dataset of tweets containing offensive language from the shared task of the fourth workshop on open source arabic corpora processing tools in language resources and evaluation conférence 2020. The dataset utilized is the OLID 2020 (Offensive Language Identification Dataset). It was manually annotated and ready to be used for the Arabic language.It contained 8,000 tweets in Arabic hostile dialect , 7000 tweets of training and 1000 tweets of test.
This dataset contains offensive and hate speech tweets .Training dataset contains 7000 tweets, 5589 of them was considered as not offensive tweets and 1410 was considered as offensive tweets , while 361 of training dataset considered as hate speech and 6638 as not hate speech.Test data contains 1000 tweets, 821 of them was considered as not offensive tweets and 179 was considered as offensive tweets ; while 44 of test data considered as hate speech and the rest as not hate speech.
The results indicate that deep learning using cnn may extract important biomarkers related to the toxic language, the performance of the studied structure cnn is evaluated by computing classification accuracy (90.29% ) and better performance of the model.Côte titre : MAI/0547 En ligne : https://drive.google.com/file/d/1qp737B7f_vw23QsQAKohgSENN0ejWMqV/view?usp=shari [...] Format de la ressource électronique : Exemplaires (1)
Code-barres Cote Support Localisation Section Disponibilité MAI/0547 MAI/0547 livre Bibliothéque des sciences Anglais Disponible
Disponible