University Sétif 1 FERHAT ABBAS Faculty of Sciences
Détail de l'auteur
Auteur Racha Difallah |
Documents disponibles écrits par cet auteur



Context-independent word embedding methods in Machine Learning for Arabic Document Classification / Mohamed Akram Belbedar
![]()
Titre : Context-independent word embedding methods in Machine Learning for Arabic Document Classification Type de document : texte imprimé Auteurs : Mohamed Akram Belbedar, Auteur ; Racha Difallah, Auteur ; mohamed Chafia Kara, Directeur de thèse Editeur : Sétif:UFA1 Année de publication : 2023 Importance : 1 vol (36 f .) Format : 29cm Langues : Français (fre) Catégories : Thèses & Mémoires:Informatique Mots-clés : Informatique Index. décimale : 004 Informatique Résumé : With the rapid development of Artificial Intelligence, the research on text information processing began to get researchers’ attentions. A huge amount of textual data is available online; therefore, automatic text classification is more than necessary. In Natural language processing, in general, and in text classification precisely, the main issue is the curse of dimensionality. Documents are represented by huge and sparse vectors. Reduce this dimensionality without affecting the information amount in the document is an active research area.
Word embeddings is the representation of the text using vectors such that the words that have similar semantic will have similar vector representation. Models like FastText, Glove and the two approaches of word2vec model called CBOW and Skip-gram. In our project, we have studied word embedding methods. Then we have used word embeddings as a method to reduce the dimensionality of the documents. Therefore, similar words are clustered and the clusters’ centres are used to represent the whole cluster. To highlight the effect of this method we have compared a text classification system with no dimension reduction and one with dimension reduction
Côte titre : MAI/0716 En ligne : https://docs.google.com/document/d/1y0s_Ez7tShNR0a4-wTlQ8B5eFv6XH0l8/edit?usp=dr [...] Format de la ressource électronique : docx Context-independent word embedding methods in Machine Learning for Arabic Document Classification [texte imprimé] / Mohamed Akram Belbedar, Auteur ; Racha Difallah, Auteur ; mohamed Chafia Kara, Directeur de thèse . - [S.l.] : Sétif:UFA1, 2023 . - 1 vol (36 f .) ; 29cm.
Langues : Français (fre)
Catégories : Thèses & Mémoires:Informatique Mots-clés : Informatique Index. décimale : 004 Informatique Résumé : With the rapid development of Artificial Intelligence, the research on text information processing began to get researchers’ attentions. A huge amount of textual data is available online; therefore, automatic text classification is more than necessary. In Natural language processing, in general, and in text classification precisely, the main issue is the curse of dimensionality. Documents are represented by huge and sparse vectors. Reduce this dimensionality without affecting the information amount in the document is an active research area.
Word embeddings is the representation of the text using vectors such that the words that have similar semantic will have similar vector representation. Models like FastText, Glove and the two approaches of word2vec model called CBOW and Skip-gram. In our project, we have studied word embedding methods. Then we have used word embeddings as a method to reduce the dimensionality of the documents. Therefore, similar words are clustered and the clusters’ centres are used to represent the whole cluster. To highlight the effect of this method we have compared a text classification system with no dimension reduction and one with dimension reduction
Côte titre : MAI/0716 En ligne : https://docs.google.com/document/d/1y0s_Ez7tShNR0a4-wTlQ8B5eFv6XH0l8/edit?usp=dr [...] Format de la ressource électronique : docx Exemplaires (1)
Code-barres Cote Support Localisation Section Disponibilité MAI/0716 MAI/0716 Mémoire Bibliothéque des sciences Anglais Disponible
Disponible