Catalogue en ligne

University Sétif 1 FERHAT ABBAS Faculty of Sciences

Nouvelle recherche

Détail de l'auteur

Auteur Nour El houda Chenni

Documents disponibles écrits par cet auteur

Ajouter le résultat dans votre panier Affiner la recherche

Development of Conversational Systems Based on RAG and LLM for Prophetic-Herbal Medicine / Nour El houda Chenni

Public

ISBD

Titre : Development of Conversational Systems Based on RAG and LLM for Prophetic-Herbal Medicine
Type de document : document électronique
Auteurs : Nour El houda Chenni ; Moussaoui ,Abdelouahab, Directeur de thèse
Editeur : Setif:UFA
Année de publication : 2025
Importance : 1 vol (101 f .)
Format : 29 cm
Langues : Anglais (eng)
Catégories : Thèses & Mémoires:Informatique

Mots-clés : Prophetic Medicine
Herbal Medicine
Conversational Systems
Retrieval-Augmented Generation
Large Language Models
NLP
Arabic NLP
Index. décimale : 004 Informatique
Résumé :
Prophetic and herbal medicine represent rich, traditional knowledge which is
often underrepresented in modern digital systems but with recent advancements
in large language models (LLMs), it became possible to build intelligent systems
capable of understanding and responding to user queries in this specialized
domain. Retrieval Augmented Generation (RAG) frameworks have further enhanced
these capabilities by grounding language models in external knowledge
sources.
This work presents the development of conversational systems focused on
Prophetic and herbal medicine in both Arabic and English independently. The
English systems include a fine-tuned Mistral-7B model trained on a domainspecific
dataset,with two RAG pipelines one using Mistral and another using
DeepSeek. The Arabic systems include two RAG implementations as well: one
using the Allam model and another based on DeepSeek, both adapted to handle
Arabic language queries with high contextual accuracy. These systems aim to
facilitate knowledge access, learning, and interaction with culturally significant
medical content through modern NLP architectures.
Note de contenu : Sommaire
Abstract I
Acknowledgements IV
Introduction 1
Chapter 1 Foundations of Artificial Intelligence in Text Understanding
4
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . 5
1.2.1.1 Types of Problems . . . . . . . . . . . . . . . . . 5
1.2.1.2 Common Algorithms . . . . . . . . . . . . . . . 6
1.2.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . 7
1.2.2.1 Types of Problems . . . . . . . . . . . . . . . . . 7
1.2.2.2 Common Algorithms . . . . . . . . . . . . . . . 7
1.2.3 Reinforcement Learning (RL) . . . . . . . . . . . . . . . . 8
1.3 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Natural Language Processing (NLP) . . . . . . . . . . . . . . . . 9
1.4.1 Core Components of NLP . . . . . . . . . . . . . . . . . . 9
1.4.2 Approaches to NLP . . . . . . . . . . . . . . . . . . . . . 12
1.4.2.1 Rule Based Systems . . . . . . . . . . . . . . . . 12
1.4.2.2 Statistical Methods . . . . . . . . . . . . . . . . 13
1.4.2.3 Machine Learning and Deep Learning . . . . . . 15
1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Chapter 2 Large Language Models (LLMs) 16
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Language Models (LM) . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Evolution of Language Models . . . . . . . . . . . . . . . . . . . 17
2.3.1 Traditional Language Models . . . . . . . . . . . . . . . . 18
2.3.2 Deep Learning Based Models . . . . . . . . . . . . . . . . 18
2.3.2.1 Recurrent Neural Networks (RNNs) . . . . . . . 18
2.3.2.2 Long Short-Term Memory (LSTMs) . . . . . . . 20
2.3.2.3 Gated Recurrent Unit (GRU) . . . . . . . . . . 21
2.3.2.4 Attention Mechanism . . . . . . . . . . . . . . . 22
2.3.2.5 Transformers . . . . . . . . . . . . . . . . . . . . 24
2.4 Word Embeddings and Representation Learning . . . . . . . . . 30
2.4.1 Static Word Embeddings . . . . . . . . . . . . . . . . . . 30
2.4.1.1 Word2Vec . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1.2 GloVe (Global Vectors for Word Representation) 32
2.4.2 Contextualized Word Embeddings . . . . . . . . . . . . . 32
2.4.2.1 ELMo (Embeddings from Language Models) . . 32
2.4.2.2 BERT (Bidirectional Encoder Representations
from Transformers) . . . . . . . . . . . . . . . . 32
2.5 Large Language Models (LLM) . . . . . . . . . . . . . . . . . . . 33
2.5.1 Key LLM Architectures . . . . . . . . . . . . . . . . . . . 33
2.5.1.1 Encoder-only Architectures . . . . . . . . . . . . 33
2.5.1.2 Decoder-only Architectures . . . . . . . . . . . . 34
2.5.1.3 Encoder–Decoder Architectures . . . . . . . . . 34
2.6 Multimodal large language models (MLLMs) . . . . . . . . . . . 35
2.6.1 Examples of MLLMs . . . . . . . . . . . . . . . . . . . . . 36
2.6.2 Advantages of MLLMs . . . . . . . . . . . . . . . . . . . . 37
2.6.3 The different between generative AI and MLLMs . . . . . 38
2.7 Challenges and Limitations . . . . . . . . . . . . . . . . . . . . . 38
2.8 Customization Techniques . . . . . . . . . . . . . . . . . . . . . . 40
2.8.1 Prompt Engineering . . . . . . . . . . . . . . . . . . . . . 40
2.8.2 Fine Tuning . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.8.3 Retrieval Augmented Generation (RAG) . . . . . . . . . . 44
2.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Chapter 3 Retrieval Augmented Generation (RAG) 46
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 RAG Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3.1 Retriever . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.1.1 Sparse Retriever . . . . . . . . . . . . . . . . . . 48
3.3.1.2 Dense Retriever . . . . . . . . . . . . . . . . . . 49
3.3.2 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.2.1 Auto regressive Models . . . . . . . . . . . . . . 50
3.3.2.2 Encoder Decoder Models . . . . . . . . . . . . . 51
3.3.2.3 Fusion in Decoder (FiD) . . . . . . . . . . . . . 51
3.4 RAG Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4.1 Document Chunking . . . . . . . . . . . . . . . . . . . . . 52
3.4.2 Dense Embedding of Chunks . . . . . . . . . . . . . . . . 53
3.4.3 vector storage and retrieval . . . . . . . . . . . . . . . . . 54
3.4.4 Query Encoding and Retrieval . . . . . . . . . . . . . . . 56
3.4.5 Prompt Construction and LLM Generation . . . . . . . . 56
3.5 RAG Key Features and Benefits . . . . . . . . . . . . . . . . . . 57
3.6 The difference between RAG and semantic search . . . . . . . . . 58
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Chapter 4 Methodology and Experiments 60
4.1 Librearies and Implementation Framework . . . . . . . . . . . . . 61
4.1.1 LangChain . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1.2 Chroma DB . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1.3 Unsloth . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1.4 Gradio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.1.5 Hugging Face . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2.1 nomic-embed-text . . . . . . . . . . . . . . . . . . . . . . 62
4.2.2 GATE-AraBert-v1 . . . . . . . . . . . . . . . . . . . . . . 62
4.2.3 ARA-Reranker-V1 . . . . . . . . . . . . . . . . . . . . . . 63
4.2.4 Mistral:Instruct . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.5 ALLaM-7B:Instruct . . . . . . . . . . . . . . . . . . . . . 63
4.2.6 DeepSeek-R1 . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.3 Development Tools . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3.1 Jupyter Notebook . . . . . . . . . . . . . . . . . . . . . . 64
4.3.2 Google Colab . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3.3 Ollama . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4 Methodology and Experiments Results . . . . . . . . . . . . . . . 65
4.4.1 Evaluation Setup . . . . . . . . . . . . . . . . . . . . . . . 65
4.4.1.1 Evaluation objective . . . . . . . . . . . . . . . . 65
4.4.1.2 Evaluation Matrices and Rating Scales . . . . . 65
4.4.1.3 Human Evaluation Method . . . . . . . . . . . . 67
4.4.1.4 LLM as a Judge Method . . . . . . . . . . . . . 67
4.4.2 English Versions . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.2.1 Fine Tuned Mistral-7B Model . . . . . . . . . . 67
4.4.2.2 English RAG Models . . . . . . . . . . . . . . . 70
4.4.2.3 Comparative Results . . . . . . . . . . . . . . . 77
4.4.3 Arabic Versions . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4.3.1 Arabic RAG Models . . . . . . . . . . . . . . . . 78
4.4.3.2 Comparative Results . . . . . . . . . . . . . . . 87
4.4.4 Automatic evaluation metrics . . . . . . . . . . . . . . . . 88
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Côte titre : MAI/1032

Development of Conversational Systems Based on RAG and LLM for Prophetic-Herbal Medicine [document électronique] / Nour El houda Chenni ; Moussaoui ,Abdelouahab, Directeur de thèse . - [S.l.] : Setif:UFA, 2025 . - 1 vol (101 f .) ; 29 cm.
Langues : Anglais (eng)
Catégories : Thèses & Mémoires:Informatique

Mots-clés : Prophetic Medicine
Herbal Medicine
Conversational Systems
Retrieval-Augmented Generation
Large Language Models
NLP
Arabic NLP
Index. décimale : 004 Informatique
Résumé :
Prophetic and herbal medicine represent rich, traditional knowledge which is
often underrepresented in modern digital systems but with recent advancements
in large language models (LLMs), it became possible to build intelligent systems
capable of understanding and responding to user queries in this specialized
domain. Retrieval Augmented Generation (RAG) frameworks have further enhanced
these capabilities by grounding language models in external knowledge
sources.
This work presents the development of conversational systems focused on
Prophetic and herbal medicine in both Arabic and English independently. The
English systems include a fine-tuned Mistral-7B model trained on a domainspecific
dataset,with two RAG pipelines one using Mistral and another using
DeepSeek. The Arabic systems include two RAG implementations as well: one
using the Allam model and another based on DeepSeek, both adapted to handle
Arabic language queries with high contextual accuracy. These systems aim to
facilitate knowledge access, learning, and interaction with culturally significant
medical content through modern NLP architectures.
Note de contenu : Sommaire
Abstract I
Acknowledgements IV
Introduction 1
Chapter 1 Foundations of Artificial Intelligence in Text Understanding
4
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . 5
1.2.1.1 Types of Problems . . . . . . . . . . . . . . . . . 5
1.2.1.2 Common Algorithms . . . . . . . . . . . . . . . 6
1.2.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . 7
1.2.2.1 Types of Problems . . . . . . . . . . . . . . . . . 7
1.2.2.2 Common Algorithms . . . . . . . . . . . . . . . 7
1.2.3 Reinforcement Learning (RL) . . . . . . . . . . . . . . . . 8
1.3 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Natural Language Processing (NLP) . . . . . . . . . . . . . . . . 9
1.4.1 Core Components of NLP . . . . . . . . . . . . . . . . . . 9
1.4.2 Approaches to NLP . . . . . . . . . . . . . . . . . . . . . 12
1.4.2.1 Rule Based Systems . . . . . . . . . . . . . . . . 12
1.4.2.2 Statistical Methods . . . . . . . . . . . . . . . . 13
1.4.2.3 Machine Learning and Deep Learning . . . . . . 15
1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Chapter 2 Large Language Models (LLMs) 16
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Language Models (LM) . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Evolution of Language Models . . . . . . . . . . . . . . . . . . . 17
2.3.1 Traditional Language Models . . . . . . . . . . . . . . . . 18
2.3.2 Deep Learning Based Models . . . . . . . . . . . . . . . . 18
2.3.2.1 Recurrent Neural Networks (RNNs) . . . . . . . 18
2.3.2.2 Long Short-Term Memory (LSTMs) . . . . . . . 20
2.3.2.3 Gated Recurrent Unit (GRU) . . . . . . . . . . 21
2.3.2.4 Attention Mechanism . . . . . . . . . . . . . . . 22
2.3.2.5 Transformers . . . . . . . . . . . . . . . . . . . . 24
2.4 Word Embeddings and Representation Learning . . . . . . . . . 30
2.4.1 Static Word Embeddings . . . . . . . . . . . . . . . . . . 30
2.4.1.1 Word2Vec . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1.2 GloVe (Global Vectors for Word Representation) 32
2.4.2 Contextualized Word Embeddings . . . . . . . . . . . . . 32
2.4.2.1 ELMo (Embeddings from Language Models) . . 32
2.4.2.2 BERT (Bidirectional Encoder Representations
from Transformers) . . . . . . . . . . . . . . . . 32
2.5 Large Language Models (LLM) . . . . . . . . . . . . . . . . . . . 33
2.5.1 Key LLM Architectures . . . . . . . . . . . . . . . . . . . 33
2.5.1.1 Encoder-only Architectures . . . . . . . . . . . . 33
2.5.1.2 Decoder-only Architectures . . . . . . . . . . . . 34
2.5.1.3 Encoder–Decoder Architectures . . . . . . . . . 34
2.6 Multimodal large language models (MLLMs) . . . . . . . . . . . 35
2.6.1 Examples of MLLMs . . . . . . . . . . . . . . . . . . . . . 36
2.6.2 Advantages of MLLMs . . . . . . . . . . . . . . . . . . . . 37
2.6.3 The different between generative AI and MLLMs . . . . . 38
2.7 Challenges and Limitations . . . . . . . . . . . . . . . . . . . . . 38
2.8 Customization Techniques . . . . . . . . . . . . . . . . . . . . . . 40
2.8.1 Prompt Engineering . . . . . . . . . . . . . . . . . . . . . 40
2.8.2 Fine Tuning . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.8.3 Retrieval Augmented Generation (RAG) . . . . . . . . . . 44
2.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Chapter 3 Retrieval Augmented Generation (RAG) 46
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 RAG Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3.1 Retriever . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.1.1 Sparse Retriever . . . . . . . . . . . . . . . . . . 48
3.3.1.2 Dense Retriever . . . . . . . . . . . . . . . . . . 49
3.3.2 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.2.1 Auto regressive Models . . . . . . . . . . . . . . 50
3.3.2.2 Encoder Decoder Models . . . . . . . . . . . . . 51
3.3.2.3 Fusion in Decoder (FiD) . . . . . . . . . . . . . 51
3.4 RAG Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4.1 Document Chunking . . . . . . . . . . . . . . . . . . . . . 52
3.4.2 Dense Embedding of Chunks . . . . . . . . . . . . . . . . 53
3.4.3 vector storage and retrieval . . . . . . . . . . . . . . . . . 54
3.4.4 Query Encoding and Retrieval . . . . . . . . . . . . . . . 56
3.4.5 Prompt Construction and LLM Generation . . . . . . . . 56
3.5 RAG Key Features and Benefits . . . . . . . . . . . . . . . . . . 57
3.6 The difference between RAG and semantic search . . . . . . . . . 58
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Chapter 4 Methodology and Experiments 60
4.1 Librearies and Implementation Framework . . . . . . . . . . . . . 61
4.1.1 LangChain . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1.2 Chroma DB . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1.3 Unsloth . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1.4 Gradio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.1.5 Hugging Face . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2.1 nomic-embed-text . . . . . . . . . . . . . . . . . . . . . . 62
4.2.2 GATE-AraBert-v1 . . . . . . . . . . . . . . . . . . . . . . 62
4.2.3 ARA-Reranker-V1 . . . . . . . . . . . . . . . . . . . . . . 63
4.2.4 Mistral:Instruct . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.5 ALLaM-7B:Instruct . . . . . . . . . . . . . . . . . . . . . 63
4.2.6 DeepSeek-R1 . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.3 Development Tools . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3.1 Jupyter Notebook . . . . . . . . . . . . . . . . . . . . . . 64
4.3.2 Google Colab . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3.3 Ollama . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4 Methodology and Experiments Results . . . . . . . . . . . . . . . 65
4.4.1 Evaluation Setup . . . . . . . . . . . . . . . . . . . . . . . 65
4.4.1.1 Evaluation objective . . . . . . . . . . . . . . . . 65
4.4.1.2 Evaluation Matrices and Rating Scales . . . . . 65
4.4.1.3 Human Evaluation Method . . . . . . . . . . . . 67
4.4.1.4 LLM as a Judge Method . . . . . . . . . . . . . 67
4.4.2 English Versions . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.2.1 Fine Tuned Mistral-7B Model . . . . . . . . . . 67
4.4.2.2 English RAG Models . . . . . . . . . . . . . . . 70
4.4.2.3 Comparative Results . . . . . . . . . . . . . . . 77
4.4.3 Arabic Versions . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4.3.1 Arabic RAG Models . . . . . . . . . . . . . . . . 78
4.4.3.2 Comparative Results . . . . . . . . . . . . . . . 87
4.4.4 Automatic evaluation metrics . . . . . . . . . . . . . . . . 88
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Côte titre : MAI/1032

Exemplaires (1)

Code-barres Cote Support Localisation Section Disponibilité
MAI/1032 MAI/1032 Mémoire Bibliothèque des sciences Anglais Disponible
Disponible

University Sétif 1 FERHAT ABBAS Faculty of Sciences

Détail de l'auteur

Auteur Nour El houda Chenni

Documents disponibles écrits par cet auteur

Exemplaires (1)

Accueil

Sélection de la langue

Se connecter

Adresse

Horaires d'ouverture :