|
| Titre : |
Deep learning, and bioinspired optimization algorithms for genetic marker selection and disease classification. |
| Type de document : |
document électronique |
| Auteurs : |
Khaoula Chouder ; Maroua Kadri, Auteur ; Abderrahim Lakehal, Directeur de thèse |
| Editeur : |
Setif:UFA |
| Année de publication : |
2025 |
| Importance : |
1 vol (73 f .) |
| Format : |
29 cm |
| Langues : |
Anglais (eng) |
| Catégories : |
Thèses & Mémoires:Informatique
|
| Mots-clés : |
Deep Learning
Cancer Classification
Feature Selection
Bio-Inspired Algorithms
Omics Data
Biomarker Discovery |
| Index. décimale : |
004 Informatique |
| Résumé : |
AI-based cancer diagnosis and classification have emerged as a critical research field over the
past decade, especially with advancements in next-generation sequencing technologies. However,
omics datasets are often characterized by high dimensionality, complexity, and scalability
challenges. Deep learning has been increasingly adopted to address these issues due to its
strong predictive performance. Nonetheless, deep learning models remain largely black-box in
nature, lacking interpretability—a crucial factor in biological contexts where the identification
of biomarkers and selected features is essential for personalizing treatment protocols and guiding
drug prescription. Therefore, various feature selection methods are highly sought to enhance
interpretability. In this work, we propose a hybrid approach that combines deep learning with
bioinspired feature selection techniques. This report provides an overview of recent advances in
deep learning for oncology, particularly for analyzing omics data such as genomic and transcriptomic
profiles. Applications in cancer diagnosis, prognosis, and therapeutic decision-making
are explored, with a focus on the integration of multi-omics data for building clinical decision
support systems. The results of the experiments showed that the classification accuracy was
as much as 80% for single-omics models while deep neural networks and convolutional models
performed better after bioinspired optimization. Enrichment analysis also confirmed the biological
relevance of the selected features, affirming their use as clinically meaningful biomarkers.
These findings demonstrate the effectiveness of our method in both boosting prediction performance
and interpretability in cancer classification tasks. |
| Note de contenu : |
Sommaire
List of Tables 6
List of Figures 8
1 Background 10
1.1 introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Introduction to Bioinformatic . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.2 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.3 Applications of bioinformatics . . . . . . . . . . . . . . . . . . . . . . 14
1.2.4 Omic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.5 Databases and repositories . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.6 Biomarkers and disease classification . . . . . . . . . . . . . . . . . . 16
1.3 Feature Selection and Bioinspired Algorithms . . . . . . . . . . . . . . . . . . 17
1.3.1 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.2 Bioinspired Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.3 Features selection using bioinspired algorithm . . . . . . . . . . . . . . 23
1.4 Machine learning and deep learning . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.1 Machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.2 Deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2 Systematic Selective Review 29
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Paper selection and filtration . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Paper classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Cancer classification using ML models . . . . . . . . . . . . . . . . . 21
2.3.2 Biomarker discovery using Features selection models in cancer . . . . 22
2.3.3 Biomarker discovery using bioinspired models in cancer . . . . . . . . 24
2.4 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3 Approach 30
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 An overview on the general proposal . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Data collection and preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5.1 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5.2 Data Preprocessing Steps . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Overview of the Feature Selection Phase . . . . . . . . . . . . . . . . . . . . . 34
3.7 What the Output Labels Mean . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.8 Bio-Inspired Feature Selection Initialization . . . . . . . . . . . . . . . . . . . 35
3.9 Pipeline Evaluation and Feature Assessment . . . . . . . . . . . . . . . . . . . 36
3.9.1 Functional Enrichment Resources . . . . . . . . . . . . . . . . . . . . 37
3.10 Enrichment Analysis Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.10.1 General Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.10.2 cBioPortal: Cancer Genomic Data Exploration . . . . . . . . . . . . . 38
3.10.3 ShinyGO: Functional Enrichment with Graphical Output . . . . . . . . 39
3.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 Experimental results & discussion 40
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4 Data preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.5 First stage feature selection using bio-inspired methods . . . . . . . . . . . . . 43
4.6 Single level omics analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.7 Multi level omics analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7.1 Deep neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7.2 Convolution neural network . . . . . . . . . . . . . . . . . . . . . . . 52
4.8 Enrichment analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
.1 Programming Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
.2 Programming Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
.3 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
|
| Côte titre : |
MAI/1008 |
Deep learning, and bioinspired optimization algorithms for genetic marker selection and disease classification. [document électronique] / Khaoula Chouder ; Maroua Kadri, Auteur ; Abderrahim Lakehal, Directeur de thèse . - [S.l.] : Setif:UFA, 2025 . - 1 vol (73 f .) ; 29 cm. Langues : Anglais ( eng)
| Catégories : |
Thèses & Mémoires:Informatique
|
| Mots-clés : |
Deep Learning
Cancer Classification
Feature Selection
Bio-Inspired Algorithms
Omics Data
Biomarker Discovery |
| Index. décimale : |
004 Informatique |
| Résumé : |
AI-based cancer diagnosis and classification have emerged as a critical research field over the
past decade, especially with advancements in next-generation sequencing technologies. However,
omics datasets are often characterized by high dimensionality, complexity, and scalability
challenges. Deep learning has been increasingly adopted to address these issues due to its
strong predictive performance. Nonetheless, deep learning models remain largely black-box in
nature, lacking interpretability—a crucial factor in biological contexts where the identification
of biomarkers and selected features is essential for personalizing treatment protocols and guiding
drug prescription. Therefore, various feature selection methods are highly sought to enhance
interpretability. In this work, we propose a hybrid approach that combines deep learning with
bioinspired feature selection techniques. This report provides an overview of recent advances in
deep learning for oncology, particularly for analyzing omics data such as genomic and transcriptomic
profiles. Applications in cancer diagnosis, prognosis, and therapeutic decision-making
are explored, with a focus on the integration of multi-omics data for building clinical decision
support systems. The results of the experiments showed that the classification accuracy was
as much as 80% for single-omics models while deep neural networks and convolutional models
performed better after bioinspired optimization. Enrichment analysis also confirmed the biological
relevance of the selected features, affirming their use as clinically meaningful biomarkers.
These findings demonstrate the effectiveness of our method in both boosting prediction performance
and interpretability in cancer classification tasks. |
| Note de contenu : |
Sommaire
List of Tables 6
List of Figures 8
1 Background 10
1.1 introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Introduction to Bioinformatic . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.2 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.3 Applications of bioinformatics . . . . . . . . . . . . . . . . . . . . . . 14
1.2.4 Omic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.5 Databases and repositories . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.6 Biomarkers and disease classification . . . . . . . . . . . . . . . . . . 16
1.3 Feature Selection and Bioinspired Algorithms . . . . . . . . . . . . . . . . . . 17
1.3.1 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.2 Bioinspired Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.3 Features selection using bioinspired algorithm . . . . . . . . . . . . . . 23
1.4 Machine learning and deep learning . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.1 Machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.2 Deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2 Systematic Selective Review 29
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Paper selection and filtration . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Paper classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Cancer classification using ML models . . . . . . . . . . . . . . . . . 21
2.3.2 Biomarker discovery using Features selection models in cancer . . . . 22
2.3.3 Biomarker discovery using bioinspired models in cancer . . . . . . . . 24
2.4 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3 Approach 30
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 An overview on the general proposal . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Data collection and preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5.1 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5.2 Data Preprocessing Steps . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Overview of the Feature Selection Phase . . . . . . . . . . . . . . . . . . . . . 34
3.7 What the Output Labels Mean . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.8 Bio-Inspired Feature Selection Initialization . . . . . . . . . . . . . . . . . . . 35
3.9 Pipeline Evaluation and Feature Assessment . . . . . . . . . . . . . . . . . . . 36
3.9.1 Functional Enrichment Resources . . . . . . . . . . . . . . . . . . . . 37
3.10 Enrichment Analysis Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.10.1 General Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.10.2 cBioPortal: Cancer Genomic Data Exploration . . . . . . . . . . . . . 38
3.10.3 ShinyGO: Functional Enrichment with Graphical Output . . . . . . . . 39
3.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 Experimental results & discussion 40
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4 Data preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.5 First stage feature selection using bio-inspired methods . . . . . . . . . . . . . 43
4.6 Single level omics analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.7 Multi level omics analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7.1 Deep neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7.2 Convolution neural network . . . . . . . . . . . . . . . . . . . . . . . 52
4.8 Enrichment analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
.1 Programming Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
.2 Programming Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
.3 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
|
| Côte titre : |
MAI/1008 |
|