University Sétif 1 FERHAT ABBAS Faculty of Sciences
Détail de l'auteur
Auteur Anis Louail |
Documents disponibles écrits par cet auteur
Ajouter le résultat dans votre panier Affiner la recherche
Titre : Differential Privacy in Bayesian Networks for Synthetic Data Type de document : document électronique Auteurs : Anis Louail ; Aliouat ,Zibouda, Directeur de thèse Editeur : Setif:UFA Année de publication : 2025 Importance : 1 vol (56 f .) Format : 29 cm Langues : Anglais (eng) Catégories : Thèses & Mémoires:Informatique Mots-clés : Informatique
Differential PrivacyIndex. décimale : 004 Informatique Résumé :
With data-driven decision-making becoming increasingly prevalent, maintaining
the confidentiality of protected information that resides in database
systems is now an acute issue. Classical privacy-preserving methods, including
Differential Privacy (DP), provide firm theoretical assurance by injecting
noise that is calibrated into the answer that is output by each query. Nevertheless,
such protection usually comes at the price of utility, at least where
the level of noise is preset and independent of user background knowledge.
At the same time, inference attacks are perilous by virtue of allowing
adversaries to infer covert information by tracking a series of apparently innocuous
queries. Existing inference-detecting mechanisms attempt to model
and observe user knowledge but lack any effective control measures for probabilistically
handling information leakage.
This dissertation studies a hybrid solution that blends Differential Privacy
with user-adapted knowledge modeling for adaptively regulating the level of
privacy. With probabilistic graphical models, notably Bayesian networks, we
want to predict what the user would infer from the response to queries. The
level of noise due to DP is then fixed based on this estimated knowledge such
that utility and privacy are traded off.
Our approach is validated by an implementation prototype with a PostgreSQL
database, populated with inference monitoring and DP protection.
The result demonstrates utility growth with stronger privacy protection by
varying protection with user knowledge.Note de contenu : Sommaire
1 Introduction and Problem Statement 6
1.1 General Introduction . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Real-World Motivation . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Methodological Approach . . . . . . . . . . . . . . . . . . . . 8
1.6 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . 9
2 State of the Art 10
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Data Privacy in Databases . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Overview of Privacy Concerns . . . . . . . . . . . . . . 11
2.2.2 Traditional Privacy Techniques . . . . . . . . . . . . . 11
2.2.3 Limitations of Conventional Approaches . . . . . . . . 12
2.3 Inference Attacks . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Definition and Motivation . . . . . . . . . . . . . . . . 12
2.3.2 Types of Inference Attacks . . . . . . . . . . . . . . . . 13
2.3.3 Inference Attacks in Non-Interactive Settings . . . . . . 13
2.4 Bayesian Networks in Privacy and Security . . . . . . . . . . . 14
2.4.1 Bayesian Networks Overview . . . . . . . . . . . . . . . 14
2.4.2 Why Use Bayesian Networks in Privacy Research . . . 15
2.4.3 Use in Modeling Inference . . . . . . . . . . . . . . . . 16
2.4.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Differential Privacy Mechanisms . . . . . . . . . . . . . . . . . 18
2.5.1 Motivation and Formal Guarantee . . . . . . . . . . . . 18
2.5.2 Global vs. Local Differential Privacy . . . . . . . . . . 19
2.5.3 Mechanisms for Ensuring Differential Privacy . . . . . 19
2.5.4 Comparison of DP Mechanisms . . . . . . . . . . . . . 20
2.6 Evaluation Metrics for Differentially Private Synthetic Data . 20
2.6.1 On the Notion of Fidelity . . . . . . . . . . . . . . . . 21
2.7 Selected Methods for Differentially Private Tabular Data Synthesis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.7.1 PrivBayes . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.7.2 MWEM (Multiplicative Weights Exponential Mechanism)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.7.3 DualQuery . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.7.4 Private-PGM . . . . . . . . . . . . . . . . . . . . . . . 23
2.7.5 Junction Tree Method . . . . . . . . . . . . . . . . . . 23
2.7.6 Justification of Selected Comparisons . . . . . . . . . . 24
2.7.7 Overview of PrivBayes . . . . . . . . . . . . . . . . . . 24
2.7.8 MWEM (Multiplicative Weights Exponential Mechanism)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7.9 DualQuery . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.7.10 Private-PGM . . . . . . . . . . . . . . . . . . . . . . . 26
2.7.11 Junction Tree Methods . . . . . . . . . . . . . . . . . . 26
2.8 In-depth Comparison of PrivBayes and Private-PGM . . . . . 27
2.8.1 Theoretical Foundations . . . . . . . . . . . . . . . . . 27
2.8.2 Computational Complexity and Efficiency . . . . . . . 27
2.8.3 Handling of Different Data Types . . . . . . . . . . . . 27
2.8.4 Quality and Accuracy of Generated Data . . . . . . . . 28
2.8.5 Practical Applications and Recommendations . . . . . 28
2.9 In-depth Comparison of PrivBayes and Junction Tree Methods 28
2.9.1 Fundamental Differences . . . . . . . . . . . . . . . . . 28
2.9.2 Computational Efficiency and Complexity . . . . . . . 29
2.9.3 Inference Mechanisms . . . . . . . . . . . . . . . . . . 29
2.9.4 Handling of Data Types and Structure . . . . . . . . . 29
2.9.5 Privacy Budget Utilization . . . . . . . . . . . . . . . . 29
2.9.6 Synthetic Data Accuracy and Generalizability . . . . . 30
2.9.7 Practical Recommendations . . . . . . . . . . . . . . . 30
2.10 Related Work on DP Database Publishing . . . . . . . . . . . 30
2.11 Limitations and Research Gaps . . . . . . . . . . . . . . . . . 35
2.11.1 Summary of Limitations . . . . . . . . . . . . . . . . . 35
2.11.2 Motivation for Our Work . . . . . . . . . . . . . . . . . 36
2.12 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3 Methodology 38
3.1 Overview of the Proposed Approach . . . . . . . . . . . . . . . 38
3.2 Bayesian Network Structure Learning . . . . . . . . . . . . . . 38
3.3 Integration of Differential Privacy . . . . . . . . . . . . . . . . 39
3.4 Synthetic Data Generation . . . . . . . . . . . . . . . . . . . . 40
3.5 Evaluation Strategy . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5.1 Structure Comparison . . . . . . . . . . . . . . . . . . 41
3.5.2 Data Utility . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6 Structural Repair of DP Networks . . . . . . . . . . . . . . . . 41
3.6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6.2 Edge Injection Method . . . . . . . . . . . . . . . . . . 42
3.6.3 Synthetic Data After Structural Repair . . . . . . . . . 42
4 Results and Discussion 43
4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.1 Dataset Description . . . . . . . . . . . . . . . . . . . . 43
4.1.2 Tools and Libraries . . . . . . . . . . . . . . . . . . . . 44
4.1.3 Parameter Settings . . . . . . . . . . . . . . . . . . . . 44
4.2 Utility and Structural Similarity Evaluation . . . . . . . . . . 45
4.2.1 KL Divergence Across Variables . . . . . . . . . . . . . 45
4.2.2 Bayesian Network Structure Evaluation . . . . . . . . . 46
4.2.3 Classification Utility . . . . . . . . . . . . . . . . . . . 48
4.3 Structural Repair and Its Impact . . . . . . . . . . . . . . . . 48
4.3.1 Edge Injection Experiments . . . . . . . . . . . . . . . 48
4.3.2 Utility After Repair . . . . . . . . . . . . . . . . . . . . 50
4.4 Comparison Between Methods . . . . . . . . . . . . . . . . . . 51
4.5 Discussion and Limitations . . . . . . . . . . . . . . . . . . . . 51
General Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Côte titre : MAI/1039 Differential Privacy in Bayesian Networks for Synthetic Data [document électronique] / Anis Louail ; Aliouat ,Zibouda, Directeur de thèse . - [S.l.] : Setif:UFA, 2025 . - 1 vol (56 f .) ; 29 cm.
Langues : Anglais (eng)
Catégories : Thèses & Mémoires:Informatique Mots-clés : Informatique
Differential PrivacyIndex. décimale : 004 Informatique Résumé :
With data-driven decision-making becoming increasingly prevalent, maintaining
the confidentiality of protected information that resides in database
systems is now an acute issue. Classical privacy-preserving methods, including
Differential Privacy (DP), provide firm theoretical assurance by injecting
noise that is calibrated into the answer that is output by each query. Nevertheless,
such protection usually comes at the price of utility, at least where
the level of noise is preset and independent of user background knowledge.
At the same time, inference attacks are perilous by virtue of allowing
adversaries to infer covert information by tracking a series of apparently innocuous
queries. Existing inference-detecting mechanisms attempt to model
and observe user knowledge but lack any effective control measures for probabilistically
handling information leakage.
This dissertation studies a hybrid solution that blends Differential Privacy
with user-adapted knowledge modeling for adaptively regulating the level of
privacy. With probabilistic graphical models, notably Bayesian networks, we
want to predict what the user would infer from the response to queries. The
level of noise due to DP is then fixed based on this estimated knowledge such
that utility and privacy are traded off.
Our approach is validated by an implementation prototype with a PostgreSQL
database, populated with inference monitoring and DP protection.
The result demonstrates utility growth with stronger privacy protection by
varying protection with user knowledge.Note de contenu : Sommaire
1 Introduction and Problem Statement 6
1.1 General Introduction . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Real-World Motivation . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Methodological Approach . . . . . . . . . . . . . . . . . . . . 8
1.6 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . 9
2 State of the Art 10
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Data Privacy in Databases . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Overview of Privacy Concerns . . . . . . . . . . . . . . 11
2.2.2 Traditional Privacy Techniques . . . . . . . . . . . . . 11
2.2.3 Limitations of Conventional Approaches . . . . . . . . 12
2.3 Inference Attacks . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Definition and Motivation . . . . . . . . . . . . . . . . 12
2.3.2 Types of Inference Attacks . . . . . . . . . . . . . . . . 13
2.3.3 Inference Attacks in Non-Interactive Settings . . . . . . 13
2.4 Bayesian Networks in Privacy and Security . . . . . . . . . . . 14
2.4.1 Bayesian Networks Overview . . . . . . . . . . . . . . . 14
2.4.2 Why Use Bayesian Networks in Privacy Research . . . 15
2.4.3 Use in Modeling Inference . . . . . . . . . . . . . . . . 16
2.4.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Differential Privacy Mechanisms . . . . . . . . . . . . . . . . . 18
2.5.1 Motivation and Formal Guarantee . . . . . . . . . . . . 18
2.5.2 Global vs. Local Differential Privacy . . . . . . . . . . 19
2.5.3 Mechanisms for Ensuring Differential Privacy . . . . . 19
2.5.4 Comparison of DP Mechanisms . . . . . . . . . . . . . 20
2.6 Evaluation Metrics for Differentially Private Synthetic Data . 20
2.6.1 On the Notion of Fidelity . . . . . . . . . . . . . . . . 21
2.7 Selected Methods for Differentially Private Tabular Data Synthesis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.7.1 PrivBayes . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.7.2 MWEM (Multiplicative Weights Exponential Mechanism)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.7.3 DualQuery . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.7.4 Private-PGM . . . . . . . . . . . . . . . . . . . . . . . 23
2.7.5 Junction Tree Method . . . . . . . . . . . . . . . . . . 23
2.7.6 Justification of Selected Comparisons . . . . . . . . . . 24
2.7.7 Overview of PrivBayes . . . . . . . . . . . . . . . . . . 24
2.7.8 MWEM (Multiplicative Weights Exponential Mechanism)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7.9 DualQuery . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.7.10 Private-PGM . . . . . . . . . . . . . . . . . . . . . . . 26
2.7.11 Junction Tree Methods . . . . . . . . . . . . . . . . . . 26
2.8 In-depth Comparison of PrivBayes and Private-PGM . . . . . 27
2.8.1 Theoretical Foundations . . . . . . . . . . . . . . . . . 27
2.8.2 Computational Complexity and Efficiency . . . . . . . 27
2.8.3 Handling of Different Data Types . . . . . . . . . . . . 27
2.8.4 Quality and Accuracy of Generated Data . . . . . . . . 28
2.8.5 Practical Applications and Recommendations . . . . . 28
2.9 In-depth Comparison of PrivBayes and Junction Tree Methods 28
2.9.1 Fundamental Differences . . . . . . . . . . . . . . . . . 28
2.9.2 Computational Efficiency and Complexity . . . . . . . 29
2.9.3 Inference Mechanisms . . . . . . . . . . . . . . . . . . 29
2.9.4 Handling of Data Types and Structure . . . . . . . . . 29
2.9.5 Privacy Budget Utilization . . . . . . . . . . . . . . . . 29
2.9.6 Synthetic Data Accuracy and Generalizability . . . . . 30
2.9.7 Practical Recommendations . . . . . . . . . . . . . . . 30
2.10 Related Work on DP Database Publishing . . . . . . . . . . . 30
2.11 Limitations and Research Gaps . . . . . . . . . . . . . . . . . 35
2.11.1 Summary of Limitations . . . . . . . . . . . . . . . . . 35
2.11.2 Motivation for Our Work . . . . . . . . . . . . . . . . . 36
2.12 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3 Methodology 38
3.1 Overview of the Proposed Approach . . . . . . . . . . . . . . . 38
3.2 Bayesian Network Structure Learning . . . . . . . . . . . . . . 38
3.3 Integration of Differential Privacy . . . . . . . . . . . . . . . . 39
3.4 Synthetic Data Generation . . . . . . . . . . . . . . . . . . . . 40
3.5 Evaluation Strategy . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5.1 Structure Comparison . . . . . . . . . . . . . . . . . . 41
3.5.2 Data Utility . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6 Structural Repair of DP Networks . . . . . . . . . . . . . . . . 41
3.6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6.2 Edge Injection Method . . . . . . . . . . . . . . . . . . 42
3.6.3 Synthetic Data After Structural Repair . . . . . . . . . 42
4 Results and Discussion 43
4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.1 Dataset Description . . . . . . . . . . . . . . . . . . . . 43
4.1.2 Tools and Libraries . . . . . . . . . . . . . . . . . . . . 44
4.1.3 Parameter Settings . . . . . . . . . . . . . . . . . . . . 44
4.2 Utility and Structural Similarity Evaluation . . . . . . . . . . 45
4.2.1 KL Divergence Across Variables . . . . . . . . . . . . . 45
4.2.2 Bayesian Network Structure Evaluation . . . . . . . . . 46
4.2.3 Classification Utility . . . . . . . . . . . . . . . . . . . 48
4.3 Structural Repair and Its Impact . . . . . . . . . . . . . . . . 48
4.3.1 Edge Injection Experiments . . . . . . . . . . . . . . . 48
4.3.2 Utility After Repair . . . . . . . . . . . . . . . . . . . . 50
4.4 Comparison Between Methods . . . . . . . . . . . . . . . . . . 51
4.5 Discussion and Limitations . . . . . . . . . . . . . . . . . . . . 51
General Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Côte titre : MAI/1039 Exemplaires (1)
Code-barres Cote Support Localisation Section Disponibilité MAI/1039 MAI/1039 Mémoire Bibliothèque des sciences Anglais Disponible
Disponible

