1 Overview of Knowledge Discovery in Traditional Chinese Medicine 1
1.1 Introduction 1
1.2 The State of the Art of TCM Data Resources 3
1.2.1 Traditional Chinese Medical Literature Analysis and Retrieval System 4
1.2.2 Figures and Photographs of Traditional Chinese Drug Database 4
1.2.3 Database of Chinese Medical Formulae 5
1.2.4 Database of Chemical Composition from Chinese Herbal Medicine 5
1.2.5 Clinical Medicine Database 5
1.2.6 TCM Electronic Medical Record Database 6
1.3 Review of KDTCM Research 6
1.3.1 Knowledge Discovery for CMF Research 6
1.3.2 Knowledge Discovery for CHM Research 11
1.3.3 Knowledge Discovery for Research of TCM Syndrome 14
1.3.4 Knowledge Discovery for TCM Clinical Diagnosis 16
1.4 Discussions and Future Directions 19
1.5 Conclusions 22
2 Integrative Mining of Traditional Chinese Medicine Literature and MEDLINE for Functional Gene Networks 27
2.1 Introduction 27
2.2 Connecting TCM Syndrome to Modern Biomedicine by Integrative Literature Mining 29
2.3 Related Work on Biomedical Literature Mining 30
2.4 Name Entity and Relation Extraction Methods 33
2.4.1 Bubble-Bootstrapping Method 33
2.4.2 Relation Weight Computing 35
2.5 MeDisco/3S System 36
2.6 Results 38
2.6.1 Functional Gene Networks 43
2.6.2 Functional Analysis of Genes from Syndrome Perspective 45
2.7 Conclusions 47
3 MapReduce-Based Network Motif Detection for Traditional Chinese Medicine 53
3.1 Introduction 53
3.2 Related Work 54
3.3 MapReduce-Based Pattern Finding 55
3.3.1 MRPF Framework 55
3.3.2 Neighbor Vertices Finding and Pattern Initialization 57
3.3.3 Pattern Extension 58
3.3.4 Frequency Computing 59
3.4 Application to Prescription Compatibility Structure Detection 61
3.4.1 Motifs Detection Results 61
3.4.2 Performance Analysis 62
3.5 Conclusions 64
4 Data Quality for Knowledge Discovery in Traditional Chinese Medicine 67
4.1 Introduction 67
4.2 Key Data Quality Dimensions in TCM 69
4.2.1 Representation Granularity 69
4.2.2 Representation Consistency 69
4.2.3 Completeness 70
4.3 Methods to Handle Data Quality Problems 70
4.3.1 Handling Representation Granularity 70
4.3.2 Handling Representation Consistency 71
4.3.3 Handling Completeness 72
4.4 Conclusions 73
5 Service-Oriented Data Mining in Traditional Chinese Medicine 75
5.1 Introduction 75
5.2 Related Work 76
5.2.1 Traditional Data Mining Software 76
5.2.2 Data Mining Systems for Specific Field 77
5.2.3 Distributed Data Mining Platform 77
5.2.4 The Spora Demo 78
5.3 System Architecture and Data Mining Service 78
5.3.1 Hierarchical Structure 78
5.3.2 Service Operator Organization 80
5.3.3 User Interaction and Visualization 81
5.4 Case Studies 82
5.4.1 Case 1: Domain-Driven KDD Support for TCM 82
5.4.2 Case 2: Data Mining Based on Distributed Resources 84
5.4.3 Case 3: Data Mining Process as a Service 84
5.5 Conclusions 85
6 Semantic E-Science for Traditional Chinese Medicine 87
6.1 Introduction 87
6.2 Results 89
6.2.1 System Architecture 89
6.2.2 TCM Domain Ontology 91
6.2.3 DartMapping 93
6.2.4 DartSearch 94
6.2.5 DartQuery 95
6.2.6 TCM Service Coordination 98
6.2.7 Knowledge Discovery Service 98
6.2.8 DartFlow 99
6.2.9 TCM Collaborative Research Scenario 100
6.2.10 Task-Driven Information Allocation 100
6.2.11 Collaborative Information Sharing 101
6.2.12 Scientific Service Coordination 102
6.3 Discussion 102
6.4 Conclusions 103
6.5 Methods 103
6.5.1 TCM Ontology Engineering 103
6.5.2 View-Based Semantic Mapping 104
6.5.3 Semantic-Based Service Matchmaking 105
7 Ontology Development for Unified Traditional Chinese Medical Language System 109
7.1Introduction 109
7.2The Principle and Knowledge System of TCM 110
7.3What Is an Ontology? 111
7.4Protege 2000: The Tool We Use 111
7.5Ontology Design and Development for UTCMLS 112
7.5.1 Methodology of Ontology Development 113
7.5.2 Knowledge Acquisition 115
7.5.3 Integrating and Merging of TCM Ontology 117
7.6 Results 117
7.6.1 The Core Top-Level Categories 120
7.6.2 Subontologies and the Hierarchical Structure 120
7.6.3 Concept Structure 120
7.6.4 Semantic Structure 121
7.6.5 Semantic Types and Semantic Relationships 121
7.7 Conclusions 124
8 Causal Knowledge Modeling for Traditional Chinese Medicine Using OWL 2 129
8.1 Introduction 129
8.2 Causal TCM Knowledge Modeling 130
8.3 Causal Reasoning 130
8.4 Evaluation 131
8.5 Conclusions 132
9 Dynamic Subontology Evolution for Traditional Chinese Medicine Web Ontology 135
9.1 Introduction 135
9.2 TCM Domain Ontology 136
9.2.1 Ontology Framework 136
9.2.2 User Interface 139
9.3 Subontology Model 140
9.3.1 Preliminaries 142
9.3.2 Subontology Definition 143
9.3.3 Subontology Operators 144
9.4 Ontology Cache for Knowledge Reuse 146
9.4.1 Reusing Subontologies as Ontology Cache 146
9.4.2 Knowledge Search with Ontology Cache 147
9.4.3 On SubO Structural Optimality 151
9.5 Dynamic Subontology Evolution 152
9.5.1 Chromosome Representation 152
9.5.2 Fitness Evaluation 154
9.5.3 Genetic Operators 154
9.5.4 Evolution Procedure 157
9.5.5 Consistency 158
9.6 Experiment and Evaluation 158
9.6.1 Experiment Design 158
9.6.2 Compare Cache Performance 160
9.6.3 Knowledge Structure 163
9.6.4 Traversal Depth for SubO Extraction 164
9.7 Related Work 165
9.8 Conclusions 166
10 Semantic Association Mining for Traditional Chinese Medicine 171
10.1 Introduction 171
10.1.1 The Semantic Web for Collaborative Knowledge Discove 171
10.1.2 The Motivating Story 172
10.1.3 HerbNet: The Knowledge Network for Herbal Medicine 173
10.1.4 Paper Organization 174
10.2 Related Work 174
10.2.1 Domain-Driven Relationship Mining for Biomedicine 174
10.2.2 Linked Data on the Semantic Web 175
10.2.3 Semantic Association Mining 176
10.3 Methods 177
10.3.1 Semantic Graph Model 177
10.3.2 Hypothesis and Hypothetical Graph 178
10.3.3 Evidence and Evidentiary Graph 179
10.3.4 Semantic Schema 181
10.3.5 Semantic Association Mining 182
10.3.6 Semantic Association Ranking 184
10.3.7 Summary 185
10.4 Evaluation 185
10.4.1 Synthetic Graph Generation 186
10.4.2 Engine Implementation 186
10.4.3 Miner Implementation 187
10.4.4 Collaborative Discovery Process 189
10.4.5 Result Analysis 190
10.5 Use Cases 191
10.5.1 The HerbNet 192
10.5.2 Formula System Interpretation 193
10.5.3 Herb—Drug Interaction Network Analysis 194
10.6 Conclusions 195
11 Semantic-Based Database Integration for Traditional Chinese Medicine 199
11.1 Introduction 199
11.2 System Architecture and Technical Features 201
11.2.1 System Architecture 201
11.2.2 Technical Features 201
11.3 Semantic Mediation 202
11.3.1 Semantic View and View-Based Mapping 202
11.3.2 Visualized Semantic Mapping Tool 204
11.4 TCM Semantic Portals 205
11.4.1 Dynamic Semantic Query Interface 205
11.4.2 Intuitive Search Interface with Concepts Ranking and Semantic Navigation 206
11.5 User Evaluation and Lesson Learned 208
11.5.1 Feedback from CATCM 208
11.5.2 A Survey on the Usage of RDF/OWL Predicates 209
11.6 Related Work 209
11.6.1 Semantic Web Context 209
11.6.2 Conventional Data Integration Context 211
11.7 Conclusions 211
12 Probabilistic Semantic Relationship Discovery from Traditional Chinese Medical Literature 213
12.1 Background 213
12.2 Related Work 214
12.3 Methods 215
12.3.1 Instance Extraction 215
12.3.2 Instance Pair Discovery 215
12.3.3 Semantic Relationship Evaluation 217
12.3.4 Probability-Based Semantic Relationship Extraction 218
12.4 Results and Discussions 220
12.5 Conclusions 221
13 Deriving Similarity Graphs from Traditional Chinese Medicine Linked Data on the Semantic Web 223
13.1 Introduction 223
13.2 Related Work 224
13.2.1 Taxonomy-Based Approach 224
13.2.2 Relationship-Based Approach 224
13.3 SST Approach 225
13.3.1 Similarity Transition 225
13.3.2 Similarity between Sets of Objects 226
13.4 Experiments and Results 227
13.4.1 Dataset Preparation 228
13.4.2 Results Analysis 229
13.4.3 Result Visualization 231
13.5 Conclusions 232