AsiaSim 2014: 14th International Conference on Systems Simulation, Kitakyushu, Japan, October 26-30, 2014. Proceedings
Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery.
This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review of clustering analysis in bioinformatics from the fundamentals through to state-of-the-art techniques and applications.
Key Features:Offers a contemporary review of clustering methods and applications in the field of bioinformatics, with particular emphasis on gene expression analysis Provides an excellent introduction to molecular biology with computer scientists and information engineering researchers in mind, laying out the basic biological knowledge behind the application of clustering analysis techniques in bioinformatics Explains the structure and properties of many types of high-throughput datasets commonly found in biological studies Discusses how clustering methods and their possible successors would be used to enhance the pace of biological discoveries in the future Includes a companion website hosting a selected collection of codes and links to publicly available datasets
The first five chapters of this volume investigate advances in the use of instance-level, pairwise constraints for partitional and hierarchical clustering. The book then explores other types of constraints for clustering, including cluster size balancing, minimum cluster size,and cluster-level relational constraints.
It also describes variations of the traditional clustering under constraints problem as well as approximation algorithms with helpful performance guarantees.
The book ends by applying clustering with constraints to relational data, privacy-preserving data publishing, and video surveillance data. It discusses an interactive visual clustering approach, a distance metric learning approach, existential constraints, and automatically generated constraints.
With contributions from industrial researchers and leading academic experts who pioneered the field, this volume delivers thorough coverage of the capabilities and limitations of constrained clustering methods as well as introduces new types of constraints and clustering algorithms.
Clustering is one of the most fundamental and essential data analysis techniques. Clustering can be used as an independent data mining task to discern intrinsic characteristics of data, or as a preprocessing step with the clustering results then used for classification, correlation analysis, or anomaly detection.
Kogan and his co-editors have put together recent advances in clustering large and high-dimension data. Their volume addresses new topics and methods which are central to modern data analysis, with particular emphasis on linear algebra tools, opimization methods and statistical techniques. The contributions, written by leading researchers from both academia and industry, cover theoretical basics as well as application and evaluation of algorithms, and thus provide an excellent state-of-the-art overview.
The level of detail, the breadth of coverage, and the comprehensive bibliography make this book a perfect fit for researchers and graduate students in data mining and in many other important related application areas.
Covering theory, algorithms, and methodologies, as well as data mining technologies, Data Mining for Bioinformatics provides a comprehensive discussion of data-intensive computations used in data mining with applications in bioinformatics. It supplies a broad, yet in-depth, overview of the application domains of data mining for bioinformatics to help readers from both biology and computer science backgrounds gain an enhanced understanding of this cross-disciplinary field.
The book offers authoritative coverage of data mining techniques, technologies, and frameworks used for storing, analyzing, and extracting knowledge from large databases in the bioinformatics domains, including genomics and proteomics. It begins by describing the evolution of bioinformatics and highlighting the challenges that can be addressed using data mining techniques. Introducing the various data mining techniques that can be employed in biological databases, the text is organized into four sections:
The book describes the various biological databases prominently referred to in bioinformatics and includes a detailed list of the applications of advanced clustering algorithms used in bioinformatics. Highlighting the challenges encountered during the application of classification on biological databases, it considers systems of both single and ensemble classifiers and shares effort-saving tips for model selection and performance estimation strategies.