The book focuses on three primary aspects of data clustering: Methods, describing key techniques commonly used for clustering, such as feature selection, agglomerative clustering, partitional clustering, density-based clustering, probabilistic clustering, grid-based clustering, spectral clustering, and nonnegative matrix factorization Domains, covering methods used for different domains of data, such as categorical data, text data, multimedia data, graph data, biological data, stream data, uncertain data, time series clustering, high-dimensional clustering, and big data Variations and Insights, discussing important variations of the clustering process, such as semisupervised clustering, interactive clustering, multiview clustering, cluster ensembles, and cluster validation
In this book, top researchers from around the world explore the characteristics of clustering problems in a variety of application areas. They also explain how to glean detailed insight from the clustering process—including how to verify the quality of the underlying clusters—through supervision, human intervention, or the automated generation of alternative clusters.
· Thoroughly developed to include many more worked examples to give greater understanding of the various methods and techniques
· Many more diagrams included--now in two color--to provide greater insight through visual presentation
· Matlab code of the most common methods are given at the end of each chapter.
· More Matlab code is available, together with an accompanying manual, via this site
· Latest hot topics included to further the reference value of the text including non-linear dimensionality reduction techniques, relevance feedback, semi-supervised learning, spectral clustering, combining clustering algorithms.
· An accompanying book with Matlab code of the most common methods and algorithms in the book, together with a descriptive summary, and solved examples including real-life data sets in imaging, and audio recognition. The companion book will be available separately or at a special packaged price (ISBN: 9780123744869).Thoroughly developed to include many more worked examples to give greater understanding of the various methods and techniques Many more diagrams included--now in two color--to provide greater insight through visual presentation Matlab code of the most common methods are given at the end of each chapter An accompanying book with Matlab code of the most common methods and algorithms in the book, together with a descriptive summary and solved examples, and including real-life data sets in imaging and audio recognition. The companion book is available separately or at a special packaged price (Book ISBN: 9780123744869. Package ISBN: 9780123744913) Latest hot topics included to further the reference value of the text including non-linear dimensionality reduction techniques, relevance feedback, semi-supervised learning, spectral clustering, combining clustering algorithms Solutions manual, powerpoint slides, and additional resources are available to faculty using the text for their course. Register at www.textbooks.elsevier.com and search on "Theodoridis" to access resources for instructor.
The book offers authoritative coverage of data mining techniques, technologies, and frameworks used for storing, analyzing, and extracting knowledge from large databases in the bioinformatics domains, including genomics and proteomics. It begins by describing the evolution of bioinformatics and highlighting the challenges that can be addressed using data mining techniques. Introducing the various data mining techniques that can be employed in biological databases, the text is organized into four sections: Supplies a complete overview of the evolution of the field and its intersection with computational learning Describes the role of data mining in analyzing large biological databases—explaining the breath of the various feature selection and feature extraction techniques that data mining has to offer Focuses on concepts of unsupervised learning using clustering techniques and its application to large biological data Covers supervised learning using classification techniques most commonly used in bioinformatics—addressing the need for validation and benchmarking of inferences derived using either clustering or classification
The book describes the various biological databases prominently referred to in bioinformatics and includes a detailed list of the applications of advanced clustering algorithms used in bioinformatics. Highlighting the challenges encountered during the application of classification on biological databases, it considers systems of both single and ensemble classifiers and shares effort-saving tips for model selection and performance estimation strategies.
The 30 chapters of this book cover the current status of SOM theory, such as connections of SOM to clustering, classification, probabilistic models, and energy functions. Many applications of the SOM are given, with data mining and exploratory data analysis the central topic, applied to large databases of financial data, medical data, free-form text documents, digital images, speech, and process measurements. Biological models related to the SOM are also discussed.