Mining Heterogeneous Information Networks: Principles and Methodologies

Morgan & Claypool Publishers

Real-world physical and abstract data objects are interconnected, forming gigantic, interconnected networks. By structuring these data objects and interactions between these objects into multiple types, such networks become semi-structured heterogeneous information networks. Most real-world applications that handle big data, including interconnected social media and social networks, scientific, engineering, or medical information systems, online e-commerce systems, and most database systems, can be structured into heterogeneous information networks. Therefore, effective analysis of large-scale heterogeneous information networks poses an interesting but critical challenge. In this book, we investigate the principles and methodologies of mining heterogeneous information networks. Departing from many existing network models that view interconnected data as homogeneous graphs or networks, our semi-structured heterogeneous information network model leverages the rich semantics of typed nodes and links in a network and uncovers surprisingly rich knowledge from the network. This semi-structured heterogeneous network modeling leads to a series of new principles and powerful methodologies for mining interconnected data, including: (1) rank-based clustering and classification; (2) meta-path-based similarity search and mining; (3) relation strength-aware mining, and many other potential developments. This book introduces this new research frontier and points out some promising research directions. Table of Contents: Introduction / Ranking-Based Clustering / Classification of Heterogeneous Information Networks / Meta-Path-Based Similarity Search / Meta-Path-Based Relationship Prediction / Relation Strength-Aware Clustering with Incomplete Attributes / User-Guided Clustering via Meta-Path Selection / Research Frontiers
Read more
Loading...

Additional Information

Publisher
Morgan & Claypool Publishers
Read more
Published on
Dec 31, 2012
Read more
Pages
147
Read more
ISBN
9781608458806
Read more
Read more
Best For
Read more
Language
English
Read more
Genres
Computers / Databases / Data Mining
Read more
Content Protection
This content is DRM protected.
Read more

Reading information

Smartphones and Tablets

Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.

Laptops and Computers

You can read books purchased on Google Play using your computer's web browser.

eReaders and other devices

To read on e-ink devices like the Sony eReader or Barnes & Noble Nook, you'll need to download a file and transfer it to your device. Please follow the detailed Help center instructions to transfer the files to supported eReaders.
Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining.

This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining.

Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projectsAddresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fieldsProvides a comprehensive, practical look at the concepts and techniques you need to get the most out of your data
Outlier (or anomaly) detection is a very broad field which has been studied in the context of a large number of research areas like statistics, data mining, sensor networks, environmental science, distributed systems, spatio-temporal mining, etc. Initial research in outlier detection focused on time series-based outliers (in statistics). Since then, outlier detection has been studied on a large variety of data types including high-dimensional data, uncertain data, stream data, network data, time series data, spatial data, and spatio-temporal data. While there have been many tutorials and surveys for general outlier detection, we focus on outlier detection for temporal data in this book. A large number of applications generate temporal datasets. For example, in our everyday life, various kinds of records like credit, personnel, financial, judicial, medical, etc., are all temporal. This stresses the need for an organized and detailed study of outliers with respect to such temporal data. In the past decade, there has been a lot of research on various forms of temporal data including consecutive data snapshots, series of data snapshots and data streams. Besides the initial work on time series, researchers have focused on rich forms of data including multiple data streams, spatio-temporal data, network data, community distribution data, etc. Compared to general outlier detection, techniques for temporal outlier detection are very different. In this book, we will present an organized picture of both recent and past research in temporal outlier detection. We start with the basics and then ramp up the reader to the main ideas in state-of-the-art outlier detection techniques. We motivate the importance of temporal outlier detection and brief the challenges beyond usual outlier detection. Then, we list down a taxonomy of proposed techniques for temporal outlier detection. Such techniques broadly include statistical techniques (like AR models, Markov models, histograms, neural networks), distance- and density-based approaches, grouping-based approaches (clustering, community detection), network-based approaches, and spatio-temporal outlier detection approaches. We summarize by presenting a wide collection of applications where temporal outlier detection techniques have been applied to discover interesting outliers.
Machine Learning and Knowledge Discovery for Engineering Systems Health Management presents state-of-the-art tools and techniques for automatically detecting, diagnosing, and predicting the effects of adverse events in an engineered system. With contributions from many top authorities on the subject, this volume is the first to bring together the two areas of machine learning and systems health management.

Divided into three parts, the book explains how the fundamental algorithms and methods of both physics-based and data-driven approaches effectively address systems health management. The first part of the text describes data-driven methods for anomaly detection, diagnosis, and prognosis of massive data streams and associated performance metrics. It also illustrates the analysis of text reports using novel machine learning approaches that help detect and discriminate between failure modes. The second part focuses on physics-based methods for diagnostics and prognostics, exploring how these methods adapt to observed data. It covers physics-based, data-driven, and hybrid approaches to studying damage propagation and prognostics in composite materials and solid rocket motors. The third part discusses the use of machine learning and physics-based approaches in distributed data centers, aircraft engines, and embedded real-time software systems.

Reflecting the interdisciplinary nature of the field, this book shows how various machine learning and knowledge discovery techniques are used in the analysis of complex engineering systems. It emphasizes the importance of these techniques in managing the intricate interactions within and between the systems to maintain a high degree of reliability.

Real world physical and abstract data objects are interconnected, forming gigantic, interconnected networks. By structuring these data objects and interactions between these objects into multiple types, such networks become semi-structured heterogeneous information networks. Most real world applications that handle big data, including interconnected social media and social networks, scientific, engineering, or medical information systems, online e-commerce systems, and most database systems, can be structured into heterogeneous information networks. Therefore, effective analysis of large-scale heterogeneous information networks poses an interesting but critical challenge. In this monograph, we investigate the principles and methodologies of mining heterogeneous information networks. Departing from many existing network models that view data as homogeneous graphs or networks, our semi-structured heterogeneous information network model leverages the rich semantics of typed nodes and links in a network and uncovers surprisingly rich knowledge from interconnected data. This semi-structured heterogeneous network modeling leads to a series of new principles and powerful methodologies for mining interconnected data, including (1) rank-based clustering and classification, (2) meta-path-based similarity search and mining, (3) relation strength-aware mining, and many other potential developments. This monograph introduces this new research frontier and points out some promising research directions.
©2018 GoogleSite Terms of ServicePrivacyDevelopersArtistsAbout Google
By purchasing this item, you are transacting with Google Payments and agreeing to the Google Payments Terms of Service and Privacy Notice.