Statistical Data Analytics: Foundations for Data Mining, Informatics, and Knowledge Discovery

Sold by John Wiley & Sons
Free sample

A comprehensive introduction to statistical methods for data mining and knowledge discovery.

Applications of data mining and ‘big data’ increasingly take center stage in our modern, knowledge-driven society, supported by advances in computing power, automated data acquisition, social media development and interactive, linkable internet software. This book presents a coherent, technical introduction to modern statistical learning and analytics, starting from the core foundations of statistics and probability. It includes an overview of probability and statistical distributions, basics of data manipulation and visualization, and the central components of standard statistical inferences. The majority of the text extends beyond these introductory topics, however, to supervised learning in linear regression, generalized linear models, and classification analytics. Finally, unsupervised learning via dimension reduction, cluster analysis, and market basket analysis are introduced.

Extensive examples using actual data (with sample R programming code) are provided, illustrating diverse informatic sources in genomics, biomedicine, ecological remote sensing, astronomy, socioeconomics, marketing, advertising and finance, among many others.

Statistical Data Analytics:

  • Focuses on methods critically used in data mining and statistical informatics. Coherently describes the methods at an introductory level, with extensions to selected intermediate and advanced techniques.
  • Provides informative, technical details for the highlighted methods.
  • Employs the open-source R language as the computational vehicle – along with its burgeoning collection of online packages – to illustrate many of the analyses contained in the book.
  • Concludes each chapter with a range of interesting and challenging homework exercises using actual data from a variety of informatic application areas.

This book will appeal as a classroom or training text to intermediate and advanced undergraduates, and to beginning graduate students, with sufficient background in calculus and matrix algebra. It will also serve as a source-book on the foundations of statistical informatics and data analytics to practitioners who regularly apply statistical learning to their modern data.

Read more

About the author

Walter W. Piegorsch is a Professor of Mathematics at the University of Arizona and the Director of Statistical Research & Education at its BIO5 Institute for Collaborative Bioresearch. Professor Piegorsch is an experienced and highly regarded author and editor. He has co-authored one previous book for Wiley, and is a founding and current co-Editor for Wiley's StatsRef: Statistics Reference Online, a comprehensive online reference resource which covers the fundamentals and applications of statistical theory, methods, and practice. He has also been on the editorial board of many scientific journals, and served as joint-Editor of the Journal of the American Statistical Association (Theory and Methods Section).
Over the course of a long and distinguished academic career Professor Piegorsch has taught and developed a number of courses in statistics and quantitative literacy, and he is in an ideal position to write this technical introduction to the use and application of statistical methods for informatics, statistical learning, and data mining.

Read more

Reviews

Loading...

Additional Information

Publisher
John Wiley & Sons
Read more
Published on
Jun 11, 2015
Read more
Pages
488
Read more
ISBN
9781119043577
Read more
Read more
Best For
Read more
Language
English
Read more
Genres
Mathematics / Probability & Statistics / General
Mathematics / Probability & Statistics / Stochastic Processes
Read more
Content Protection
This content is DRM protected.
Read more

Reading information

Smartphones and Tablets

Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.

Laptops and Computers

You can read books purchased on Google Play using your computer's web browser.

eReaders and other devices

To read on e-ink devices like the Sony eReader or Barnes & Noble Nook, you'll need to download a file and transfer it to your device. Please follow the detailed Help center instructions to transfer the files to supported eReaders.
Walter W. Piegorsch
Environmental statistics is a rapidly growing field, supported by advances in digital computing power, automated data collection systems, and interactive, linkable Internet software. Concerns over public and ecological health and the continuing need to support environmental policy-making and regulation have driven a concurrent explosion in environmental data analysis. This textbook is designed to address the need for trained professionals in this area. The book is based on a course which the authors have taught for many years, and prepares students for careers in environmental analysis centered on statistics and allied quantitative methods of data evaluation. The text extends beyond the introductory level, allowing students and environmental science practitioners to develop the expertise to design and perform sophisticated environmental data analyses. In particular, it: Provides a coherent introduction to intermediate and advanced methods for modeling and analyzing environmental data. Takes a data-oriented approach to describing the various methods. Illustrates the methods with real-world examples Features extensive exercises, enabling use as a course text. Includes examples of SAS computer code for implementation of the statistical methods. Connects to a Web site featuring solutions to exercises, extra computer code, and additional material. Serves as an overview of methods for analyzing environmental data, enabling use as a reference text for environmental science professionals.

Graduate students of statistics studying environmental data analysis will find this invaluable as will practicing data analysts and environmental scientists including specialists in atmospheric science, biology and biomedicine, chemistry, ecology, environmental health, geography, and geology.

Gareth James
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform.

Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Douglas Nychka
Walter W. Piegorsch
A comprehensive introduction to statistical methods for data mining and knowledge discovery.

Applications of data mining and ‘big data’ increasingly take center stage in our modern, knowledge-driven society, supported by advances in computing power, automated data acquisition, social media development and interactive, linkable internet software. This book presents a coherent, technical introduction to modern statistical learning and analytics, starting from the core foundations of statistics and probability. It includes an overview of probability and statistical distributions, basics of data manipulation and visualization, and the central components of standard statistical inferences. The majority of the text extends beyond these introductory topics, however, to supervised learning in linear regression, generalized linear models, and classification analytics. Finally, unsupervised learning via dimension reduction, cluster analysis, and market basket analysis are introduced.

Extensive examples using actual data (with sample R programming code) are provided, illustrating diverse informatic sources in genomics, biomedicine, ecological remote sensing, astronomy, socioeconomics, marketing, advertising and finance, among many others.

Statistical Data Analytics:

Focuses on methods critically used in data mining and statistical informatics. Coherently describes the methods at an introductory level, with extensions to selected intermediate and advanced techniques. Provides informative, technical details for the highlighted methods. Employs the open-source R language as the computational vehicle – along with its burgeoning collection of online packages – to illustrate many of the analyses contained in the book. Concludes each chapter with a range of interesting and challenging homework exercises using actual data from a variety of informatic application areas.

This book will appeal as a classroom or training text to intermediate and advanced undergraduates, and to beginning graduate students, with sufficient background in calculus and matrix algebra. It will also serve as a source-book on the foundations of statistical informatics and data analytics to practitioners who regularly apply statistical learning to their modern data.

Douglas Nychka
Walter W. Piegorsch
Environmental statistics is a rapidly growing field, supported by advances in digital computing power, automated data collection systems, and interactive, linkable Internet software. Concerns over public and ecological health and the continuing need to support environmental policy-making and regulation have driven a concurrent explosion in environmental data analysis. This textbook is designed to address the need for trained professionals in this area. The book is based on a course which the authors have taught for many years, and prepares students for careers in environmental analysis centered on statistics and allied quantitative methods of data evaluation. The text extends beyond the introductory level, allowing students and environmental science practitioners to develop the expertise to design and perform sophisticated environmental data analyses. In particular, it: Provides a coherent introduction to intermediate and advanced methods for modeling and analyzing environmental data. Takes a data-oriented approach to describing the various methods. Illustrates the methods with real-world examples Features extensive exercises, enabling use as a course text. Includes examples of SAS computer code for implementation of the statistical methods. Connects to a Web site featuring solutions to exercises, extra computer code, and additional material. Serves as an overview of methods for analyzing environmental data, enabling use as a reference text for environmental science professionals.

Graduate students of statistics studying environmental data analysis will find this invaluable as will practicing data analysts and environmental scientists including specialists in atmospheric science, biology and biomedicine, chemistry, ecology, environmental health, geography, and geology.

©2018 GoogleSite Terms of ServicePrivacyDevelopersArtistsAbout Google
By purchasing this item, you are transacting with Google Payments and agreeing to the Google Payments Terms of Service and Privacy Notice.