James E. Gentle is a Professor of Computational Statistics at George Mason University. His research interests include Monte Carlo methods and computational finance. He is an elected member of the ISI and a Fellow of the American Statistical Association.
Wolfgang Karl Härdle is a Professor of Statistics at the Humboldt-Universität zu Berlin and the Director of CASE – the Centre for Applied Statistics and Economics. He teaches quantitative finance and semi-parametric statistical methods. His research focuses on dynamic factor models, multivariate statistics in finance and computational statistics. He is an elected member of the ISI and an advisor to the Guanghua School of Management, Peking University and to National Central University, Taiwan.
Yuichi Mori is a Professor of Statistics and Informatics at Okayama University of Science. His research interests include efficient computing in multivariate methods, dimension reduction and variable selection, and statistics education. He is an elected member of the ISI and served as a council member of the IASC from 2003 to 2007.
New to the Third Edition
This third edition is updated with the latest version of MATLAB and the corresponding version of the Statistics and Machine Learning Toolbox. It also incorporates new sections on the nearest neighbor classifier, support vector machines, model checking and regularization, partial least squares regression, and multivariate adaptive regression splines.
The authors include algorithmic descriptions of the procedures as well as examples that illustrate the use of algorithms in data analysis. The MATLAB code, examples, and data sets are available online.
The second part of the book begins with a consideration of various types of matrices encountered in statistics, such as projection matrices and positive definite matrices, and describes the special properties of those matrices. The second part also describes some of the many applications of matrix theory in statistics, including linear models, multivariate analysis, and stochastic processes. The brief coverage in this part illustrates the matrix theory developed in the first part of the book. The first two parts of the book can be used as the text for a course in matrix algebra for statistics students, or as a supplementary text for various courses in linear models or multivariate statistics.
The third part of this book covers numerical linear algebra. It begins with a discussion of the basics of numerical computations, and then describes accurate and efficient algorithms for factoring matrices, solving linear systems of equations, and extracting eigenvalues and eigenvectors. Although the book is not tied to any particular software system, it describes and gives examples of the use of modern computer software for numerical linear algebra. This part is essentially self-contained, although it assumes some ability to program in Fortran or C and/or the ability to use R/S-Plus or Matlab. This part of the book can be used as the text for a course in statistical computing, or as a supplementary text for various courses that emphasize computations.
The book includes a large number of exercises with some solutions provided in an appendix.
Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way.
You’ll learn how to:Wrangle—transform your datasets into a form convenient for analysisProgram—learn powerful R tools for solving data problems with greater clarity and easeExplore—examine your data, generate hypotheses, and quickly test themModel—provide a low-dimensional summary that captures true "signals" in your datasetCommunicate—learn R Markdown for integrating prose, code, and results