"The Freakonomics of big data." —Stein Kretsinger, founding executive of Advertising.com
Award-winning | Used by over 30 universities | Translated into 9 languages
An introduction for everyone. In this rich, fascinating — surprisingly accessible — introduction, leading expert Eric Siegel reveals how predictive analytics (aka machine learning) works, and how it affects everyone every day. Rather than a “how to” for hands-on techies, the book serves lay readers and experts alike by covering new case studies and the latest state-of-the-art techniques.
Prediction is booming. It reinvents industries and runs the world. Companies, governments, law enforcement, hospitals, and universities are seizing upon the power. These institutions predict whether you're going to click, buy, lie, or die.
Why? For good reason: predicting human behavior combats risk, boosts sales, fortifies healthcare, streamlines manufacturing, conquers spam, optimizes social networks, toughens crime fighting, and wins elections.
How? Prediction is powered by the world's most potent, flourishing unnatural resource: data. Accumulated in large part as the by-product of routine tasks, data is the unsalted, flavorless residue deposited en masse as organizations churn away. Surprise! This heap of refuse is a gold mine. Big data embodies an extraordinary wealth of experience from which to learn.
Predictive analytics (aka machine learning) unleashes the power of data. With this technology, the computer literally learns from data how to predict the future behavior of individuals. Perfect prediction is not possible, but putting odds on the future drives millions of decisions more effectively, determining whom to call, mail, investigate, incarcerate, set up on a date, or medicate.
In this lucid, captivating introduction — now in its Revised and Updated edition — former Columbia University professor and Predictive Analytics World founder Eric Siegel reveals the power and perils of prediction:
How does predictive analytics work? This jam-packed book satisfies by demystifying the intriguing science under the hood. For future hands-on practitioners pursuing a career in the field, it sets a strong foundation, delivers the prerequisite knowledge, and whets your appetite for more.
A truly omnipresent science, predictive analytics constantly affects our daily lives. Whether you are a consumer of it — or consumed by it — get a handle on the power of Predictive Analytics.
ERIC SIEGEL, PhD, is the founder of Predictive Analytics World and executive editor of The Predictive Analytics Times. A former Columbia University professor, he is a renowned speaker, educator, and leader in the field.
The five most valuable econometric methods, or what the authors call the Furious Five--random assignment, regression, instrumental variables, regression discontinuity designs, and differences in differences--are illustrated through well-crafted real-world examples (vetted for awesomeness by Kung Fu Panda's Jade Palace). Does health insurance make you healthier? Randomized experiments provide answers. Are expensive private colleges and selective public high schools better than more pedestrian institutions? Regression analysis and a regression discontinuity design reveal the surprising truth. When private banks teeter, and depositors take their money and run, should central banks step in to save them? Differences-in-differences analysis of a Depression-era banking crisis offers a response. Could arresting O. J. Simpson have saved his ex-wife's life? Instrumental variables methods instruct law enforcement authorities in how best to respond to domestic abuse.
Wielding econometric tools with skill and confidence, Mastering 'Metrics uses data and statistics to illuminate the path from cause to effect.
Shows why econometrics is importantExplains econometric research through humorous and accessible discussionOutlines empirical methods central to modern econometric practiceWorks through interesting and relevant real-world examples
But how does one exactly do data science? Do you have to hireone of these priests of the dark arts, the "data scientist," toextract this gold from your data? Nope.
Data science is little more than using straight-forward steps toprocess raw data into actionable insight. And in DataSmart, author and data scientist John Foreman will show you howthat's done within the familiar environment of aspreadsheet.
Why a spreadsheet? It's comfortable! You get to look at the dataevery step of the way, building confidence as you learn the tricksof the trade. Plus, spreadsheets are a vendor-neutral place tolearn data science without the hype.
But don't let the Excel sheets fool you. This is a book forthose serious about learning the analytic techniques, the math andthe magic, behind big data.
Each chapter will cover a different technique in aspreadsheet so you can follow along:Mathematical optimization, including non-linear programming andgenetic algorithmsClustering via k-means, spherical k-means, and graphmodularityData mining in graphs, such as outlier detectionSupervised AI through logistic regression, ensemble models, andbag-of-words modelsForecasting, seasonal adjustments, and prediction intervalsthrough monte carlo simulationMoving from spreadsheets into the R programming language
You get your hands dirty as you work alongside John through eachtechnique. But never fear, the topics are readily applicable andthe author laces humor throughout. You'll even learnwhat a dead squirrel has to do with optimization modeling, whichyou no doubt are dying to know.
In addition to econometric essentials, Mostly Harmless Econometrics covers important new extensions--regression-discontinuity designs and quantile regression--as well as how to get standard errors right. Joshua Angrist and Jörn-Steffen Pischke explain why fancier econometric techniques are typically unnecessary and even dangerous. The applied econometric methods emphasized in this book are easy to use and relevant for many areas of contemporary social science.
An irreverent review of econometric essentials
A focus on tools that applied researchers use most
Chapters on regression-discontinuity designs, quantile regression, and standard errors
Many empirical examples
A clear and concise resource with wide applications
The second edition of this acclaimed graduate text provides a unified treatment of two methods used in contemporary econometric research, cross section and data panel methods. By focusing on assumptions that can be given behavioral content, the book maintains an appropriate level of rigor while emphasizing intuitive thinking. The analysis covers both linear and nonlinear models, including models with dynamics and/or individual heterogeneity. In addition to general estimation frameworks (particular methods of moments and maximum likelihood), specific linear and nonlinear methods are covered in detail, including probit and logit models and their multivariate, Tobit models, models for count data, censored and missing data schemes, causal (or treatment) effects, and duration analysis.
Econometric Analysis of Cross Section and Panel Data was the first graduate econometrics text to focus on microeconomic data structures, allowing assumptions to be separated into population and sampling assumptions. This second edition has been substantially updated and revised. Improvements include a broader class of models for missing data problems; more detailed treatment of cluster problems, an important topic for empirical researchers; expanded discussion of "generalized instrumental variables" (GIV) estimation; new coverage (based on the author's own recent research) of inverse probability weighting; a more complete framework for estimating treatment effects with panel data, and a firmly established link between econometric approaches to nonlinear panel data and the "generalized estimating equation" literature popular in statistics and other fields. New attention is given to explaining when particular econometric methods can be applied; the goal is not only to tell readers what does work, but why certain "obvious" procedures do not. The numerous included exercises, both theoretical and computer-based, allow the reader to extend methods covered in the text and discover new insights.