Big Data

McGraw-Hill Education
Free sample

This book is meant for students, as well as executives, who wish to take advantage of emerging opportunities in Big Data. It provides an intuition of the wholeness of the field in a simple language, free from jargon and code. All the essential Big Data technology tools and platforms such as Hadoop, MapReduce, Spark, and NoSql are discussed. The short chapters make it easy to quickly understand the key concepts. A complete case study of developing a Big Data application is included.

Salient Features:

- Provides fun and insightful case-lets from real-world stories at the beginning of every chapter. For example IMB Watson Case Study, Google Flu

- Provides a running case study across the chapters as exercises e.g. Google Query Architecture, How Google Search Works

- Dedicated Chapters on Data Mining and Big Data Programming, Appendices on Installation of Hadoop, Spark and Amazon Web Services

Read more
Collapse

About the author

Professor of CSE and MIS at MUM, in Fairfield, Iowa 

Read more
Collapse
Loading...

Additional Information

Publisher
McGraw-Hill Education
Read more
Collapse
Published on
May 1, 2017
Read more
Collapse
Pages
279
Read more
Collapse
ISBN
9789352604548
Read more
Collapse
Read more
Collapse
Read more
Collapse
Language
English
Read more
Collapse
Content Protection
This content is DRM protected.
Read more
Collapse
Read Aloud
Available on Android devices
Read more
Collapse

Reading information

Smartphones and Tablets

Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.

Laptops and Computers

You can read books purchased on Google Play using your computer's web browser.

eReaders and other devices

To read on e-ink devices like the Sony eReader or Barnes & Noble Nook, you'll need to download a file and transfer it to your device. Please follow the detailed Help center instructions to transfer the files to supported eReaders.
Learn the basics of analytics on big data using Java, machine learning and other big data toolsAbout This BookAcquire real-world set of tools for building enterprise level data science applicationsSurpasses the barrier of other languages in data science and learn create useful object-oriented codesExtensive use of Java compliant big data tools like apache spark, Hadoop, etc.Who This Book Is For

This book is for Java developers who are looking to perform data analysis in production environment. Those who wish to implement data analysis in their Big data applications will find this book helpful.

What You Will LearnStart from simple analytic tasks on big dataGet into more complex tasks with predictive analytics on big data using machine learningLearn real time analytic tasksUnderstand the concepts with examples and case studiesPrepare and refine data for analysisCreate charts in order to understand the dataSee various real-world datasetsIn Detail

This book covers case studies such as sentiment analysis on a tweet dataset, recommendations on a movielens dataset, customer segmentation on an ecommerce dataset, and graph analysis on actual flights dataset.

This book is an end-to-end guide to implement analytics on big data with Java. Java is the de facto language for major big data environments, including Hadoop. This book will teach you how to perform analytics on big data with production-friendly Java. This book basically divided into two sections. The first part is an introduction that will help the readers get acquainted with big data environments, whereas the second part will contain a hardcore discussion on all the concepts in analytics on big data. It will take you from data analysis and data visualization to the core concepts and advantages of machine learning, real-life usage of regression and classification using Naive Bayes, a deep discussion on the concepts of clustering,and a review of simple neural networks on big data using deepLearning4j or plain Java Spark code. This book is a must-have book for Java developers who want to start learning big data analytics and want to use it in the real world.

Style and approach

The approach of book is to deliver practical learning modules in manageable content. Each chapter is a self-contained unit of a concept in big data analytics. Book will step by step builds the competency in the area of big data analytics. Examples using real world case studies to give ideas of real applications and how to use the techniques mentioned. The examples and case studies will be shown using both theory and code.

Perspectives on the varied challenges posed by big data for health, science, law, commerce, and politics.

Big data is ubiquitous but heterogeneous. Big data can be used to tally clicks and traffic on web pages, find patterns in stock trades, track consumer preferences, identify linguistic correlations in large corpuses of texts. This book examines big data not as an undifferentiated whole but contextually, investigating the varied challenges posed by big data for health, science, law, commerce, and politics. Taken together, the chapters reveal a complex set of problems, practices, and policies.

The advent of big data methodologies has challenged the theory-driven approach to scientific knowledge in favor of a data-driven one. Social media platforms and self-tracking tools change the way we see ourselves and others. The collection of data by corporations and government threatens privacy while promoting transparency. Meanwhile, politicians, policy makers, and ethicists are ill-prepared to deal with big data's ramifications. The contributors look at big data's effect on individuals as it exerts social control through monitoring, mining, and manipulation; big data and society, examining both its empowering and its constraining effects; big data and science, considering issues of data governance, provenance, reuse, and trust; and big data and organizations, discussing data responsibility, “data harm,” and decision making.

Contributors
Ryan Abbott, Cristina Alaimo, Kent R. Anderson, Mark Andrejevic, Diane E. Bailey, Mike Bailey, Mark Burdon, Fred H. Cate, Jorge L. Contreras, Simon DeDeo, Hamid R. Ekbia, Allison Goodwell, Jannis Kallinikos, Inna Kouper, M. Lynne Markus, Michael Mattioli, Paul Ohm, Scott Peppet, Beth Plale, Jason Portenoy, Julie Rennecker, Katie Shilton, Dan Sholler, Cassidy R. Sugimoto, Isuru Suriarachchi, Jevin D. West

Frank Kane's hands-on Spark training course, based on his bestselling Taming Big Data with Apache Spark and Python video, now available in a book. Understand and analyze large data sets using Spark on a single system or on a cluster.About This BookUnderstand how Spark can be distributed across computing clustersDevelop and run Spark jobs efficiently using PythonA hands-on tutorial by Frank Kane with over 15 real-world examples teaching you Big Data processing with SparkWho This Book Is For

If you are a data scientist or data analyst who wants to learn Big Data processing using Apache Spark and Python, this book is for you. If you have some programming experience in Python, and want to learn how to process large amounts of data using Apache Spark, Frank Kane's Taming Big Data with Apache Spark and Python will also help you.

What You Will LearnFind out how you can identify Big Data problems as Spark problemsInstall and run Apache Spark on your computer or on a clusterAnalyze large data sets across many CPUs using Spark's Resilient Distributed DatasetsImplement machine learning on Spark using the MLlib libraryProcess continuous streams of data in real time using the Spark streaming modulePerform complex network analysis using Spark's GraphX libraryUse Amazon's Elastic MapReduce service to run your Spark jobs on a clusterIn Detail

Frank Kane's Taming Big Data with Apache Spark and Python is your companion to learning Apache Spark in a hands-on manner. Frank will start you off by teaching you how to set up Spark on a single system or on a cluster, and you'll soon move on to analyzing large data sets using Spark RDD, and developing and running effective Spark jobs quickly using Python.

Apache Spark has emerged as the next big thing in the Big Data domain – quickly rising from an ascending technology to an established superstar in just a matter of years. Spark allows you to quickly extract actionable insights from large amounts of data, on a real-time basis, making it an essential tool in many modern businesses.

Frank has packed this book with over 15 interactive, fun-filled examples relevant to the real world, and he will empower you to understand the Spark ecosystem and implement production-grade real-time Spark projects with ease.

Style and approach

Frank Kane's Taming Big Data with Apache Spark and Python is a hands-on tutorial with over 15 real-world examples carefully explained by Frank in a step-by-step manner. The examples vary in complexity, and you can move through them at your own pace.

©2020 GoogleSite Terms of ServicePrivacyDevelopersArtistsAbout Google|Location: United StatesLanguage: English (United States)
By purchasing this item, you are transacting with Google Payments and agreeing to the Google Payments Terms of Service and Privacy Notice.