Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates
NoSQL Distilled is a concise but thorough introduction to this rapidly emerging technology. Pramod J. Sadalage and Martin Fowler explain how NoSQL databases work and the ways that they may be a superior alternative to a traditional RDBMS. The authors provide a fast-paced guide to the concepts you need to know in order to evaluate whether NoSQL databases are right for your needs and, if so, which technologies you should explore further.
The first part of the book concentrates on core concepts, including schemaless data models, aggregates, new distribution models, the CAP theorem, and map-reduce. In the second part, the authors explore architectural and design issues associated with implementing NoSQL. They also present realistic use cases that demonstrate NoSQL databases at work and feature representative examples using Riak, MongoDB, Cassandra, and Neo4j.
In addition, by drawing on Pramod Sadalage’s pioneering work, NoSQL Distilled shows how to implement evolutionary design with schema migration: an essential technique for applying NoSQL databases. The book concludes by describing how NoSQL is ushering in a new age of Polyglot Persistence, where multiple data-storage worlds coexist, and architects can choose the technology best optimized for each type of data access.
This textbook provides a comprehensive introduction to forecasting methods and presents enough information about each method for readers to use them sensibly.
Let's face it, SQL is a deceptively simple language to learn, and many database developers never go far beyond the simple statement: SELECT columns FROM table WHERE conditions. But there is so much more you can do with the language. In the SQL Cookbook, experienced SQL developer Anthony Molinaro shares his favorite SQL techniques and features. You'll learn about:
Window functions, arguably the most significant enhancement to SQL in the past decade. If you're not using these, you're missing out
Powerful, database-specific features such as SQL Server's PIVOT and UNPIVOT operators, Oracle's MODEL clause, and PostgreSQL's very useful GENERATE_SERIES function
Pivoting rows into columns, reverse-pivoting columns into rows, using pivoting to facilitate inter-row calculations, and double-pivoting a result set
Bucketization, and why you should never use that term in Brooklyn.
How to create histograms, summarize data into buckets, perform aggregations over a moving range of values, generate running-totals and subtotals, and other advanced, data warehousing techniques
The technique of walking a string, which allows you to use SQL to parse through the characters, words, or delimited elements of a string
Written in O'Reilly's popular Problem/Solution/Discussion style, the SQL Cookbook is sure to please. Anthony's credo is: "When it comes down to it, we all go to work, we all have bills to pay, and we all want to go home at a reasonable time and enjoy what's still available of our days." The SQL Cookbook moves quickly from problem to solution, saving you time each step of the way.
If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out.Get a crash course in PythonLearn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data scienceCollect, explore, clean, munge, and manipulate dataDive into the fundamentals of machine learningImplement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clusteringExplore recommender systems, natural language processing, network analysis, MapReduce, and databases
Updated to reflect recent advances in MySQL and InnoDB performance, features, and tools, this third edition not only offers specific examples of how MySQL works, it also teaches you why this system works as it does, with illustrative stories and case studies that demonstrate MySQL’s principles in action. With this book, you’ll learn how to think in MySQL.Learn the effects of new features in MySQL 5.5, including stored procedures, partitioned databases, triggers, and viewsImplement improvements in replication, high availability, and clusteringAchieve high performance when running MySQL in the cloudOptimize advanced querying features, such as full-text searchesTake advantage of modern multi-core CPUs and solid-state disksExplore backup and recovery strategies—including new tools for hot online backups
Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:
Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduceBecome familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistenceDiscover common pitfalls and advanced features for writing real-world MapReduce programsDesign, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloudUse Pig, a high-level query language for large-scale data processingTake advantage of HBase, Hadoop's database for structured and semi-structured dataLearn ZooKeeper, a toolkit of coordination primitives for building distributed systems
If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject.
"Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk."-- Doug Cutting, Hadoop Founder, Yahoo!
Each chapter presents a self-contained lesson on a key SQL concept or technique, with numerous illustrations and annotated examples. Exercises at the end of each chapter let you practice the skills you learn. With this book, you will:
Move quickly through SQL basics and learn several advanced featuresUse SQL data statements to generate, manipulate, and retrieve dataCreate database objects, such as tables, indexes, and constraints, using SQL schema statementsLearn how data sets interact with queries, and understand the importance of subqueriesConvert and manipulate data with SQL's built-in functions, and use conditional logic in data statements
Knowledge of SQL is a must for interacting with data. With Learning SQL, you'll quickly learn how to put the power and flexibility of this language to work.
If you want to find out how to use Python to start answering critical questions of your data, pick up Python Machine Learning – whether you want to get started from scratch or want to extend your data science knowledge, this is an essential and unmissable resource.What You Will LearnExplore how to use different machine learning models to ask different questions of your dataLearn how to build neural networks using Keras and TheanoFind out how to write clean and elegant Python code that will optimize the strength of your algorithmsDiscover how to embed your machine learning model in a web application for increased accessibilityPredict continuous target outcomes using regression analysisUncover hidden patterns and structures in data with clusteringOrganize data using effective pre-processing techniquesGet to grips with sentiment analysis to delve deeper into textual and social media dataIn Detail
Machine learning and predictive analytics are transforming the way businesses and other organizations operate. Being able to understand trends and patterns in complex data is critical to success, becoming one of the key strategies for unlocking growth in a challenging contemporary marketplace. Python can help you deliver key insights into your data – its unique capabilities as a language let you build sophisticated algorithms and statistical models that can reveal new perspectives and answer key questions that are vital for success.
Python Machine Learning gives you access to the world of predictive analytics and demonstrates why Python is one of the world's leading data science languages. If you want to ask better questions of data, or need to improve and extend the capabilities of your machine learning systems, this practical data science book is invaluable. Covering a wide range of powerful Python libraries, including scikit-learn, Theano, and Keras, and featuring guidance and tips on everything from sentiment analysis to neural networks, you'll soon be able to answer some of the most important questions facing you and your organization.Style and approach
Python Machine Learning connects the fundamental theoretical principles behind machine learning to their practical application in a way that focuses you on asking and answering the right questions. It walks you through the key elements of Python and its powerful machine learning libraries, while demonstrating how to get to grips with a range of statistical models.
But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.
Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet.
Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype.
But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.
Each chapter will cover a different technique in a spreadsheet so you can follow along:Mathematical optimization, including non-linear programming and genetic algorithms Clustering via k-means, spherical k-means, and graph modularity Data mining in graphs, such as outlier detection Supervised AI through logistic regression, ensemble models, and bag-of-words models Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation Moving from spreadsheets into the R programming language
You get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.
This updated second edition provides guidance for database developers, advanced configuration for system administrators, and an overview of the concepts and use cases for other people on your project. Ideal for NoSQL newcomers and experienced MongoDB users alike, this guide provides numerous real-world schema design examples.Get started with MongoDB core concepts and vocabularyPerform basic write operations at different levels of safety and speedCreate complex queries, with options for limiting, skipping, and sorting resultsDesign an application that works well with MongoDBAggregate data, including counting, finding distinct values, grouping documents, and using MapReduceGather and interpret statistics about your collections and databasesSet up replica sets and automatic failover in MongoDBUse sharding to scale horizontally, and learn how it impacts applicationsDelve into monitoring, security and authentication, backup/restore, and other administrative tasks
Detailing the hows and the whys of successful Essbase implementation, the book arms you with simple yet powerful tools to meet your immediate needs, as well as the theoretical knowledge to proceed to the next level with Essbase. Infrastructure, data sourcing and transformation, database design, calculations, automation, APIs, reporting, and project implementation are covered by subject matter experts who work with the tools and techniques on a daily basis. In addition to practical cases that illustrate valuable lessons learned, the book offers:
Undocumented Secrets—Dan Pressman describes the previously unpublished and undocumented inner workings of the ASO Essbase engine. Authoritative Experts—If you have questions that no one else can solve, these 12 Essbase professionals are the ones who can answer them. Unpublished—Includes the only third-party guide to infrastructure. Infrastructure is easy to get wrong and can doom any Essbase project. Comprehensive—Let there never again be a question on how to create blocks or design BSO databases for performance—Dave Farnsworth provides the answers within. Innovative—Cameron Lackpour and Joe Aultman bring new and exciting solutions to persistent Essbase problems.
With a list of contributors as impressive as the program of presenters at a leading Essbase conference, this book offers unprecedented access to the insights and experiences of those at the forefront of the field. The previously unpublished material presented in these pages will give you the practical knowledge needed to use this powerful and intuitive tool to build highly useful analytical models, reporting systems, and forecasting applications.
Rather than run through all possible scenarios, this pragmatic operations guide calls out what works, as demonstrated in critical deployments.Get a high-level overview of HDFS and MapReduce: why they exist and how they workPlan a Hadoop deployment, from hardware and OS selection to network requirementsLearn setup and configuration details with a list of critical propertiesManage resources by sharing a cluster across multiple groupsGet a runbook of the most common cluster maintenance tasksMonitor Hadoop clusters—and learn troubleshooting with the help of real-world war storiesUse basic tools and techniques to handle backup and catastrophic failure
This book offers practical answers to some of the hardest questions faced by PL/SQL developers, including:What is the best way to write the SQL logic in my application code?
How should I write my packages so they can be leveraged by my entire team of developers?
How can I make sure that all my team's programs handle and record errors consistently?Oracle PL/SQL Best Practices summarizes PL/SQL best practices in nine major categories: overall PL/SQL application development; programming standards; program testing, tracing, and debugging; variables and data structures; control logic; error handling; the use of SQL in PL/SQL; building procedures, functions, packages, and triggers; and overall program performance.
This book is a concise and entertaining guide that PL/SQL developers will turn to again and again as they seek out ways to write higher quality code and more successful applications.
"This book presents ideas that make the difference between a successful project and one that never gets off the ground. It goes beyond just listing a set of rules, and provides realistic scenarios that help the reader understand where the rules come from. This book should be required reading for any team of Oracle database professionals."
--Dwayne King, President, KRIDAN Consulting
This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates.
Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
Your freemium product generates vast volumes of data, but using that data to maximize conversion, boost retention, and deliver revenue can be challenging if you don't fully understand the impact that small changes can have on revenue. In this book, author Eric Seufert provides clear guidelines for using data and analytics through all stages of development to optimize your implementation of the freemium model. Freemium Economics de-mystifies the freemium model through an exploration of its core, data-oriented tenets, so that you can apply it methodically rather than hoping that conversion and revenue will naturally follow product launch.
By reading Freemium Economics, you will:Learn how to apply data science and big data principles in freemium product design and development to maximize conversion, boost retention, and deliver revenue Gain a broad introduction to the conceptual economic pillars of freemium and a complete understanding of the unique approaches needed to acquire users and convert them from free to paying customers Get practical tips and analytical guidance to successfully implement the freemium model Understand the metrics and infrastructure required to measure the success of a freemium product and improve it post-launch Includes a detailed explanation of the lifetime customer value (LCV) calculation and step-by-step instructions for implementing key performance indicators in a simple, universally-accessible tool like Excel
This book will help you:Become a contributor on a data science team Deploy a structured lifecycle approach to data analytics problems Apply appropriate analytic techniques and tools to analyzing big data Learn how to tell a compelling story with data to drive business action Prepare for EMC Proven Professional Data Science Certification
Corresponding data sets are available at www.wiley.com/go/9781118876138.
Get started discovering, analyzing, visualizing, and presenting data in a meaningful way today!
The example code for this unique data science book is maintained in a public GitHub repository. It’s designed to be easily accessible through a turnkey virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks.
Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power.
Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention.
Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You’ll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you’ve mastered these techniques, you’ll constantly turn to this guide for the working PyMC code you need to jumpstart future projects.
• Learning the Bayesian “state of mind” and its practical implications
• Understanding how computers perform Bayesian inference
• Using the PyMC Python library to program Bayesian analyses
• Building and debugging models with PyMC
• Testing your model’s “goodness of fit”
• Opening the “black box” of the Markov Chain Monte Carlo algorithm to see how and why it works
• Leveraging the power of the “Law of Large Numbers”
• Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning
• Using loss functions to measure an estimate’s weaknesses based on your goals and desired outcomes
• Selecting appropriate priors and understanding how their influence changes with dataset size
• Overcoming the “exploration versus exploitation” dilemma: deciding when “pretty good” is good enough
• Using Bayesian inference to improve A/B testing
• Solving data science problems when only small amounts of data are available
Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.
Maybe you've written some simple SQL queries to interact with databases. But now you want more, you want to really dig into those databases and work with your data. Head First SQL will show you the fundamentals of SQL and how to really take advantage of it. We'll take you on a journey through the language, from basic INSERT statements and SELECT queries to hardcore database manipulation with indices, joins, and transactions. We all know "Data is Power" - but we'll show you how to have "Power over your Data". Expect to have fun, expect to learn, and expect to be querying, normalizing, and joining your data like a pro by the time you're finished reading!
This book is an in-depth guide to the use of pandas for data analysis, for either the seasoned data analysis practitioner or the novice user. It provides a basic introduction to the pandas framework, and takes users through the installation of the library and the IPython interactive environment. Thereafter, you will learn basic as well as advanced features, such as MultiIndexing, modifying data structures, and sampling data, which provide powerful capabilities for data analysis.
Create and distribute dynamic, feature-rich data visualizations and highly interactive BI dashboards—quickly and easily! Tableau 8: The Official Guide provides the hands-on instruction and best practices you need to meet your business intelligence objectives and drive better decision making. Discover how to work from the Tableau GUI, load BI from disparate sources, drag and drop to analyze data, set up custom visualizations, and build robust dashboards. This practical guide shows you, step by step, how to design and publish meaningful business communications to end users across your enterprise.Navigate the Tableau user interface and data window Connect to spreadsheets, databases, and other sources Select data fields and drag them to desired screen locations Work with pre-defined visualizations and sample workbooks Display background maps and perform geographic analysis Add calculated fields, graphs, charts, tables, and statistics Combine multiple data sources into real-time dashboards Export your visualizations to the Web or in various file formats
Electronic content includes:Videos that demonstrate the techniques presented in the book Sample Tableau workbooks
The authors use task oriented descriptions and concrete end-to-end examples to ensure that the reader can immediately begin using this new service. The book describes all aspects of the service from data ingress to applying machine learning, evaluating the models, and deploying them as web services.
Learn how you can quickly build and deploy sophisticated predictive models with the new Azure Machine Learning from Microsoft.
What’s New in the Second Edition?
Five new chapters have been added with practical detailed coverage of:Python Integration – a new feature announced February 2015Data preparation and feature selection Data visualization with Power BIRecommendation enginesSelling your models on Azure Marketplace
Design, implement, manage, and maintain a highly flexible service-oriented computing infrastructure across your enterprise using the detailed information in this Oracle Press guide. Written by an Oracle ACE director, Oracle SOA Suite 12c Handbook uses a start-to-finish case study to illustrate each concept and technique. Learn expert techniques for designing and implementing components, assembling composite applications, integrating Java, handling complex business logic, and maximizing code reuse. Runtime administration, governance, and security are covered in this practical resource.Get started with the Oracle SOA Suite 12c development and run time environment Deploy and manage SOA composite applications Expose SOAP/XML REST/JSON through Oracle Service Bus Establish interactions through adapters for Database, JMS, File/FTP, UMS, LDAP, and Coherence Embed custom logic using Java and the Spring component Perform fast data analysis in real time with Oracle Event Processor Implement Event Drive Architecture based on the Event Delivery Network (EDN) Use Oracle Business Rules to encapsulate logic and automate decisions Model complex processes using BPEL, BPMN, and human task components Establish KPIs and evaluate performance using Oracle Business Activity Monitoring Control traffic, audit system activity, and encrypt sensitive data
Now, in just 24 lessons of one hour or less, you can learn how to leverage MongoDB's immense power. Each short, easy lesson builds on all that's come before, teaching NoSQL concepts and MongoDB techniques from the ground up.
Sams Teach Yourself NoSQL with MongoDB in 24 Hours covers all this, and much more:
Predictive analytics and Data Mining techniques covered: Exploratory Data Analysis, Visualization, Decision trees, Rule induction, k-Nearest Neighbors, Naïve Bayesian, Artificial Neural Networks, Support Vector machines, Ensemble models, Bagging, Boosting, Random Forests, Linear regression, Logistic regression, Association analysis using Apriori and FP Growth, K-Means clustering, Density based clustering, Self Organizing Maps, Text Mining, Time series forecasting, Anomaly detection and Feature selection. Implementation files can be downloaded from the book companion site at www.LearnPredictiveAnalytics.comDemystifies data mining concepts with easy to understand languageShows how to get up and running fast with 20 commonly used powerful techniques for predictive analysisExplains the process of using open source RapidMiner toolsDiscusses a simple 5 step process for implementing algorithms that can be used for performing predictive analyticsIncludes practical use cases and examples
“Cindi has created, with her typical attention to details that matter, a contemporary forward-looking guide that organizations could use to evaluate existing or create a foundation for evolving business intelligence / analytics programs. The book touches on strategy, value, people, process, and technology, all of which must be considered for program success. Among other topics, the data, data warehousing, and ROI comments were spot on. The ‘technobabble’ chapter was brilliant!” —Bill Frank, Business Intelligence and Data Warehousing Program Manager, Johnson & Johnson
“If you want to be an analytical competitor, you’ve got to go well beyond business intelligence technology. Cindi Howson has wrapped up the needed advice on technology, organization, strategy, and even culture in a neat package. It’s required reading for quantitatively oriented strategists and the technologists who support them.” —Thomas H. Davenport, President’s Distinguished Professor, Babson College and co-author, Competing on Analytics
“Cindi has created an exceptional, authoritative description of the end-to-end business intelligence ecosystem. This is a great read for those who are just trying to better understand the business intelligence space, as well as for the seasoned BI practitioner.” —Sully McConnell, Vice President, Business Intelligence and Information Management, Time Warner Cable
“Cindi’s book succinctly yet completely lays out what it takes to deliver BI successfully. IT and business leaders will benefit from Cindi’s deep BI experience, which she shares through helpful, real-world definitions, frameworks, examples, and stories. This is a must-read for companies engaged in – or considering – BI.” —Barbara Wixom, PhD, Principal Research Scientist, MIT Sloan Center for Information Systems Research
Expanded to cover the latest advances in business intelligence such as big data, cloud, mobile, visual data discovery, and in-memory computing, this fully updated bestseller by BI guru Cindi Howson provides cutting-edge techniques to exploit BI for maximum value. Successful Business Intelligence: Unlock the Value of BI & Big Data, Second Edition describes best practices for an effective BI strategy. Find out how to:Garner executive support to foster an analytic culture Align the BI strategy with business goals Develop an analytic ecosystem to exploit data warehousing, analytic appliances, and Hadoop for the right BI workload Continuously improve the quality, breadth, and timeliness of data Find the relevance of BI for everyone in the company Use agile development processes to deliver BI capabilities and improvements at the pace of business change Select the right BI tools to meet user and business needs Measure success in multiple ways Embrace innovation, promote successes and applications, and invest in training Monitor your evolution and maturity across various factors for impact
Exclusive industry survey data and real-world case studies from Medtronic, Macy’s, 1-800 CONTACTS, The Dow Chemical Company, Netflix, Constant Contact, and other companies show successful BI initiatives in action.
From Moneyball to Nate Silver, BI and big data have permeated our cultural, political, and economic landscape. This timely, up-to-date guide reveals how to plan and deploy an agile, state-of-the-art BI solution that links insight to action and delivers a sustained competitive advantage.
Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.Learn fundamental components such as MapReduce, HDFS, and YARNExplore MapReduce in depth, including steps for developing applications with itSet up and maintain a Hadoop cluster running HDFS and MapReduce on YARNLearn two data formats: Avro for data serialization and Parquet for nested dataUse data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with HadoopLearn the HBase distributed database and the ZooKeeper distributed configuration service
Did You Know?
-Knowledge of SQL is an important skill to display on your resume.
-With the growth of digital information, Database Administrator is one of the fastest growing careers.
-SQL can be learned in hours and used for decades.
Learn to script Transact SQL using Microsoft SQL Server.
-Create tables and databases
-create views, stored procedures and more.
Over 100 examples of SQL queries and statements along with images of results will help you learn T SQL.
A special section included in this illustrated guide will help you test your skills and get ahead in the workplace.
Now is the time to learn SQL.
Click the 'buy button' and start scripting SQL TODAY!
Written by Oracle ACE Director and MySQL expert Ronald Bradford, Effective MySQL: Optimizing SQL Statements is filled with detailed explanations and practical examples that can be applied immediately to improve database and application performances. Featuring a step-by-step approach to SQL optimization, this Oracle Press book helps you to analyze and tune problematic SQL statements.Identify the essential analysis commands for gathering and diagnosing issues Learn how different index theories are applied and represented in MySQL Plan and execute informed SQL optimizations Create MySQL indexes to improve query performance Master the MySQL query execution plan Identify key configuration variables that impact SQL execution and performance Apply the SQL optimization lifecycle to capture, identify, confirm, analyze, and optimize SQL statements and verify the results Improve index utilization with covering indexes and partial indexes Learn hidden performance tips for improving index efficiency and simplifying SQL statements
—From the Foreword by Raymie Stata, CEO of Altiscale
The Insider’s Guide to Building Distributed, Big Data Applications with Apache Hadoop™ YARN
Apache Hadoop is helping drive the Big Data revolution. Now, its data processing has been completely overhauled: Apache Hadoop YARN provides resource management at data center scale and easier ways to create distributed applications that process petabytes of data. And now in Apache Hadoop™ YARN, two Hadoop technical leaders show you how to develop new applications and adapt existing code to fully leverage these revolutionary advances.
YARN project founder Arun Murthy and project lead Vinod Kumar Vavilapalli demonstrate how YARN increases scalability and cluster utilization, enables new programming models and services, and opens new options beyond Java and batch processing. They walk you through the entire YARN project lifecycle, from installation through deployment.
You’ll find many examples drawn from the authors’ cutting-edge experience—first as Hadoop’s earliest developers and implementers at Yahoo! and now as Hortonworks developers moving the platform forward and helping customers succeed with it.
Coverage includesYARN’s goals, design, architecture, and components—how it expands the Apache Hadoop ecosystem Exploring YARN on a single node Administering YARN clusters and Capacity Scheduler Running existing MapReduce applications Developing a large-scale clustered YARN application Discovering new open source frameworks that run under YARN
Complete with illustrations and helpful hints, this fifth edition provides a valuable one-stop overview of Oracle Database 12c, including an introduction to Oracle and cloud computing. Oracle Essentials provides the conceptual background you need to understand how Oracle truly works.
Topics include:A complete overview of Oracle databases and data stores, and Fusion Middleware products and featuresCore concepts and structures in Oracle’s architecture, including pluggable databasesOracle objects and the various datatypes Oracle supportsSystem and database management, including Oracle Enterprise Manager 12cSecurity options, basic auditing capabilities, and options for meeting compliance needsPerformance characteristics of disk, memory, and CPU tuningBasic principles of multiuser concurrencyOracle’s online transaction processing (OLTP)Data warehouses, Big Data, and Oracle’s business intelligence toolsBackup and recovery, and high availability and failover solutions
Hacking Web Intelligence shows you how to dig into the Web and uncover the information many don't even know exists. The book takes a holistic approach that is not only about using tools to find information online but also how to link all the information and transform it into presentable and actionable intelligence. You will also learn how to secure your information online to prevent it being discovered by these reconnaissance methods.
Hacking Web Intelligence is an in-depth technical reference covering the methods and techniques you need to unearth open source information from the Internet and utilize it for the purpose of targeted attack during a security assessment. This book will introduce you to many new and leading-edge reconnaissance, information gathering, and open source intelligence methods and techniques, including metadata extraction tools, advanced search engines, advanced browsers, power searching methods, online anonymity tools such as TOR and i2p, OSINT tools such as Maltego, Shodan, Creepy, SearchDiggity, Recon-ng, Social Network Analysis (SNA), Darkweb/Deepweb, data visualization, and much more.Provides a holistic approach to OSINT and Web recon, showing you how to fit all the data together into actionable intelligenceFocuses on hands-on tools such as TOR, i2p, Maltego, Shodan, Creepy, SearchDiggity, Recon-ng, FOCA, EXIF, Metagoofil, MAT, and many moreCovers key technical topics such as metadata searching, advanced browsers and power searching, online anonymity, Darkweb / Deepweb, Social Network Analysis (SNA), and how to manage, analyze, and visualize the data you gatherIncludes hands-on technical examples and case studies, as well as a Python chapter that shows you how to create your own information-gathering tools and modify existing APIs
This comprehensive new volume shows you how to compile PostgreSQL from source, create a database, and configure PostgreSQL to accept client-server connections. It also covers the many advanced features, such as transactions, versioning, replication, and referential integrity that enable developers and DBAs to use PostgreSQL for serious business applications. The thorough introduction to PostgreSQL's PL/pgSQL programming language explains how you can use this very useful but under-documented feature to develop stored procedures and triggers. The book includes a complete command reference, and database administrators will appreciate the chapters on user management, database maintenance, and backup & recovery. With Practical PostgreSQL, you will discover quickly why this open source database is such a great open source alternative to proprietary products from Oracle, IBM, and Microsoft.
Peter Christen’s book is divided into three parts: Part I, “Overview”, introduces the subject by presenting several sample applications and their special challenges, as well as a general overview of a generic data matching process. Part II, “Steps of the Data Matching Process”, then details its main steps like pre-processing, indexing, field and record comparison, classification, and quality evaluation. Lastly, part III, “Further Topics”, deals with specific aspects like privacy, real-time matching, or matching unstructured data. Finally, it briefly describes the main features of many research and open source systems available today.By providing the reader with a broad range of data matching concepts and techniques and touching on all aspects of the data matching process, this book helps researchers as well as students specializing in data quality or data matching aspects to familiarize themselves with recent research advances and to identify open research challenges in the area of data matching. To this end, each chapter of the book includes a final section that provides pointers to further background and research material. Practitioners will better understand the current state of the art in data matching as well as the internal workings and limitations of current systems. Especially, they will learn that it is often not feasible to simply implement an existing off-the-shelf data matching system without substantial adaption and customization. Such practical considerations are discussed for each of the major steps in the data matching process.
The book offers a rich blend of theory and practice. It is suitable for students, researchers and practitioners interested in Web mining and data mining both as a learning text and as a reference book. Professors can readily use it for classes on data mining, Web mining, and text mining. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.
The first part provides an introduction to basic procedures for handling and operating with text strings. Then, it reviews major mathematical modeling approaches. Statistical and geometrical models are also described along with main dimensionality reduction methods. Finally, it presents some specific applications such as document clustering, classification, search and terminology extraction.
All descriptions presented are supported with practical examples that are fully reproducible. Further reading, as well as additional exercises and projects, are proposed at the end of each chapter for those readers interested in conducting further experimentation.
The glossary defines over 50 R terms using SAS/SPSS jargon and again using R jargon. The table of contents and the index allow you to find equivalent R functions by looking up both SAS statements and SPSS commands. When finished, you will be able to import data, manage and transform it, create publication quality graphics, and perform basic statistical analyses.
This new edition has updated programming, an expanded index, and even more statistical methods covered in over 25 new sections.
Social Network Data Analytics covers an important niche in the social network analytics field. This edited volume, contributed by prominent researchers in this field, presents a wide selection of topics on social network data mining such as Structural Properties of Social Networks, Algorithms for Structural Discovery of Social Networks and Content Analysis in Social Networks. This book is also unique in focussing on the data analytical aspects of social networks in the internet scenario, rather than the traditional sociology-driven emphasis prevalent in the existing books, which do not focus on the unique data-intensive characteristics of online social networks. Emphasis is placed on simplifying the content so that students and practitioners benefit from this book.
This book targets advanced level students and researchers concentrating on computer science as a secondary text or reference book. Data mining, database, information security, electronic commerce and machine learning professionals will find this book a valuable asset, as well as primary associations such as ACM, IEEE and Management Science.
Levy profiles the imaginative brainiacs who found clever and unorthodox solutions to computer engineering problems. They had a shared sense of values, known as "the hacker ethic," that still thrives today. Hackers captures a seminal period in recent history when underground activities blazed a trail for today's digital world, from MIT students finagling access to clunky computer-card machines to the DIY culture that spawned the Altair and the Apple II.
FileMaker Pro: The Missing Manual approaches FileMaker the way FileMaker approaches you: it's user-friendly and seemingly straightforward enough, but it offers plenty of substance worthy of deeper exploration. Packed with practical information as well as countless expert tips and invaluable guidance, it's an in-depth guide to designing and building useful databases with the powerful and pliable FileMaker Pro.
Covering FileMaker for both Windows and Macintosh, FileMaker Pro: The Missing Manual is ideal for small business users, home users, school teachers, developers--anyone who wants to organize information efficiently and effectively. Whether you want to run a business, publish a shopping cart on the Web, plan a wedding, manage a student information system at your school, or program databases for clients, this book delivers.
Author Geoff Coffey has many years of experience using FileMaker Pro (he was, in fact, an early beta tester for the product). Author Susan Prosser is a FileMaker Certified Developer who trains other developers. Together, Coffey and Prosser show you how to:Get FileMaker up and running quickly and smoothlyImport and organize information with easeDesign relational databases that are simple to use, yet powerfulTake advantage of FileMaker Pro calculation capabilitiesAutomate processes with scriptingCustomize FileMaker Pro to your needs and preferencesShare information with other people (coworkers, clients, and customers) and other programsUnderstand and select the best security options
What could easily come across as dry and intimidating--things like relational theory, calculations, and scripting--are presented in a way that is interesting and intuitive to mainstream users. In no time, you'll be working more productively and efficiently using FileMaker Pro.
Based on his nine years of experience as a program manager for Internet Explorer, and lead program manager for Windows and MSN, Berkun explains to technical and non-technical readers alike what it takes to get through a large software or web development project. Making Things Happen doesn't cite specific methods, but focuses on philosophy and strategy. Unlike other project management books, Berkun offers personal essays in a comfortable style and easy tone that emulate the relationship of a wise project manager who gives good, entertaining and passionate advice to those who ask.
Topics in this new edition include:How to make things happenMaking good decisionsSpecifications and requirementsIdeas and what to do with themHow not to annoy peopleLeadership and trustThe truth about making datesWhat to do when things go wrongComplete with a new forward from the author and a discussion guide for forming reading groups/teams, Making Things Happen offers in-depth exercises to help you apply lessons from the book to your job. It is inspiring, funny, honest, and compelling, and definitely the one book that you and your team need to have within arm's reach throughout the life of your project.
Coming from the rare perspective of someone who fought difficult battles on Microsoft's biggest projects and taught project design and management for MSTE, Microsoft's internal best practices group, this is valuable advice indeed. It will serve you well with your current work, and on future projects to come.
Where did they get the ideas that made them rich? How did they convince investors to back them? What went wrong, and how did they recover?
Nearly all technical people have thought of one day starting or working for a startup. For them, this book is the closest you can come to being a fly on the wall at a successful startup, to learn how it's done.
But ultimately these interviews are required reading for anyone who wants to understand business, because startups are business reduced to its essence. The reason their founders become rich is that startups do what businesses do—create value—more intensively than almost any other part of the economy. How? What are the secrets that make successful startups so insanely productive? Read this book, and let the founders themselves tell you.
Whether you're an aspiring manager, a current manager, or just wondering what the heck a manager does all day, there is a story in this book that will speak to you.
ASP.NET MVC 5 contains a number of advances over previous versions, including the ability to define routes using C# attributes and the ability to override filters. The user experience of building MVC applications has also been substantially improved. The new, more tightly integrated, Visual Studio 2013 IDE has been created specifically with MVC application development in mind and provides a full suite of tools to improve development times and assist in reporting, debugging and deploying your code.
Best-selling author Adam Freeman explains how to get the most from AngularJS. He begins by describing the MVC pattern and the many benefits that can be gained from separating your logic and presentation code. He then shows how you can use AngularJS's features within in your projects to produce professional-quality results. Starting from the nuts-and-bolts and building up to the most advanced and sophisticated features AngularJS is carefully unwrapped, going in-depth to give you the knowledge you need.
Each topic is covered clearly and concisely and is packed with the details you need to learn to be truly effective. The most important features are given a no-nonsense in-depth treatment and chapters include common problems and details of how to avoid them.
Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addition to the mechanics of BigQuery, the book also covers the architecture of the underlying Dremel query engine, providing a thorough understanding that leads to better query results.Features a companion website that includes all code and data sets from the book Uses real-world examples to explain everything analysts need to know to effectively use BigQuery Includes web application examples coded in Python
Written by world-renowned forensic practitioners, this book uses the most current examination and analysis techniques in the field. It consists of 9 chapters that cover a range of topics such as the open source examination platform; disk and file system analysis; Windows systems and artifacts; Linux systems and artifacts; Mac OS X systems and artifacts; Internet artifacts; and automating analysis and extending capabilities. The book lends itself to use by students and those entering the field who do not have means to purchase new tools for different investigations.
This book will appeal to forensic practitioners from areas including incident response teams and computer forensic investigators; forensic technicians from legal, audit, and consulting firms; and law enforcement agencies.Written by world-renowned forensic practitioners Details core concepts and techniques of forensic file system analysisCovers analysis of artifacts from the Windows, Mac, and Linux operating systems
The top software developers are ten times more productive than average developers. Ten times. You can’t afford not to hire them. But if you haven’t been reading Joel Spolsky’s books or blog, you probably don’t know how to find them and make them want to work for you.
In this brief book, Joel reveals all his secrets—from his years at Microsoft, and as the co–founder of Fog Creek Software—for recruiting the best developers in the world.
If you’ve ever wondered what you should be looking for in a resume, if you’ve ever struggled to decide whether to hire someone at the end of an interview, or if you’re wondering why you can’t find great programmers, stop everything and read this book.