Understand the language and vocabulary of Data Architecture.
The Data Architecture field is rife with terms that have become “fashionable”. Some of the terms began with very specific, specialized, meanings – but as their use spread, they lost the precision of their technical definitions and become, well, “buzzwords”.
A buzzword is “a word or expression from a particular subject area that has become fashionable because it has been used a lot”. Compliance is “the obeying of an accepted principle or instruction that states the way things are or should be done.”
The assignment is to take buzzwords and follow rules to use them correctly. We cut through the hype to arrive at buzzword compliance – the state where you fully understand the words that in fact have real meaning in the data architecture industry. This book will rationalize the various ways all these terms are defined.
Of necessity, the book must address all aspects of describing an enterprise and its data management technologies. This includes a wide range of subjects, from entity/relationship modeling, through the semantic web, to database issues like relational and “beyond relational” (“NoSQL”) approaches. In each case, the definitions for the subject are meant to be detailed enough to make it possible to understand basic principles—while recognizing that a full understanding will require consulting the sources where they are more completely described.
The book’s Glossary contains a catalogue of definitions and its Bibliography contains a comprehensive set of references.
Since the early 1980s, David Hay has been a pioneer in the use of process and data models to support strategic planning, requirements analysis and system design. He has developed enterprise models for many industries, including, among others, pharmaceutical research, oil refining and production, film and television, and nuclear energy. In each case, he found the relatively simple structures hidden in formidably complex situations. In addition to being a frequent speaker at international conferences, Mr. Hay has published several books and numerous articles.
In 1995, David Hay published Data Model Patterns: Conventions of Thought - the groundbreaking book on how to use standard data models to describe the standard business situations. Enterprise Model Patterns: Describing the World builds on the concepts presented there, adds 15 years of practical experience, and presents a more comprehensive view. You will learn how to apply both the abstract and concrete elements of your enterprise’s architectural data model through four levels of abstraction:
An abstract template that underlies the Level 1 model that follows, plus two meta models:
• Information Resources. In addition to books, articles, and e-mail notes, it also includes photographs, videos, and sound recordings.
• Accounting. Accounting is remarkable because it is itself a modeling language. It takes a very different approach than data modelers in that instead of using entities and entity classes that represent things in the world, it is concerned with accounts that represent bits of value to the organization.
Level 1: An enterprise model that is generic enough to apply to any company or government agency, but concrete enough to be readily understood by all. It describes:
• People and Organization. Who is involved with the business? The people involved are not only the employees within the organization, but customers, agents, and others with whom the organization comes in contact. Organizations of interest include the enterprise itself and its own internal departments, as well as customers, competitors, government agencies, and the like.
• Geographic Locations. Where is business conducted? A geographic location may be either a geographic area (defined as any bounded area on the Earth), a geographic point (used to identify a particular location), or, if you are an oil company for example, a geographic solid (such as an oil reserve).
• Assets. What tangible items are used to carry out the business? These are any physical things that are manipulated, sometimes as products, but also as the means to producing products and services.
• Activities. How is the business carried out? This model not only covers services offered, but also projects and any other kinds of activities. In addition, the model describes the events that cause activities to happen.
• Time. All data is positioned in time, but some more than others.
Level 2: A more detailed model describing specific functional areas:
• Human Resources
• Communications and Marketing
• The Laboratory Level 3: Examples of the details a model can have to address what is truly unique in a particular industry. Here you see how to address the unique bits in areas as diverse as:
• Criminal Justice. The model presented here is based on the “Global Justice XML Data Model” (GJXDM).
• Banking. The model presented here is the result of working for four different banks and then adding some thought to come up with something different from what is currently in any of them.
• Highways. The model here is derived from a project in a Canadian Provincial Highway Department, and addresses the question “what is a road?”
In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications.Peer under the hood of the systems you already use, and learn how to use and operate them more effectivelyMake informed decisions by identifying the strengths and weaknesses of different toolsNavigate the trade-offs around consistency, scalability, fault tolerance, and complexityUnderstand the distributed systems research upon which modern databases are builtPeek behind the scenes of major online services, and learn from their architectures
It offers a view of the world being addressed by all the techniques, methods, and tools of the information processing industry (for example, object-oriented design, CASE, business process re-engineering, etc.) and presents several concepts that need to be addressed by such tools.
This book is pertinent, with companies and government agencies realizing that the data they use represent a significant corporate resource recognize the need to integrate data that has traditionally only been available from disparate sources. An important component of this integration is management of the "metadata" that describe, catalogue, and provide access to the various forms of underlying business data. The "metadata repository" is essential to keep track of the various physical components of these systems and their semantics.
The book is ideal for data management professionals, data modeling and design professionals, and data warehouse and database repository designers.A comprehensive work based on the Zachman Framework for information architecture—encompassing the Business Owner's, Architect's, and Designer's views, for all columns (data, activities, locations, people, timing, and motivation)Provides a step-by-step description of model and is organized so that different readers can benefit from different partsProvides a view of the world being addressed by all the techniques, methods and tools of the information processing industry (for example, object-oriented design, CASE, business process re-engineering, etc.)Presents many concepts that are not currently being addressed by such tools — and should be
Written by Wes McKinney, the creator of the Python pandas project, this book is a practical, modern introduction to data science tools in Python. It’s ideal for analysts new to Python and for Python programmers new to data science and scientific computing. Data files and related material are available on GitHub.Use the IPython shell and Jupyter notebook for exploratory computingLearn basic and advanced features in NumPy (Numerical Python)Get started with data analysis tools in the pandas libraryUse flexible tools to load, clean, transform, merge, and reshape dataCreate informative visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsAnalyze and manipulate regular and irregular time series dataLearn how to solve real-world data analysis problems with thorough, detailed examples