The Taxobook: Principles and Practices of Building Taxonomies, Part 2 of a 3-Part Series

Morgan & Claypool Publishers
Free sample

This book outlines the basic principles of creation and maintenance of taxonomies and thesauri. It also provides step by step instructions for building a taxonomy or thesaurus and discusses the various ways to get started on a taxonomy construction project. Often, the first step is to get management and budgetary approval, so I start this book with a discussion of reasons to embark on the taxonomy journey. From there I move on to a discussion of metadata and how taxonomies and metadata are related, and then consider how, where, and why taxonomies are used. Information architecture has its cornerstone in taxonomies and metadata. While a good discussion of information architecture is beyond the scope of this work, I do provide a brief discussion of the interrelationships among taxonomies, metadata, and information architecture. Moving on to the central focus of this book, I introduce the basics of taxonomies, including a definition of vocabulary control and why it is so important, how indexing and tagging relate to taxonomies, a few of the types of tagging, and a definition and discussion of post- and pre-coordinate indexing. After that I present the concept of a hierarchical structure for vocabularies and discuss the differences among various kinds of controlled vocabularies, such as taxonomies, thesauri, authority files, and ontologies. Once you have a green light for your project, what is the next step? Here I present a few options for the first phase of taxonomy construction and then a more detailed discussion of metadata and markup languages. I believe that it is important to understand the markup languages (SGML and XML specifically, and HTML to a lesser extent) in relation to information structure, and how taxonomies and metadata feed into that structure. After that, I present the steps required to build a taxonomy, from defining the focus, collecting and organizing terms, analyzing your vocabulary for even coverage over subject areas, filling in gaps, creating relationships between terms, and applying those terms to your content. Here I offer a cautionary note: don’t believe that your taxonomy is “done!” Regular, scheduled maintenance is an important—critical, really—component of taxonomy construction projects. After you’ve worked through the steps in this book, you will be ready to move on to integrating your taxonomy into the workflow of your organization. This is covered in Book 3 of this series. Table of Contents: List of Figures / Preface / Acknowledgments / Building a Case for Building a Taxonomy / Taxonomy Basics / Getting Started / Terms: The Building Blocks of a Taxonomy / Building the Structure of Your Taxonomy / Evaluation and Maintenance / Standards and Taxonomies / Glossary / End Notes / Author Biography
Read more

About the author

Marjorie M.K. Hlava and her team have worked with or built over 600 controlled vocabularies. Their experience is shared with you in this book. Margie is well known internationally for her work in the implementation of information science principles and the ever-evolving technology that supports them. She and the team at Access Innovations have provided the "back room" operations for many information-related organizations over the last 40 years. Margie is very active in the main organizations concerned with those areas. She has served as president of NFAIS (the National Federation of Advanced Information Services); that organization awarded her the Anne Marie Cunningham Memorial Award for Exemplary Volunteer Service to the Federation in 2012, as well as the Miles Conrad lectureship in 2014. She has also served as president of the American Society for Information Science and Technology (ASIS&T), which has awarded her the prestigious Watson Davis Award and their top honor, the ASIS&T Award of Merit. She has served two terms on the Board of Directors of the Special Libraries Association (SLA); SLA has honored her with their President's Award for her work in standards and has made her a Fellow of the SLA for her many other services within the organization. More recently, she served as the founding chair of SLA's Taxonomy Division.

For five years, Margie was on the Board of the National Information Standards Organization (NISO), and she continues to serve on the Content and Collaboration Standards Topic Committee for NISO. She has also held numerous committee positions in these and other organizations. She convened the workshop leading to the ANSI/NISO thesaurus standard NISO Z39.19-2005, and was a member of the standards committee for its writing. She also acted as liaison to the British Standards Institute controlled vocabulary standards group to ensure that the U.S. and British standards would be compatible.

Margie is the founder and president of Access Innovations, Inc., which has been honored with many awards, including recognition several times by KMWorld Magazine as one of 100 Companies That Matter in Knowledge Management and as a Trend-Setting Product Company, as well as by EContent Magazine as one of 100 Companies That Matter Most in the Digital Content Industry. The company's information management services include thesaurus and taxonomy creation. Under Margie's guidance, Access Innovations has developed the Data Harmony® line of software for content creation, taxonomy management, and automated categorization for portals and data collections. The Data Harmony Suite is protected by two patents, numbers 6898586 and 8046212, and 21 patent claims. Her recognition of the value of automatic code suggestion forthe medical industry led to the founding of Access Integrity and its Medical Claims Compliance system.

Margie's primary areas of research include automated indexing, thesaurus development, taxonomy creation, natural language processing, machine translations, and computer aided indexing. She has authored more than 200 published articles on these subjects. At industry and association meetings, she has given numerous workshops and presentations on thesaurus and taxonomy creation and maintenance.

Read more

Additional Information

Morgan & Claypool Publishers
Read more
Published on
Nov 1, 2014
Read more
Read more
Read more
Read more
Best For
Read more
Read more
Computers / Social Aspects / General
Computers / System Administration / Storage & Retrieval
Read more
Content Protection
This content is DRM protected.
Read more

Reading information

Smartphones and Tablets

Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.

Laptops and Computers

You can read books purchased on Google Play using your computer's web browser.

eReaders and other devices

To read on e-ink devices like the Sony eReader or Barnes & Noble Nook, you'll need to download a file and transfer it to your device. Please follow the detailed Help center instructions to transfer the files to supported eReaders.
This book is the third of a three-part series on taxonomies, and covers putting your taxonomy into use in as many ways as possible to maximize retrieval for your users. Chapter 1 suggests several items to research and consider before you start your implementation and integration process. It explores the different pieces of software that you will need for your system and what features to look for in each. Chapter 2 launches with a discussion of how taxonomy terms can be used within a workflow, connecting two—or more—taxonomies, and intelligent coordination of platforms and taxonomies. Microsoft SharePoint is a widely used and popular program, and I consider their use of taxonomies in this chapter. Following that is a discussion of taxonomies and semantic integration and then the relationship between indexing and the hierarchy of a taxonomy. Chapter 3 (“How is a Taxonomy Connected to Search?”) provides discussions and examples of putting taxonomies into use in practical applications. It discusses displaying content based on search, how taxonomy is connected to search, using a taxonomy to guide a searcher, tools for search, including search engines, crawlers and spiders, and search software, the parts of a search-capable system, and then how to assemble that search-capable system. This chapter also examines how to measure quality in search, the different kinds of search, and theories on search from several famous theoreticians—two from the 18th and 19th centuries, and two contemporary. Following that is a section on inverted files, parsing, discovery, and clustering. While you probably don’t need a comprehensive understanding of these concepts to build a solid, workable system, enough information is provided for the reader to see how they fit into the overall scheme. This chapter concludes with a look at faceted search and some possibilities for search interfaces. Chapter 4, “Implementing a Taxonomy in a Database or on a Website,” starts where many content systems really should—with the authors, or at least the people who create the content. This chapter discusses matching up various groups of related data to form connections, data visualization and text analytics, and mobile and e-commerce applications for taxonomies. Finally, Chapter 5 presents some educated guesses about the future of knowledge organization. Table of Contents: List of Figures / Preface / Acknowledgments / On Your Mark, Get Ready .... WAIT! Things to Know Before You Start the Implementation Step / Taxonomy and Thesaurus Implementation / How is a Taxonomy Connected to Search? / Implementing a Taxonomy in a Database or on a Website / What Lies Ahead for Knowledge Organization? / Glossary / End Notes / Author Biography
This is the first volume in a series about creating and maintaining taxonomies and their practical applications, especially in search functions. In Book 1 (The Taxobook: History, Theories, and Concepts of Knowledge Organization), the author introduces the very foundations of classification, starting with the ancient Greek philosophers Plato and Aristotle, as well as Theophrastus and the Roman Pliny the Elder. They were first in a line of distinguished thinkers and philosophers to ponder the organization of the world around them and attempt to apply a structure or framework to that world. The author continues by discussing the works and theories of several other philosophers from Medieval and Renaissance times, including Saints Aquinas and Augustine, William of Occam, Andrea Cesalpino, Carl Linnaeus, and René Descartes. In the 17th, 18th, and 19th centuries, John Locke, Immanuel Kant, James Frederick Ferrier, Charles Ammi Cutter, and Melvil Dewey contributed greatly to the theories of classification systems and knowledge organization. Cutter and Dewey, especially, created systems that are still in use today. Chapter 8 covers the contributions of Shiyali Ramamrita Ranganathan, who is considered by many to be the “father of modern library science.” He created the concept of faceted vocabularies, which are widely used—even if they are not well understood—on many e-commerce websites. Following the discussions and historical review, the author has included a glossary that covers all three books of this series so that it can be referenced as you work your way through the second and third volumes. The author believes that it is important to understand the history of knowledge organization and the differing viewpoints of various philosophers—even if that understanding is only that the differing viewpoints simply exist. Knowing the differing viewpoints will help answer the fundamental questions: Why do we want to build taxonomies? How do we build them to serve multiple points of view? Table of Contents: List of Figures / Preface / Acknowledgments / Origins of Knowledge Organization Theory: Early Philosophy of Knowledge / Saints and Traits: Realism and Nominalism / Arranging the glowers... and the Birds, and the Insects, and Everything Else: Early Naturalists and Taxonomies / The Age of Enlightenment Impacts Knowledge Theory / 18th-Century Developments: Knowledge Theory Coming to the Foreground / High Resolution: Classification Sharpens in the 19th and 20th Centuries / Outlining the World and Its Parts / Facets: An Indian Mathematician and Children’s Toys at Selfridge’s / Points of Knowledge / Glossary / End Notes / Author Biography
The study of people, information and communication technologies and the contexts in which these technologies are designed, implemented and used has long interested scholars in a wide range of disciplines, including the social study of computing, science and technology studies, the sociology of technology, and management information systems. As ICT use has spread from organizations into the larger world, these devices have become routine information appliances in our social lives, researchers have begun to ask deeper and more profound questions about how our lives have become bound up with technologies. A common theme running through this research is that the relationships among people, technology and context are dynamic, complex and critically important to understand. This synthesis lecture explores social informatics (SI), one important and dynamic approach that researchers have used to study these complex relationships. SI is "the interdisciplinary study of the design, uses and consequences of information technology that takes into account their interaction with institutional and cultural contexts" (Kling 1998, p.52; 1999). SI provides flexible frameworks to explore complex and dynamic sociotechnical interactions. As a domain of study related largely by common vocabulary and conclusions, SI critically examines common conceptions of and expectations for technology, by providing contextual evidence. This synthesis describes the evolution of SI research and identifies challenges and opportunities for future research. In what might be seen as an example of sociotechnical "natural selection", SI emerged in six different locations during the 1980s and 1990s: Norway, Slovenia, Japan, the former Soviet Union, the UK and, last, the US. As SI evolved, the version popularized in the US became globally dominant. The evolution of SI is presented in five stages: emergence, foundational, expansion, coherence, and transformation. Thus, we divide SI research into five major periods: an emergence stage, when various forms of SI emerged around the globe, an early period of foundational work which grounds SI (Pre-1990s), a period of expansion (1990s), a robust period of coherence and influence by Rob Kling (2000-2005) , and a period of transformation (2006-Present). Following the description of the five periods we discuss the evolution throughout the periods under five sections: principles, concepts, approaches, topics, and findings. Principles refer to the overarching motivations and labels employed to describe scholarly work. Approaches describe the theories, frameworks, and models employed in analysis, emphasizing the multi-disciplinary and interdisciplinary nature of SI. Concepts include specific processes, entities, themes, and elements of discourse within a given context, revealing a shared SI language surrounding change, complexity, consequences, and social elements of technology. Topics label the issues and general domains studied within social informatics, ranging from scholarly communication to online communities to information systems. Findings from seminal SI works illustrate growing insights over time and demonstrate how repeatable explanations unify SI. In the concluding remarks, we raise questions about the possible futures of SI research.
Visual information retrieval (VIR) is an active and vibrant research area, which attempts at providing means for organizing, indexing, annotating, and retrieving visual information (images and videos) from large, unstructured repositories. The goal of VIR is to retrieve matches ranked by their relevance to a given query, which is often expressed as an example image and/or a series of keywords. During its early years (1995-2000), the research efforts were dominated by content-based approaches contributed primarily by the image and video processing community. During the past decade, it was widely recognized that the challenges imposed by the lack of coincidence between an image's visual contents and its semantic interpretation, also known as semantic gap, required a clever use of textual metadata (in addition to information extracted from the image's pixel contents) to make image and video retrieval solutions efficient and effective. The need to bridge (or at least narrow) the semantic gap has been one of the driving forces behind current VIR research. Additionally, other related research problems and market opportunities have started to emerge, offering a broad range of exciting problems for computer scientists and engineers to work on. In this introductory book, we focus on a subset of VIR problems where the media consists of images, and the indexing and retrieval methods are based on the pixel contents of those images -- an approach known as content-based image retrieval (CBIR). We present an implementation-oriented overview of CBIR concepts, techniques, algorithms, and figures of merit. Most chapters are supported by examples written in Java, using Lucene (an open-source Java-based indexing and search implementation) and LIRE (Lucene Image REtrieval), an open-source Java-based library for CBIR.
©2018 GoogleSite Terms of ServicePrivacyDevelopersArtistsAbout Google|Location: United StatesLanguage: English (United States)
By purchasing this item, you are transacting with Google Payments and agreeing to the Google Payments Terms of Service and Privacy Notice.