Information Retrieval Models: Foundations and Relationships

Morgan & Claypool Publishers
Free sample

Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: "It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works." This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-based models. The aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the "relationships between models." This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Foundations of IR Models / Relationships Between IR Models / Summary & Research Outlook / Bibliography / Author's Biography / Index
Read more

About the author

Queen Mary University of London

Read more
Loading...

Additional Information

Publisher
Morgan & Claypool Publishers
Read more
Published on
Jul 1, 2013
Read more
Pages
163
Read more
ISBN
9781627050791
Read more
Read more
Best For
Read more
Language
English
Read more
Genres
Computers / Information Theory
Computers / System Administration / Storage & Retrieval
Language Arts & Disciplines / Library & Information Science / General
Read more
Content Protection
This content is DRM protected.
Read more

Reading information

Smartphones and Tablets

Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.

Laptops and Computers

You can read books purchased on Google Play using your computer's web browser.

eReaders and other devices

To read on e-ink devices like the Sony eReader or Barnes & Noble Nook, you'll need to download a file and transfer it to your device. Please follow the detailed Help center instructions to transfer the files to supported eReaders.
These proceedings contain the papers presented at ECIR 2010, the 32nd Eu- pean Conference on Information Retrieval. The conference was organizedby the Knowledge Media Institute (KMi), the Open University, in co-operation with Dublin City University and the University of Essex, and was supported by the Information Retrieval Specialist Group of the British Computer Society (BCS- IRSG) and the Special Interest Group on Information Retrieval (ACM SIGIR). It was held during March 28-31, 2010 in Milton Keynes, UK. ECIR 2010 received a total of 202 full-paper submissions from Continental Europe (40%), UK (14%), North and South America (15%), Asia and Australia (28%), Middle East and Africa (3%). All submitted papers were reviewed by at leastthreemembersoftheinternationalProgramCommittee.Outofthe202- pers 44 were selected asfull researchpapers. ECIR has alwaysbeen a conference with a strong student focus. To allow as much interaction between delegates as possible and to keep in the spirit of the conference we decided to run ECIR 2010 as a single-track event. As a result we decided to have two presentation formats for full papers. Some of them were presented orally, the others in poster format. The presentation format does not represent any di?erence in quality. Instead, the presentation format was decided after the full papers had been accepted at the Program Committee meeting held at the University of Essex. The views of the reviewers were then taken into consideration to select the most appropriate presentation format for each paper.
As information becomes more ubiquitous and the demands that searchers have on search systems grow, there is a need to support search behaviors beyond simple lookup. Information seeking is the process or activity of attempting to obtain information in both human and technological contexts. Exploratory search describes an information-seeking problem context that is open-ended, persistent, and multifaceted, and information-seeking processes that are opportunistic, iterative, and multitactical. Exploratory searchers aim to solve complex problems and develop enhanced mental capacities. Exploratory search systems support this through symbiotic human-machine relationships that provide guidance in exploring unfamiliar information landscapes. Exploratory search has gained prominence in recent years. There is an increased interest from the information retrieval, information science, and human-computer interaction communities in moving beyond the traditional turn-taking interaction model supported by major Web search engines, and toward support for human intelligence amplification and information use. In this lecture, we introduce exploratory search, relate it to relevant extant research, outline the features of exploratory search systems, discuss the evaluation of these systems, and suggest some future directions for supporting exploratory search. Exploratory search is a new frontier in the search domain and is becoming increasingly important in shaping our future world.

Table of Contents: Introduction / Defining Exploratory Search / Related Work / Features of Exploratory Search Systems / Evaluation of Exploratory Search Systems / Future Directions and concluding Remarks

Visual information retrieval (VIR) is an active and vibrant research area, which attempts at providing means for organizing, indexing, annotating, and retrieving visual information (images and videos) from large, unstructured repositories. The goal of VIR is to retrieve matches ranked by their relevance to a given query, which is often expressed as an example image and/or a series of keywords. During its early years (1995-2000), the research efforts were dominated by content-based approaches contributed primarily by the image and video processing community. During the past decade, it was widely recognized that the challenges imposed by the lack of coincidence between an image's visual contents and its semantic interpretation, also known as semantic gap, required a clever use of textual metadata (in addition to information extracted from the image's pixel contents) to make image and video retrieval solutions efficient and effective. The need to bridge (or at least narrow) the semantic gap has been one of the driving forces behind current VIR research. Additionally, other related research problems and market opportunities have started to emerge, offering a broad range of exciting problems for computer scientists and engineers to work on. In this introductory book, we focus on a subset of VIR problems where the media consists of images, and the indexing and retrieval methods are based on the pixel contents of those images -- an approach known as content-based image retrieval (CBIR). We present an implementation-oriented overview of CBIR concepts, techniques, algorithms, and figures of merit. Most chapters are supported by examples written in Java, using Lucene (an open-source Java-based indexing and search implementation) and LIRE (Lucene Image REtrieval), an open-source Java-based library for CBIR.
The study of people, information and communication technologies and the contexts in which these technologies are designed, implemented and used has long interested scholars in a wide range of disciplines, including the social study of computing, science and technology studies, the sociology of technology, and management information systems. As ICT use has spread from organizations into the larger world, these devices have become routine information appliances in our social lives, researchers have begun to ask deeper and more profound questions about how our lives have become bound up with technologies. A common theme running through this research is that the relationships among people, technology and context are dynamic, complex and critically important to understand. This synthesis lecture explores social informatics (SI), one important and dynamic approach that researchers have used to study these complex relationships. SI is "the interdisciplinary study of the design, uses and consequences of information technology that takes into account their interaction with institutional and cultural contexts" (Kling 1998, p.52; 1999). SI provides flexible frameworks to explore complex and dynamic sociotechnical interactions. As a domain of study related largely by common vocabulary and conclusions, SI critically examines common conceptions of and expectations for technology, by providing contextual evidence. This synthesis describes the evolution of SI research and identifies challenges and opportunities for future research. In what might be seen as an example of sociotechnical "natural selection", SI emerged in six different locations during the 1980s and 1990s: Norway, Slovenia, Japan, the former Soviet Union, the UK and, last, the US. As SI evolved, the version popularized in the US became globally dominant. The evolution of SI is presented in five stages: emergence, foundational, expansion, coherence, and transformation. Thus, we divide SI research into five major periods: an emergence stage, when various forms of SI emerged around the globe, an early period of foundational work which grounds SI (Pre-1990s), a period of expansion (1990s), a robust period of coherence and influence by Rob Kling (2000-2005) , and a period of transformation (2006-Present). Following the description of the five periods we discuss the evolution throughout the periods under five sections: principles, concepts, approaches, topics, and findings. Principles refer to the overarching motivations and labels employed to describe scholarly work. Approaches describe the theories, frameworks, and models employed in analysis, emphasizing the multi-disciplinary and interdisciplinary nature of SI. Concepts include specific processes, entities, themes, and elements of discourse within a given context, revealing a shared SI language surrounding change, complexity, consequences, and social elements of technology. Topics label the issues and general domains studied within social informatics, ranging from scholarly communication to online communities to information systems. Findings from seminal SI works illustrate growing insights over time and demonstrate how repeatable explanations unify SI. In the concluding remarks, we raise questions about the possible futures of SI research.
These proceedings contain the papers presented at ECIR 2010, the 32nd Eu- pean Conference on Information Retrieval. The conference was organizedby the Knowledge Media Institute (KMi), the Open University, in co-operation with Dublin City University and the University of Essex, and was supported by the Information Retrieval Specialist Group of the British Computer Society (BCS- IRSG) and the Special Interest Group on Information Retrieval (ACM SIGIR). It was held during March 28-31, 2010 in Milton Keynes, UK. ECIR 2010 received a total of 202 full-paper submissions from Continental Europe (40%), UK (14%), North and South America (15%), Asia and Australia (28%), Middle East and Africa (3%). All submitted papers were reviewed by at leastthreemembersoftheinternationalProgramCommittee.Outofthe202- pers 44 were selected asfull researchpapers. ECIR has alwaysbeen a conference with a strong student focus. To allow as much interaction between delegates as possible and to keep in the spirit of the conference we decided to run ECIR 2010 as a single-track event. As a result we decided to have two presentation formats for full papers. Some of them were presented orally, the others in poster format. The presentation format does not represent any di?erence in quality. Instead, the presentation format was decided after the full papers had been accepted at the Program Committee meeting held at the University of Essex. The views of the reviewers were then taken into consideration to select the most appropriate presentation format for each paper.
©2018 GoogleSite Terms of ServicePrivacyDevelopersArtistsAbout Google|Location: United StatesLanguage: English (United States)
By purchasing this item, you are transacting with Google Payments and agreeing to the Google Payments Terms of Service and Privacy Notice.