Arjun Sabharwal joined the University of Toledo Library faculty in January 2009 as Assistant Professor and Digital Initiatives Librarian. He holds a Master of Library and Information Science and a Graduate Certificate in Archival Administration in addition to previously earned graduate degrees. He oversees the digital preservation of archival collections, manages the Toledo's Attic virtual museum web site, designs virtual exhibitions, leads the planning and implementation of UTOPIA (The University of Toledo OPen Institutional Archive) and the University of Toledo Digital Repository at the university, and manages digitization projects. Current professional interests include archiving, digital humanities, digital history, and developing thematic research collections. He has authored several research articles and reviews, and presented at conferences on work related to archives and digital libraries. Since 2010, he has engaged in digital scholarship via his international blog on ResearchGate titled Digital Humanities and Archives.
"Building a Scalable Data Warehouse" covers everything one needs to know to create a scalable data warehouse end to end, including a presentation of the Data Vault modeling technique, which provides the foundations to create a technical data warehouse layer. The book discusses how to build the data warehouse incrementally using the agile Data Vault 2.0 methodology. In addition, readers will learn how to create the input layer (the stage layer) and the presentation layer (data mart) of the Data Vault 2.0 architecture including implementation best practices. Drawing upon years of practical experience and using numerous examples and an easy to understand framework, Dan Linstedt and Michael Olschimke discuss:
How to load each layer using SQL Server Integration Services (SSIS), including automation of the Data Vault loading processes.
Important data warehouse technologies and practices.
Data Quality Services (DQS) and Master Data Services (MDS) in the context of the Data Vault architecture.Provides a complete introduction to data warehousing, applications, and the business context so readers can get-up and running fast Explains theoretical concepts and provides hands-on instruction on how to build and implement a data warehouseDemystifies data vault modeling with beginning, intermediate, and advanced techniquesDiscusses the advantages of the data vault approach over other techniques, also including the latest updates to Data Vault 2.0 and multiple improvements to Data Vault 1.0
The book presents comprehensive information in a logical, easy-to-follow format, covering topics such as research strategies for library and information science doctoral students; planning for research; defining the problem, forming a theory, and testing the theory; the scientific method of inquiry and data collection techniques; survey research methods and questionnaires; analyzing quantitative data; interview-based research; writing research proposals; and even time management skills. LIS students and professionals can consult the text for instruction on conducting research using this array of tools as well as for guidance in critically reading and evaluating research publications, proposals, and reports.
The explanations and current research examples supplied by discipline experts offer advice and strategies for completing research projects, dissertations, and theses as well as for writing grants, overcoming writer's block, collaborating with colleagues, and working with outside consultants. The answer to nearly any question posed by novice researchers is provided in this book.
To help realize Big Data’s full potential, the book addresses numerous challenges, offering the conceptual and technological solutions for tackling them. These challenges include life-cycle data management, large-scale storage, flexible processing infrastructure, data modeling, scalable machine learning, data analysis algorithms, sampling techniques, and privacy and ethical issues.Covers computational platforms supporting Big Data applicationsAddresses key principles underlying Big Data computingExamines key developments supporting next generation Big Data platformsExplores the challenges in Big Data computing and ways to overcome themContains expert contributors from both academia and industry