Abstract: The vast availability of unstructured data presents a significant opportunity for social sciences, yet there is a pressing need for better tools and infrastructure to access and utilize this data effectively. This talk will highlight how the Business and Economic Research Data Infrastructure Program BERD@NFDI is addressing these needs, showcasing achievements and inviting further collaboration within the European social science community. Simultaneously, the fields of Natural Language Processing (NLP) and Large Language Models (LLMs) require high-quality training data. Social scientists have been collecting valuable data for decades, which can serve as essential benchmarks for advancing NLP and LLM research. By embracing open science, we can bridge the gap between social science and computational research, making this data more accessible and fostering collaboration across disciplines.
talk-data.com
Speaker
Frauke Kreuter
3
talks
Professor Frauke Kreuter is the Professor of Statistics and Data Science in Social Sciences and the Humanities at the Ludwig-Maximilians-University of Munich, Germany; Co-director of the Social Data Science Center (SoDa), and faculty member in the Joint Program in Survey Methodology (JPSM) at the University of Maryland, USA; and until recently head of the statistical methods group at the Institute for Employment Research (IAB) in Nuremberg, Germany. She is an elected fellow of the American Statistical Association and the 2020 recipient of the Warren Mitofsky Innovators Award of the American Association for Public Opinion Research. In addition to her academic work, Dr. Kreuter is the Founder of the International Program for Survey and Data Science, developed in response to the increasing demand from researchers and practitioners for the appropriate methods and right tools to face a changing data environment; Co-Founder of the Coleridge Initiative, whose goal is to accelerate data-driven research and policy around human beings and their interactions for program management, policy development, and scholarly purposes by enabling efficient, effective, and secure access to sensitive data about society and the economy; and Co-Founder of the German language podcast Dig Deep.
Bio from: Unstructured Data: Bridging Social Sciences & NLP/LLM Research Thru Open Science
Filter by Event / Source
Talks & appearances
3 activities · Newest first
Featuring a timely presentation of total survey error (TSE), this edited volume introduces valuable tools for understanding and improving survey data quality in the context of evolving large-scale data sets This book provides an overview of the TSE framework and current TSE research as related to survey design, data collection, estimation, and analysis. It recognizes that survey data affects many public policy and business decisions and thus focuses on the framework for understanding and improving survey data quality. The book also addresses issues with data quality in official statistics and in social, opinion, and market research as these fields continue to evolve, leading to larger and messier data sets. This perspective challenges survey organizations to find ways to collect and process data more efficiently without sacrificing quality. The volume consists of the most up-to-date research and reporting from over 70 contributors representing the best academics and researchers from a range of fields. The chapters are broken out into five main sections: The Concept of TSE and the TSE Paradigm, Implications for Survey Design, Data Collection and Data Processing Applications, Evaluation and Improvement, and Estimation and Analysis. Each chapter introduces and examines multiple error sources, such as sampling error, measurement error, and nonresponse error, which often offer the greatest risks to data quality, while also encouraging readers not to lose sight of the less commonly studied error sources, such as coverage error, processing error, and specification error. The book also notes the relationships between errors and the ways in which efforts to reduce one type can increase another, resulting in an estimate with larger total error. This book: • Features various error sources, and the complex relationships between them, in 25 high-quality chapters on the most up-to-date research in the field of TSE • Provides comprehensive reviews of the literature on error sources as well as data collection approaches and estimation methods to reduce their effects • Presents examples of recent international events that demonstrate the effects of data error, the importance of survey data quality, and the real-world issues that arise from these errors • Spans the four pillars of the total survey error paradigm (design, data collection, evaluation and analysis) to address key data quality issues in official statistics and survey research Total Survey Error in Practice is a reference for survey researchers and data scientists in research areas that include social science, public opinion, public policy, and business. It can also be used as a textbook or supplementary material for a graduate-level course in survey research methods. Paul P. Biemer, PhD, is distinguished fellow at RTI International and associate director of Survey Research and Development at the Odum Institute, University of North Carolina, USA. Edith de Leeuw, PhD, is professor of survey methodology in the Department of Methodology and Statistics at Utrecht University, the Netherlands. Stephanie Eckman, PhD, is fellow at RTI International, USA. Brad Edwards is vice president, director of Field Services, and deputy area director at Westat, USA. Frauke Kreuter, PhD, is professor and director of the Joint Program in Survey Methodology, University of Maryland, USA; professor of statistics and methodology at the University of Mannheim, Germany; and head of the Statistical Methods Research Department at the Institute for Employment Research, Germany. Lars E. Lyberg, PhD, is senior advisor at Inizio, Sweden. N. Clyde Tucker, PhD, is principal survey methodologist at the American Institutes for Research, USA. Brady T. West, PhD, is research associate professor in the Survey Resea
Explore the practices and cutting-edge research on the new and exciting topic of paradata Paradata are measurements related to the process of collecting survey data. Improving Surveys with Paradata: Analytic Uses of Process Information is the most accessible and comprehensive contribution to this up-and-coming area in survey methodology. Featuring contributions from leading experts in the field, Improving Surveys with Paradata: Analytic Uses of Process Information introduces and reviews issues involved in the collection and analysis of paradata. The book presents readers with an overview of the indispensable techniques and new, innovative research on improving survey quality and total survey error. Along with several case studies, topics include: Using paradata to monitor fieldwork activity in face-to-face, telephone, and web surveys Guiding intervention decisions during data collection Analysis of measurement, nonresponse, and coverage error via paradata Providing a practical, encompassing guide to the subject of paradata, the book is aimed at both producers and users of survey data. Improving Surveys with Paradata: Analytic Uses of Process The book also serves as an excellent resource for courses on data collection, survey methodology, and nonresponse and measurement error.