Many good project ideas fail before they even start due to the sensitive personal data required. The good news: a synthetic version of this data does not need protection. Synthetic data copies the actual data's structure and statistical properties without recreating personally identifiable information. The bad news: It is difficult to create synthetic data for open-access use, without recreating the exact copy of actual data. This talk will give hands-on insights into synthetic data creation and challenges along its lifecycle. We will learn how to create and evaluate synthetic data for any use case using the open-source package Synthetic Data Vault. We will find answers to why it takes so long to synthesize the huge amount of data dormant in public administration. The talk addresses owners who want to create access to their private data as well as analysts looking to use synthetic data. After this session, listeners will know which steps to take to generate synthetic data for multi-purpose use and its limitations for real-world analyses.
talk-data.com
Topic
Data Vault
data_modeling
data_warehouse
analytics
analytics_engineering
1
tagged
Activity Trend
4
peak/qtr
2020-Q1
2026-Q1
Top Events
Data Engineering Podcast
7
O'Reilly Data Engineering Books
4
Data + AI Summit 2025
2
The Joe Reis Show
2
dbt Coalesce 2023
2
PyConDE & PyData Berlin 2023
1
DATA MINER Big Data Europe Conference 2020
1
O'Reilly Data Science Books
1
Die Data Engineering Reise
1
The Analytics Engineering Podcast
1
Snowflake World Tour Amsterdam
1
Big Data LDN 2024
1
Filtering by:
PyConDE & PyData Berlin 2023
×