talk-data.com talk-data.com

Topic

Data Governance

data_management compliance data_quality

417

tagged

Activity Trend

90 peak/qtr
2020-Q1 2026-Q1

Activities

417 activities · Newest first

Snowflake Data Engineering

A practical introduction to data engineering on the powerful Snowflake cloud data platform. Data engineers create the pipelines that ingest raw data, transform it, and funnel it to the analysts and professionals who need it. The Snowflake cloud data platform provides a suite of productivity-focused tools and features that simplify building and maintaining data pipelines. In Snowflake Data Engineering, Snowflake Data Superhero Maja Ferle shows you how to get started. In Snowflake Data Engineering you will learn how to: Ingest data into Snowflake from both cloud and local file systems Transform data using functions, stored procedures, and SQL Orchestrate data pipelines with streams and tasks, and monitor their execution Use Snowpark to run Python code in your pipelines Deploy Snowflake objects and code using continuous integration principles Optimize performance and costs when ingesting data into Snowflake Snowflake Data Engineering reveals how Snowflake makes it easy to work with unstructured data, set up continuous ingestion with Snowpipe, and keep your data safe and secure with best-in-class data governance features. Along the way, you’ll practice the most important data engineering tasks as you work through relevant hands-on examples. Throughout, author Maja Ferle shares design tips drawn from her years of experience to ensure your pipeline follows the best practices of software engineering, security, and data governance. About the Technology Pipelines that ingest and transform raw data are the lifeblood of business analytics, and data engineers rely on Snowflake to help them deliver those pipelines efficiently. Snowflake is a full-service cloud-based platform that handles everything from near-infinite storage, fast elastic compute services, inbuilt AI/ML capabilities like vector search, text-to-SQL, code generation, and more. This book gives you what you need to create effective data pipelines on the Snowflake platform. About the Book Snowflake Data Engineering guides you skill-by-skill through accomplishing on-the-job data engineering tasks using Snowflake. You’ll start by building your first simple pipeline and then expand it by adding increasingly powerful features, including data governance and security, adding CI/CD into your pipelines, and even augmenting data with generative AI. You’ll be amazed how far you can go in just a few short chapters! What's Inside Ingest data from the cloud, APIs, or Snowflake Marketplace Orchestrate data pipelines with streams and tasks Optimize performance and cost About the Reader For software developers and data analysts. Readers should know the basics of SQL and the Cloud. About the Author Maja Ferle is a Snowflake Subject Matter Expert and a Snowflake Data Superhero who holds the SnowPro Advanced Data Engineer and the SnowPro Advanced Data Analyst certifications. Quotes An incredible guide for going from zero to production with Snowflake. - Doyle Turner, Microsoft A must-have if you’re looking to excel in the field of data engineering. - Isabella Renzetti, Data Analytics Consultant & Trainer Masterful! Unlocks the true potential of Snowflake for modern data engineers. - Shankar Narayanan, Microsoft Valuable insights will enhance your data engineering skills and lead to cost-effective solutions. A must read! - Frédéric L’Anglais, Maxa Comprehensive, up-to-date and packed with real-life code examples. - Albert Nogués, Danone

AWS re:Invent 2024 - Data foundation in the age of generative AI (ANT302)

An unparalleled level of interest in generative AI is driving organizations of all sizes to rethink their data strategy. While there is a need for data foundation constructs such as data pipelines, data architectures, data stores, and data governance to evolve, there are also business elements that need to stay constant such as organizations wanting to be cost-efficient while efficiently collaborating across their data estate. In this session, learn how laying your data foundation on AWS provides the guidance and the building blocks to balance both needs and empowers organizations to grow their data strategy for building generative AI applications.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

Coalesce 2024: Generative AI driven near-real-time operational analytics with zero-ETL and dbt Cloud

AWS offers the most scalable, highest performing data services to keep up with the growing volume and velocity of data to help organizations to be data-driven in real-time. AWS helps customers unify diverse data sources by investing in a zero ETL future and enable end-to-end data governance so your teams are free to move faster with data. Data teams running dbt Cloud are able to deploy analytics code, following software engineering best practices such as modularity, continuous integration and continuous deployment (CI/CD), and embedded documentation. In this session, we will dive deeper into how to get near real-time insight on petabytes of transaction data using Amazon Aurora zero-ETL integration with Amazon Redshift and dbt Cloud for your Generative AI workloads.

Speakers: Neela Kulkarni Solutions Architect AWS

Neeraja Rentachintala Director, Product Management Amazon

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: The journey to well-governed data products: A conversation with Dropbox and Atlan

With over 700 million users, interacting with over 550 billion pieces of content and counting, technology leader Dropbox is no stranger to the importance of great data.

Join Cortney Worthy, Data Governance Lead at Dropbox, and Austin Kronz, Director of Data Strategy at Atlan, as they explore Dropbox's journey toward creating well-governed, trustworthy data products. The discussion will highlight Dropbox’s domain-focused approach to data governance and how a robust framework and federated ownership model ensure the right data reaches the right stakeholders.

Additionally, this session will discuss how tools like Atlan and dbt can be integrated into a data governance strategy to enhance and refine it.

Speakers: Austin Kronz Director Of Data Strategy Atlan

Cortney Worthy

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: How we went from matching entities to chatting with our data

Entities, dimensions, and metrics. These things play crucial roles in allowing companies to create meaningful pictures of their data.

Max has spent the last 2.5 years using dbt with two different cleantech startups, experience he draws on to inform his approach to the challenge of matching and maintaining entities for more robust semantics. This talk will delve into the practical aspects of using the dbt Semantic Layer and Dot an LLM Slack plugin to provide matched insights directly to team members at Topanga.io.

Expect to learn about best practices in self-service, LLMs, data governance, and how to leverage the dbt Semantic Layer effectively. It’s a session geared toward beginners, intermediates, and pros alike

Speaker: Max Richman Head of Data and Financial Analysis Topanga.io

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: Financial verticals on one data platform: Storebrand's journey with Snowflake and dbt

Storebrand is a financial institution in Norway with several independent verticals — banking, pensions, insurance, asset management, properties — and horizontals across corporate and personal markets. To meet the matrix of data needs, Storebrand has needed to develop several data warehouses. They're now two years into the journey of building a new data platform, based on Snowflake and dbt, with four people in the platform team — and they're now scaling that central platform to the whole enterprise, with dbt Mesh. This requires automation of infrastructure and permissions to be regulatory compliant, data governance, clear separation of duties between the platform team and the data teams — for example, who has the responsibility for data masking and GDPR deletions? — a great developer experience, and a lot of good will. This presentation will try to answer how they've done it so far, and their plans to scale it from here.

Speaker: Eivind Berg Tech Lead Data Platform Storebrand

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Databricks Data Intelligence Platform: Unlocking the GenAI Revolution

This book is your comprehensive guide to building robust Generative AI solutions using the Databricks Data Intelligence Platform. Databricks is the fastest-growing data platform offering unified analytics and AI capabilities within a single governance framework, enabling organizations to streamline their data processing workflows, from ingestion to visualization. Additionally, Databricks provides features to train a high-quality large language model (LLM), whether you are looking for Retrieval-Augmented Generation (RAG) or fine-tuning. Databricks offers a scalable and efficient solution for processing large volumes of both structured and unstructured data, facilitating advanced analytics, machine learning, and real-time processing. In today's GenAI world, Databricks plays a crucial role in empowering organizations to extract value from their data effectively, driving innovation and gaining a competitive edge in the digital age. This book will not only help you master the Data Intelligence Platform but also help power your enterprise to the next level with a bespoke LLM unique to your organization. Beginning with foundational principles, the book starts with a platform overview and explores features and best practices for ingestion, transformation, and storage with Delta Lake. Advanced topics include leveraging Databricks SQL for querying and visualizing large datasets, ensuring data governance and security with Unity Catalog, and deploying machine learning and LLMs using Databricks MLflow for GenAI. Through practical examples, insights, and best practices, this book equips solution architects and data engineers with the knowledge to design and implement scalable data solutions, making it an indispensable resource for modern enterprises. Whether you are new to Databricks and trying to learn a new platform, a seasoned practitioner building data pipelines, data science models, or GenAI applications, or even an executive who wants to communicate the value of Databricks to customers, this book is for you. With its extensive feature and best practice deep dives, it also serves as an excellent reference guide if you are preparing for Databricks certification exams. What You Will Learn Foundational principles of Lakehouse architecture Key features including Unity Catalog, Databricks SQL (DBSQL), and Delta Live Tables Databricks Intelligence Platform and key functionalities Building and deploying GenAI Applications from data ingestion to model serving Databricks pricing, platform security, DBRX, and many more topics Who This Book Is For Solution architects, data engineers, data scientists, Databricks practitioners, and anyone who wants to deploy their Gen AI solutions with the Data Intelligence Platform. This is also a handbook for senior execs who need to communicate the value of Databricks to customers. People who are new to the Databricks Platform and want comprehensive insights will find the book accessible.

Financial Data Engineering

Today, investment in financial technology and digital transformation is reshaping the financial landscape and generating many opportunities. Too often, however, engineers and professionals in financial institutions lack a practical and comprehensive understanding of the concepts, problems, techniques, and technologies necessary to build a modern, reliable, and scalable financial data infrastructure. This is where financial data engineering is needed. A data engineer developing a data infrastructure for a financial product possesses not only technical data engineering skills but also a solid understanding of financial domain-specific challenges, methodologies, data ecosystems, providers, formats, technological constraints, identifiers, entities, standards, regulatory requirements, and governance. This book offers a comprehensive, practical, domain-driven approach to financial data engineering, featuring real-world use cases, industry practices, and hands-on projects. You'll learn: The data engineering landscape in the financial sector Specific problems encountered in financial data engineering The structure, players, and particularities of the financial data domain Approaches to designing financial data identification and entity systems Financial data governance frameworks, concepts, and best practices The financial data engineering lifecycle from ingestion to production The varieties and main characteristics of financial data workflows How to build financial data pipelines using open source tools and APIs Tamer Khraisha, PhD, is a senior data engineer and scientific author with more than a decade of experience in the financial sector.

Every organization today is exploring generative AI to drive value and push their business forward. But a common pitfall is that AI strategies often don’t align with business objectives, leading companies to chase flashy tools rather than focusing on what truly matters. How can you avoid these traps and ensure your AI efforts are not only innovative but also aligned with real business value?  Leon Gordon, is a leader in data analytics and AI. A current Microsoft Data Platform MVP based in the UK, founder of Onyx Data. During the last decade, he has helped organizations improve their business performance, use data more intelligently, and understand the implications of new technologies such as artificial intelligence and big data. Leon is an Executive Contributor to Brainz Magazine, a Thought Leader in Data Science for the Global AI Hub, chair for the Microsoft Power BI – UK community group and the DataDNA data visualization community as well as an international speaker and advisor. In the episode, Adel and Leon explore aligning AI with business strategy, building AI use-cases, enterprise AI-agents, AI and data governance, data-driven decision making, key skills for cross-functional teams, AI for automation and augmentation, privacy and AI and much more.  Links Mentioned in the Show: Onyx DataConnect with LeonLeon’s Linkedin Course - How to Build and Execute a Successful Data StrategySkill Track: AI Business FundamentalsRelated Episode: Generative AI in the Enterprise with Steve Holden, Senior Vice President and Head of Single-Family Analytics at Fannie MaeRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Data governance can contribute local optimizations to a company's value chain, such as better data discovery via a data catalog, or quality-monitored and cleansed data sets. From a 30,000 ft data strategy view, it is even more desirable to connect the dots for business objects frequently reused among business processes and make them available as governed, quality-controlled, easily accessible data products. The speaker successfully launched a Data Governance program in a company traditionally ranking metal higher than data and will share experiences on the ongoing data product journey.

Elsevier is a leading provider of quality scientific data to the global research sector. We are all too aware that high-quality, well-structured data is the cornerstone of any data-driven product – particularly relevant as we are caught in the disruptive excitement of the Gen AI wave. We mustn’t lose sight of the role good data plays – garbage in garbage out is as applicable now as ever.

The generation and availability of high-quality data relies on good data governance and the adoption of FAIR (Findable, Accessible, Interoperable, Reusable) data principles, including ontologies. Our semantic technology stack and domain expertise helps drive this adoption. Structured data, such as ontology-tagged text and Knowledge Graphs can be the bedrock of explainable GenAI solutions such as we are seeing in the arena of scientific search.

The Good, the Bad and the Ugly Amy is a Senior Data Solutions and Integration Manager at Bay Wa r.e. Her responsibility was enabling Data Governance, Data Products and Data Mesh. The challenge was building a unified data decentralization framework for dozens of organizations that historically used different stacks, metrics, and processes. Data Mesh is a complex concept, and every organisation views it differently. Amy will share the framework she had implemented for which her team gained leadership buy-in. She will discuss what Amy?s team managed to execute, what they've achieved, and what's on their roadmap. She will also share her learnings from this exciting journey, including securing buy-in from different business units. At 'Journey Building Data Mesh: The Good, The Bad, and The Ugly,' Amynwill focus on: Why Data Mesh, and when it is the right time to start prioritizing it? How did they implement data contracts at the scale, and what is the current progress? What Amy?s team would do differently today on their journey to Data Mesh.

Join Experian, Sainsbury’s, The Nottingham, UST and British Business Bank discuss how better data quality and better data governance leads to improved AI. Hear real business examples of how AI is being implemented and the lessons our panellists wished they’d known sooner. Also learn key takeaways on how to have a better Data Governance strategy and why having trust in your data is more important than any new emerging technology.

Step into the dynamic world of data governance, business operations, and artificial intelligence (AI), where the unsung hero, metadata, takes center stage. Just like the perfect sandwich relies on clear definitions of its ingredients, this talk unveils the indispensable role of metadata in defining and organizing data. George will share captivating real-life stories and examples on how clarity in definitions and metadata not only streamlines operations but also empowers decision-makers with invaluable insights. Explore the backbone of AI advancement through essential data management tools: the Business Glossary, Data Dictionary, Data Catalog, and Machine Learning Metadata Store. Let's embark on a journey where unified interpretations pave the way for accuracy, efficiency, and success in the data-driven era.

Join this session to discover how Natwest approaches governance, security & privacy controls, focusing on how they've designed & implemented a framework that balances control with agility. We will dive into NatWest's journey to establish a robust foundation for federated governance using a hub-and-spoke model. Attendees will gain insights into how NatWest enables a scalable governance structure that empowers individual teams while maintaining centralised oversight. Additionally, NatWest will share their roadmap for building a modern data architecture, guided by Data Mesh principles, that ensures flexibility, scalability, and alignment with the evolving needs of the organisation. This session is a must-attend for those looking to modernise their data strategies with a focus on governance and architectural innovation.

In today's rapidly evolving digital landscape, companies must adapt their approach to Data Governance to remain competitive. With the proliferation of data and the increasing reliance on advanced technologies like AI and machine learning, to remain effective Data Governance needs to evolve and adapt.

Join Nicola as she shares key learnings for her Data Governance journey and how we have to adapt our approach to Data Governance to work with the evolving environment we operate in.

AI is changing our work and personal lives, offering unprecedented opportunities in almost every arena. However, many organizations risk undermining their AI-driven projects by neglecting the need to unify, protect, and improve their data from the outset. Join this session to see first-hand examples of how feeding different data sets into a custom Large Language Model (LLM) can impact outcomes and learn how to build your foundation of high-quality, fully governed data today.

In the journey "From Data Mess to Data Mesh," an internal data marketplace is essential for transforming disorganized data into a cohesive, discoverable, and accessible resource. By centralizing data assets, it ensures seamless data discoverability and findability. Moreover, it upholds robust data governance and orchestration, maintaining compliance and quality. Join me to explore how an internal data marketplace can streamline data management, foster a data-driven culture, and drive organizational efficiency.

Main covered points:

• What is an Internal Data Marketplace? 

• Why is it Different from Existing Vendor-Based Marketplaces? 

• Real example of a Data Marketplace 

• Steps to Build a Data Marketplace 

• The main Architecture behind building your own Data Marketplace

Enterprises who deploy data observability report fewer and shorter incidents due to data quality issues. However, deploying data observability widely within an enterprise can be daunting, especially for teams who have experienced a heavy lift when rolling out other data governance technologies. This talk will review the top challenges enterprises will face when pursuing a data observability initiative, and a mix of process and technology solutions that can mitigate them to speed up time to value so data governance teams can show business-facing results quickly.