talk-data.com talk-data.com

Topic

AWS

Amazon Web Services (AWS)

cloud cloud provider infrastructure services

837

tagged

Activity Trend

190 peak/qtr
2020-Q1 2026-Q1

Activities

837 activities · Newest first

​​Retail & Consumer Goods Industry Forum: How AI is Transforming How Brands Connect With Consumers | Sponsored by: Accenture & AWS

Consumer industries are being transformed by AI as physical and digital experiences converge. In this flagship session for retail, travel, restaurants and consumer goods attendees at Data + AI Summit, Databricks and a panel of industry leaders will explore how real-time data and machine learning are enabling brands to gain deeper consumer insights, personalize interactions and move closer to true 1:1 marketing. From AI agents shopping on behalf of consumers to consumer-centric supply chains, discover how the most innovative companies will use AI to reshape customer relationships and drive growth in an increasingly connected world.

FinOps: Automated Unity Catalog Cost Observability, Data Isolation and Governance Framework

Westat, a leader in data-driven research for 60 years+, has implemented a centralized Databricks platform to support hundreds of research projects for government, foundations, and private clients. This initiative modernizes Westat’s technical infrastructure while maintaining rigorous statistical standards and streamlining data science. The platform enables isolated project environments with strict data boundaries, centralized oversight, and regulatory compliance. It allows project-specific customization of compute and analytics, and delivers scalable computing for complex analyses. Key features include config-driven Infrastructure as Code (IaC) with Terragrunt, custom tagging and AWS cost integration for ROI tracking, budget policies with alerts for proactive cost management, and a centralized dashboard with row-level security for self-service cost analytics. This unified approach provides full financial visibility and governance while empowering data teams to deliver value. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Sponsored by: AWS | Deploying a GenAI Agent using Databricks Mosaic AI, Anthropic, LangGraph, and Amazon Bedrock

In this session, you’ll see how to build and deploy a GenAI agent and Model Context Protocol (MCP) with Databricks, Anthropic, Mosaic External AI Gateway, and Amazon Bedrock. You will learn the architecture, best-practices of using Databricks Mosaic AI, Anthropic Sonnet 3.7 first-party frontier model, and LangGraph for custom workflow orchestration in Databricks Data Intelligence Platform. You’ll also see how to use Databricks Mosaic AI to provide agent evaluation and monitoring. In addition, you will also see how inline agent will use MCP to provide tools and other resources using Amazon Nova models with Amazon Bedrock inline agent for deep research. This approach gives you the flexibility of LangGraph, the powerful managed agents offered by Amazon Bedrock, and Databricks Mosaic AI’s operational support for evaluation and monitoring.

Energy and Utilities Industry Forum | Sponsored by: Deloitte and AWS

Join us for a compelling forum exploring how energy leaders are harnessing data and AI to build a more sustainable future. As the industry navigates the complex balance between rising global energy demands and ambitious decarbonization goals, innovative companies are discovering that intelligence-driven operations are the key to success. From optimizing renewable energy integration to revolutionizing grid management, learn how energy pioneers are using AI to transform traditional operations while accelerating the path to net zero. This session reveals how Databricks is empowering energy companies to turn their sustainability aspirations into reality, proving that the future of energy is both clean and intelligent.

Tech Industry Forum: Tip of the Spear With Data and AI | Sponsored by: Aimpoint Digital and AWS

Join us for the Tech Industry Forum, formerly known as the Tech Innovators Summit, now part of Databricks Industry Experience. This session will feature keynotes, panels and expert talks led by top customer speakers and Databricks experts. Tech companies are pushing the boundaries of data and AI to accelerate innovation, optimize operations and build collaborative ecosystems. In this session, we’ll explore how unified data platforms empower organizations to scale their impact, democratize analytics across teams and foster openness for building tomorrow’s products. Key topics include: Scaling data platforms to support real-time analytics and AI-driven decision-making Democratizing access to data while maintaining robust governance and security Harnessing openness and portability to enable seamless collaboration with partners and customers After the session, connect with your peers during the exclusive Industry Forum Happy Hour. Reserve your seat today!

Enabling Sleep Science Research With Databricks and Delta Sharing

Leveraging Databricks as a platform, we facilitate the sharing of anonymized datasets across various Databricks workspaces and accounts, spanning multiple cloud environments such as AWS, Azure, and Google Cloud. This capability, powered by Delta Sharing, extends both within and outside Sleep Number, enabling accelerated insights while ensuring compliance with data security and privacy standards. In this session, we will showcase our architecture and implementation strategy for data sharing, highlighting the use of Databricks’ Unity Catalog and Delta Sharing, along with integration with platforms like Jira, Jenkins, and Terraform to streamline project management and system orchestration.

Democratizing Data in a Regulated Industry: Best Practices and Outcomes With J.P. Morgan Payments

Join our 2024 Databricks Disruptor award winners for a session on how they leveraged the Databricks and AWS platforms to build an internal technology marketplace in the highly regulated banking industry empowering end-users to innovate and own their data sets while maintaining strict compliance. In this talk, leaders from the J.P. Morgan Payments Data team share how they’ve done it — from keeping customer needs at the center of all decision-making to promoting a culture of experimentation. They’ll also expand upon how J.P. Morgan Payments products team now leverages the data platform they’ve built to create customer products including Cash Flow Intelligence.

Let's Save Tons of Money With Cloud-Native Data Ingestion!

Delta Lake is a fantastic technology for quickly querying massive data sets, but first you need those massive data sets! In this session we will dive into the cloud-native architecture Scribd has adopted to ingest data from AWS Aurora, SQS, Kinesis Data Firehose and more. By using off-the-shelf open source tools like kafka-delta-ingest, oxbow and Airbyte, Scribd has redefined its ingestion architecture to be more event-driven, reliable, and most importantly: cheaper. No jobs needed! Attendees will learn how to use third-party tools in concert with a Databricks and Unity Catalog environment to provide a highly efficient and available data platform. This architecture will be presented in the context of AWS but can be adapted for Azure, Google Cloud Platform or even on-premise environments.

Empowering Healthcare Insights: A Unified Lakehouse Approach With Databricks

NHS England is revolutionizing healthcare research by enabling secure, seamless access to de-identified patient data through the Federated Data Platform (FDP). Despite vast data resources spread across regional and national systems, analysts struggle with fragmented, inconsistent datasets. Enter Databricks: powering a unified, virtual data lake with Unity Catalog at its core — integrating diverse NHS systems while ensuring compliance and security. By bridging AWS and Azure environments with a private exchange and leveraging the Iceberg connector to interface with Palantir, analysts gain scalable, reliable and governed access to vital healthcare data. This talk explores how this innovative architecture is driving actionable insights, accelerating research and ultimately improving patient outcomes.

Leveraging Databricks Unity Catalog for Enhanced Data Governance in Unipol

In the contemporary landscape of data management, organizations are increasingly faced with the challenges of data segregation, governance and permission management, particularly when operating within complex structures such as holding companies with multiple subsidiaries. Unipol comprises seven subsidiary companies, each with a diverse array of workgroups, leading to a cumulative total of multiple operational groups. This intricate organizational structure necessitates a meticulous approach to data management, particularly regarding the segregation of data and the assignment of precise read-and-write permissions tailored to each workgroup. The challenge lies in ensuring that sensitive data remains protected while enabling seamless access for authorized users. This speech wants to demonstrate how Unity Catalog emerges as a pivotal tool in the daily use of the data platform, offering a unified governance solution that supports data management across diverse AWS environments.

Hands-on with Apache Iceberg

You've probably heard the name Apache Iceberg by now. If it wasn't when Databricks reportedly spent 2 billion USD buying Tabular, it might have been when AWS announced S3 Tables built on Iceberg. But do you know what Apache Iceberg actually is? Or how you could start using it today?

In this tutorial, we will walk through an end-to-end example of writing and reading Iceberg data, while taking a few pitstops to demonstrate Iceberg's selling points.

Poussés par des exigences élevées en matière de centralisation des données, de sécurité et d’auditabilité des accès, nous avons conçu une solution sur-mesure permettant aux équipes Data Science de collaborer efficacement à partir d’un point d’accès centralisé. Découvrez comment nous avons fait pour conjuguer besoin client architecture technique dans cette session axée pratique et retour terrain. À écouter pour tous ceux qui s’intéressent à la mise en production de la Data Science à l’échelle !

panel
by Marat Valiullin (Ancestry) , Tanping Wang (Visa) , Animesh Singh (LinkedIn) , Shardul Desai (Bank of America) , Bruno Aziza (Google Cloud) , Alisson Sol (Capital One) , Morgan Brown (Dropbox) , Jacqueline Karlin (PayPal) , Tirthankar Lahiri (Oracle) , Aishwarya Srinivasan (Fireworks AI) , Naresh Dulam (JPMorgan Chase) , Taimur Rashid (AWS) , Rooshana Purnyn (Hyatt Hotels Corporation) , Maya Ackerman (WaveAI) , Venkatesh Shivanna (Electronic Arts (EA)) , Jaishankar Sundararaman (Google) , Eleonore Fournier-Tombs (United Nations)

Keynotes & panels featuring industry leaders from Google, AWS, IBM, PayPal, Bank of America, Capital One, Visa, JPMorgan Chase, Hyatt Hotels Corporation, United Nations, Fireworks AI, WaveAI, EA, Dropbox, Ancestry, Oracle, LinkedIn, and more.

CDAOs and AI leaders are grappling with two crucial questions: 1. What public cloud provider should we choose for AI and GenAI initiatives, and 2. how do we assemble the right cloud architecture to scale and deploy AI more effectively?
This session compares public cloud AI and Generative AI architectures from AWS, Azure and GCP and provides insights on their points of differentiation.

Summary In this episode of the Data Engineering Podcast Mai-Lan Tomsen Bukovec, Vice President of Technology at AWS, talks about the evolution of Amazon S3 and its profound impact on data architecture. From her work on compute systems to leading the development and operations of S3, Mylan shares insights on how S3 has become a foundational element in modern data systems, enabling scalable and cost-effective data lakes since its launch alongside Hadoop in 2006. She discusses the architectural patterns enabled by S3, the importance of metadata in data management, and how S3's evolution has been driven by customer needs, leading to innovations like strong consistency and S3 tables.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.This is a pharmaceutical Ad for Soda Data Quality. Do you suffer from chronic dashboard distrust? Are broken pipelines and silent schema changes wreaking havoc on your analytics? You may be experiencing symptoms of Undiagnosed Data Quality Syndrome — also known as UDQS. Ask your data team about Soda. With Soda Metrics Observability, you can track the health of your KPIs and metrics across the business — automatically detecting anomalies before your CEO does. It’s 70% more accurate than industry benchmarks, and the fastest in the category, analyzing 1.1 billion rows in just 64 seconds. And with Collaborative Data Contracts, engineers and business can finally agree on what “done” looks like — so you can stop fighting over column names, and start trusting your data again.Whether you’re a data engineer, analytics lead, or just someone who cries when a dashboard flatlines, Soda may be right for you. Side effects of implementing Soda may include: Increased trust in your metrics, reduced late-night Slack emergencies, spontaneous high-fives across departments, fewer meetings and less back-and-forth with business stakeholders, and in rare cases, a newfound love of data. Sign up today to get a chance to win a $1000+ custom mechanical keyboard. Visit dataengineeringpodcast.com/soda to sign up and follow Soda’s launch week. It starts June 9th.Your host is Tobias Macey and today I'm interviewing Mai-Lan Tomsen Bukovec about the evolutions of S3 and how it has transformed data architectureInterview IntroductionHow did you get involved in the area of data management?Most everyone listening knows what S3 is, but can you start by giving a quick summary of what roles it plays in the data ecosystem?What are the major generational epochs in S3, with a particular focus on analytical/ML data systems?The first major driver of analytical usage for S3 was the Hadoop ecosystem. What are the other elements of the data ecosystem that helped shape the product direction of S3?Data storage and retrieval have been core primitives in computing since its inception. What are the characteristics of S3 and all of its copycats that led to such a difference in architectural patterns vs. other shared data technologies? (e.g. NFS, Gluster, Ceph, Samba, etc.)How does the unified pool of storage that is exemplified by S3 help to blur the boundaries between application data, analytical data, and ML/AI data?What are some of the default patterns for storage and retrieval across those three buckets that can lead to anti-patterns which add friction when trying to unify those use cases?The age of AI is leading to a massive potential for unlocking unstructured data, for which S3 has been a massive dumping ground over the years. How is that changing the ways that your customers think about the value of the assets that they have been hoarding for so long?What new architectural patterns is that generating?What are the most interesting, innovative, or unexpected ways that you have seen S3 used for analytical/ML/Ai applications?What are the most interesting, unexpected, or challenging lessons that you have learned while working on S3?When is S3 the wrong choice?What do you have planned for the future of S3?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links AWS S3KinesisKafkaSQSEMRDrupalWordpressNetflix Blog on S3 as a Source of TruthHadoopMapReduceNasa JPLFINRA == Financial Industry Regulatory AuthorityS3 Object VersioningS3 Cross RegionS3 TablesIcebergParquetAWS KMSIceberg RESTDuckDBNFS == Network File SystemSambaGlusterFSCephMinIOS3 MetadataPhotoshop Generative FillAdobe FireflyTurbotax AI AssistantAWS Access AnalyzerData ProductsS3 Access PointAWS Nova ModelsLexisNexis ProtegeS3 Intelligent TieringS3 Principal Engineering TenetsThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Summary In this episode of the Data Engineering Podcast Chakravarthy Kotaru talks about scaling data operations through standardized platform offerings. From his roots as an Oracle developer to leading the data platform at a major online travel company, Chakravarthy shares insights on managing diverse database technologies and providing databases as a service to streamline operations. He explains how his team has transitioned from DevOps to a platform engineering approach, centralizing expertise and automating repetitive tasks with AWS Service Catalog. Join them as they discuss the challenges of migrating legacy systems, integrating AI and ML for automation, and the importance of organizational buy-in in driving data platform success.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.This is a pharmaceutical Ad for Soda Data Quality. Do you suffer from chronic dashboard distrust? Are broken pipelines and silent schema changes wreaking havoc on your analytics? You may be experiencing symptoms of Undiagnosed Data Quality Syndrome — also known as UDQS. Ask your data team about Soda. With Soda Metrics Observability, you can track the health of your KPIs and metrics across the business — automatically detecting anomalies before your CEO does. It’s 70% more accurate than industry benchmarks, and the fastest in the category, analyzing 1.1 billion rows in just 64 seconds. And with Collaborative Data Contracts, engineers and business can finally agree on what “done” looks like — so you can stop fighting over column names, and start trusting your data again.Whether you’re a data engineer, analytics lead, or just someone who cries when a dashboard flatlines, Soda may be right for you. Side effects of implementing Soda may include: Increased trust in your metrics, reduced late-night Slack emergencies, spontaneous high-fives across departments, fewer meetings and less back-and-forth with business stakeholders, and in rare cases, a newfound love of data. Sign up today to get a chance to win a $1000+ custom mechanical keyboard. Visit dataengineeringpodcast.com/soda to sign up and follow Soda’s launch week. It starts June 9th.Your host is Tobias Macey and today I'm interviewing Chakri Kotaru about scaling successful data operations through standardized platform offeringsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining the different ways that you have seen teams you work with fail due to lack of structure and opinionated design?Why NoSQL?Pairing different styles of NoSQL for different problemsUseful patterns for each NoSQL style (document, column family, graph, etc.)Challenges in platform automation and scaling edge casesWhat challenges do you anticipate as a result of the new pressures as a result of AI applications?What are the most interesting, innovative, or unexpected ways that you have seen platform engineering practices applied to data systems?What are the most interesting, unexpected, or challenging lessons that you have learned while working on data platform engineering?When is NoSQL the wrong choice?What do you have planned for the future of platform principles for enabling data teams/data applications?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links RiakDynamoDBSQL ServerCassandraScyllaDBCAP TheoremTerraformAWS Service CatalogBlog PostThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Join us to explore how AWS is shaping the future of data and AI, unlocking new opportunities for business innovation and data driven decision-making with the next generation of Amazon SageMaker, your center for data, analytics, and AI. Collaborate faster with proven services coming together in a unified experience, with open access to all your data and with built-in governance.

Amazon Redshift Cookbook - Second Edition

Amazon Redshift Cookbook provides practical techniques for utilizing AWS's managed data warehousing service effectively. With this book, you'll learn to create scalable and secure data analytics solutions, tackle data integration challenges, and leverage Redshift's advanced features like data sharing and generative AI capabilities. What this Book will help me do Create end-to-end data analytics solutions from ingestion to reporting using Amazon Redshift. Optimize the performance and security of Redshift implementations to meet enterprise standards. Leverage Amazon Redshift for zero-ETL ingestion and advanced concurrency scaling. Integrate Redshift with data lakes for enhanced data processing versatility. Implement generative AI and machine learning solutions directly within Redshift environments. Author(s) Shruti Worlikar, Harshida Patel, and Anusha Challa are seasoned data experts who bring together years of experience with Amazon Web Services and data analytics. Their combined expertise enables them to offer actionable insights, hands-on recipes, and proven strategies for implementing and optimizing Amazon Redshift-based solutions. Who is it for? This book is best suited for data analysts, data engineers, and architects who are keen on mastering modern data warehouse solutions using Redshift. Readers should have some knowledge of data warehousing and familiarity with cloud concepts. Ideal for professionals looking to migrate on-premises systems or build cloud-native analytics pipelines leveraging Redshift.