Search – talk-data.com

Title & Speakers	Event
Event DATA MINER Big Data Europe Conference 2020 2024-12-06
Hartmut Armbruster: Tips on Schema, Queries, Parallel Access, and Reactive Programming 2024-12-06 · 23:12 Hartmut Armbruster 🌟 Session Overview 🌟 Session Name: Maximising Cassandra's Potential: Tips on Schema, Queries, Parallel Access, and Reactive Programming Speaker: Hartmut Armbruster Session Description: In this talk, we will design the backend and data layer for a data-rich social platform feed tailored to authenticated users. We will begin with UI wireframes and then develop logical and physical Cassandra data models and query patterns. Using reactive programming paradigms, we will optimize the process flow to efficiently execute queries in parallel. Finally, we will implement a Proof of Concept (POC) using Kotlin, Quarkus, and Mutiny. While reactive programming can initially seem intimidating, it becomes productive, elegant, and even enjoyable once you become familiar with it. This talk aims to inspire new ideas and showcase what’s possible with a modern, tailored, and efficiently utilized stack. Prior knowledge of Cassandra Query Language (CQL), data partitioning/sharding concepts, and reactive programming is beneficial but optional. 🚀 About Big Data and RPA 2024 🚀 Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨ 📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP 💡 Stay Connected & Updated 💡 Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop! 🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT AI/ML Analytics Big Data Cassandra Dashboard	YouTube
Sonal Goyal: Open Source Entity Resolution - Needs and Challenges 2024-12-06 · 19:45 Sonal Goyal 🌟 Session Overview 🌟 Session Name: Open Source Entity Resolution - Needs and Challenges Speaker: Sonal Goyal Session Description: Real world data contains multiple records belonging to the same customer. These records can be in single or multiple systems and they have variations across fields, which makes it hard to combine them together, especially with growing data volumes. This hurts customer analytics - establishing lifetime value, loyalty programs, or marketing channels is impossible when the base data is not linked. No AI algorithm for segmentation can produce the right results when there are multiple copies of the same customer lurking in the data. No warehouse can live up to its promise if the dimension tables have duplicates. With a modern data stack and DataOps, we have established patterns for E and L in ELT for building data warehouses, datalakes and deltalakes. However, the T - getting data ready for analytics still needs a lot of effort. Modern tools like dbt are actively and successfully addressing this. What is also needed is a quick and scalable way to resolve entities to build the single source of truth of core business entities post Extraction and pre or post Loading. This session would cover the problem of Entity Resolution, its practical applications and challenges in building an entity resolution system. It will also cover Zingg - an Open Source Framework for building Entity Resolution systems. (https://github.com/zinggAI/zingg/) 🚀 About Big Data and RPA 2024 🚀 Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨ 📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP 💡 Stay Connected & Updated 💡 Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop! 🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT AI/ML Analytics Big Data Dashboard DataOps dbt ETL/ELT GitHub Marketing Modern Data Stack	YouTube

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable 2023-10-15 · 23:00 Eric Sammer – Founder @ Decodable , Tobias Macey – host Summary Building streaming applications has gotten substantially easier over the past several years. Despite this, it is still operationally challenging to deploy and maintain your own stream processing infrastructure. Decodable was built with a mission of eliminating all of the painful aspects of developing and deploying stream processing systems for engineering teams. In this episode Eric Sammer discusses why more companies are including real-time capabilities in their products and the ways that Decodable makes it faster and easier. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it’s real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize today to get 2 weeks free! As more people start using AI for projects, two things are clear: It’s a rapidly advancing field, but it’s tough to navigate. How can you get the best results for your use case? Instead of being subjected to a bunch of buzzword bingo, hear directly from pioneers in the developer and data science space on how they use graph tech to build AI-powered apps. . Attend the dev and ML talks at NODES 2023, a free online conference on October 26 featuring some of the brightest minds in tech. Check out the agenda and register today at Neo4j.com/NODES. Your host is Tobias Macey and today I'm interviewing Eric Sammer about starting your stream processing journey with Decodable Interview Introduction How did you get involved in the area of data management? Can you describe what Decodable is and the story behind it? What are the notable changes to the Decodable platform since we last spoke? (October 2021) What are the industry shifts that have influenced the product direction? What are the problems that customers are trying to solve when they come to Decodable? When you launched your focus was on SQL transformations of streaming data. What was the process for adding full Java support in addition to SQL? What are the developer experience challenges that are particular to working with streaming data? How have you worked to address that in the Decodable platform and interfaces? As you evolve the technical and product direction, what is your heuristic for balancing the unification of interfaces and system integration against the ability to swap different components or interfaces as new technologies are introduced? What are the most interesting, innovative, or unexpected ways that you have seen Decodable used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Decodable? When is Decodable the wrong choice? What do you have planned for the future of Decodable? Contact Info esammer on GitHub LinkedIn Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers Links Decodable Podcast Episode Understanding the Apache Flink Journey Flink Podcast Episode Debezium Podcast Episode Kafka Redpanda Podcast Episode Kinesis PostgreSQL Podcast Episode Snowflake Podcast Episode Databricks Startree Pinot Podcast Episode Rockset Podcast Episode Druid InfluxDB Samza Storm Pulsar Podcast Episode ksqlDB Podcast Episode dbt GitHub Actions Airbyte Singer Splunk Outbox Pattern The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Sponsored By: Neo4J: NODES 2023 is a free online conference focused on graph-driven innovations with content for all skill levels. Its 24 hours are packed with 90 interactive technical sessions from top developers and data scientists across the world covering a broad range of topics and use cases. The event tracks: - Intelligent Applications: APIs, Libraries, and Frameworks – Tools and best practices for creating graph-powered applications and APIs with any software stack and programming language, including Java, Python, and JavaScript - Machine Learning and AI – How graph technology provides context for your data and enhances the accuracy of your AI and ML projects (e.g.: graph neural networks, responsible AI) - Visualization: Tools, Techniques, and Best Practices – Techniques and tools for exploring hidden and unknown patterns in your data and presenting complex relationships (knowledge graphs, ethical data practices, and data representation) Don’t miss your chance to hear about the latest graph-powered implementations and best practices for free on October 26 at NODES 2023. Go to Neo4j.com/NODES today to see the full agenda and register!Rudderstack: Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstackMaterialize: You shouldn't have to throw away the database to build with fast-changing data. Keep the familiar SQL, keep the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. That is Materialize, the only true SQL streaming database built from the ground up to meet the needs of modern data products: Fresh, Correct, Scalable — all in a familiar SQL UI. Built on Timely Dataflow and Differential Dataflow, open source frameworks created by cofounder Frank McSherry at Microsoft Research, Materialize is trusted by data and engineering teams at Ramp, Pluralsight, Onward and more to build real-time data products without the cost, complexity, and development time of stream processing. Go to materialize.com today and get 2 weeks free!Datafold: This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare… AI/ML Airbyte Analytics Flink API Kinesis BI CI/CD Cloud Computing Data Engineering Data Management Data Quality Data Science Databricks Dataflow Datafold dbt Druid GitHub Java JavaScript Kafka Modern Data Stack Microsoft Neo4j postgresql Python Redpanda SaaS Singer Snowflake Splunk SQL Data Streaming	Data Engineering Podcast Listen
Event Data Council 2023 2023-05-11
Generative AI & the Natural Language Interface for Data \| Seek AI 2023-05-11 · 18:59 Sarah Nagy – CEO @ Seek AI ABOUT THE TALK: With the advancement of AI, the natural language interface for data is more valuable than ever before. This talk explores three key questions. First, what would a natural language interface for data actually look like? Second, what kind of value would it add to organizations using the Modern Data Stack? Third, what will the challenges look like when it comes to working with a natural language interface for data? Sarah Nagy will share real-world learnings from Seek's customers for each of these questions. ABOUT THE SPEAKER: A former quant, Sarah Nagy founded Seek AI in 2021. Prior to starting Seek, Sarah most recently led the consumer data team at Citadel's Ashler Capital. Prior to joining Citadel, Sarah led the quant arms at two startups, Edison and Predata, which both successfully exited. Sarah started her career as a quant at ITG developing algorithmic trading strategies. Sarah has a Master in Finance degree from Princeton and dual Bachelor's degrees in Astrophysics and Business Economics from UCLA. ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies. FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/ AI/ML Analytics Data Engineering GenAI Modern Data Stack	YouTube
Extreme Self-Service: Turning Data Consumers into Data Constructors \| Whatnot 2023-05-11 · 15:11 Alice Leach – Data Engineer @ Whatnot ABOUT THE TALK: Small data teams face supply and demand problems. Triaging and prioritizing data work can be overwhelming. But what if data consumers could create their own products with minimal training? Learn how to empower data consumers without disrupting others. Discover lessons from an 'extreme' self-service analytics approach: best practices, fostering a data community, promoting SQL literacy, and establishing solid guard rails. ABOUT THE SPEAKER: Alice Leach is a Data Engineer at Whatnot Inc., a live stream platform and marketplace that enables collectors and enthusiasts to connect, buy, and sell verified products. She transitioned from academia to data in 2021, working first as a data scientist then data engineer. Her current work at Whatnot focuses on designing and building robust, self-service data workflows using a modern data stack. ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies. FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil AI/ML Analytics Data Engineering Modern Data Stack SQL	YouTube

Event Modern Data Stack Conference 2021 2021-09-23
Keynote: A Future Look at Fivetran & An Introduction to HVR Software 2021-09-23 · 14:30 Fraser Harris – VP of Product, Fivetran , Mark Van De Wiel – Field CTO, Fivetran @ Fivetran Fivetran
How to Build a Single Customer View You Can Trust 2021-09-23 · 13:50 Scott O’Leary – Partnerships @ Monte Carlo , Minty Banfield – Senior Analyst @ Red Ventures UK , Siddharth Dawara – Head of Data Engineering @ Red Ventures UK , Olle Hammarstrom – Senior Data Engineer @ Red Ventures UK
Leveling Up Your Financial Analytics 2021-09-23 · 13:00 Justin Patri – Senior Consultant Data & Analytics @ Slalom , Jason Raede – Co-Founder @ Whetstone.ai , Ryan McNaught – Director Data & Analytics @ Slalom , Joey Berkowitz – Senior Data Analyst @ HealthJoy Analytics
Reverse ETL and Why Your Warehouse Should Be Your CDP 2021-09-23 · 11:40 William Tsu – Customer Success Operations @ Blend , Darren Haken – Head of Engineering @ Platform and Data, Auto Trader UK , Tejas Manohar – Founder @ Hightouch CDP ETL/ELT
Self-Service Analytics: The Promised Land or our Next Nightmare? 2021-09-23 · 11:05 Benn Stancil – Chief Analytics Officer, Mode , Maura Church – Data Science Manager @ Patreon , Jess Tillis – Senior Manager Global Operations @ Fivetran , Sarah Catanzaro – General Partner, Amplify Partners Analytics
Modernizing Your Hiring to Build the Data Teams of the Future 2021-09-23 · 10:05 Devina Nembhard – Co-Founder @ Black in Data , Sadiqah Musa – Senior Analyst and Co-Founder @ Black in Data
Keynote: Moral Psychology: How to Persuade People Who Aren’t Moved by Just Your Data 2021-09-23 · 09:00 George Fraser – CEO & Co-Founder, Fivetran , Jonathan Haidt – Thomas Cooley Professor of Ethical Leadership @ NYU-Stern
Quickfire Best Practice Stacks 2021-09-23 Ethan Lyon – Associate Director of Engineering @ Seer Interactive , Callum McCann – Solutions Architect @ Sisu Data , Daniel Burgos – Head of Data Platform @ Rappi
How to Drive Organizational Change with Analytics in the Cloud 2021-09-23 Carly Capitula – Global Enablement Practice Lead @ InterWorks , Mat Hughes – Analytics Practice Lead @ InterWorks , Brian Bickell – Global Data Practice Director @ InterWorks , Madison Gomez – SI Field Alliance Manager @ Fivetran Analytics Cloud Computing
Breaking Your Data Team Out of the Service Trap 2021-09-23 Emilie Schario – Director of Data @ Netlify
Achieving Competitive Advantages With Modern BI 2021-09-23 Lucas Thelosen – Head of Professional Services @ Looker, Google Cloud @ Gravity BI
Activating Current User Journeys to Build New Product Features 2021-09-23 Jeremy Levy – CEO & Co-Founder @ Indicative , Alex Nazarevich – VP of Growth @ INDOCHINO
Simplifying Data Lake Management Using AWS Lake Formation Governed Tables 2021-09-23 Meera Viswanathan – Lead Product Manager, Fivetran , Roy Hasson – Senior Product Manager - AWS Lake Formation @ AWS AWS Data Lake
Fivetran + Open Source: Open options for running your Modern Data Stack 2021-09-23 Nick Acosta – Developer Advocate @ Fivetran Fivetran Modern Data Stack
Mastering Your Data-Driven Approach to Marketing Efficiency 2021-09-23 Bertrand Cariou – Senior Director Product Marketing @ Trifacta , Zack Pike – CIO @ Callahan Marketing

talk-data.com

People (2 results)

Activities & events