Holly Smith will take a closer look at the three big projects in this space; Delta, Hudi and Iceberg. They’re all trying to solve for similar data problems and have tackled the various challenges in different ways. Her talk will start with the very basics of how we got here, what the history is before diving deep into the underlying tech, their roadmaps, and their impacts on the data landscape as a whole.
talk-data.com
Speaker
Holly Smith
12
talks
Holly Smith is a Staff Developer Advocate at Databricks, where she specializes in Data and AI, developer advocacy, and tech education. With over a decade of experience and her status as one of the company’s earliest employees, she is a renowned public speaker and educator, known for her YouTube series on AI and data engineering. She is an award-winning expert in Data and AI, recognized for translating complex technologies into accessible learning experiences.
Bio from: Databricks DATA + AI Summit 2023
Frequent Collaborators
Filter by Event / Source
Talks & appearances
12 activities · Newest first
We’re excited to be back at Big Data LDN this year—huge thanks to the organisers for hosting Databricks London once more!
Join us for an evening of insights, networking, and community with the Databricks Team and Advancing Analytics!
🎤 Agenda:
6:00 PM – 6:10 PM | Kickoff & Warm Welcome
Grab a drink, say hi, and get the lowdown on what’s coming up. We’ll set the scene for an evening of learning and laughs.
6:10 PM – 6:50 PM | The Metadata Marathon: How three projects are racing forward – Holly Smith (Staff Developer Advocate, Databricks)
With the enormous amount of discussion about open storage formats between nerds and even not-nerds, it can be hard to keep track of who’s doing what and how this actually makes any impact on day to day data projects.
Holly will take a closer look at the three big projects in this space; Delta, Hudi and Iceberg. They’re all trying to solve for similar data problems and have tackled the various challenges in different ways. Her talk will start with the very basics of how we got here, what the history is before diving deep into the underlying tech, their roadmaps, and their impacts on the data landscape as a whole.
6:50 PM – 7:10 PM | What’s New in Databricks & Databricks AI – Simon Whiteley & Gavi Regunath
Hot off the press! Simon and Gavi will walk you through the latest and greatest from Databricks, including shiny new AI features and platform updates you’ll want to try ASAP.
7:10 PM onwards | Q&A Panel + Networking
Your chance to ask the experts anything—then stick around for drinks, snacks, and some good old-fashioned data geekery.
Get fresh AI updates, expert insights on open formats, and network with the brightest minds in data - all in one evening.
So you’ve heard of Databricks, but still not sure what the fuss is all about. Yes you’ve heard it’s Spark, but then there’s this Delta thing that’s both a data lake and a data warehouse (isn’t that what Iceberg is?) And then there's Unity Catalog, that's not just a catalog, it also does access management but even surprising things like optimise your data and programmatic access to lineage and billing? But then serverless came out and now you don’t even have to learn Spark? And of course there’s a bunch of AI stuff to use or create yourself. So why not spend 30 mins learning the details of what Databricks does, and how it can turn you into a rockstar Data Engineer.
Each year at Summit, Women in Data and AI have a half day for in-person discussions on empowering Women in Data and AI Breakfast, and networking with like-minded professionals and trailblazers. For this virtual discussion, hear from Kate Ostbye (Pfizer), Lisa Cohen (Anthropic), Pallavi Koppol and Holly Smith (Databricks) about navigating challenges, celebrating successes, and inspire one another as we champion diversity and innovation in data together. And how to get involved year-round.
Be first to witness the latest breakthroughs from Databricks and share the success of innovative data and AI companies.
Join a live recording of the Over Architected Databricks podcast with Nick and Holly as they take the hottest features for the coming week and try to shoehorn them into one architecture. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.
Databricks is the bestest platform ever where everything is perfect and nothing else could ever make it any better, right? …right? You and I know, this is not true. Don’t get me wrong, there are features that I absolutely love, but there are also some that require powering through the papercuts. And then there are those that I pretend don’t exist. I’ll be opening up to give my honest take on three of each category, why I do (or don’t) like them, and then telling you which talks to attend to find out more.
We have a hypothesis, that 90% of people doing Gen AI today weren’t doing it two years ago. The landscape is full of people stumbling their way through it, from the AI academics learning that code for papers is not software development ready, all the way to data experts suddenly needing to learn a new skill.
In this talk, we'll go through what data engineers need to know to help get those AI projects off the ground. Starting with picking the right projects, execution plans, through to toolsets and skills that will make you shine.
90% of people in Gen AI today weren't doing it 2 years ago. This talk covers what data engineers need to launch AI projects.
Did you finish the Photon whitepaper and think, wait, what? I know I did; it’s my job to understand it, explain it, and then use it. If your role involves using Apache Spark™ on Databricks, then you need to know about Photon and where to use it. Join me, chief dummy, nay "supreme" dummy, as I break down this whitepaper into easy to understand explanations that don’t require a computer science degree. Together we will unravel mysteries such as:
- Why is a Java Virtual Machine the current bottleneck for Spark enhancements?
- What does vectorized even mean? And how was it done before?
- Why is the relationship status between Spark and Photon "complicated?"
In this session, we’ll start with the basics of Apache Spark, the details we pretend to know, and where those performance cracks are starting to show through. Only then will we start to look at Photon, how it’s different, where the clever design choices are and how you can make the most of this in your own workloads. I’ve spent over 50 hours going over the paper in excruciating detail; every reference, and in some instances, the references of the references so that you don’t have to.
Talk by: Holly Smith
Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc
With 75k attendees (and 12k in person at the sold-out show), Day 2 of the conference is kicked off by co-hosts Holly Smith (Sr Resident Solutions Architect, Databricks) and Jimmy Obeyeni (Strategic Account Executive, Databricks). Hear their take on Day 1 of the conference, the state of data and AI, Databricks, and what to expect for the excitement and buzz of Day 2.
Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc