talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (9 results)

See all 9 →
Showing 12 results

Activities & events

Title & Speakers Event

Tech is not the real blocker. Bureaucracy is. Enterprise AI can get held up because of politics, silos, and resistance to change. At this session, Niall Maher (Marsh McLennan) explains how he shipped 40 production-ready systems inside a 150-year-old business, while James Malone (Nearform) shows how AI boosted design productivity without derailing workflows. If you want practical strategies for getting AI moving, this is your blueprint.

Join us for pizza, drinks, and AI that actually works. Limited spots, free to attend.

Speakers:

Niall Maher, Engineering Leaders for Innovation/AI, Marsh MacLennan From a team of four to over 100 developers, Niall has led the rollout of 40 production systems in the past year, enabling 90,000 staff to use LLM-based applications daily. In this session, he’ll share his journey from full-stack developer to AI leader, and the key lessons learned at Marsh McLennan.

James Malone, AI Strategy, Product & Design, Nearform James specialises in bridging traditional design excellence with cutting-edge AI systems and automation. Over the past 18 months, he has integrated AI into Nearform’s workflows, helping teams rapidly prototype, iterate, and deliver impactful results while reshaping design processes.

The Future Form meet-up series aims to build a community in London which facilitates the exchange of ideas and knowledge of the state of the art in the areas of Artificial Intelligence, Data Engineering, and Design, and the interplay of each. We'll bring together a community of builders from both industry and government to talk about their approaches to the emerging areas of Data, AI & Design.

From Idea to Impact: Shipping AI in the Real World

The Future Form meet-up series aims to build a community in London which facilitates the exchange of ideas and knowledge of the state of the art in the areas of Artificial Intelligence, Data Engineering, and Design, and the interplay of each. We'll bring together a community of builders from both industry and government to talk about their approaches to the emerging areas of Data, AI & Design. This will be more than a technical conversation, that will also touch on human and commercial factors that influence the work.

Agenda

  • 18:00 - Welcome drinks & pizza
  • 18:30 - Guest speaker - Nico Albanese, AI SDK, Vercel
  • 19:15 - Guest speaker - Tey Bannerman, AI Furnace, ex-McKinsey
  • 19:45 - Audience Q&A
  • 20:30 - Event close

Talk 1 - Nico Albanese

Nico maintains the AI SDK for Vercel. In this talk, he’ll dive into how to build with the SDK, sharing insights and exploring real-world use cases they’re encountering at Vercel.

Talk 2 - Tey Bannerman

Tey is a startup founder and the first designer to become a partner at McKinsey & Company. In his talk, he’ll share his experiences conducting user research and creating consumer-facing GenAI apps in both traditional food retail e-commerce and pure e-commerce environments, highlighting key learnings and considerations from his work.

Future Form; Data, AI & Design

We are excited to finally have the first ClickHouse Meetup in the vibrant city of Delhi! Join the ClickHouse crew, from Singapore and from different cities in India, for an engaging day of talks, food, and discussion with your fellow database enthusiasts.

But here's the deal: to secure your spot, make sure you register ASAP!

🗓️ Agenda:

  • 10:30 AM: Registration & Networking
  • 11:05 AM: Welcome & Opening
  • 11:10 AM: Introduction to ClickHouse by Rakesh Puttaswamy, Solution Architect @ ClickHouse
  • 11:25 AM: ClickPipes Overview and demo by Kunal Gupta, Sr. Software Engineer @ ClickHouse
  • 11:40 AM: Optimizing Log Management with Clickhouse: Cost-Effective & Scalable Solutions by Pushpender Kumar, DevOps Architect @ OLX India
  • 12:10 PM: ClickHouse at Physics Wallah: Empowering Real-Time Analytics at Scale by Utkarsh G. Srivastava, Software Development Engineer III @ Physics Wallah
  • 12:40 PM: FabFunnel & ClickHouse: Delivering Real-Time Marketing Analytics by Anmol Jain, SDE-2 (Full stack Developer) and Siddhant Gaba, SDE-2 (Python), @ Idea Clan
  • 1:10 PM: From SQL to AI: Building Intelligent Applications with ClickHouse and LangDB by Matteo Pelati, Co-founder, LangDB.ai
  • 1:40 PM: Lunch & Networking

If anyone from the community is interested in sharing a talk at future meetups, complete this CFP form and we’ll be in touch. _______

🎤 Session Details: Introduction to ClickHouse Discover the secrets behind ClickHouse's unparalleled efficiency and performance. Johnny will give an overview of different use cases for which global companies are adopting this groundbreaking database to transform data storage and analytics.

Speaker: Rakesh Puttaswamy, Solution Architect @ ClickHouse Rakesh Puttaswamy is a Solution Architect with ClickHouse, working with users across India, with over 12 years of experience in data architecture, big data, data science, and software engineering.Rakesh helps organizations design and implement cutting-edge data-driven solutions. With deep expertise in a broad range of databases and data warehousing technologies, he specializes in building scalable, innovative solutions to enable data transformation and drive business success.

🎤 Session Details: ClickPipes Overview and demo ClickPipes is a powerful integration engine that simplifies data ingestion at scale, making it as easy as a few clicks. With an intuitive onboarding process, setting up new ingestion pipelines takes just a few steps—select your data source, define the schema, and let ClickPipes handle the rest. Designed for continuous ingest, it automates pipeline management, ensuring seamless data flow without manual intervention. In this talk, Kunal will demo the Postgres CDC connector for ClickPipes, enabling seamless, native replication of Postgres data to ClickHouse Cloud in just a few clicks—no external tools needed for fast, cost-effective analytics.

Speaker: Kunal Gupta, Sr. Software Engineer @ ClickHouse Kunal Gupta is a Senior Software Engineer at ClickHouse, joining through the acquisition of PeerDB in 2024, where he played a pivotal role as a founding engineer. With several years of experience in architecting scalable systems and real-time applications, Kunal has consistently driven innovation and technical excellence. Previously, he was a founding engineer for new solutions at ICICIdirect and at AsknBid Tech, leading high-impact teams and advancing code analysis, storage solutions, and enterprise software development.

🎤 Session Details: Optimizing Log Management with Clickhouse: Cost-Effective & Scalable Solutions Efficient log management is essential in today's cloud-native environments, yet traditional solutions like ElasticSearch often face scalability issues, high costs, and performance limitations. This talk will begin with an overview of common logging tools and their challenges, followed by an in-depth look at ClickHouse's architecture. We will compare ClickHouse with ElasticSearch, focusing on improvements in query performance, storage efficiency, and overall cost-effectiveness.

A key highlight will be OLX India's migration to ClickHouse, detailing the motivations behind the shift, the migration strategy, key optimizations, and the resulting 50% reduction in log storage costs. By the end of this talk, attendees will gain a clear understanding of when and how to leverage ClickHouse for log management, along with best practices for optimizing performance and reducing operational costs.

Speaker: Pushpender Kumar, DevOps Architect @ OLX India Born and raised in Bijnor, moved to Delhi to stay ahead in the race of life. Currently working as a DevOps Architect at OLX India, specializing in cloud infrastructure, Kubernetes, and automation with over 10 years of experience. Successfully optimized log storage costs by 50% using Clickhouse, bringing scalability and efficiency to large-scale logging systems. Passionate about cloud optimization, DevOps hiring, and performance engineering.

🎤 Session Details: ClickHouse at Physics Wallah: Empowering Real-Time Analytics at Scale This session explores how Physics Wallah revolutionized its real-time analytics capabilities by leveraging ClickHouse. We'll delve into the journey of implementing ClickHouse to efficiently handle large-scale data processing, optimize query performance, and power diverse use cases such as user activity tracking and engagement analysis. By enabling actionable insights and seamless decision-making, this transformation has significantly enhanced the learning experience for millions of users.

Today, more than five customer-facing products at Physics Wallah are powered by ClickHouse, serving over 10 million students and parents, including 1.5 million Daily Active Users. Our in-house ClickHouse cluster, hosted and managed within our EKS infrastructure on AWS Cloud, ingests more than 10 million rows of data daily from various sources. Join us to learn about the architecture, challenges, and key strategies behind this scalable, high-performance analytics solution.

Speaker: Utkarsh G. Srivastava, Software Development Engineer III @ Physics Wallah As a versatile Software Engineer with over 7 years of experience in the IT industry, I have had the privilege of taking on diverse roles, with a primary focus on backend development, data engineering, infrastructure, DevOps, and security. Throughout my career, I have played a pivotal role in transformative projects, consistently striving to craft innovative and effective solutions for customers in the SaaS space.

🎤 Session Details: FabFunnel & ClickHouse: Delivering Real-Time Marketing Analytics We are a performance marketing company that relies on real-time reporting to drive data-driven decisions and maximize campaign effectiveness. As our client base expanded, we encountered significant challenges with our reporting system—frequent data updates meant handling large datasets inefficiently, leading to slow query execution and delays in delivering insights. This bottleneck hindered our ability to provide timely optimizations for ad campaigns. To address these issues, we needed a solution that could handle rapid data ingestion and querying at scale without the overhead of traditional refresh processes. In this talk, we’ll share how we transformed our reporting infrastructure to achieve real-time insights, enhancing speed, scalability, and efficiency in managing large-scale ad performance data.

Speakers: Anmol Jain, SDE-2 (Full stack Developer), & Siddhant Gaba, SDE-2 (Python) @ Idea Clan From competing as a national table tennis player to building high-performance software, Anmol Jain brings a unique mix of strategy and problem-solving to tech. With 3+ years of experience at Idea Clan, they play a key role in scaling Lookfinity and FabFunnel, managing multi-million-dollar ad spends every month. Specializing in ClickHouse, React.js, and Node.js, Anmol focuses on real-time data processing and scalable backend solutions. At this meet-up, they’ll share insights on solving reporting challenges and driving real-time decision-making in performance marketing.

Siddhant Gaba is an SDE II at Idea Clan, with expertise in Python, Java, and C#, specializing in scalable backend systems. With four years of experience working with FastAPI, PostgreSQL, MongoDB, and ClickHouse, he focuses on real-time analytics, database optimization, and distributed systems. Passionate about high-performance computing, asynchronous APIs, and system design, he aims to advance real-time data processing. Outside of work, he enjoys playing volleyball. At this meetup, he will share insights on how ClickHouse transformed real-time reporting and scalability.

🎤 Session Details: From SQL to AI: Building Intelligent Applications with ClickHouse and LangDB As AI becomes a driving force behind innovation, building applications that seamlessly integrate AI capabilities with existing data infrastructures is critical.

In this session, we explore the creation of agentic applications using ClickHouse and LangDB. We will introduce the concept of an AI gateway, explaining its role in connecting powerful AI models with the high-performance analytics engine of ClickHouse. By leveraging LangDB, we demonstrate how to directly interact with AI functions as User-Defined Functions (UDFs) in ClickHouse, enabling developers to design and execute complex AI workflows within SQL.

Additionally, we will showcase how LangDB facilitates deep visibility into AI function behaviors and agent interactions, providing tools to analyze and optimize the performance of AI-driven logic. Finally, we will highlight how ClickHouse, powered by LangDB APIs, can be used to evaluate and refine the quality of LLM responses, ensuring reliable and efficient AI integrations.

Speaker: Matteo Pelati, Co-founder, LangDB.ai Matteo Pelati is a seasoned software engineer with over two decades of experience, specializing in data engineering for the past ten years. He is the co-founder of LangDB, a company based in Singapore building the fastest Open Source AI Gateway. Before founding LangDB, he was part of the early team at DataRobot, where he contributed to scaling their product for enterprise clients. Subsequently, he joined DBS Bank where he built their data platform and team from the ground up. Prior to starting LangDB, Matteo led the data group for Asia Pacific and data engineering at Goldman Sachs.

ClickHouse Delhi/Gurgaon Meetup - March 2025

Details 🎉 Come along to the London Scala Talks in collaboration with Imperial DoCSoc! 🎉 In this event, you'll hear from Rory Graves and Noel Welsh. Agenda 6:00pm - 🥤 Doors open. Come along and grab a drink! 6:40pm - 🗣️ Rory Graves: A Life in Scala 7:20pm - 🍕 Intermission: Join us for some free food and drinks! Vegan, vegetarian and gluten-free options are provided. Let us know if you'd like something special - we'd be happy to accommodate. 7:50pm - 🗣️ Noel Welsh: Designing with Duality 8:30pm - 🥤 Socialising: Grab a drink and let's discuss the talks. 9:00pm - 🍻 Join us in a pub to discuss the talks!

🌐 This event has a live stream. Join it here at 6:40PM

🗣️Rory Graves: A Life in Scala Software development is full of competing demands—speed, reliability, scalability, flexibility, and affordability. Everyone wants all of them, and they want them now. But building software that delivers on all these fronts is much harder than it seems. Having used Scala as his primary language since 2010 and being deeply involved in its community, Rory has seen how different programming paradigms, tools, and mental models shape software development. This talk will explore key lessons from his career—how to think about software and software careers, the trade-offs of different paradigms and abstraction levels, and why for him Scala hits the sweet spot for building robust, scalable systems easily. Along the way, we will discuss real-world insights on tools, best practices, and how to navigate the ever-changing landscape of software engineering. Whether you’re considering Scala for your future projects or simply thinking about the next steps in your career, this talk will give you perspectives that go beyond the code. ⭐ Speaker ⭐ Rory has been passionate about coding for as long as he can remember—starting with writing games in BASIC far too many years ago. Over a 30+ year career, he has worked across a vast range of systems, from embedded systems to distributed internet-scale bidding engines. His experience spans industries, company sizes (from two-person startups to multinational corporations), and programming paradigms. A long-time advocate of Scala, Rory has contributed to numerous open-source projects, including performance optimizations in the Scala compiler. Today, he works in a hybrid role supporting applied AI and ML research while continuing to mentor developers and share knowledge through public speaking. Outside of software, Rory is a martial artist, a windmill tour guide, and a passionate mentor of programming and public speaking.

🗣️ Noel Welsh: Designing with Duality How can we systematically design software? One way is to use dualities, which allows us to connect implementation techniques, such as FP and OO (otherwise known as data and codata), as different instantiations of the same underlying model. In this talk, we'll explore duality as a design strategy for interpreters, and show that four different approaches to writing an interpreter inner loop fall out as applications of duality. This allows us to pick the implementation that best suits our needs, and makes creating software less a matter of inspiration and more the application of a consistent and repeatable process. ⭐ Speaker ⭐ Noel is developer, mentor, and trainer who works with leading companies in the UK and USA. He fell in love with functional programming when he discovered PLT Scheme (now Racket) shortly after graduation, and the majority of his work since has involved functional programming and, more recently, Scala. ———————————————————— 🗣️ Would you like to present, but are not sure how to start? Give a talk with us and you'll receive mentorship from a trained toastmaster! Get in touch through this form and we'll get you started 🏡 Interested in hosting or supporting us? Please get in touch through this form and we can discuss how you can get involved. 📜 All London Scala User Group events operate under the Scala Community Code of Conduct: [https://www.scala-lang.org/conduct/](https://www.scala-lang.org/conduct/) We encourage each of you to report the breach of the conduct, either anonymously through this form or by contacting one of our team members. We guarantee privacy and confidentiality, as well as that we will take your report seriously and react quickly.

Scala Talks x Imperial: A Life in Scala & Designing with Duality

To access this webinar, please register here: https://hubs.li/Q01_8X6V0

Topic: “Preparing for your First Enterprise Large Language Model (LLM) Application”

Speaker#1: Nicolas Decavel-Bueff, Data Science Consultant at Pandata

He is an SF-based Data Scientist who has delivered valuable solutions across a broad spectrum of industries. His accomplishments range from employing natural language processing models in logistics to leading teams in the development of vital models in the utility sector. His work with diverse tools has consistently created quantifiable business impact. Equipped with a Master's in Data Science from the University of San Francisco, Nicolas blends academic rigor and practical experience to address complex business challenges.

Speaker#2: Parham Parvizi, Founder of Data Stack Academy and Tura.io

Parham is a founding member of Tura.io and DataStack.Academy. Tura is a group of professional Cloud Data Engineers and Architects while Data Stack Academy is the most comprehensive Data Engineering bootcamp; training the future of Cloud Data Engineers. In his 20 years as Data Engineer and Cloud/Big Data Solution Architect, he has been an Apache Software Foundation contributor and an early adopter and contributor to open source Big Data projects as Map Reduce and Hive. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. As a Data Advisor and consultant, Parham’s has had the opportunity and pleasure to work with nearly every fortune 100 company over the years. From managing thousands node clusters to optimizing data task that you are familiar with behind the scenes.

Speaker#3: Cal Al-Dhubaib, Founder & AI Strategist at Pandata

Cal is a globally recognized data scientist, entrepreneur, and innovator in responsible artificial intelligence, specializing in high-risk sectors such as healthcare, energy, and defense. He is the founder and CEO of Pandata, a consulting company that helps organizations to design and develop AI-driven solutions for complex business challenges, with an emphasis on responsible AI. Their clients include globally recognized organizations like the Cleveland Clinic, Progressive Insurance, University Hospitals, and Parker Hannifin. Cal frequently speaks on topics including AI ethics, change management, data literacy, and the unique challenges of implementing AI solutions in high-risk industries. His insights have been featured in noteworthy publications such as Forbes, Ohiox, the Marketing AI Institute, Open Data Science, and AI Business News. Cal has also received recognition among Crain’s Cleveland Notable Immigrant Leaders, Notable Entrepreneurs, and most recently, Notable Technology Executives.

Abstract:

Amid the growing accessibility of performant Large Language Models (LLMs) like GPT-4 and Llama-2, and a burgeoning range of commercial licenses, enterprises are now forging the first wave of LLM-driven applications. This session will begin by addressing the challenges with LLMs such as defining clear success criteria for an LLM project and understanding different approaches to LLM work.

We’ll further address technical challenges such as memory handling, input quality control ("garbage in, garbage out"), and the complexities of embeddings. A comparison of different approaches from using Retrieval-Augmented Generation (RAG) to training or fine-tuning on top of a variety of different LLMs.

Through real-world examples we will discuss practical approaches to risk management, ethical considerations, and the ideal team composition for an LLM project. In addition, we’ll discuss a variety of tools that form the modern LLM application development stack. Join us to demystify LLM applications and equip your organization with the knowledge to succeed.

ODSC Links:

• Get free access to more talks/trainings like this at Ai+ Training platform:

https://hubs.li/H0Zycsf0

• ODSC blog: https://opendatascience.com/

• Facebook: https://www.facebook.com/OPENDATASCI

• Twitter: https://twitter.com/_ODSC & @odsc

• LinkedIn: https://www.linkedin.com/company/open-data-science

• Slack Channel: https://hubs.li/Q01_Yrgb0

• Code of conduct: https://odsc.com/code-of-conduct/

Preparing for your First Enterprise Large Language Model (LLM) Application

To access this webinar, please register here: https://hubs.li/Q01_8X6V0

Topic: “Preparing for your First Enterprise Large Language Model (LLM) Application”

Speaker#1: Nicolas Decavel-Bueff, Data Science Consultant at Pandata

He is an SF-based Data Scientist who has delivered valuable solutions across a broad spectrum of industries. His accomplishments range from employing natural language processing models in logistics to leading teams in the development of vital models in the utility sector. His work with diverse tools has consistently created quantifiable business impact. Equipped with a Master's in Data Science from the University of San Francisco, Nicolas blends academic rigor and practical experience to address complex business challenges.

Speaker#2: Parham Parvizi, Founder of Data Stack Academy and Tura.io

Parham is a founding member of Tura.io and DataStack.Academy. Tura is a group of professional Cloud Data Engineers and Architects while Data Stack Academy is the most comprehensive Data Engineering bootcamp; training the future of Cloud Data Engineers. In his 20 years as Data Engineer and Cloud/Big Data Solution Architect, he has been an Apache Software Foundation contributor and an early adopter and contributor to open source Big Data projects as Map Reduce and Hive. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. As a Data Advisor and consultant, Parham’s has had the opportunity and pleasure to work with nearly every fortune 100 company over the years. From managing thousands node clusters to optimizing data task that you are familiar with behind the scenes.

Speaker#3: Cal Al-Dhubaib, Founder & AI Strategist at Pandata

Cal is a globally recognized data scientist, entrepreneur, and innovator in responsible artificial intelligence, specializing in high-risk sectors such as healthcare, energy, and defense. He is the founder and CEO of Pandata, a consulting company that helps organizations to design and develop AI-driven solutions for complex business challenges, with an emphasis on responsible AI. Their clients include globally recognized organizations like the Cleveland Clinic, Progressive Insurance, University Hospitals, and Parker Hannifin. Cal frequently speaks on topics including AI ethics, change management, data literacy, and the unique challenges of implementing AI solutions in high-risk industries. His insights have been featured in noteworthy publications such as Forbes, Ohiox, the Marketing AI Institute, Open Data Science, and AI Business News. Cal has also received recognition among Crain’s Cleveland Notable Immigrant Leaders, Notable Entrepreneurs, and most recently, Notable Technology Executives.

Abstract:

Amid the growing accessibility of performant Large Language Models (LLMs) like GPT-4 and Llama-2, and a burgeoning range of commercial licenses, enterprises are now forging the first wave of LLM-driven applications. This session will begin by addressing the challenges with LLMs such as defining clear success criteria for an LLM project and understanding different approaches to LLM work.

We’ll further address technical challenges such as memory handling, input quality control ("garbage in, garbage out"), and the complexities of embeddings. A comparison of different approaches from using Retrieval-Augmented Generation (RAG) to training or fine-tuning on top of a variety of different LLMs.

Through real-world examples we will discuss practical approaches to risk management, ethical considerations, and the ideal team composition for an LLM project. In addition, we’ll discuss a variety of tools that form the modern LLM application development stack. Join us to demystify LLM applications and equip your organization with the knowledge to succeed.

ODSC Links:

• Get free access to more talks/trainings like this at Ai+ Training platform:

https://hubs.li/H0Zycsf0

• ODSC blog: https://opendatascience.com/

• Facebook: https://www.facebook.com/OPENDATASCI

• Twitter: https://twitter.com/_ODSC & @odsc

• LinkedIn: https://www.linkedin.com/company/open-data-science

• Slack Channel: https://hubs.li/Q01_Yrgb0

• Code of conduct: https://odsc.com/code-of-conduct/

Preparing for your First Enterprise Large Language Model (LLM) Application

To access this webinar, please register here: https://hubs.li/Q01_8X6V0

Topic: “Preparing for your First Enterprise Large Language Model (LLM) Application”

Speaker#1: Nicolas Decavel-Bueff, Data Science Consultant at Pandata

He is an SF-based Data Scientist who has delivered valuable solutions across a broad spectrum of industries. His accomplishments range from employing natural language processing models in logistics to leading teams in the development of vital models in the utility sector. His work with diverse tools has consistently created quantifiable business impact. Equipped with a Master's in Data Science from the University of San Francisco, Nicolas blends academic rigor and practical experience to address complex business challenges.

Speaker#2: Parham Parvizi, Founder of Data Stack Academy and Tura.io

Parham is a founding member of Tura.io and DataStack.Academy. Tura is a group of professional Cloud Data Engineers and Architects while Data Stack Academy is the most comprehensive Data Engineering bootcamp; training the future of Cloud Data Engineers. In his 20 years as Data Engineer and Cloud/Big Data Solution Architect, he has been an Apache Software Foundation contributor and an early adopter and contributor to open source Big Data projects as Map Reduce and Hive. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. As a Data Advisor and consultant, Parham’s has had the opportunity and pleasure to work with nearly every fortune 100 company over the years. From managing thousands node clusters to optimizing data task that you are familiar with behind the scenes.

Speaker#3: Cal Al-Dhubaib, Founder & AI Strategist at Pandata

Cal is a globally recognized data scientist, entrepreneur, and innovator in responsible artificial intelligence, specializing in high-risk sectors such as healthcare, energy, and defense. He is the founder and CEO of Pandata, a consulting company that helps organizations to design and develop AI-driven solutions for complex business challenges, with an emphasis on responsible AI. Their clients include globally recognized organizations like the Cleveland Clinic, Progressive Insurance, University Hospitals, and Parker Hannifin. Cal frequently speaks on topics including AI ethics, change management, data literacy, and the unique challenges of implementing AI solutions in high-risk industries. His insights have been featured in noteworthy publications such as Forbes, Ohiox, the Marketing AI Institute, Open Data Science, and AI Business News. Cal has also received recognition among Crain’s Cleveland Notable Immigrant Leaders, Notable Entrepreneurs, and most recently, Notable Technology Executives.

Abstract:

Amid the growing accessibility of performant Large Language Models (LLMs) like GPT-4 and Llama-2, and a burgeoning range of commercial licenses, enterprises are now forging the first wave of LLM-driven applications. This session will begin by addressing the challenges with LLMs such as defining clear success criteria for an LLM project and understanding different approaches to LLM work.

We’ll further address technical challenges such as memory handling, input quality control ("garbage in, garbage out"), and the complexities of embeddings. A comparison of different approaches from using Retrieval-Augmented Generation (RAG) to training or fine-tuning on top of a variety of different LLMs.

Through real-world examples we will discuss practical approaches to risk management, ethical considerations, and the ideal team composition for an LLM project. In addition, we’ll discuss a variety of tools that form the modern LLM application development stack. Join us to demystify LLM applications and equip your organization with the knowledge to succeed.

ODSC Links:

• Get free access to more talks/trainings like this at Ai+ Training platform:

https://hubs.li/H0Zycsf0

• ODSC blog: https://opendatascience.com/

• Facebook: https://www.facebook.com/OPENDATASCI

• Twitter: https://twitter.com/_ODSC & @odsc

• LinkedIn: https://www.linkedin.com/company/open-data-science

• Slack Channel: https://hubs.li/Q01YzHZw0

• Code of conduct: https://odsc.com/code-of-conduct/

Preparing for your First Enterprise Large Language Model (LLM) Application

To access this webinar, please register here: https://hubs.li/Q01_8X6V0

Topic: “Preparing for your First Enterprise Large Language Model (LLM) Application”

Speaker#1: Nicolas Decavel-Bueff, Data Science Consultant at Pandata

He is an SF-based Data Scientist who has delivered valuable solutions across a broad spectrum of industries. His accomplishments range from employing natural language processing models in logistics to leading teams in the development of vital models in the utility sector. His work with diverse tools has consistently created quantifiable business impact. Equipped with a Master's in Data Science from the University of San Francisco, Nicolas blends academic rigor and practical experience to address complex business challenges.

Speaker#2: Parham Parvizi, Founder of Data Stack Academy and Tura.io

Parham is a founding member of Tura.io and DataStack.Academy. Tura is a group of professional Cloud Data Engineers and Architects while Data Stack Academy is the most comprehensive Data Engineering bootcamp; training the future of Cloud Data Engineers. In his 20 years as Data Engineer and Cloud/Big Data Solution Architect, he has been an Apache Software Foundation contributor and an early adopter and contributor to open source Big Data projects as Map Reduce and Hive. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. As a Data Advisor and consultant, Parham’s has had the opportunity and pleasure to work with nearly every fortune 100 company over the years. From managing thousands node clusters to optimizing data task that you are familiar with behind the scenes.

Speaker#3: Cal Al-Dhubaib, Founder & AI Strategist at Pandata

Cal is a globally recognized data scientist, entrepreneur, and innovator in responsible artificial intelligence, specializing in high-risk sectors such as healthcare, energy, and defense. He is the founder and CEO of Pandata, a consulting company that helps organizations to design and develop AI-driven solutions for complex business challenges, with an emphasis on responsible AI. Their clients include globally recognized organizations like the Cleveland Clinic, Progressive Insurance, University Hospitals, and Parker Hannifin. Cal frequently speaks on topics including AI ethics, change management, data literacy, and the unique challenges of implementing AI solutions in high-risk industries. His insights have been featured in noteworthy publications such as Forbes, Ohiox, the Marketing AI Institute, Open Data Science, and AI Business News. Cal has also received recognition among Crain’s Cleveland Notable Immigrant Leaders, Notable Entrepreneurs, and most recently, Notable Technology Executives.

Abstract:

Amid the growing accessibility of performant Large Language Models (LLMs) like GPT-4 and Llama-2, and a burgeoning range of commercial licenses, enterprises are now forging the first wave of LLM-driven applications. This session will begin by addressing the challenges with LLMs such as defining clear success criteria for an LLM project and understanding different approaches to LLM work.

We’ll further address technical challenges such as memory handling, input quality control ("garbage in, garbage out"), and the complexities of embeddings. A comparison of different approaches from using Retrieval-Augmented Generation (RAG) to training or fine-tuning on top of a variety of different LLMs.

Through real-world examples we will discuss practical approaches to risk management, ethical considerations, and the ideal team composition for an LLM project. In addition, we’ll discuss a variety of tools that form the modern LLM application development stack. Join us to demystify LLM applications and equip your organization with the knowledge to succeed.

ODSC Links:

• Get free access to more talks/trainings like this at Ai+ Training platform:

https://hubs.li/H0Zycsf0

• ODSC blog: https://opendatascience.com/

• Facebook: https://www.facebook.com/OPENDATASCI

• Twitter: https://twitter.com/_ODSC & @odsc

• LinkedIn: https://www.linkedin.com/company/open-data-science

• Slack Channel: https://hubs.li/Q01_Yrgb0

• Code of conduct: https://odsc.com/code-of-conduct/

Preparing for your First Enterprise Large Language Model (LLM) Application

To access this webinar, please register here: https://hubs.li/Q01_8X6V0

Topic: “Preparing for your First Enterprise Large Language Model (LLM) Application”

Speaker#1: Nicolas Decavel-Bueff, Data Science Consultant at Pandata

He is an SF-based Data Scientist who has delivered valuable solutions across a broad spectrum of industries. His accomplishments range from employing natural language processing models in logistics to leading teams in the development of vital models in the utility sector. His work with diverse tools has consistently created quantifiable business impact. Equipped with a Master's in Data Science from the University of San Francisco, Nicolas blends academic rigor and practical experience to address complex business challenges.

Speaker#2: Parham Parvizi, Founder of Data Stack Academy and Tura.io

Parham is a founding member of Tura.io and DataStack.Academy. Tura is a group of professional Cloud Data Engineers and Architects while Data Stack Academy is the most comprehensive Data Engineering bootcamp; training the future of Cloud Data Engineers. In his 20 years as Data Engineer and Cloud/Big Data Solution Architect, he has been an Apache Software Foundation contributor and an early adopter and contributor to open source Big Data projects as Map Reduce and Hive. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. As a Data Advisor and consultant, Parham’s has had the opportunity and pleasure to work with nearly every fortune 100 company over the years. From managing thousands node clusters to optimizing data task that you are familiar with behind the scenes.

Speaker#3: Cal Al-Dhubaib, Founder & AI Strategist at Pandata

Cal is a globally recognized data scientist, entrepreneur, and innovator in responsible artificial intelligence, specializing in high-risk sectors such as healthcare, energy, and defense. He is the founder and CEO of Pandata, a consulting company that helps organizations to design and develop AI-driven solutions for complex business challenges, with an emphasis on responsible AI. Their clients include globally recognized organizations like the Cleveland Clinic, Progressive Insurance, University Hospitals, and Parker Hannifin. Cal frequently speaks on topics including AI ethics, change management, data literacy, and the unique challenges of implementing AI solutions in high-risk industries. His insights have been featured in noteworthy publications such as Forbes, Ohiox, the Marketing AI Institute, Open Data Science, and AI Business News. Cal has also received recognition among Crain’s Cleveland Notable Immigrant Leaders, Notable Entrepreneurs, and most recently, Notable Technology Executives.

Abstract:

Amid the growing accessibility of performant Large Language Models (LLMs) like GPT-4 and Llama-2, and a burgeoning range of commercial licenses, enterprises are now forging the first wave of LLM-driven applications. This session will begin by addressing the challenges with LLMs such as defining clear success criteria for an LLM project and understanding different approaches to LLM work.

We’ll further address technical challenges such as memory handling, input quality control ("garbage in, garbage out"), and the complexities of embeddings. A comparison of different approaches from using Retrieval-Augmented Generation (RAG) to training or fine-tuning on top of a variety of different LLMs.

Through real-world examples we will discuss practical approaches to risk management, ethical considerations, and the ideal team composition for an LLM project. In addition, we’ll discuss a variety of tools that form the modern LLM application development stack. Join us to demystify LLM applications and equip your organization with the knowledge to succeed.

ODSC Links:

• Get free access to more talks/trainings like this at Ai+ Training platform:

https://hubs.li/H0Zycsf0

• ODSC blog: https://opendatascience.com/

• Facebook: https://www.facebook.com/OPENDATASCI

• Twitter: https://twitter.com/_ODSC & @odsc

• LinkedIn: https://www.linkedin.com/company/open-data-science

• Slack Channel: https://hubs.li/Q01YzHZw0

• Code of conduct: https://odsc.com/code-of-conduct/

Preparing for your First Enterprise Large Language Model (LLM) Application

To access this webinar, please register here: https://hubs.li/Q01_8X6V0

Topic: “Preparing for your First Enterprise Large Language Model (LLM) Application”

Speaker#1: Nicolas Decavel-Bueff, Data Science Consultant at Pandata

He is an SF-based Data Scientist who has delivered valuable solutions across a broad spectrum of industries. His accomplishments range from employing natural language processing models in logistics to leading teams in the development of vital models in the utility sector. His work with diverse tools has consistently created quantifiable business impact. Equipped with a Master's in Data Science from the University of San Francisco, Nicolas blends academic rigor and practical experience to address complex business challenges.

Speaker#2: Parham Parvizi, Founder of Data Stack Academy and Tura.io

Parham is a founding member of Tura.io and DataStack.Academy. Tura is a group of professional Cloud Data Engineers and Architects while Data Stack Academy is the most comprehensive Data Engineering bootcamp; training the future of Cloud Data Engineers. In his 20 years as Data Engineer and Cloud/Big Data Solution Architect, he has been an Apache Software Foundation contributor and an early adopter and contributor to open source Big Data projects as Map Reduce and Hive. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. Prior to Tura Labs, he was a product manager at Pivotal and one of the initial members of Talend. As a Data Advisor and consultant, Parham’s has had the opportunity and pleasure to work with nearly every fortune 100 company over the years. From managing thousands node clusters to optimizing data task that you are familiar with behind the scenes.

Speaker#3: Cal Al-Dhubaib, Founder & AI Strategist at Pandata

Cal is a globally recognized data scientist, entrepreneur, and innovator in responsible artificial intelligence, specializing in high-risk sectors such as healthcare, energy, and defense. He is the founder and CEO of Pandata, a consulting company that helps organizations to design and develop AI-driven solutions for complex business challenges, with an emphasis on responsible AI. Their clients include globally recognized organizations like the Cleveland Clinic, Progressive Insurance, University Hospitals, and Parker Hannifin. Cal frequently speaks on topics including AI ethics, change management, data literacy, and the unique challenges of implementing AI solutions in high-risk industries. His insights have been featured in noteworthy publications such as Forbes, Ohiox, the Marketing AI Institute, Open Data Science, and AI Business News. Cal has also received recognition among Crain’s Cleveland Notable Immigrant Leaders, Notable Entrepreneurs, and most recently, Notable Technology Executives.

Abstract:

Amid the growing accessibility of performant Large Language Models (LLMs) like GPT-4 and Llama-2, and a burgeoning range of commercial licenses, enterprises are now forging the first wave of LLM-driven applications. This session will begin by addressing the challenges with LLMs such as defining clear success criteria for an LLM project and understanding different approaches to LLM work.

We’ll further address technical challenges such as memory handling, input quality control ("garbage in, garbage out"), and the complexities of embeddings. A comparison of different approaches from using Retrieval-Augmented Generation (RAG) to training or fine-tuning on top of a variety of different LLMs.

Through real-world examples we will discuss practical approaches to risk management, ethical considerations, and the ideal team composition for an LLM project. In addition, we’ll discuss a variety of tools that form the modern LLM application development stack. Join us to demystify LLM applications and equip your organization with the knowledge to succeed.

ODSC Links:

• Get free access to more talks/trainings like this at Ai+ Training platform:

https://hubs.li/H0Zycsf0

• ODSC blog: https://opendatascience.com/

• Facebook: https://www.facebook.com/OPENDATASCI

• Twitter: https://twitter.com/_ODSC & @odsc

• LinkedIn: https://www.linkedin.com/company/open-data-science

• Slack Channel: https://hubs.li/Q01YzHZw0

• Code of conduct: https://odsc.com/code-of-conduct/

Preparing for your First Enterprise Large Language Model (LLM) Application
Prineha Narang – co-founder and CTO @ Aliro , Tobias Macey – host

Summary The next paradigm shift in computing is coming in the form of quantum technologies. Quantum procesors have gained significant attention for their speed and computational power. The next frontier is in quantum networking for highly secure communications and the ability to distribute across quantum processing units without costly translation between quantum and classical systems. In this episode Prineha Narang, co-founder and CTO of Aliro, explains how these systems work, the capabilities that they can offer, and how you can start preparing for a post-quantum future for your data systems.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days or even weeks. By the time errors have made their way into production, it’s often too late and damage is done. Datafold built automated regression testing to help data and analytics engineers deal with data quality in their pull requests. Datafold shows how a change in SQL code affects your data, both on a statistical level and down to individual rows and values before it gets merged to production. No more shipping and praying, you can now know exactly what will change in your database! Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold. Your host is Tobias Macey and today I’m interviewing Dr. Prineha Narang about her work at Aliro building quantum networking technologies and how it impacts the capabilities of data systems

Interview

Introduction How did you get involved in the area of data management? Can you describe what Aliro is and the story behind it? What are the use cases that you are focused on? What is the impact of quantum networks on distributed systems design? (what limitations does it remove?) What are the failure modes of quantum networks?

How do they differ from classical networks?

How can network technologies bridge between classical and quantum connections and where do those transitions happen?

What are the latency/bandwidth capacities of quantum networks? How does it influence the network protocols used during those communications?

How much error correction is necessary during the quantum communication stages of network transfers?

How does quantum computing technology change the landscape for AI technologies?

How does that impact the work of data engineers who are buildin

AI/ML Airflow Analytics CI/CD Data Engineering Data Management Data Quality Datafold dbt GitHub Kubernetes Looker Modern Data Stack Snowflake SQL
Pete Soderling – founder @ Data Council , Tobias Macey – host

Summary Data professionals are working in a domain that is rapidly evolving. In order to stay current we need access to deeply technical presentations that aren’t burdened by extraneous marketing. To fulfill that need Pete Soderling and his team have been running the Data Council series of conferences and meetups around the world. In this episode Pete discusses his motivation for starting these events, how they serve to bring the data community together, and the observations that he has made about the direction that we are moving. He also shares his experiences as an investor in developer oriented startups and his views on the importance of empowering engineers to launch their own companies.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. And for your machine learning workloads, they just announced dedicated CPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Listen, I’m sure you work for a ‘data driven’ company – who doesn’t these days? Does your company use Amazon Redshift? Have you ever groaned over slow queries or are just afraid that Amazon Redshift is gonna fall over at some point? Well, you’ve got to talk to the folks over at intermix.io. They have built the “missing” Amazon Redshift console – it’s an amazing analytics product for data engineers to find and re-write slow queries and gives actionable recommendations to optimize data pipelines. WeWork, Postmates, and Medium are just a few of their customers. Go to dataengineeringpodcast.com/intermix today and use promo code DEP at sign up to get a $50 discount! You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management.For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the O’Reilly AI conference, the Strata Data conference, the combined events of the Data Architecture Summit and Graphorum, and Data Council in Barcelona. Go to dataengineeringpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host is Tobias Macey and today I’m interviewing Pete Soderling about his work to build and grow a community for data professionals with the Data Council conferences and meetups, as well as his experiences as an investor in data oriented companies

Interview

Introduction How did you get involved in the area of data management? What was your original reason for focusing your efforts on fostering a community of data engineers?

What was the state of recognition in the industry for that role at the time that you began your efforts?

The current manifestation of your community efforts is in the form of the Data Council conferences and meetups. Previously they were known as Data Eng Conf and before that was Hakka Labs. Can you discuss the evolution of your efforts to grow this community?

How has the community itself changed and grown over the past few years?

Communities form around a huge variety of focal points. What are some of the complexities or challenges in building one based on something as nebulous as data? Where do you draw inspiration and direction for how to manage such a large and distributed community?

What are some of the most interesting/challenging/unexpected aspects of community management that you have encountered?

What are some ways that you have been surprised or delighted in your interactions with the data community? How do you approach sustainability of the Data Council community and the organization itself? The tagline that you have focused on for Data Council events is that they are no fluff, juxtaposing them against larger business oriented events. What are your guidelines for fulfilling that promise and why do you think that is an important distinction? In addition to your community building you are also an investor. How did you get involved in that side of your business and how does it fit into your overall mission? You also have a stated mission to help engineers build their own companies. In your opinion, how does an engineer led business differ from one that may be founded or run by a business oriented individual and why do you think that we need more of them?

What are the ways that you typically work to empower engineering founders or encourage them to create their own businesses?

What are some of the challenges that engineering founders face and what are some common difficulties or misunderstandings related to business?

What are your opinions on venture-backed vs. "lifestyle" or bootstrapped businesses?

What are the characteristics of a data business that you look at when evaluating a potential investment? What are some of the current industry trends that you are most excited by?

What are some that you find concerning?

What are your goals and plans for the future of Data Council?

Contact Info

@petesoder on Twitter LinkedIn @petesoder on Medium

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don’t forget to check out our other show, Podcast.init to learn about the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat

Links

Data Council Database Design For Mere Mortals Bloomberg Garmin 500 Startups Geeks On A Plane Data Council NYC 2019 Track Summary Pete’s Angel List Syndicate DataOps

Data Kitchen Episode DataOps Vs DevOps Episode

Great Expectations

Podcast.init Interview

Elementl Dagster

Data Council Presentation

Data Council Call For Proposals

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Support Data Engineering Podcast

AI/ML Analytics Big Data Dagster Data Engineering Data Management DataOps DevOps Marketing Python Redshift Data Streaming
Showing 12 results