talk-data.com talk-data.com

Topic

BigQuery

Google BigQuery

data_warehouse analytics google_cloud olap

315

tagged

Activity Trend

17 peak/qtr
2020-Q1 2026-Q1

Activities

315 activities · Newest first

Snowflake Recipes: A Problem-Solution Approach to Implementing Modern Data Pipelines

Explore Snowflake’s core concepts and unique features that differentiates it from industry competitors, such as, Azure Synapse and Google BigQuery. This book provides recipes for architecting and developing modern data pipelines on the Snowflake data platform by employing progressive techniques, agile practices, and repeatable strategies. You’ll walk through step-by-step instructions on ready-to-use recipes covering a wide range of the latest development topics. Then build scalable development pipelines and solve specific scenarios common to all modern data platforms, such as, data masking, object tagging, data monetization, and security best practices. Throughout the book you’ll work with code samples for Amazon Web Services, Microsoft Azure, and Google Cloud Platform. There’s also a chapter devoted to solving machine learning problems with Snowflake. Authors Dillon Dayton and John Eipe are both Snowflake SnowPro Core certified, specializing in data and digital services, and understand the challenges of finding the right solution to complex problems. The recipes in this book are based on real world use cases and examples designed to help you provide quality, performant, and secured data to solve business initiatives. What You’ll Learn Handle structured and un- structured data in Snowflake. Apply best practices and different options for data transformation. Understand data application development. Implement data sharing, data governance and security. Who This book Is For Data engineers, scientists and analysts moving into Snowflake, looking to build data apps. This book expects basic knowledge in Cloud (AWS or Azure or GCP), SQL and Python

Chloe Caron: How Well do LLMs Detect Anomalies in Your Data?

🌟 Session Overview 🌟

Session Name: How Well do LLMs Detect Anomalies in Your Data? Speaker: Chloe Caron Session Description: Data quality challenges can severely impact businesses, causing a reported average 12% revenue loss for US companies (according to an Experian report). In this talk, we will follow a journey into constructing an anomaly detector, exploring LLMs, prompt engineering, and data type impacts. Along this path, we will analyze the use of multiple tools, including OpenAI, BigQuery, and Mistral. By the end, you will have gained insights on how to boost the accuracy of your anomaly detector and strengthen your data quality strategy.

🚀 About Big Data and RPA 2024 🚀

Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨

📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP

💡 Stay Connected & Updated 💡

Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop!

🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT

Karim Wadie: Fail-Safe BigQuery: Disaster Recovery Automation Techniques

🌟 Session Overview 🌟

Session Name: Fail-Safe BigQuery: Disaster Recovery Automation Techniques Speaker: Karim Wadie Session Description: Disaster recovery planning is critical for business continuity against unforeseen events, the most frequent of which are human errors. To guard against this, organizations need to define a backup strategy for their BigQuery tables and execute it at scale. For that, we introduce BQ Backup Manager[1], an open-source solution developed by Google Consulting Services that acts as a framework for defining varying backup policies across the organization and automating their execution on thousands of tables.

Join our session to learn more about the framework's features, architecture, and how it can immediately benefit your organization or customer.

[1] https://github.com/GoogleCloudPlatform/bq-backup-manager

🚀 About Big Data and RPA 2024 🚀

Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨

📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP

💡 Stay Connected & Updated 💡

Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop!

🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT

Big Data is Dead: Long Live Hot Data 🔥

Over the last decade, Big Data was everywhere. Let's set the record straight on what is and isn't Big Data. We have been consumed by a conversation about data volumes when we should focus more on the immediate task at hand: Simplifying our work.

Some of us may have Big Data, but our quest to derive insights from it is measured in small slices of work that fit on your laptop or in your hand. Easy data is here— let's make the most of it.

📓 Resources Big Data is Dead: https://motherduck.com/blog/big-data-is-dead/ Small Data Manifesto: https://motherduck.com/blog/small-data-manifesto/ Small Data SF: https://www.smalldatasf.com/

➡️ Follow Us LinkedIn: https://linkedin.com/company/motherduck X/Twitter : https://twitter.com/motherduck Blog: https://motherduck.com/blog/


Explore the "Small Data" movement, a counter-narrative to the prevailing big data conference hype. This talk challenges the assumption that data scale is the most important feature of every workload, defining big data as any dataset too large for a single machine. We'll unpack why this distinction is crucial for modern data engineering and analytics, setting the stage for a new perspective on data architecture.

Delve into the history of big data systems, starting with the non-linear hardware costs that plagued early data practitioners. Discover how Google's foundational papers on GFS, MapReduce, and Bigtable led to the creation of Hadoop, fundamentally changing how we scale data processing. We'll break down the "big data tax"—the inherent latency and system complexity overhead required for distributed systems to function, a critical concept for anyone evaluating data platforms.

Learn about the architectural cornerstone of the modern cloud data warehouse: the separation of storage and compute. This design, popularized by systems like Snowflake and Google BigQuery, allows storage to scale almost infinitely while compute resources are provisioned on-demand. Understand how this model paved the way for massive data lakes but also introduced new complexities and cost considerations that are often overlooked.

We examine the cracks appearing in the big data paradigm, especially for OLAP workloads. While systems like Snowflake are still dominant, the rise of powerful alternatives like DuckDB signals a shift. We reveal the hidden costs of big data analytics, exemplified by a petabyte-scale query costing nearly $6,000, and argue that for most use cases, it's too expensive to run computations over massive datasets.

The key to efficient data processing isn't your total data size, but the size of your "hot data" or working set. This talk argues that the revenge of the single node is here, as modern hardware can often handle the actual data queried without the overhead of the big data tax. This is a crucial optimization technique for reducing cost and improving performance in any data warehouse.

Discover the core principles for designing systems in a post-big data world. We'll show that since only 1 in 500 users run true big data queries, prioritizing simplicity over premature scaling is key. For low latency, process data close to the user with tools like DuckDB and SQLite. This local-first approach offers a compelling alternative to cloud-centric models, enabling faster, more cost-effective, and innovative data architectures.

Coalesce 2024: Needle in the (data) stack: How Spotify powers Salesforce

Spotify has absurd quantities of data. This is a huge asset, but it makes it difficult to power their frontline partnership team in Salesforce with the relevant cuts of that data they need. After struggling with both ad-hoc solutions and Salesforce consultant-led solutions, they've landed on a flexible, secure, and automated data strategy: they use dbt and Hightouch to refine critical data in Google BigQuery, sync updated records to Salesforce, and then close the loop for intelligence and analytics.

They'll share their optimal solution, with no caveats, for the real, everyday data issues that many teams encounter at scale with Salesforce.

Speaker: Tim Leonard Sr Insights Manager Spotify

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Coalesce 2024: implementing real-time slowly changing dimensions (SCD) type 2 with dbt

Learn how Hostinger, a leading European provider of web hosting solutions, leverages dbt to implement slowly changing dimensions (SCD)...and not go bankrupt while doing it at scale. They'll cover how they used the Debezium connector to tackle the challenges of change data capture (CDC), and how leveraging dynamic incremental predicates allowed them to scale their solution at a fraction of the cost using BigQuery.

Speakers: Augustinas Karvelis Analytics Engineer Hostinger

Valentinas Mitalauskas Analytics Engineering Lead Hostinger

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

In today's data-driven landscape, the ability to efficiently harness the power of AI is crucial for businesses seeking to unlock valuable insights and drive innovation. This session will explore how BigQuery, Google Cloud's leading data warehouse solution, can accelerate your AI initiatives. Discover how BigQuery's serverless architecture, built-in machine learning capabilities, and seamless integration with Google Cloud's AI ecosystem empower you to build, train, and deploy ML models at scale. Whether you're a data scientist, engineer, or business leader, this session will provide you with actionable insights and strategies to supercharge your AI efforts with BigQuery.

Join us as we dive deep into the world of spatial analytics and discover how to leverage location data to its fullest potential. Whether you're working with massive datasets in a Spark environment, harnessing the power of cloud data warehouses like Snowflake, Redshift, or BigQuery, or analysing live data feeds in real time, this session will equip you with the tools and knowledge to:

• Uncover hidden patterns and trends in your data that traditional analytics might miss.

• Gain a deeper understanding of your customers and their behaviours.

• Optimise operations and improve efficiency.

• Make data-driven decisions with confidence.

Don't miss this opportunity to learn how spatial analytics can revolutionise your data platform and drive your business forward.

Sayle Matthews leads the North American GCP Data Practice at DoiT International. Over the past year and a half, he has focused almost exclusively on BigQuery, helping hundreds of GCP customers optimize their usage and solve some of their biggest 'Big Data' challenges. With extensive experience in Google BigQuery billing, we sat down to discuss the changes and, most importantly, the impact these changes have had on the market, as observed by Sayle while working with hundreds of clients of various sizes at DoiT. Sayle's LinkedIn page - https://www.linkedin.com/in/sayle-matthews-522a795/

Cost management is a continuous challenge for our data teams at Astronomer. Understanding the expenses associated with running our workflows is not always straightforward, and identifying which process ran a query causing unexpected usage on a given day can be time-consuming. In this talk, we will showcase an Airflow Plugin and specific DAGs developed and used internally at Astronomer to track and optimize the costs of running DAGs. Our internal tool monitors Snowflake query costs, provides insights, and sends alerts for abnormal usage. With it, Astronomer identified and refactored its most costly DAGs, resulting in an almost 25% reduction in Snowflake spending. We will demonstrate how to track Snowflake-related DAG costs and discuss how the tool can be adapted to any database supporting query tagging like BigQuery, Oracle, and more. This talk will cover the implementation details and show how Airflow users can effectively adopt this tool to monitor and manage their DAG costs.

Google Machine Learning and Generative AI for Solutions Architects

This book teaches solutions architects how to effectively design and implement AI/ML solutions utilizing Google Cloud services. Through detailed explanations, examples, and hands-on exercises, you will understand essential AI/ML concepts, tools, and best practices while building advanced applications. What this Book will help me do Build robust AI/ML solutions using Google Cloud tools such as TensorFlow, BigQuery, and Vertex AI. Prepare and process data efficiently for machine learning workloads. Establish and apply an MLOps framework for automating ML model lifecycle management. Implement cutting-edge generative AI solutions using best practices. Address common challenges in AI/ML projects with insights from expert solutions. Author(s) Kieran Kavanagh is a seasoned principal architect with nearly twenty years of experience in the tech industry. He has successfully led teams in designing, planning, and governing enterprise cloud strategies, and his wealth of experience is distilled into the practical approaches and insights in this book. Who is it for? This book is ideal for IT professionals aspiring to design AI/ML solutions, particularly in the role of solutions architects. It assumes a basic knowledge of Python and foundational AI/ML concepts but is suitable for both beginners and seasoned practitioners. If you're looking to deepen your understanding of state-of-the-art AI/ML applications on Google Cloud, this resource will guide you.

Data Engineering with Google Cloud Platform - Second Edition

Data Engineering with Google Cloud Platform is your ultimate guide to building scalable data platforms using Google Cloud technologies. In this book, you will learn how to leverage products such as BigQuery, Cloud Composer, and Dataplex for efficient data engineering. Expand your expertise and gain practical knowledge to excel in managing data pipelines within the Google Cloud ecosystem. What this Book will help me do Understand foundational data engineering concepts using Google Cloud Platform. Learn to build and manage scalable data pipelines with tools such as Dataform and Dataflow. Explore advanced topics like data governance and secure data handling in Google Cloud. Boost readiness for Google Cloud data engineering certification with real-world exam guidance. Master cost-effective strategies and CI/CD practices for data engineering on Google Cloud. Author(s) Adi Wijaya, the author of this book, is a Data Strategic Cloud Engineer at Google with extensive experience in data engineering and the Google Cloud ecosystem. With his hands-on expertise, he emphasizes practical solutions and in-depth knowledge sharing, guiding readers through the intricacies of Google Cloud for data engineering success. Who is it for? This book is ideal for data analysts, IT practitioners, software engineers, and data enthusiasts aiming to excel in data engineering. Whether you're a beginner tackling fundamental concepts or an experienced professional exploring Google Cloud's advanced capabilities, this book is designed for you. It bridges your current skills with modern data engineering practices on Google Cloud, making it a valuable resource at any stage of your career.

Discover the transformative synergy between SAP Datasphere and Google BigQuery, driving data insights. We'll explore Datasphere's transformation, integration, and data governance capabilities alongside Big Query’s scalability and real-time analytics process. Also learn how SAP GenAI Hub and Google Cloud accelerate AI initiatives and innovation. You will also hear real-world success stories on how businesses leverage this integration for tangible outcomes.

By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only. Please note: seating is limited and on a first-come, first served basis; standing areas are available

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

We are bringing Google’s research and innovations in artificial intelligence (AI) directly to your data in BigQuery. Join this session to learn about BigQuery’s built-in ML capabilities, such as model inferences, and how to use Gemini, Google's most capable and flexbile AI model yet, directly within BigQuery to simplify advanced use cases such as sentiment analysis, entity extraction, and many more.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

BigQuery Studio and BigFrames are a powerful combination for scalable data science and analytics. Unify data management, analysis, and collaboration with BigQuery Studio’s intuitive interface. Scale data science and machine learning with BigFrames’ powerful Python API. Get deeper insights, faster.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Learn about real-time AI-powered insights with BigQuery continuous queries, and how this new feature is poised to revolutionize data engineering by empowering event-driven and AI-driven data pipelines with Vertex AI, Pub/Sub, and Bigtable – all through the familiar language of Cloud SQL. Learn about how UPS was able to use big data on millions of shipped packages to reduce package theft, their work on more efficient claims processing, and why they are looking to BigQuery to accelerate time to insights and smarter business outcomes.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

IHG Hotels & Resorts, managing 6,500+ hotels across 19 brands, faced data challenges amidst rapid growth. Cognizant & Google Cloud helped modernize their infrastructure. Using Data Fusion & migration tools, 320TB of data moved from Teradata to BigQuery, enabling scalability, security, and advanced analytics. This laid the foundation for IHG's future growth and empowers data-driven decision-making. Please note: seating is limited and on a first-come, first served basis; standing areas are available. By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Businesses need to predict what customers want and create personalized experiences to gain a competitive advantage and drive revenue. They need to deliver customized, tailored interactions that increase customer acquisition, improve loyalty and increase satisfaction. Join Fullstory’s Head of Data Products to learn how Data + Engineering teams can supercharge tools like DialogFlow and BigQuery with unprecedented behavioral data to accurately forecast and create experiences that outpace the competition and keep customers coming back for more. By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

In the era of multimodal generative AI, a unified governance-focused data platform powered by Gemini becomes paramount. Join this session to learn how BigQuery fuels your data and AI lifecycle from training to inference, by unifying structured and unstructured data such as text, images and audio, while encompassing security and governance. Learn how Priceline is using BigQuery and Vertex AI to reinvent customer experiences and lead the industry in data and AI innovation.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Join us to learn how to activate the full potential of your data with AI in BigQuery. Take an in-depth look at how BigQuery's core integration with generative AI models like Gemini, coupled with its petabyte-scale analytics capabilities, enables new possibilities for gaining insights from your data. Learn how to derive insights from your untapped and unstructured data such as images, documents, and audio files, and explore BigQuery vector search and multi-modal embeddings, all powered by Google's industry-leading AI capabilities in BigQuery using simple Cloud SQL queries. You will also learn how Unilever is creating a data strategy that allows data teams to scale efficiently and rapidly experiment with AI models and gen AI use cases.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.