talk-data.com talk-data.com

Topic

Big Data

data_processing analytics large_datasets

1217

tagged

Activity Trend

28 peak/qtr
2020-Q1 2026-Q1

Activities

1217 activities · Newest first

Angelika Postaremczak: Best Practices for Storing Data in BigQuery

Join Angelika Postaremczak in an enlightening session on 'Best Practices for Storing Data in BigQuery' and discover the keys to optimizing data storage for lightning-fast queries without breaking the bank! 🚀💾 Explore table design strategies, data partitioning, clustering, and resource management through Infrastructure as Code for maximizing the potential of cloud data storage. ☁️📊 #BigQuery #datastorage

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Zain Hasan: Getting Started With Vector Databases

Join Zain Hasan to dive into Vector Databases and revolutionize your search capabilities with machine learning. 🚀🔍 Learn how to harness the power of cloud-based data storage in this insightful session. 💡💻 #VectorDatabases #MachineLearning #datastorage

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Timothy Spann: Building Real-time Travel Alerts

Join Timothy Spann as he takes you on a journey of 'Building Real-time Travel Alerts' 🌍🚀. Learn how to construct a dynamic streaming application using Apache NiFi, Apache Kafka, and Apache Flink, ensuring optimal performance, productivity, and development simplicity in delivering timely travel advisories. 🌐🛫 #RealTimeAlerts #Streaming #ApacheStack

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Tim Frey: Active-Active Machine Learning Infrastructures For Training And Prediction

Discover how to ensure uninterrupted machine learning operations during challenging times with Tim Frey. 🤖🔄 Learn about active-active infrastructures for simultaneous training and prediction, and gain valuable insights from real-world examples. 💡🌐 #MachineLearning #ActiveInfrastructure

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Sohini Pattanayak: Why Do You Need an Explainable Ai?

Discover the importance of Explainable AI (XAI) with Sohini Pattanayak as she sheds light on why it's a crucial addition to the AI landscape. 🤖🔍 Learn how XAI demystifies complex AI algorithms, promotes transparency, and fosters trust. Explore real-world use cases and unlock the potential of accountable, ethical AI systems. 📊🧠 #ExplainableAI #AI #transparency

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Atiba de Souza: AI Prompt Engineering to Customer Engagement: the Evolution of Effective Content

Unlock the secrets of AI Prompt Engineering for Customer Engagement with Atiba de Souza. 🤖✍️ Explore how AI is revolutionizing content ideation and creation, enhancing your ability to connect effectively with your audience throughout the Customer Value Journey. 🚀📝 #AI #customerengagement

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Roberto Freato: Green Must Be Convenient

Join Roberto Freato in his session 'Green Must Be Convenient' as he unveils the evolution of database storage practices, demonstrating how Azure SQL Database efficiently manages vast amounts of JSON objects. Discover how this bridges the gap between raw data in a data lake and the relational view used by analytical applications. 🌱💾 #DatabaseStorage #azuresql

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Ricardo Sueiras: Getting Started with Apache Airflow - Building Your First Workflow

Embark on your data engineering journey with Ricardo Sueiras and learn the essentials of Apache Airflow in 'Getting Started with Apache Airflow - Building Your First Workflow.' 🚀🐍 Discover the architecture and create your first workflow, perfect for beginners eager to explore this open-source powerhouse! 💻📊 #ApacheAirflow #WorkflowCreation

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Pasha Finkelshteyn: Sparking Success: Unveiling the Journey of Apache Spark Application Development

Embark on a journey of Apache Spark application development with Pasha Finkelshteyn! 🚀 Explore the stages from concept to execution, delving into data exploration, transformation, and analysis powered by Spark's high-level APIs. 📊 Learn testing and validation approaches for accuracy and reliability, and empower yourself to create robust Spark applications for unlocking insights from massive datasets. 💡🔥 #ApacheSpark #BigData #DevelopmentJourney

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Kacper Łukawski: The Challenges of Making Vector Search Billion-scale

Join Kacper Łukawski as he delves into 'The Challenges of Making Vector Search Billion-scale.' 🔍🌐 Explore the intricacies of semantic search with large-scale embeddings and discover the lessons learned from scaling a vector database at Qdrant. Dive deep into design choices and the robust infrastructure behind them in this enlightening session.💡🚀 #VectorSearch #Scaling #semantics

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Jay Alexander Clifford: A 101 in Time Series Analytics with Apache Arrow, Pandas and Parquet

Join Jay Alexander Clifford in a deep dive into Time Series Analytics with Apache Arrow, Pandas, and Parquet. 📈🐍 Explore the power of columnar databases, and learn how to build efficient and scalable analytics applications for time series data using open-source tools. 🚀 #TimeSeries #analytics

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Antonio Zarauz Moreno, Mateo Alvarez Calvo: Towards Large-scale Speech Analytics Systems

Join Antonio Zarauz Moreno and Mateo Alvarez Calvo as they delve into the world of 'Towards Large-scale Speech Analytics Systems.' 🎙️🔍 Explore the complexities of building robust speech recognition systems, combining cutting-edge ASR algorithms, diarization, and more to optimize accuracy and cost-effectiveness in various domains. 🗣️📊 #SpeechAnalytics #asroma

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

Effective data management has become a cornerstone of success in our digital era. It involves not just collecting and storing information but also organizing, securing, and leveraging data to drive progress and innovation. Many organizations turn to tools like Snowflake for advanced data warehousing capabilities. However, while Snowflake enhances data storage and access, it's not a complete solution for all data management challenges. To address this, tools like Capital One’s Slingshot can be used alongside Snowflake, helping to optimize costs and refine data management strategies. Salim Syed is a VP, Head of engineering for Capital One Slingshot product. He led Capital One’s data warehouse migration to AWS and is a specialist in deploying Snowflake to a large enterprise. Salim’s expertise lies in developing Big Data (Lake) and Data Warehouse strategy on the public cloud. He leads an organization of more than 100 data engineers, support engineers, DBAs and full stack developers in driving enterprise data lake, data warehouse, data management and visualization platform services. Salim has more than 25 years of experience in the data ecosystem. His career started in data engineering where he built data pipelines and then moved into maintenance and administration of large database servers using multi-tier replication architecture in various remote locations. He then worked at CodeRye as a database architect and at 3M Health Information Systems as an enterprise data architect. Salim has been at Capital One for the past six years. In this episode, Adel and Salim explore cloud data management and the evolution of Slingshot into a major multi-tenant SaaS platform, the shift from on-premise to cloud-based data governance, the role of centralized tooling, strategies for effective cloud data management, including data governance, cost optimization, and waste reduction as well as insights into navigating the complexities of data infrastructure, security, and scalability in the modern digital era. Links Mentioned in the Show: Capital One SlingshotSnowflakeCourse: Introduction to Data WarehousingCourse: Introduction to Snowflake

What exactly is #DataManagement? How can it help us #monetize the data we have? And does this help with things like protecting #datasources? We discuss all these #questions and more on this latest #podcast episode of Data Unchained as #CEO and #CoFounder of Masthead Data, Yuliia Tkachova joins us to discus these topics and how her business is #disrupting the #industry.

datapipeline #datatables #bigdata #technology #hightech

Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US Hosted on Acast. See acast.com/privacy for more information.

Data Exploration and Preparation with BigQuery

In "Data Exploration and Preparation with BigQuery," Michael Kahn provides a hands-on guide to understanding and utilizing Google's powerful data warehouse solution, BigQuery. This comprehensive book equips you with the skills needed to clean, transform, and analyze large datasets for actionable business insights. What this Book will help me do Master the process of exploring and assessing the quality of datasets. Learn SQL for performing efficient and advanced data transformations in BigQuery. Optimize the performance of BigQuery queries for speed and cost-effectiveness. Discover best practices for setting up and managing BigQuery resources. Apply real-world case studies to analyze data and derive meaningful insights. Author(s) Michael Kahn is an experienced data engineer and author specializing in big data solutions and technologies. With years of hands-on experience working with Google Cloud Platform and BigQuery, he has assisted organizations in optimizing their data pipelines for effective decision-making. His accessible writing style ensures complex topics become approachable, enabling readers of various skill levels to succeed. Who is it for? This book is tailored for data analysts, data engineers, and data scientists who want to learn how to effectively use BigQuery for data exploration and preparation. Whether you're new to BigQuery or looking to deepen your expertise in working with large datasets, this book provides clear guidance and practical examples to achieve your goals.

Distributed Machine Learning with PySpark: Migrating Effortlessly from Pandas and Scikit-Learn

Migrate from pandas and scikit-learn to PySpark to handle vast amounts of data and achieve faster data processing time. This book will show you how to make this transition by adapting your skills and leveraging the similarities in syntax, functionality, and interoperability between these tools. Distributed Machine Learning with PySpark offers a roadmap to data scientists considering transitioning from small data libraries (pandas/scikit-learn) to big data processing and machine learning with PySpark. You will learn to translate Python code from pandas/scikit-learn to PySpark to preprocess large volumes of data and build, train, test, and evaluate popular machine learning algorithms such as linear and logistic regression, decision trees, random forests, support vector machines, Naïve Bayes, and neural networks. After completing this book, you will understand the foundational concepts of data preparation and machine learning and will have the skills necessary toapply these methods using PySpark, the industry standard for building scalable ML data pipelines. What You Will Learn Master the fundamentals of supervised learning, unsupervised learning, NLP, and recommender systems Understand the differences between PySpark, scikit-learn, and pandas Perform linear regression, logistic regression, and decision tree regression with pandas, scikit-learn, and PySpark Distinguish between the pipelines of PySpark and scikit-learn Who This Book Is For Data scientists, data engineers, and machine learning practitioners who have some familiarity with Python, but who are new to distributed machine learning and the PySpark framework.

Fundamentals of Data Science

Fundamentals of Data Science: Theory and Practice presents basic and advanced concepts in data science along with real-life applications. The book provides students, researchers and professionals at different levels a good understanding of the concepts of data science, machine learning, data mining and analytics. Users will find the authors’ research experiences and achievements in data science applications, along with in-depth discussions on topics that are essential for data science projects, including pre-processing, that is carried out before applying predictive and descriptive data analysis tasks and proximity measures for numeric, categorical and mixed-type data. The book's authors include a systematic presentation of many predictive and descriptive learning algorithms, including recent developments that have successfully handled large datasets with high accuracy. In addition, a number of descriptive learning tasks are included. Presents the foundational concepts of data science along with advanced concepts and real-life applications for applied learning Includes coverage of a number of key topics such as data quality and pre-processing, proximity and validation, predictive data science, descriptive data science, ensemble learning, association rule mining, Big Data analytics, as well as incremental and distributed learning Provides updates on key applications of data science techniques in areas such as Computational Biology, Network Intrusion Detection, Natural Language Processing, Software Clone Detection, Financial Data Analysis, and Scientific Time Series Data Analysis Covers computer program code for implementing descriptive and predictive algorithms

Google Cloud Platform for Data Science: A Crash Course on Big Data, Machine Learning, and Data Analytics Services

This book is your practical and comprehensive guide to learning Google Cloud Platform (GCP) for data science, using only the free tier services offered by the platform. Data science and machine learning are increasingly becoming critical to businesses of all sizes, and the cloud provides a powerful platform for these applications. GCP offers a range of data science services that can be used to store, process, and analyze large datasets, and train and deploy machine learning models. The book is organized into seven chapters covering various topics such as GCP account setup, Google Colaboratory, Big Data and Machine Learning, Data Visualization and Business Intelligence, Data Processing and Transformation, Data Analytics and Storage, and Advanced Topics. Each chapter provides step-by-step instructions and examples illustrating how to use GCP services for data science and big data projects. Readers will learn how to set up a Google Colaboratory account and run Jupyternotebooks, access GCP services and data from Colaboratory, use BigQuery for data analytics, and deploy machine learning models using Vertex AI. The book also covers how to visualize data using Looker Data Studio, run data processing pipelines using Google Cloud Dataflow and Dataprep, and store data using Google Cloud Storage and SQL. What You Will Learn Set up a GCP account and project Explore BigQuery and its use cases, including machine learning Understand Google Cloud AI Platform and its capabilities Use Vertex AI for training and deploying machine learning models Explore Google Cloud Dataproc and its use cases for big data processing Create and share data visualizations and reports with Looker Data Studio Explore Google Cloud Dataflow and its use cases for batch and stream data processing Run data processing pipelines on Cloud Dataflow Explore Google Cloud Storageand its use cases for data storage Get an introduction to Google Cloud SQL and its use cases for relational databases Get an introduction to Google Cloud Pub/Sub and its use cases for real-time data streaming Who This Book Is For Data scientists, machine learning engineers, and analysts who want to learn how to use Google Cloud Platform (GCP) for their data science and big data projects

Many organizations abandoned data modeling as they embraced big data and NoSQL. Now they find that data modeling continues to be important, perhaps more important today than ever before. With a fresh look you’ll see that today’s data modeling is different from past practices – much more than physical design for relational data. Published at: https://www.eckerson.com/articles/a-fresh-look-at-data-modeling-part-1-the-what-and-why-of-data-modeling