Big Data

Building Trust in the Age of Mis- and Disinformation

2024-09-19 · Big Data LDN 2024

Face To Face

by Manca Vitorino (LexisNexis Risk Solutions) , Kayt Button , Natascha Bowen , Isabella Grandi , Maddie Amos (The Crown Estate)

AI/ML Data Governance

Navigate the complexities of today's digital and data landscape in our panel discussion that underscores the essential role of data governance in the era of accelerating mis- and dis-information. As Big Data ceases to be a buzzword and transforms into the lifeblood of decision-making, governance is elevated from a regulatory compliance requirement to a differentiator and a beacon of trust. This session goes beyond exploration of governance as a data hygiene factor and delves into the relationship between governance, value creation, and elicitation of trust—particularly for decisions steered by AI as we travel into the world of automated risk management and decision making.

Databricks London Meetup

2024-09-18 · Exhibitors' Events - Auto created

Face To Face

by Ari Kaplan (Databricks) , Terry McCann (Advancing Analytics)

Databricks

Databricks London Meetup September 2024 @ Big Data London https://www.meetup.com/databricks-london/events/302510873/

Meetup: Databricks London Meetup September 2024 @ Big Data London

2024-09-18 · Big Data LDN 2024

Face To Face

by Simon Whiteley (Advancing Analytics)

Databricks

Details to be confirmed soon

Great Data Debate

2024-09-18 · Big Data LDN 2024

Face To Face

by Parul Mishra , Anupam Datta , Mike Ferguson (Big Data LDN) , Rahul Pathak , Stijn 'Stan' Christiaens (Collibra) , Marc Geall (SAP)

AI/ML Analytics Data Management GenAI

In this flagship Big Data LDN keynote debate, conference chair and leading industry analyst Mike Ferguson welcomes executives from leading software vendors to discuss key topics in data management and analytics. Panellists will debate the impact of Generative AI, the implications of key industry trends, how best to deal with real-world customer challenges, how to build a modern data and analytics (D&A) architecture, how to manage, produce, share and govern data and AI, and issues on-the-horizon that companies should be planning for today.

Attendees will learn best practices for data and analytics implementation in a modern data-driven enterprise from seasoned executives and an experienced industry analyst in a packed, unscripted, candid discussion.

How Do Leaders Build a Data Strategy? Do You Still Need One in 2024?

2024-09-18 · Big Data LDN 2024

Face To Face

by Sophie Valentine , Steve Higgon , David Crawford , Nik Walker

AI/ML

In today’s fast-evolving big data landscape, having a solid data strategy is crucial. Our session will explore how leaders develop and execute data strategies, and whether these strategies are still essential in 2024. We will cover the key elements of a successful data strategy, the necessity of adapting to technological advancements like AI and evolving data privacy regulations, and provide insights from industry leaders who have effectively navigated these changes. We'll discuss the future trajectory of data strategies, questioning if traditional approaches still hold value or if new paradigms are emerging. This session is perfect for technology leaders, data professionals, and anyone keen on harnessing data for business success.

Unlock the Secrets of Data-Driven Strategies That Drive Profit Growth

2024-09-18 · Big Data LDN 2024

Face To Face

by Barry Panayi (John Lewis Partnership) , Dr. Simon Jury , Emma Duckworth , Claire Williams (Capgemini) , Steph Bell

AI/ML Analytics Data Collection

Join us as we unlock the secrets of data-driven strategies that drive profit, loyalty, and hyper-personalised experiences, with Capgemini and a Women in Data leadership panel.

At this year’s Big Data London, Women in Data & Capgemini are back with another must-see panel, featuring a diverse and engaging group of female data leaders and their allies from across the Retail & CPG worlds. Last year’s session was one of the most oversubscribed events of the day, with standing room only, thanks to its thought-provoking and honest discussions. This year’s panel promises the same dynamic as they tackle the conundrum of balancing margin focus with rewarding customer loyalty and how data plays a key role.

The panellists, as well as sharing their own career journeys and experience, will explore how they’ve approached bold strategies that move beyond immediate profits to emphasise the long-term value of customer data and loyalty. They’ll explore how data, analytics & ai can uncover deep insights into customer behaviours and preferences, enabling brands to create personalised experiences and loyalty programs that boost engagement and build lasting trust.

The discussion will highlight the importance of seeing customer data as a strategic asset. By investing in data collection and analysis, companies can identify trends, predict future behaviours, and tailor their offerings to meet evolving customer needs. This approach can drive repeat business and increase customer lifetime value, ultimately leading to higher margins over time.

This year’s panel will explore the how data and boldness are key for a balanced strategy that blends margin management with a robust focus on customer loyalty. Using data smartly is key to achieving sustainable profit growth and strengthening brand loyalty. Don’t miss out on what promises to be an inspiring and insightful discussion!

Harnessing Big Data with AI: Opportunities, Challenges, and Ethical Considerations

2024-09-18 · Big Data LDN 2024

Face To Face

by Jon Cooke (Dataception) , Ravi Ramachandran (The GTM Firm & Co-Founder, Eidolon AI) , Steven Totman , Natalie Cramp

AI/ML Data Management

Our panel discussion explores the transformative potential of big data and AI across diverse industries. We will address key technical challenges such as data management and model scalability, along with ethical and privacy concerns, including data utilization and algorithmic biases. Looking ahead, we will discuss future trends and emerging AI breakthroughs. Emphasizing interdisciplinary collaboration, we advocate for diverse teams to ensure fairness and innovation in AI solutions. This dialogue aims to illuminate the complexities and opportunities in big data and AI.

Welcome to Big Data LDN 2024

2024-09-18 · Big Data LDN 2024

Face To Face

by Mike Ferguson (Big Data LDN)

AI/ML Analytics Data Management

In this short presentation, Big Data LDN Conference Chairman and Europe’s leading IT Industry Analyst in Data Management and Analytics, Mike Ferguson, will welcome everyone to Big Data LDN 2024. He will also summarise where companies are in data, analytics and AI in 2024, what the key challenges and trends are, how are these trends impacting on how companies build a data-driven enterprise and where you can find out more about these at the show.

Dutch-CTI: Verrijking Cyber Threat Intel met Big Data, ML en AI

2024-09-12 · Data Expo NL 2024

talk

by Aad van Boven

AI/ML

Statistics for Data Science and Analytics

2024-09-04 · O'Reilly Data Science Books O'Reilly Amazon

book

by Janet Dobbins , Peter C. Bruce , Peter Gedeck

AI/ML Analytics Data Science NumPy Pandas Python Scikit-learn SciPy data data-science data-science-tasks statistics

Introductory statistics textbook with a focus on data science topics such as prediction, correlation, and data exploration Statistics for Data Science and Analytics is a comprehensive guide to statistical analysis using Python, presenting important topics useful for data science such as prediction, correlation, and data exploration. The authors provide an introduction to statistical science and big data, as well as an overview of Python data structures and operations. A range of statistical techniques are presented with their implementation in Python, including hypothesis testing, probability, exploratory data analysis, categorical variables, surveys and sampling, A/B testing, and correlation. The text introduces binary classification, a foundational element of machine learning, validation of statistical models by applying them to holdout data, and probability and inference via the easy-to-understand method of resampling and the bootstrap instead of using a myriad of “kitchen sink” formulas. Regression is taught both as a tool for explanation and for prediction. This book is informed by the authors’ experience designing and teaching both introductory statistics and machine learning at Statistics.com. Each chapter includes practical examples, explanations of the underlying concepts, and Python code snippets to help readers apply the techniques themselves. Statistics for Data Science and Analytics includes information on sample topics such as: Int, float, and string data types, numerical operations, manipulating strings, converting data types, and advanced data structures like lists, dictionaries, and sets Experiment design via randomizing, blinding, and before-after pairing, as well as proportions and percents when handling binary data Specialized Python packages like numpy, scipy, pandas, scikit-learn and statsmodels—the workhorses of data science—and how to get the most value from them Statistical versus practical significance, random number generators, functions for code reuse, and binomial and normal probability distributions Written by and for data science instructors, Statistics for Data Science and Analytics is an excellent learning resource for data science instructors prescribing a required intro stats course for their programs, as well as other students and professionals seeking to transition to the data science field.

Sayle Matthews - Reflection on Google BigQuery Cost Changes After 1 Year

2024-09-02 · Straight Data Talk Listen

podcast_episode

by Yuliia Tkachova (Masthead Data) , Sayle Matthews (DoiT International)

BigQuery GCP

Sayle Matthews leads the North American GCP Data Practice at DoiT International. Over the past year and a half, he has focused almost exclusively on BigQuery, helping hundreds of GCP customers optimize their usage and solve some of their biggest 'Big Data' challenges. With extensive experience in Google BigQuery billing, we sat down to discuss the changes and, most importantly, the impact these changes have had on the market, as observed by Sayle while working with hundreds of clients of various sizes at DoiT. Sayle's LinkedIn page - https://www.linkedin.com/in/sayle-matthews-522a795/

Polars Cookbook

2024-08-23 · O'Reilly Data Science Books O'Reilly Amazon

book

by Yuki Kakegawa

Analytics Cloud Computing Data Analytics Microsoft NumPy Pandas Polars Python data data-science data-science-tools

Dive into the world of data analysis with the Polars Cookbook. This book, ideal for data professionals, covers practical recipes to manipulate, transform, and analyze data using the Python Polars library. You'll learn both the fundamentals and advanced techniques to build efficient and scalable data workflows. What this Book will help me do Master the basics of Python Polars including installation and setup. Perform complex data manipulation like pivoting, grouping, and joining. Handle large-scale time series data for accurate analysis. Understand data integration with libraries like pandas and numpy. Optimize workflows for both on-premise and cloud environments. Author(s) Yuki Kakegawa is an experienced data analytics consultant who has collaborated with companies such as Microsoft and Stanford Health Care. His passion for data led him to create this detailed guide on Polars. His expertise ensures you gain real-world, actionable insights from every chapter. Who is it for? This book is perfect for data analysts, engineers, and scientists eager to enhance their efficiency with Python Polars. If you are familiar with Python and tools like pandas but are new to Polars, this book will upskill you. Whether handling big data or optimizing code for performance, the Polars Cookbook has the guidance you need to succeed.

#237 Guardrails for the Future of AI with Viktor Mayer-Schönberger, Professor of Internet Governance and Regulation at the University of Oxford

2024-08-22 · DataFramed Listen

podcast_episode

by Viktor Mayer-Schönberger (Oxford Internet Institute, University of Oxford) , Richie (DataCamp)

AI/ML

Guardrails are not something we actively use in our day-to-day lives, they’re in place to keep us safe when we lack the control needed to keep us on course, and for that, they are essential. Navigating the complexities of decision-making in AI and data can be challenging, especially on a global scale when many are searching for any sort of competitive advantage. Every choice you make can have significant impacts, and having the right frameworks, ethics and guardrails in place are crucial. But how do you create systems that guide decisions without stifling creativity or flexibility? What practices can you employ to ensure your team consistently make better choices and flourish in the age of AI? Viktor Mayer-Schönberger is a distinguished Professor of Internet Governance and Regulation at the Oxford Internet Institute, University of Oxford. With a career spanning over decades, his research focuses on the role of information in a networked economy. He previously served on the faculty of Harvard’s Kennedy School of Government for ten years and has authored several influential books, including the award-winning “Delete: The Virtue of Forgetting in the Digital Age” and the international bestseller “Big Data.” Viktor founded Ikarus Software in 1986, where he developed Virus Utilities, Austria’s best-selling software product. He has been recognized as a Top-5 Software Entrepreneur in Austria and has served as a personal adviser to the Austrian Finance Minister on innovation policy. His work has garnered global attention, featuring in major outlets like the New York Times, BBC, and The Economist. Viktor is also a frequent public speaker and an advisor to governments, corporations, and NGOs on issues related to the information economy. In the episode, Richie and Viktor explore the definition of guardrails, characteristics of good guardrails, guardrails in business contexts, life-or-death decision-making, principles of effective guardrails, decision-making and cognitive bias, uncertainty in decision-making, designing guardrails, AI and the implementation of guardrails, and much more. Links Mentioned in the Show: Guardrails: Guiding Human Decisions in the Age of AI by Urs Gasser and Viktor Mayer-SchönbergerBook - The Checklist Manifesto by Atul GawandeConnect with ViktorCourse - AI EthicsRelated Episode: Making Better Decisions using Data & AI with Cassie Kozyrkov, Google's First Chief Decision ScientistRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

DuckDB in Action

2024-08-21 · O'Reilly Data Science Books O'Reilly Amazon

book

by Michael Simons , Mark Needham , Michael Hunger

Analytics API Cloud Computing CSV Data Analytics DuckDB DWH Java JSON Motherduck Neo4j Pandas +8 more

Dive into DuckDB and start processing gigabytes of data with ease—all with no data warehouse. DuckDB is a cutting-edge SQL database that makes it incredibly easy to analyze big data sets right from your laptop. In DuckDB in Action you’ll learn everything you need to know to get the most out of this awesome tool, keep your data secure on prem, and save you hundreds on your cloud bill. From data ingestion to advanced data pipelines, you’ll learn everything you need to get the most out of DuckDB—all through hands-on examples. Open up DuckDB in Action and learn how to: Read and process data from CSV, JSON and Parquet sources both locally and remote Write analytical SQL queries, including aggregations, common table expressions, window functions, special types of joins, and pivot tables Use DuckDB from Python, both with SQL and its "Relational"-API, interacting with databases but also data frames Prepare, ingest and query large datasets Build cloud data pipelines Extend DuckDB with custom functionality Pragmatic and comprehensive, DuckDB in Action introduces the DuckDB database and shows you how to use it to solve common data workflow problems. You won’t need to read through pages of documentation—you’ll learn as you work. Get to grips with DuckDB's unique SQL dialect, learning to seamlessly load, prepare, and analyze data using SQL queries. Extend DuckDB with both Python and built-in tools such as MotherDuck, and gain practical insights into building robust and automated data pipelines. About the Technology DuckDB makes data analytics fast and fun! You don’t need to set up a Spark or run a cloud data warehouse just to process a few hundred gigabytes of data. DuckDB is easily embeddable in any data analytics application, runs on a laptop, and processes data from almost any source, including JSON, CSV, Parquet, SQLite and Postgres. About the Book DuckDB in Action guides you example-by-example from setup, through your first SQL query, to advanced topics like building data pipelines and embedding DuckDB as a local data store for a Streamlit web app. You’ll explore DuckDB’s handy SQL extensions, get to grips with aggregation, analysis, and data without persistence, and use Python to customize DuckDB. A hands-on project accompanies each new topic, so you can see DuckDB in action. What's Inside Prepare, ingest and query large datasets Build cloud data pipelines Extend DuckDB with custom functionality Fast-paced SQL recap: From simple queries to advanced analytics About the Reader For data pros comfortable with Python and CLI tools. About the Authors Mark Needham is a blogger and video creator at @‌LearnDataWithMark. Michael Hunger leads product innovation for the Neo4j graph database. Michael Simons is a Java Champion, author, and Engineer at Neo4j. Quotes I use DuckDB every day, and I still learned a lot about how DuckDB makes things that are hard in most databases easy! - Jordan Tigani, Founder, MotherDuck An excellent resource! Unlocks possibilities for storing, processing, analyzing, and summarizing data at the edge using DuckDB. - Pramod Sadalage, Director, Thoughtworks Clear and accessible. A comprehensive resource for harnessing the power of DuckDB for both novices and experienced professionals. - Qiusheng Wu, Associate Professor, University of Tennessee Excellent! The book all we ducklings have been waiting for! - Gunnar Morling, Decodable

#252: The Ever-Shifting Operating Environment of the Data Professional

2024-08-20 · The Analytics Power Hour Listen

podcast_episode

by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , Katie Bauer (GlossGenius) , Moe Kiss (Canva) , Michael Helbling (Search Discovery)

AI/ML Data Lake SQL

Broadly writ, we're all in the business of data work in some form, right? It's almost like we're all swimming around in a big data lake, and our peers are swimming around it, too, and so are our business partners. There might be some HiPPOs and some SLOTHs splashing around in the shallow end, and the contours of the lake keep changing. Is lifeguarding…or writing SQL…or prompt engineering to get AI to write SQL…or identifying business problems a job or a skill? Does it matter? Aren't we all just trying to get to the Insights Water Slide? Katie Bauer, Head of Data at Gloss Genius and thought-provoker at Wrong But Useful, joined Michael, Julie, and Val for a much less metaphorically tortured exploration of the ever-shifting landscape in which the modern data professional operates. Or swims. Or sinks? For complete show notes, including links to items mentioned in this episode and a transcript of the show, visit the show page.

DataOps, Observability, and The Cure for Data Team Blues - Christopher Bergh

2024-08-15 · DataTalks.Club Listen

podcast_episode

by Johanna Berer (DataTalks.Club) , Christopher Bergh (DataKitchen)

Agile/Scrum AI/ML Analytics Chef Cloud Computing Data Engineering Data Science DataOps DevOps Hadoop Java Microsoft +1 more