talk-data.com talk-data.com

Topic

Big Data

data_processing analytics large_datasets

1217

tagged

Activity Trend

28 peak/qtr
2020-Q1 2026-Q1

Activities

1217 activities · Newest first

Sponsored by: Deloitte | Advancing AI in Cybersecurity with Databricks & Deloitte: Data Management & Analytics

Deloitte is observing a growing trend among cybersecurity organizations to develop big data management and analytics solutions beyond traditional Security Information and Event Management (SIEM) systems. Leveraging Databricks to extend these SIEM capabilities, Deloitte can help clients lower the cost of cyber data management while enabling scalable, cloud-native architectures. Deloitte helps clients design and implement cybersecurity data meshes, using Databricks as a foundational data lake platform to unify and govern security data at scale. Additionally, Deloitte extends clients’ cybersecurity capabilities by integrating advanced AI and machine learning solutions on Databricks, driving more proactive and automated cybersecurity solutions. Attendees will gain insight into how Deloitte is utilizing Databricks to manage enterprise cyber risks and deliver performant and innovative analytics and AI insights that traditional security tools and data platforms aren’t able to deliver.

Handbook of Decision Analysis, 2nd Edition

Qualitative and quantitative techniques to apply decision analysis to real-world decision problems, supported by sound mathematics, best practices, soft skills, and more With substantive illustrations based on the authors’ personal experiences throughout, Handbook of Decision Analysis describes the philosophy, knowledge, science, and art of decision analysis. Key insights from decision analysis applications and behavioral decision analysis research are presented, and numerous decision analysis textbooks, technical books, and research papers are referenced for comprehensive coverage. This book does not introduce new decision analysis mathematical theory, but rather ensures the reader can understand and use the most common mathematics and best practices, allowing them to apply rigorous decision analysis with confidence. The material is supported by examples and solution steps using Microsoft Excel and includes many challenging real-world problems. Given the increase in the availability of data due to the development of products that deliver huge amounts of data, and the development of data science techniques and academic programs, a new theme of this Second Edition is the use of decision analysis techniques with big data and data analytics. Written by a team of highly qualified professionals and academics, Handbook of Decision Analysis includes information on: Behavioral decision-making insights, decision framing opportunities, collaboration with stakeholders, information assessment, and decision analysis modeling techniques Principles of value creation through designing alternatives, clear value/risk tradeoffs, and decision implementation Qualitative and quantitative techniques for each key decision analysis task, as opposed to presenting one technique for all decisions. Stakeholder analysis, decision hierarchies, and influence diagrams to frame descriptive, predictive, and prescriptive analytics decision problems to ensure implementation success Handbook of Decision Analysis is a highly valuable textbook, reference, and/or refresher for students and decision professionals in business, management science, engineering, engineering management, operations management, mathematics, and statistics who want to increase the breadth and depth of their technical and soft skills for success when faced with a professional or personal decision.

Security is often referred to as a big data problem, and the growing volume, variety, and nuance of security telemetry requires more sophistication and control than ever before. This session dives into the new data pipeline management capabilities of Google Security Operations. We’ll show how you can improve the way you route, reduce, redact, enrich, and transform security data to manage scale, reduce costs, satisfy compliance mandates, and, most importantly, drive better security outcomes.

Shifting Left in Banking: Enhancing Machine Learning Models through Proactive Data Quality | Abhi...

Shifting Left in Banking: Enhancing Machine Learning Models through Proactive Data Quality | Abhi Ghosh | Shift Left Data Conference 2025

Good Data and not Big Data is becoming more important in today's ecosystem. Machine Learning models rely on good quality data to make their model training more efficient and effective. We have traditionally applied Data Quality checks and balances in manual, centralized way, putting a lot of onus on our customers. Shifting Left Data Quality will bring the data quality checks closer to where data is being created, while preventing bad data from flowing downstream. Also auto-detecting, recommending and auto-enforcing data quality rules will make our customers job easier, while creating a more mature and robust data ecosystem.

Time Series Analysis with Spark

Time Series Analysis with Spark provides a practical introduction to leveraging Apache Spark and Databricks for time series analysis. You'll learn to prepare, model, and deploy robust and scalable time series solutions for real-world applications. From data preparation to advanced generative AI techniques, this guide prepares you to excel in big data analytics. What this Book will help me do Understand the core concepts and architectures of Apache Spark for time series analysis. Learn to clean, organize, and prepare time series data for big data environments. Gain expertise in choosing, building, and training various time series models tailored to specific projects. Master techniques to scale your models in production using Spark and Databricks. Explore the integration of advanced technologies such as generative AI to enhance predictions and derive insights. Author(s) Yoni Ramaswami, a Senior Solutions Architect at Databricks, has extensive experience in data engineering and AI solutions. With a focus on creating innovative big data and AI strategies across industries, Yoni authored this book to empower professionals to efficiently handle time series data. Yoni's approachable style ensures that both foundational concepts and advanced techniques are accessible to readers. Who is it for? This book is ideal for data engineers, machine learning engineers, data scientists, and analysts interested in enhancing their expertise in time series analysis using Apache Spark and Databricks. Whether you're new to time series or looking to refine your skills, you'll find both foundational insights and advanced practices explained clearly. A basic understanding of Spark is helpful but not required.

How do you make sense of massive, interconnected datasets across time? In this episode of Data Unchained, we sit down with Ben Steer, Founder and CTO of Pometry, to explore the power of temporal graph analytics, a revolutionary approach called, "Big Data, Small Box," and how data can help prevent fraud and black market trading.

DataUnchained #EnterpriseData #CIO #CTO #CISO #DataStrategy #DigitalTransformation #BigData #CloudComputing #GraphAnalytics #AI #MachineLearning #DataEngineering #DataSecurity #BusinessIntelligence #TechLeadership #TechInnovation #AIinBusiness #ITStrategy #CyberSecurity #HPC #CloudCostOptimization #DataScience #Podcast #TechPodcast #BusinessPodcast #DataPodcast #Innovation

Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US Hosted on Acast. See acast.com/privacy for more information.

If you want to build a strong career in data, this show is for you. We welcomed the new face of Mavens of Data, Kristen Kehrer, who shared her best advice for data professionals and those aspiring toward a data career. You'll leave the show with some actionable tips and some of the best career advice directly from one of our favorite data pros of all time. What You'll Learn: What you should focus on if you're trying to land your first job How to succeed once you are in that initial role How to think about building a successful career long-term   Register for free to be part of the next live session: https://bit.ly/3XB3A8b   About our guest: Kristen Kehrer has been providing innovative & practical statistical modeling solutions in the utilities, healthcare, and eCommerce sectors since 2010. Alongside her professional accomplishments, she achieved recognition as a LinkedIn Top Voice in Data Science & Analytics in 2018. Kristen is also the founder of Data Moves Me, LLC, and has previously served as a faculty member and subject matter expert at the Emeritus Institute of Management and UC Berkeley Ext.

 Kristen lights up on stage and has spoken at conferences like ODSC, DataScienceGO, BI+Analytics Conference, Boye Conference, and Big Data LDN, etc.

She holds a Master of Science degree in Applied Statistics from Worcester Polytechnic Institute and a Bachelor of Science degree in Mathematics.

datamovesme.com   Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter

Send us a text In this special replay episode of Making Data Simple, Al Martin sits down with Matt Cowell, CEO of QuantHub, to dive deep into data literacy, upskilling, and solving learning challenges. Matt shares his expertise on defining data fluency, the best ways to learn, and how organizations can close the data skill gap. From client use cases to leadership insights, this episode is packed with valuable takeaways for businesses and individuals navigating the data-driven world. Show Notes & Chapter Markers: ⏳ 2:25 – From SVP of Products to Data Learning Business 📊 3:48 – Defining Data Literacy 🎓 5:50 – Teaching the Products 🚧 7:36 – What’s Out of Scope? 🏢 12:50 – Client Use Case 💡 18:07 – Solving Learning Problems 📖 21:14 – What Does a Learning Plan Look Like? 🔍 25:08 – Defining Micro 🧠 30:20 – Best Ways to Learn 📈 33:14 – Measuring Success 💰 34:47 – Venture Capital Funding 🌟 36:10 – Fundamental Leadership Belief 🔑 38:24 – The Most Valuable Leadership Skill 🔗 Connect & Resources: QuantHubMatt Cowell on LinkedInBooks Mentioned: Monetizing Innovation, Ultra LearningConnect with the Team:🎤 Host: Al Martin🎬 Producers: Kate Mayne📩 Want to be a guest? Reach out to [email protected] and tell us why you should be next! 📢 Hashtags:

MakingDataSimple #DataLiteracy #Upskilling #AI #BigData #TechPodcast #Leadership #LearnData #QuantHub

Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Welcome back to another podcast episode of Data Unchained. Jon Toor, CMO of Cloudian, joins us at Super Computing 2024 to discuss the future of decentralized data management, the evolving landscape of AI-driven storage, and what the next steps look like for metadata and object storage.

DataUnchained #Supercomputing2024 #AI #GPUComputing #ObjectStorage #GPUDirect #Cloudian #Hammerspace #DataScience #MachineLearning #AIInfrastructure #DataStorage #TechPodcast #ArtificialIntelligence #SC24 #BigData #DataManagement

Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US Hosted on Acast. See acast.com/privacy for more information.

As AI continues to dominate industry conversations, the notion of AI readiness becomes a focal point for organizations. It's a multifaceted challenge that goes beyond technology, encompassing business processes and cultural shifts. For professionals, this means grappling with questions like: How do you choose the right AI projects that align with business goals? What skills and team structures are necessary to support AI initiatives? And how do you manage the change that comes with integrating AI into your operations? Venky Veeraraghavan is the Chief Product Officer at DataRobot. As CPO, Venky drives the definition and delivery of the DataRobot Enterprise AI Suite. Venky has twenty-five years of experience focusing on big data and AI as a product leader and technical consultant at top technology companies (Microsoft) and early-stage startups (Trilogy). In the episode, Richie and Venky Veeraraghavan explore AI readiness in organizations, the importance of aligning AI with business processes, the roles and skills needed for AI integration, the balance between building and buying AI solutions, the challenges of implementing AI-driven changes, and much more. Links Mentioned in the Show: DatarobotConnect with VenkySkill Track: Artificial Intelligence (AI) LeadershipRelated Episode: Aligning AI with Enterprise Strategy with Leon Gordon, CEO at Onyx DataAttend RADAR Skills Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

On this podcast episode of Data Unchained, David Cerf, Chief Data Evangelist for GRAU DATA GmbH, joins us to talk about how metadata is is playing a big part in accessing large data sets and helping researchers, scientists, and engineers make sense of billions of files at scale.

DataUnchained #TechPodcast #Supercomputing #AIInfrastructure #CloudComputing #DataManagement #Metadata #DataScience #BigData #HPC #HighPerformanceComputing #AIRevolution #DigitalTransformation #DataAnalytics #EnterpriseTech #FutureOfData #TechTalk #Innovation #CloudStorage #PodcastLife #TechInsights #ITInfrastructure #TechLeaders #MachineLearning #DataDriven

Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US Hosted on Acast. See acast.com/privacy for more information.

Where Data Science Meets Shrek: How BuzzFeed uses AI

By introducing a range of AI-enhanced products that amplify creativity and interactivity across our platforms, Buzzfeed has been able to connect with the largest global audience of young people online to cement its role as the defining digital media company of the AI era. Notably, some of Buzzfeed's most successful tools and content experiences thrive on the power of small, focused datasets. Still wondering how Shrek fits into the picture? You'll have to watch!

Video from: https://smalldatasf.com/

📓 Resources Big Data is Dead: https://motherduck.com/blog/big-data-... Small Data Manifesto: https://motherduck.com/blog/small-dat... Why Small Data?: https://benn.substack.com/p/is-excel-... Small Data SF: https://www.smalldatasf.com/

➡️ Follow Us LinkedIn: / motherduck
X/Twitter : / motherduck
Bluesky: motherduck.com Blog: https://motherduck.com/blog/


Discover how BuzzFeed's Data team, led by Gilad Cohen, harnesses AI for creative purposes, leveraging large language models (LLMs) and generative image capabilities to enhance content creation. This video explores how machine learning teams build tools to create new interactive media experiences, focusing on augmenting creative workflows rather than replacing jobs, allowing readers to participate more deeply in the content they consume.

We dive into the core data science problem of understanding what a piece of content is about, a crucial step for improving content recommendation systems. Learn why traditional methods fall short and how the team is constantly seeking smaller, faster, and more performant models. This exploration covers the evolution from earlier architectures like DistilBERT to modern, more efficient approaches for better content representation, clustering, and user personalization.

A key technique explored is the use of text embeddings, which are dense, low-dimensional vector representations of data. This video provides an accessible explanation of embeddings as a form of compressed knowledge, showing how BuzzFeed creates a unique vector for each article. This allows for simple vector math to find semantically similar content, forming a foundational infrastructure for powerful ranking and recommender systems.

Explore how BuzzFeed leverages generative image capabilities to create new interactive formats. The journey began with Midjourney experiments and evolved to building custom tools by fine-tuning a Stable Diffusion XL model using LORA (Low-Rank Approximation). This advanced technique provides greater control over image output, enabling the rapid creation of viral AI generators that respond to trending topics and allow for massive user engagement.

Finally, see a practical application of machine learning for content optimization. BuzzFeed uses its vast historical dataset from Bayesian A/B testing to train a model that predicts headline performance. By generating multiple headline candidates with an LLM like Claude and running them through this predictive model, they can identify the winning headline. This showcases how to use unique, in-house data to build powerful tools that improve click-through rates and drive engagement, pointing to a significant transformation in how media is created and consumed.

Welcome to Data Unchained, where we explore the decentralization of data and the cutting-edge technologies shaping the future of AI and HPC. Recorded live from Supercomputing 24 in Atlanta, Georgia, this episode features an in-depth conversation with Gary Grider, a leading technologist at Los Alamos National Laboratory, and host Molly Presley. Episode Highlights: - The evolution of storage systems: From early file systems to groundbreaking innovations like Lustre, HPSS, and NFS. - Overcoming storage challenges in massive-scale HPC and AI environments. - Insights into Los Alamos’ role in virtual nuclear testing and managing petabyte-scale simulations. - How Hammerspace Tier 0 technology is transforming local storage in compute nodes. - The convergence of AI and HPC: A look into standardizing infrastructure to support modern workloads. Gary shares his decades-long journey in storage innovation, the importance of standardized protocols like NFS, and the revolutionary impact of integrating compute and storage technologies to streamline workflows for industries beyond HPC.

DataUnchained #Supercomputing24 #HPC #AIWorkflows #DataStorage #DecentralizedData #NFS #LosAlamos #GaryGrider #BigData #ParallelComputing #Tier0Storage #AIInfrastructure #TechPodcast #Innovation #CloudComputing #MachineLearning #HybridCloud #MultiCloud #Supercomputing #TechInnovation #ArtificialIntelligence #HighPerformanceComputing #DataScience #ComputePower

Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US Hosted on Acast. See acast.com/privacy for more information.

Big Data, Data Mining and Data Science

Through the application of cutting-edge techniques like Big Data, Data Mining, and Data Science, it is possible to extract insights from massive datasets. These methodologies are crucial in enabling informed decision-making and driving transformative advancements across many fields, industries, and domains. This book offers an overview of latest tools, methods and approaches while also highlighting their practical use through various applications and case studies.

Welcome to Data Unchained, the podcast where we delve into the evolving world of decentralized data and workflows. Hosted by Molly Presley, this episode features a thought-provoking discussion with Matthew Shaxted, Co-Founder and CEO of Parallel Works, about the challenges and opportunities in hybrid and multi-cloud environments. Key Highlights: - The journey of Parallel Works: From HPC simulations to democratizing large-scale computing resources. - The convergence of HPC and AI infrastructure—how organizations are adapting to GPU-heavy workflows. - Overcoming decentralized data challenges: Solutions for application portability and cost-efficient workload management. The evolution of AI-driven task placement for seamless resource optimization. - Real-world insights into managing hybrid and multi-cloud workloads with cost controls and global namespaces. - Matthew also introduces ACTIVATE, Parallel Works' next-gen hybrid multi-cloud platform, and shares exciting announcements for the future, including advancements in Kubernetes integration and benchmarking AI task placement. Learn more about Parallel Works: https://parallel.works @parallel-works

dataunchained #DecentralizedData #HybridCloud #MultiCloud #HPC #AIWorkflows #ParallelWorks #DataManagement #CloudComputing #ArtificialIntelligence #DataInnovation #TechPodcast #BigData #MachineLearning #futureofai

Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US Hosted on Acast. See acast.com/privacy for more information.

Artificial Intelligence-Enabled Businesses

This book has a multidimensional perspective on AI solutions for business innovation and real-life case studies to achieve competitive advantage and drive growth in the evolving digital landscape. Artificial Intelligence-Enabled Businesses demonstrates how AI is a catalyst for change in business functional areas. Though still in the experimental phase, AI is instrumental in redefining the workforce, predicting consumer behavior, solving real-life marketing dynamics and modifications, recommending products and content, foreseeing demand, analyzing costs, strategizing, managing big data, enabling collaboration of cross-entities, and sparking new ethical, social and regulatory implications for business. Thus, AI can effectively guide the future of financial services, trading, mobile banking, last-mile delivery, logistics, and supply chain with a solution-oriented focus on discrete business problems. Furthermore, it is expected to educate leaders to act in an ever more accurate, complex, and sophisticated business environment with the combination of human and machine intelligence. The book offers effective, efficient, and strategically competent suggestions for handling new challenges and responsibilities and is aimed at leaders who wish to be more innovative. It covers the early stages of AI adoption by organizations across their functional areas and provides insightful guidance for practitioners in the suitable and timely adoption of AI. This book will greatly help to scale up AI by leveraging interdisciplinary collaboration with cross-functional, skill-diverse teams and result in a competitive advantage. Audience This book is for marketing professionals, organizational leaders, and researchers to leverage AI and new technologies across various business functions. It also fits the needs of academics, students, and trainers, providing insights, case studies, and practical strategies for driving growth in the rapidly evolving digital landscape.