talk-data.com talk-data.com

Topic

Cyber Security

cybersecurity information_security data_security privacy

2078

tagged

Activity Trend

297 peak/qtr
2020-Q1 2026-Q1

Activities

2078 activities · Newest first

Empowering Healthcare Insights: A Unified Lakehouse Approach With Databricks

NHS England is revolutionizing healthcare research by enabling secure, seamless access to de-identified patient data through the Federated Data Platform (FDP). Despite vast data resources spread across regional and national systems, analysts struggle with fragmented, inconsistent datasets. Enter Databricks: powering a unified, virtual data lake with Unity Catalog at its core — integrating diverse NHS systems while ensuring compliance and security. By bridging AWS and Azure environments with a private exchange and leveraging the Iceberg connector to interface with Palantir, analysts gain scalable, reliable and governed access to vital healthcare data. This talk explores how this innovative architecture is driving actionable insights, accelerating research and ultimately improving patient outcomes.

GPU Accelerated Spark Connect

Spark Connect, first included for SQL/DataFrame API in Apache Spark 3.4 and recently extended to MLlib in 4.0, introduced a new way to run Spark applications over a gRPC protocol. This has many benefits, including easier adoption for non-JVM clients, version independence from applications and increased stability and security of the associated Spark clusters. The recent Spark Connect extension for ML also included a plugin interface to configure enhanced server-side implementations of the MLlib algorithms when launching the server. In this talk, we shall demonstrate how this new interface, together with Spark SQL’s existing plugin interface, can be used with NVIDIA GPU-accelerated plugins for ML and SQL to enable no-code change, end-to-end GPU acceleration of Spark ETL and ML applications over Spark Connect, with optimal performance up to 9x at 80% cost reduction compared to CPU baselines.

In today's rapidly evolving digital landscape, organizations must prioritize robust data architectures and AI strategies to remain competitive. In this session, we will explore how Procter & Gamble (P&G) has embarked on a transformative journey to digitize its operations via scalable data, analytics and AI platforms, establishing a strong foundation for data-driven decision-making and the emergence of agentic AI.Join us as we delve into the comprehensive architecture and platform initiatives undertaken at P&G to create scalable and agile data platforms unleashing BI/AI value. We will discuss our approach to implementing data governance and semantics, ensuring data integrity and accessibility across the organization. By leveraging advanced analytics and Business Intelligence (BI) tools, we will illustrate how P&G harnesses data to generate actionable insights at scale, all while maintaining security and speed.

Redesigning Kaizen's Cloud Data Lake for the Future

At Kaizen Gaming, data drives our decision-making, but rapid growth exposed inefficiencies in our legacy cloud setup — escalating costs, delayed insights and scalability limits. Operating in 18 countries with 350M daily transactions (1PB+), shared quotas and limited cost transparency hindered efficiency. To address this, we redesigned our cloud architecture with Data Landing Zones, a modular framework that decouples resources, enabling independent scaling and cost accountability. Automation streamlined infrastructure, reduced overhead and enhanced FinOps visibility, while Unity Catalog ensured governance and security. Migration challenges included maintaining stability, managing costs and minimizing latency. A phased approach, Delta Sharing, and DBx Asset Bundles simplified transitions. The result: faster insights, improved cost control and reduced onboarding time, fostering innovation and efficiency. We share our transformation, offering insights for modern cloud optimization.

Responsible AI at Scale: Balancing Democratization and Regulation in the Financial Sector

We partnered with Databricks to pioneer a new standard in financial sector's enterprise AI, balancing rapid AI democratization with strict regulatory and security requirements. At the core is our Responsible AI Gateway, enforcing jailbreak prevention and compliance on every LLM query. Real-time observability, powered by Databricks, calculates risk and accuracy metrics, detecting issues before escalation. Leveraging Databricks' model hosting ensures scalable LLM access, fortifying security and efficiency. We built frameworks to democratize AI without compromising guardrails. Operating in a regulated environment, we showcase how Databricks enables democratization and responsible AI at scale, offering best practices for financial organizations to harness AI safely and efficiently.

Using Identity Security With Unity Catalog for Faster, Safer Data Access

Managing authentication effectively is key to securing your data platform. In this session, we’ll explore best practices from Databricks for overcoming authentication challenges, including token visibility, MFA/SSO, CI/CD token federation and risk containment. Discover how to map your authentication maturity journey while maximizing security ROI. We'll showcase new capabilities like access token reports for improved visibility, streamlined MFA implementation and secure SSO with token federation. Learn strategies to minimize token risk through TTL limits, scoped tokens and network policies. You'll walk away with actionable insights to enhance your authentication practices and strengthen platform security on Databricks.

This course introduces learners to evaluating and governing GenAI (generative artificial intelligence) systems. First, learners will explore the meaning behind and motivation for building evaluation and governance/security systems. Next, the course will connect evaluation and governance systems to the Databricks Data Intelligence Platform. Third, learners will be introduced to a variety of evaluation techniques for specific components and types of applications. Finally, the course will conclude with an analysis of evaluating entire AI systems with respect to performance and cost. Pre-requisites: Familiarity with prompt engineering, and experience with the Databricks Data Intelligence Platform. Additionally, knowledge of retrieval-augmented generation (RAG) techniques including data preparation, embeddings, vectors, and vector databases Labs: Yes Certification Path: Databricks Certified Generative AI Engineer Associate

State Street Uses Databricks as a Cybersecurity Lakehouse for Threat Intelligence & Real-Time Alerts

Organizations face the challenge of managing vast amounts of data to combat emerging threats. The Databricks Data Intelligence platform represents a paradigm shift in cybersecurity at State Street, providing a comprehensive solution for managing and analyzing diverse security data. Through its partnership with Databricks, State Street has created a capability to: Efficiently manage structured and unstructured data. Scale up to analyze 50 petabytes of data in real-time. Ingest and parse data for critical security data streams. Build advanced cybersecurity data products and use automation & orchestration to streamline cybersecurity operations. By leveraging these capabilities, State Street has positioned itself as a leader in the financial services industry when it comes to cybersecurity.

Transforming Government With Data and AI: Singapore GovTech's Journey With Databricks

GovTech is an agency in the Singapore Government focused on tech for good. The GovTech Chief Data Office (CDO) has built the GovTech Data Platform with Databricks at the core. As the government tech agency, we safeguard national-level government and citizen data. A comprehensive data strategy is essential to uplifting data maturity. GovTech has adopted the service model approach where data services are offered to stakeholders based on their data maturity. Their maturity is uplifted through partnership, readying them for more advanced data analytics. CDO offers a plethora of data assets in a “data restaurant” ranging from raw data to data products, all delivered via Databricks and enabled through fine-grained access control, underpinned by data management best practices such as data quality, security and governance. Within our first year on Databricks, CDO was able to save 8,000 man-hours, democratize data across 50% of the agency and achieve six-figure savings through BI consolidation.

Information Security and Privacy Quick Reference

A fast, accurate, and up-to-date desk reference for information security and privacy practitioners everywhere Information security and privacy roles demand up-to-date knowledge coming from a seemingly countless number of sources, including several certifications—like the CISM, CIPP, and CISSP—legislation and regulations issued by state and national governments, guidance from local and industry organizations, and even international bodies, like the European Union. The Information Security and Privacy Quick Reference: The Essential Handbook for Every CISO, CSO, and Chief Privacy Officer is an updated, convenient, and accurate desk reference for information privacy practitioners who need fast and easy access to the latest guidance, laws, and standards that apply in their field. This book is the most effective resource for information security professionals who need immediate and correct solutions to common and rarely encountered problems. An expert team of writers—Joe Shelley, James Michael Stewart, and the bestselling technical author, Mike Chapple—draw on decades of combined technology and education experience to deliver organized and accessible coverage of: Security and Privacy Foundations Governance, Risk Management, and Compliance Security Architecture and Design Identity and Access Management Data Protection and Privacy Engineering Security and Privacy Incident Management Network Security and Privacy Protections Security Assessment and Testing Endpoint and Device Security Application Security Cryptography Essentials Physical and Environmental Security Legal and Ethical Considerations Threat Intelligence and Cyber Defense Business Continuity and Disaster Recovery Information Security and Privacy Quick Reference is a must-have resource for CISOs, CSOs, Chief Privacy Officers, and other information security and privacy professionals seeking a reliable, accurate, and fast way to answer the questions they encounter at work every single day.

Agentic Cyber Defense with External Threat Intelligence

This talk will detail how to integrate external threat intelligence data into an autonomous agentic AI system for proactive cybersecurity. Using real world datasets—including open-source threat feeds, security logs, or OSINT—you will learn how to build a data ingestion pipeline, train models with Python, and deploy agents that autonomously detect and mitigate cyber threats. This case study will provide practical insights into data preprocessing, feature engineering, and the challenges of adversarial conditions.

Not Another LLM Talk… Practical Lessons from Building a Real-World Adverse Media Pipeline

LLMs are magical—until they aren’t. Extracting adverse media entities might sound straightforward, but throw in hallucinations, inconsistent outputs, and skyrocketing API costs, and suddenly, that sleek prototype turns into a production nightmare.

Our adverse media pipeline monitors over 1 million articles a day, sifting through vast amounts of news to identify reports of crimes linked to financial bad actors, money laundering, and other risks. Thanks to GenAI and LLMs, we can tackle this problem in new ways—but deploying these models at scale comes with its own set of challenges: ensuring accuracy, controlling costs, and staying compliant in highly regulated industries.

In this talk, we’ll take you inside our journey to production, exploring the real-world challenges we faced through the lens of key personas: Cautious Claire, the compliance officer who doesn’t trust black-box AI; Magic Mike, the sales lead who thinks LLMs can do anything; Just-Fine-Tune Jenny, the PM convinced fine-tuning will solve everything; Reinventing Ryan, the engineer reinventing the wheel; and Paranoid Pete, the security lead fearing data leaks.

Expect practical insights, cautionary tales, and real-world lessons on making LLMs reliable, scalable, and production-ready. If you've ever wondered why your pipeline works perfectly in a Jupyter notebook but falls apart in production, this talk is for you.

Legacy systems slow organisations due to scale limits, risk, and cost. In this session, we will walkthrough how enterprises use Archon Data Store, our AI-powered archival platform, to create enterprise-wide data archival strategies for their legacy and modern data. Through use cases from finance, pharma and manufacturing - we will learn intelligent archival techniques that enable organisations to discover their data; align their past, present and future data footprint through decommissioning and migration while enhancing system performance, security and stewardship for their enterprise data.

Data architects are increasingly tasked with provisioning quality unstructured data to support AI models. However, little has been done to manage unstructured data beyond data security and privacy requirements. This session will look at what it takes to improve the quality of unstructured data and the emerging best practices in this space.

This session will share proven enterprise architecture best practices for augmenting Snowflake with data virtualization to deliver real-time insights. We'll explore how to address latency-sensitive use cases—such as month-end financial reconciliations—while ensuring data security and supporting cloud migration using Denodo. Attendees will learn how the combination of Snowflake and Denodo enables scalable, low-latency analytics across highly customized and distributed data environments.

As enterprises embrace GenAI and intelligent agents, securing sensitive data—like PII, financial records, and IP—while maintaining compliance is crucial. This session explores how Skyflow helps meet modern privacy demands, including India’s DPDP Act, using polymorphic encryption, tokenization, consent management, and fine-grained access controls. See real-world architectures that show how to embed privacy into both legacy and AI-first systems, enabling innovation without compromising security or regulatory compliance.

D&A leaders must develop DataOps as an essential practice to redefine their data management operations. This involves establishing business value before pursuing significant data engineering initiatives, and preventing duplicated efforts undertaken by different teams in managing the common metadata, security and observability of information assets within the data platforms.

How Maverick Data Built a One-Stop Business Management App in Sigma | The Data Apps Conference

As a data consulting firm helping clients solve their data challenges, Maverick Data faced its own operational inefficiencies with fragmented systems for time tracking, client management, project staffing, and financial reporting. Instead of continuing to juggle multiple disconnected applications, they decided to practice what they preach and build a unified solution.

In this session, Spencer Baucke (Co-founder) will demonstrate how Maverick Data built a comprehensive business operations app in Sigma to:

Centralize employee, client, and project management with appropriate role-based security controls Streamline time entry and automated invoice generation to eliminate manual processes Integrate financial data to create real-time projections and business insights Automate reporting with scheduled emails to ensure timely updates for team members By consolidating operations into a single data app, Maverick Data reduced software spend, gained unprecedented visibility into their business performance, and dramatically improved decision-making processes between leadership. The solution has inspired new client solutions based on their internal success.

➡️ Learn more about Data Apps: https://www.sigmacomputing.com/product/data-applications?utm_source=youtube&utm_medium=organic&utm_campaign=data_apps_conference&utm_content=pp_data_apps


➡️ Sign up for your free trial: https://www.sigmacomputing.com/go/free-trial?utm_source=youtube&utm_medium=video&utm_campaign=free_trial&utm_content=free_trial

sigma #sigmacomputing #dataanalytics #dataanalysis #businessintelligence #cloudcomputing #clouddata #datacloud #datastructures #datadriven #datadrivendecisionmaking #datadriveninsights #businessdecisions #datadrivendecisions #embeddedanalytics #cloudcomputing #SigmaAI #AI #AIdataanalytics #AIdataanalysis #GPT #dataprivacy #python #dataintelligence #moderndataarchitecture

Beyond the Bill: Gaining Granular Databricks Cost Insights with Data Apps | The Data Apps Conference

Managing cloud costs requires accurate resource tagging, but maintaining completeness and accuracy is a challenge. In this session, Mitchell Ertle (Senior Partner Solutions Architect) and Josue Bogran (Data & AI Architect) demonstrate how Sigma and Databricks combine to streamline FinOps and resource management with AI-driven cost attribution and workflow automation.

Through a practical demonstration, you'll see:

Identify and classify untagged Databricks pipelines with a cost attribution app Use GenAI from Databricks to suggest tags with human-in-the-loop approval Enable bidirectional data flow between Sigma and Databricks for real-time updates Automate workflows with Sigma’s actions framework Ensure security and governance by inheriting Unity Catalog permissions Discover why this combination is powerful—Sigma provides intuitive application building while Databricks delivers computation, AI/ML capabilities, and data storage. These platforms create solutions business users can interact with directly, without technical expertise.

Whether in data engineering, finance, or operations, learn how Sigma + Databricks can automate workflows, optimize costs, and drive business impact.

➡️ Learn more about Data Apps: https://www.sigmacomputing.com/product/data-applications?utm_source=youtube&utm_medium=organic&utm_campaign=data_apps_conference&utm_content=pp_data_apps


➡️ Sign up for your free trial: https://www.sigmacomputing.com/go/free-trial?utm_source=youtube&utm_medium=video&utm_campaign=free_trial&utm_content=free_trial

sigma #sigmacomputing #dataanalytics #dataanalysis #businessintelligence #cloudcomputing #clouddata #datacloud #datastructures #datadriven #datadrivendecisionmaking #datadriveninsights #businessdecisions #datadrivendecisions #embeddedanalytics #cloudcomputing #SigmaAI #AI #AIdataanalytics #AIdataanalysis #GPT #dataprivacy #python #dataintelligence #moderndataarchitecture