talk-data.com talk-data.com

Topic

Big Data

data_processing analytics large_datasets

1217

tagged

Activity Trend

28 peak/qtr
2020-Q1 2026-Q1

Activities

1217 activities · Newest first

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. This week on Making Data Simple, we have Jean-Claude Mamou. Jean-Claude is the Chief Architect of Information Integration and Governance portfolio, this includes such products as Watson Knowledge Catalog and Datastage.   Show Notes 1:45 – Jean-Claude’s experience 5:15 – What are the industry challenges? 6:52 – Is there integration without governance? 9:49 – What is the new solution? 13:12 – Understanding your critical data 16:06 – Explain what IBM Satellite means 19:53 – Where does Cloud Pak for Data come into play? 24:57 – What technology can we use to avoid repetitive mistakes? 30:36 – Understanding critical data 33:52 – What is the number 1 data quality issue? 37:08 - How are you inspired and how do you figure your next innovation?  38:52 – Do you have a process you follow? Jean-Claude Mamou – LinkedIn Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Summary Organizations of all sizes are striving to become data driven, starting in earnest with the rise of big data a decade ago. With the never-ending growth in data sources and methods for aggregating and analyzing them, the use of data to direct the business has become a requirement. Randy Bean has been helping enterprise organizations define and execute their data strategies since before the age of big data. In this episode he discusses his experiences and how he approached the work of distilling them for his book "Fail Fast, Learn Faster". This is an entertaining and enlightening exploration of the business side of data with an industry veteran.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Struggling with broken pipelines? Stale dashboards? Missing data? If this resonates with you, you’re not alone. Data engineers struggling with unreliable data need look no further than Monte Carlo, the world’s first end-to-end, fully automated Data Observability Platform! In the same way that application performance monitoring ensures reliable software and keeps application downtime at bay, Monte Carlo solves the costly problem of broken data pipelines. Monte Carlo monitors and alerts for data issues across your data warehouses, data lakes, ETL, and business intelligence, reducing time to detection and resolution from weeks or days to just minutes. Start trusting your data with Monte Carlo today! Visit dataengineeringpodcast.com/impact today to save your spot at IMPACT: The Data Observability Summit a half-day virtual event featuring the first U.S. Chief Data Scientist, founder of the Data Mesh, Creator of Apache Airflow, and more data pioneers spearheading some of the biggest movements in data. The first 50 to RSVP with this link will be entered to win an Oculus Quest 2 — Advanced All-In-One Virtual Reality Headset. RSVP today – you don’t want to miss it! Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Your host is Tobias Macey and today I’m interviewing Randy Bean about his recent book focusing on the use of big data and AI for informing data driven business leadership

Interview

Introduction How did you get involved in the area of data management? Can you start by discussing the focus of the book and what motivated you to write it?

Who is the intended audience, and how did that inform the tone and content?

Businesses and their officers have been aiming to be "data driven" for years. In your experience, what are the concrete goals that are implied by that term?

What are the barriers that organizations encounter in the pursuit of those goals? How have the success rates (real and imagined) shifted in recent years as the level of sophisticatio

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. This week on Making Data Simple, we have Russ Krajec. Russ is a “recovering patent attorney” and believes IP can be used as a financial instrument.  He’s the author of Investing in Patents and one of IAM’s Top 300 Patent Strategists.  Russ is CEO of BlueIron, he finances the cost of patent portfolios, insures IP portfolios for enforcement and defense, and provides loans using IP as collateral. Show Notes 1:28 – Russ’s intro 4:51 – How is your business monetized? 6:17 - Shark Tank chat 11:30 – What does BlueIron provide? 14:31 – What data do you use to say ‘this is the next big thing’ 23:24 – How many clients do you work with? 25:50 – Common myths 33:09 – Number one mistake? 39:41 – How do you choose a patent attorney? BlueIron [email protected] Russ Krajec - LinkedIn Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Storage Systems

Storage Systems: Organization, Performance, Coding, Reliability and Their Data Processing was motivated by the 1988 Redundant Array of Inexpensive/Independent Disks proposal to replace large form factor mainframe disks with an array of commodity disks. Disk loads are balanced by striping data into strips—with one strip per disk— and storage reliability is enhanced via replication or erasure coding, which at best dedicates k strips per stripe to tolerate k disk failures. Flash memories have resulted in a paradigm shift with Solid State Drives (SSDs) replacing Hard Disk Drives (HDDs) for high performance applications. RAID and Flash have resulted in the emergence of new storage companies, namely EMC, NetApp, SanDisk, and Purestorage, and a multibillion-dollar storage market. Key new conferences and publications are reviewed in this book.The goal of the book is to expose students, researchers, and IT professionals to the more important developments in storage systems, while covering the evolution of storage technologies, traditional and novel databases, and novel sources of data. We describe several prototypes: FAWN at CMU, RAMCloud at Stanford, and Lightstore at MIT; Oracle's Exadata, AWS' Aurora, Alibaba's PolarDB, Fungible Data Center; and author's paper designs for cloud storage, namely heterogeneous disk arrays and hierarchical RAID. Surveys storage technologies and lists sources of data: measurements, text, audio, images, and video Familiarizes with paradigms to improve performance: caching, prefetching, log-structured file systems, and merge-trees (LSMs) Describes RAID organizations and analyzes their performance and reliability Conserves storage via data compression, deduplication, compaction, and secures data via encryption Specifies implications of storage technologies on performance and power consumption Exemplifies database parallelism for big data, analytics, deep learning via multicore CPUs, GPUs, FPGAs, and ASICs, e.g., Google's Tensor Processing Units

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. This week on Making Data Simple, we have Davit Buniatyan. Davit is founding CEO of Activeloop, he started his PhD at Princeton University, his research involved reconstructing the connectome of the mouse brain. In this research he dealt with large-scale unstructured data which was extremely expensive (amounting to millions of dollars) to manage. Later on, he realized that this problem is a real pain point, not only in the lab setting but also for many companies across industries. This made him think of a radically more efficient, and a machine-learning native way to work with data. The idea of changing how an ML team can create and manage datasets got him into Y Combinator, where he started Activeloop, a startup that has attracted the investment of prominent Silicon Valley VC firms and angel investors, and the attention of the open-source community, with the framework trending number 1 in Python on GitHub worldwide earlier this year. Show Notes 2:05 – Davit’s experience 6:44 – What is your success criteria in the mouse connectome? 8:44 – What did you learn from this? 10:00 – Could this solve ALS? 13:19 – What is the problem you’re solving 17:17 – How do you prepare the data? 24:00 – Why are the naysayers wrong? 25:21 – What is the name of the technology? 31:19 – What problem have you not solved? 37:44 – What keeps you up at night? 38:42 – How are you finding talent?  41:06 – What do you do for fun? Activeloop Activeloop - Twitter Davit Buniatyan - LinkedIn Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. This week on Making Data Simple, we have Mark Gabrielson leads RegTech in Expert Labs which is a division of Services that is associated with Development. Mark joined IBM via the Informix acquisition. In the last 10 years Mark has been leading IBMs Commercial Payments Practice. Mark is currently working on projects involving Safer Payments and OpenPages.

Show Notes 4:50 – What’s your most favorite job? 7:58 – Describe RegTech  13:19 – Breaking down silos 20:56 – How do you do it with AI? 23:05 – What is OpenPages sweet spot? 26:50 – If I am a client how do I get started? 30:28 – Who are the competitors? Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Send us a text Hosted by Al Martin, VP, Data and AI Expert Services and Learning at IBM, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. This week on Making Data Simple, we have Lynne Snead. Lynne is the founder of Talent Evolution Systems, a behavioral analyst, consultant, training specialist, speaker, coach, Lynne has a back ground in Educational Psychology, and has specialized in organizational performance for over 20 years. Lynne is one of the original Franklin Covey co-authors, has a best seller, she created Franklin Covey’s signature Project Development process and programs, worked directly with Stephen Covey. 1:30 – Lynne talks about her background 5:40 – Lynne’s coaching specialty and mission statement 10:30 – Why don’t all leaders have coaches? 12:08 – Why do you differentiate corporate coaching from life coaching? 16:27 - Do you believe in the element of natural state?  18:32 – How many individuals have you coached? 19:49 – What constitutes a great leader? 20:55 – What are the common mistakes in a leader? 24:44 – Steer them back on track Lynne Snead - LinkedIn Talent Evolution Systems Lynne’s email: [email protected] Leadership and self deception  How will you measure your life? Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Azure Databricks Cookbook

Azure Databricks is a robust analytics platform that leverages Apache Spark and seamlessly integrates with Azure services. In the Azure Databricks Cookbook, you'll find hands-on recipes to ingest data, build modern data pipelines, and perform real-time analytics while learning to optimize and secure your solutions. What this Book will help me do Design advanced data workflows integrating Azure Synapse, Cosmos DB, and streaming sources with Databricks. Gain proficiency in using Delta Tables and Spark for efficient data storage and analysis. Learn to create, deploy, and manage real-time dashboards with Databricks SQL. Master CI/CD pipelines for automating deployments of Databricks solutions. Understand security best practices for restricting access and monitoring Azure Databricks. Author(s) None Raj and None Jaiswal are experienced professionals in the field of big data and analytics. They are well-versed in implementing Azure Databricks solutions for real-world problems. Their collaborative writing approach ensures clarity and practical focus. Who is it for? This book is tailored for data engineers, scientists, and big data professionals who want to apply Azure Databricks and Apache Spark to their analytics workflows. A basic familiarity with Spark and Azure is recommended to make the best use of the recipes provided. If you're looking to scale and optimize your analytics pipelines, this book is for you.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. This week on Making Data Simple, we have Jean-Georges Perrin, Director of Engineering at weexperience. Together, they discuss — and compare — Apache Spark and Hadoop, and explain what it means to hold the title of IBM Champion. Show Notes 02:07 - Connect with Jean-Georges Perrin on LinkedIn and Twitter, and check out his website. 13:14 - Check out Jean-Georges' book on Apache Spark. 24:38 - What does it mean to be an IBM Champion? Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts.

This week on Making Data Simple, we have Elo Umeh, from Terragon Africa’s fastest-growing enterprise marketing technology company. Terragon uses its on-demand marketing cloud platform, attribution software, and deep analytics capability to enable thoughtful, targeted omni-channel access to 100m+ mobile-first African consumers. Elo is the Founder and CEO at Terragon Group. Elo career has spanned over 15 years where he has worked in the mobile and digital media across East and West Africa. He was part of the founding team at Mtech Communications. Elo holds a global executive MBA from IESE business of school where he graduated at the top of his class. Elo also has a Bachelor’s degree in Business Administration from Lagos State University. Show Notes 4:02 – What keeps you going? 6:15 – Lets dive into Terragon 8:40 – Who are your customers? 11:06 – Define pre-paid 14:40 – What kind of incites and security are you providing? 20:37- What kind of technology is Terragon using? 23:16 – What was it about the smart phone that made you want to go out on your own? 26:10 – Who’s your biggest competitor?  28:20 – What’s next for Terragon? 31:01 – What are the biggest mistakes entrepreneurs make? Terragon  Elo Umeh - LinkedIn

Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Foundations of Data Intensive Applications

PEEK “UNDER THE HOOD” OF BIG DATA ANALYTICS The world of big data analytics grows ever more complex. And while many people can work superficially with specific frameworks, far fewer understand the fundamental principles of large-scale, distributed data processing systems and how they operate. In Foundations of Data Intensive Applications: Large Scale Data Analytics under the Hood, renowned big-data experts and computer scientists Drs. Supun Kamburugamuve and Saliya Ekanayake deliver a practical guide to applying the principles of big data to software development for optimal performance. The authors discuss foundational components of large-scale data systems and walk readers through the major software design decisions that define performance, application type, and usability. You???ll learn how to recognize problems in your applications resulting in performance and distributed operation issues, diagnose them, and effectively eliminate them by relying on the bedrock big data principles explained within. Moving beyond individual frameworks and APIs for data processing, this book unlocks the theoretical ideas that operate under the hood of every big data processing system. Ideal for data scientists, data architects, dev-ops engineers, and developers, Foundations of Data Intensive Applications: Large Scale Data Analytics under the Hood shows readers how to: Identify the foundations of large-scale, distributed data processing systems Make major software design decisions that optimize performance Diagnose performance problems and distributed operation issues Understand state-of-the-art research in big data Explain and use the major big data frameworks and understand what underpins them Use big data analytics in the real world to solve practical problems

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts.

This week on Making Data Simple, we have Anastasia Leng is CEO and Founder of Creative X. Anastasia previously worked at Google, in 2012 Anastasia started an e-Commerce business which then lead to Creative X.

Show Notes 4:13 – How much time do you spend on funding? 7:08 – Why do it again? 13:28 –Is this the ending days of Hatch or the early days of Creative X? 18:00 – How would you label your business? 23:38 – What technology are you using? 27:21 – Who is your target customer? 34:14 – Are there other competitors doing this today? 36:34 – Customer stories 38:38 – Are you using AI? Email - [email protected] Anastasia - LinkedIn  Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts.

This week on Making Data Simple, we have Trent Gray-Donald Distinguished Engineer, IBM Data and AI, Dakshi Agrawal IBM Fellow and CTO, IBM AI.  Trent Gray-Donald spend his first 16 years on manage language runtime, then moved over to Data and AI, and then Cloud Pak for Data.  Dakshi Agrawal joined IBM right after his Phd in IBM Research, then Dakshi moved into software development, and in the 6 years in AI.  Show Notes .15 - 5:23 - Repeat of introductions from Part 1 5:50 – What is AI Anywhere? 9:09 – Does it make our development more difficult? 11:22 – Does data virtualization work? 15:31 - How do we get started with AI? 17:41 – Customer success stories Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts.

This week on Making Data Simple, we have Trent Gray-Donald Distinguished Engineer, IBM Data and AI, Dakshi Agrawal IBM Fellow and CTO, IBM AI.  Trent Gray-Donald spend his first 16 years on manage language runtime, then moved over to Data and AI, and then Cloud Pak for Data.  Dakshi Agrawal joined IBM right after his Phd in IBM Research, then Dakshi moved into software development, and in the 6 years in AI.  Show Notes 5:24 – Why is IBM Watson important? 10:28 – How does data fabric fit in? 15:25 – How would you describe the customer journey around data fabric? 17:10 – Is the ultimate destination AI? Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Data Engineering on Azure

Build a data platform to the industry-leading standards set by Microsoft’s own infrastructure. In Data Engineering on Azure you will learn how to: Pick the right Azure services for different data scenarios Manage data inventory Implement production quality data modeling, analytics, and machine learning workloads Handle data governance Using DevOps to increase reliability Ingesting, storing, and distributing data Apply best practices for compliance and access control Data Engineering on Azure reveals the data management patterns and techniques that support Microsoft’s own massive data infrastructure. Author Vlad Riscutia, a data engineer at Microsoft, teaches you to bring an engineering rigor to your data platform and ensure that your data prototypes function just as well under the pressures of production. You'll implement common data modeling patterns, stand up cloud-native data platforms on Azure, and get to grips with DevOps for both analytics and machine learning. About the Technology Build secure, stable data platforms that can scale to loads of any size. When a project moves from the lab into production, you need confidence that it can stand up to real-world challenges. This book teaches you to design and implement cloud-based data infrastructure that you can easily monitor, scale, and modify. About the Book In Data Engineering on Azure you’ll learn the skills you need to build and maintain big data platforms in massive enterprises. This invaluable guide includes clear, practical guidance for setting up infrastructure, orchestration, workloads, and governance. As you go, you’ll set up efficient machine learning pipelines, and then master time-saving automation and DevOps solutions. The Azure-based examples are easy to reproduce on other cloud platforms. What's Inside Data inventory and data governance Assure data quality, compliance, and distribution Build automated pipelines to increase reliability Ingest, store, and distribute data Production-quality data modeling, analytics, and machine learning About the Reader For data engineers familiar with cloud computing and DevOps. About the Author Vlad Riscutia is a software architect at Microsoft. Quotes A definitive and complete guide on data engineering, with clear and easy-to-reproduce examples. - Kelum Prabath Senanayake, Echoworx An all-in-one Azure book, covering all a solutions architect or engineer needs to think about. - Albert Nogués, Danone A meaningful journey through the Azure ecosystem. You’ll be building pipelines and joining components quickly! - Todd Cook, Appen A gateway into the world of Azure for machine learning and DevOps engineers. - Krzysztof Kamyczek, Luxoft

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. This week on Making Data Simple, we have Alex Watson. Alex was previously a GM at AWS and is currently a Co-Founder at Gretel.ai. Gretel is a privacy startup that enables developers, researchers, and scientists to quickly create safe versions of data for use in pre-production environments and machine learning workloads, which are shareable across teams and organizations. These tools address head-on the massive data privacy bottleneck--which has stifled innovation across multiple industries for years—by equipping builders everywhere with the ability to create quality datasets that scale. In short, synthetic data levels the playing field for everyone. This democratization of data will foster competition, scientific discoveries, and the inventions that will drive the next revolution of our data economy.  The company recently closed their series-A funding, led by Greylock, for another $12 million and brought Jason Warner, the current CTO for GitHub, on as an investor. Gretel also launched its latest public beta, Beta2, which offers privacy engineering as a service for everyone, not just developers. Show Notes 2:03 – Alex’s background 4:36 – What time frame was Harvest AI? 7:14 – How does NLP play into Harvest AI? 10:50 – How can we not have enough knowledge? 14:08 – Does the tech exist today for security? 18:14 – Privacy issues 20:42 – What does Gretel stand for? 27:42 – Do you increase the opportunity for bias? 31:18 – Where is the sweet spot for Gretel? 33:30 – When do synthetic not work? 37:42 – What is practical privacy? Gretel Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

SQL Server on Kubernetes: Designing and Building a Modern Data Platform

Build a modern data platform by deploying SQL Server in Kubernetes. Modern application deployment needs to be fast and consistent to keep up with business objectives and Kubernetes is quickly becoming the standard for deploying container-based applications, fast. This book introduces Kubernetes and its core concepts. Then it shows you how to build and interact with a Kubernetes cluster. Next, it goes deep into deploying and operationalizing SQL Server in Kubernetes, both on premises and in cloud environments such as the Azure Cloud. You will begin with container-based application fundamentals and then go into an architectural overview of a Kubernetes container and how it manages application state. Then you will learn the hands-on skill of building a production-ready cluster. With your cluster up and running, you will learn how to interact with your cluster and perform common administrative tasks. Once you can admin the cluster, you will learn how to deploy applications and SQL Server in Kubernetes. You will learn about high-availability options, and about using Azure Arc-enabled Data Services. By the end of this book, you will know how to set up a Kubernetes cluster, manage a cluster, deploy applications and databases, and keep everything up and running. What You Will Learn Understand Kubernetes architecture and cluster components Deploy your applications into Kubernetes clusters Manage your containers programmatically through API objects and controllers Deploy and operationalize SQL Server in Kubernetes Implement high-availability SQL Server scenarios on Kubernetes using Azure Arc-enabled Data Services Make use of Kubernetes deployments for Big Data Clusters Who This Book Is For DBAs and IT architects who are ready to begin planning their next-generation data platform and want to understand what it takes to run SQL Server in a container in Kubernetes. SQL Server on Kubernetes is an excellent choice for those who want to understand the big picture of why Kubernetes is the next-generation deployment method for SQL Server but also want to understand the internals, or the how, of deploying SQL Server in Kubernetes. When finished with this book, you will have the vision and skills to successfully architect, build and maintain a modern data platform deploying SQL Server on Kubernetes.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts.

This week on Making Data Simple, we have Neil Gilbert Siegel. Neil has lead the creation of a large number of successful military intelligence and commercial systems, this includes the US Blue Force tracker, Neil, has had a number of advances in consumer electronics and health care, a number of patents. Neil also has a number of wards including the US National Academy of Engineering and a Fellow of the National Academy of Inventors. And finally an author of Engineering Project Management.   

Show Notes 2:49 – Why is all data wrong? 9:00 – How do you fix wrong data? 23:17 – Where did you get your love for engineering?

Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Nassim Nicholas Taleb – Fooled by Randomness - Black Swan Engineering Project Management NeilSiegel.usc.edu Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Designing Big Data Platforms

DESIGNING BIG DATA PLATFORMS Provides expert guidance and valuable insights on getting the most out of Big Data systems An array of tools are currently available for managing and processing data—some are ready-to-go solutions that can be immediately deployed, while others require complex and time-intensive setups. With such a vast range of options, choosing the right tool to build a solution can be complicated, as can determining which tools work well with each other. Designing Big Data Platforms provides clear and authoritative guidance on the critical decisions necessary for successfully deploying, operating, and maintaining Big Data systems. This highly practical guide helps readers understand how to process large amounts of data with well-known Linux tools and database solutions, use effective techniques to collect and manage data from multiple sources, transform data into meaningful business insights, and much more. Author Yusuf Aytas, a software engineer with a vast amount of big data experience, discusses the design of the ideal Big Data platform: one that meets the needs of data analysts, data engineers, data scientists, software engineers, and a spectrum of other stakeholders across an organization. Detailed yet accessible chapters cover key topics such as stream data processing, data analytics, data science, data discovery, and data security. This real-world manual for Big Data technologies: Provides up-to-date coverage of the tools currently used in Big Data processing and management Offers step-by-step guidance on building a data pipeline, from basic scripting to distributed systems Highlights and explains how data is processed at scale Includes an introduction to the foundation of a modern data platform Designing Big Data Platforms: How to Use, Deploy, and Maintain Big Data Systems is a must-have for all professionals working with Big Data, as well researchers and students in computer science and related fields.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Hosted by Al Martin, VP, IBM Expert Services Delivery, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts.

This week on Making Data Simple, we have Neil Gilbert Siegel. Neil has lead the creation of a large number of successful military intelligence and commercial systems, this includes the US Blue Force tracker, Neil, has had a number of advances in consumer electronics and health care, a number of patents. Neil also has a number of wards including the US National Academy of Engineering and a Fellow of the National Academy of Inventors. And finally an author of Engineering Project Management.   

Show Notes 2:20 – Tell us about IBM 3:03 – How do you describe yourself? 8:37 – Can you talk about the first US Army unmanned aerial vehicle? 11:13 – Can you give us some examples of the smartphone and tablet components? 18:30 – What’s your favorite invention?  20:20 – Why a book about Engineering and Management?

Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.