React

Observability for Distributed Computing with Dask

2023-04-18 · PyConDE & PyData Berlin 2023

talk

by Hendrik Makait

AI/ML Cloud Computing Data Engineering Data Science NumPy Pandas Python

Debugging is hard. Distributed debugging is hell.

Dask is a popular library for parallel and distributed computing in Python. Dask is commonly used in data science, actual science, data engineering, and machine learning to distribute workloads onto clusters of many hundreds of workers with ease.

However, when things go wrong life can become difficult due to all of the moving parts. These parts include your code, other PyData libraries like NumPy/pandas, the machines you’re running on, the network between them, storage, the cloud, and of course issues with Dask itself. It can be difficult to understand what is going on, especially when things seem slower than they should be or fail unexpectedly. Observability is the key to sanity and success.

In this talk, we describe the tools Dask offers to help you observe your distributed cluster, analyze performance, and monitor your cluster to react to unexpected changes quickly. We will dive into distributed logging, automated metrics, event-based monitoring, and root-causing problems with diagnostic tooling. Throughout the talk, we will leverage real-world use cases to show how these tools help to identify and solve problems for large-scale users in the wild.

This talk should be particularly insightful for Dask users, but the approaches to observing distributed systems should be relevant to anyone operating at scale in production.

Unlocking The Potential Of Streaming Data Applications Without The Operational Headache At Grainite

2023-03-25 · Data Engineering Podcast Listen

podcast_episode

by Ashish Kumar (Grainite) , Abhishek Chauhan (Grainite) , Tobias Macey

AI/ML Data Engineering Data Management Data Science JavaScript Modern Data Stack Python Data Streaming

Summary

The promise of streaming data is that it allows you to react to new information as it happens, rather than introducing latency by batching records together. The peril is that building a robust and scalable streaming architecture is always more complicated and error-prone than you think it's going to be. After experiencing this unfortunate reality for themselves, Abhishek Chauhan and Ashish Kumar founded Grainite so that you don't have to suffer the same pain. In this episode they explain why streaming architectures are so challenging, how they have designed Grainite to be robust and scalable, and how you can start using it today to build your streaming data applications without all of the operational headache.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Businesses that adapt well to change grow 3 times faster than the industry average. As your business adapts, so should your data. RudderStack Transformations lets you customize your event data in real-time with your own JavaScript or Python code. Join The RudderStack Transformation Challenge today for a chance to win a $1,000 cash prize just by submitting a Transformation to the open-source RudderStack Transformation library. Visit dataengineeringpodcast.com/rudderstack today to learn more Hey there podcast listener, are you tired of dealing with the headache that is the 'Modern Data Stack'? We feel your pain. It's supposed to make building smarter, faster, and more flexible data infrastructures a breeze. It ends up being anything but that. Setting it up, integrating it, maintaining it—it’s all kind of a nightmare. And let's not even get started on all the extra tools you have to buy to get it to do its thing. But don't worry, there is a better way. TimeXtender takes a holistic approach to data integration that focuses on agility rather than fragmentation. By bringing all the layers of the data stack together, TimeXtender helps you build data solutions up to 10 times faster and saves you 70-80% on costs. If you're fed up with the 'Modern Data Stack', give TimeXtender a try. Head over to dataengineeringpodcast.com/timextender where you can do two things: watch us build a data estate in 15 minutes and start for free today. Join in with the event for the global data community, Data Council Austin. From March 28-30th 2023, they'll play host to hundreds of attendees, 100 top speakers, and dozens of startups that are advancing data science, engineering and AI. Data Council attendees are amazing founders, data scientists, lead engineers, CTOs, heads of data, investors and community organizers who are all working together to build the future of data. As a listener to the Data Engineering Podcast you can get a special discount of 20% off your ticket by using the promo code dataengpod20. Don't miss out on their only event this year! Visit: dataengineeringpodcast.com/data-council today Your host is Tobias Macey and today I'm interviewing Ashish Kumar and Abhishek Chauhan about Grainite, a platform designed to give you a single place to build streaming data applications

Interview

Introduction How did you get involved in the area of data management? Can you describe what Grainite is and the story behind it? What are the personas that you are focused on addressing with Grainite? What are some of the most complex aspects of building streaming data applications in the absence of something like Grainite?

How does Grainite work to reduce that complexity?

What are some of the commonalities that you see in the teams/organizations that find their way to Grainite?

What are some of the higher-order projects that teams are able to build when they are using Grainite as a starting point vs. where they would be spending effort on a fully managed streaming architecture?

Can you describe how Grainite is architected?

How have the design and goals of the platform changed/evolved since you first started working on it?

Wh

102 - CDO Spotlight: The Non-Technical Roles Data Science and Analytics Teams Need to Drive Adoption of Data Products w/ Iván Herrero Bartolomé

2022-10-18 · Experiencing Data w/ Brian T. O’Neill (AI & data product management leadership—powered by UX design) Listen

podcast_episode

by Brian O’Neill (Designing for Analytics) , Iván Herrero Bartolomé (Grupo Intercorp)

Analytics Data Analytics Data Science HTML

Today I’m chatting with Iván Herrero Bartolomé, Chief Data Officer at Grupo Intercorp. Iván describes how he was prompted to write his new article in CDO Magazine, “CDOs, Let’s Get Out of Our Comfort Zone” as he recognized the importance of driving cultural change within organizations in order to optimize the use of data. Listen in to find out how Iván is leveraging the role of the analytics translator to drive this cultural shift, as well as the challenges and benefits he sees data leaders encounter as they move from tactical to strategic objectives. Iván also reveals the number one piece of advice he’d give CDOs who are struggling with adoption.

Highlights / Skip to:

Iván explains what prompted him to write his new article, “CDOs, Let’s Get Out of Our Comfort Zone” (01:08) What Iván feels is necessary for data leaders to close the gap between data and the rest of the business and why (03:44) Iván dives into who he feels really owns delivery of value when taking on new data science and analytics projects (09:50) How Iván’s team went from managing technical projects that often didn’t make it to production to working on strategic projects that almost always make it to production (13:06) The framework Iván has developed to upskill technical and business roles to be effective data / analytics translators (16:32) The challenge Iván sees data leaders face as they move from setting and measuring tactical goals to moving towards strategic goals and initiatives (24:12) Iván explains how the C-Suite’s attitude impacts the cross-functional role of data & analytics leadership (28:55) The number one piece of advice Iván would give new CDO’s struggling with low adoption of their data products and solutions (31:45)

Quotes from Today’s Episode “We’re going to do all our best to ensure that [...] everything that is expected from us is done in the best possible way. But that’s not going to be enough. We need a sponsorship and we need someone accountable for the project and someone who will be pushing and enabling the use of the solution once we are gone. Because we cannot stay forever in every company.” – Iván Herrero Bartolomé (10:52)

“We are trying to upskill people from the business to become data translators, but that’s going to take time. Especially what we try to do is to take product owners and give them a high-level immersion on the state-of-the-art and the possibilities that data analytics bring to the table. But as we can’t rely on our companies having this kind of talent and these data translators, they are one of the profiles that we bring in for every project that we work on.” – Iván Herrero Bartolomé (13:51)

“There’s a lot to do, not just between data and analytics and the other areas of the company, but aligning the incentives of all the organization towards the same goals in a way that there’s no friction between the goals of the different areas, the people, [...] and the final goals of the organization. – Iván Herrero Bartolomé (23:13) “Deciding which goals are you going to be co-responsible for, I think that is a sophisticated process that it’s not mastered by many companies nowadays. That probably is one of the main blockers keeping data analytics areas working far from their business counterparts” – Iván Herrero Bartolomé (26:05)

“When the C-suite looks at data and analytics, if they think these are just technical skills, then the data analytics team are just going to behave as technical people. And many, many data analytics teams are set up as part of the IT organization. So, I think it all begins somehow with how the C-suite of our companies look at us.” – Iván Herrero Bartolomé (28:55) “For me, [digital] means much more than the technical development of solutions; it should also be part of the transformation of the company, both in how companies develop relationships with their customers, but also inside how every process in the companies becomes more nimble and can react faster to the changes in the market.” – Iván Herrero Bartolomé (30:49) “When you feel that everyone else not doing what you think they should be doing, think twice about whether it is they who are not doing what they should be doing or if it’s something that you are not doing properly.” – Iván Herrero Bartolomé (31:45)

Links “CDOs, Let’s Get Out of Our Comfort Zone”: https://www.cdomagazine.tech/cdo_magazine/topics/opinion/cdos-lets-get-out-of-our-comfort-zone/article_dce87fce-2479-11ed-a0f4-03b95765b4dc.html LinkedIn: https://www.linkedin.com/in/ivan-herrero-bartolome/

Speeding Up The Time To Insight For Supply Chains And Logistics With The Pathway Database That Thinks

2022-10-16 · Data Engineering Podcast Listen

podcast_episode

by Adrian Kosowski (Pathway) , Tobias Macey

AI/ML Analytics BI Data Engineering Data Management Dataflow ETL/ELT Google Analytics Hevo Data Kubernetes Modern Data Stack MongoDB +5 more

Summary Logistics and supply chains are under increased stress and scrutiny in recent years. In order to stay ahead of customer demands, businesses need to be able to react quickly and intelligently to changes, which requires fast and accurate insights into their operations. Pathway is a streaming database engine that embeds artificial intelligence into the storage, with functionality designed to support the spatiotemporal data that is crucial for shipping and logistics. In this episode Adrian Kosowski explains how the Pathway product got started, how its design simplifies the creation of data products that support supply chain operations, and how developers can help to build an ecosystem of applications that allow businesses to accelerate their time to insight.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don’t forget to thank them for their continued support of this show! Atlan is the metadata hub for your data ecosystem. Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan’s active metadata capabilities. Push information about data freshness and quality to your business intelligence, automatically scale up and down your warehouse based on usage patterns, and let the bots answer those questions in Slack so that the humans can focus on delivering real value. Go to dataengineeringpodcast.com/atlan today to learn more about how Atlan’s active metadata platform is helping pioneering data teams like Postman, Plaid, WeWork & Unilever achieve extraordinary things with metadata and escape the chaos. Prefect is the modern Dataflow Automation platform for the modern data stack, empowering data practitioners to build, run and monitor robust pipelines at scale. Guided by the principle that the orchestrator shouldn’t get in your way, Prefect is the only tool of its kind to offer the flexibility to write code as workflows. Prefect specializes in glueing together the disparate pieces of a pipeline, and integrating with modern distributed compute libraries to bring power where you need it, when you need it. Trusted by thousands of organizations and supported by over 20,000 community members, Prefect powers over 100MM business critical tasks a month. For more information on Prefect, visit dataengineeringpodcast.com/prefect. Data engineers don’t enjoy writing, maintaining, and modifying ETL pipelines all day, every day. Especially once they realize 90% of all major data sources like Google Analytics, Salesforce, Adwords, Facebook, Spreadsheets, etc., are already available as plug-and-play connectors with reliable, intuitive SaaS solutions. Hevo Data is a highly reliable and intuitive data pipeline platform used by data engineers from 40+ countries to set up and run low-latency ELT pipelines with zero maintenance. Boasting more than 150 out-of-the-box connectors that can be set up in minutes, Hevo also allows you to monitor and control your pipelines. You get: real-time data flow visibility, fail-safe mechanisms, and alerts if anything breaks; preload transformations and auto-schema mapping precisely control how data lands in your destination; models and workflows to transform data for analytics; and reverse-ETL capability to move the transformed data back to your business software to inspire timely action. All of this, plus its transparent pricing and 24*7 live s

Full Stack FastAPI, React, and MongoDB

2022-09-23 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Marko Aleksendrić

JavaScript JSON MongoDB Python Redis data data-engineering nosql-databases

Master web development with the FARM stack in this comprehensive guide. You'll learn to harness FastAPI for a secure and efficient backend, React for a dynamic frontend, and MongoDB for flexible data storage. Gain practical experience by building fully functional projects that you can deploy and fine-tune, opening doors to enhanced proficiency in modern web technologies. What this Book will help me do Build secure and performant backends using FastAPI and understand its integration with MongoDB. Develop responsive and dynamic user interfaces with React and incorporate server-side rendering for improved SEO. Explore the intricacies of deploying full-stack applications on platforms like Heroku and Netlify. Implement robust user authentication systems with JSON Web Tokens for securing your applications. Apply caching strategies with Redis to enhance the performance and scalability of applications. Author(s) Marko Aleksendrić, the author of this book, combines years of experience in software development with a passion for teaching. Specializing in full-stack web technologies, Marko has a track record of guiding developers in mastering modern tools like FastAPI and React. His practical approach focuses on equipping readers with real-world skills through projects and best practices. Who is it for? This book is ideal for developers with foundational knowledge in Python, JavaScript, and web basics who want to expand their expertise into full-stack development. Whether you're a professional seeking to enhance your project toolkit or a beginner aiming to tackle modern web applications, this guide provides a step-by-step approach tailored to your growth.

098 - Why Emilie Schario Wants You to Run Your Data Team Like a Product Team

2022-08-23 · Experiencing Data w/ Brian T. O’Neill (AI & data product management leadership—powered by UX design) Listen

podcast_episode

by Emilie Schario (Netlify) , Brian O’Neill (Designing for Analytics)

AI/ML Analytics BI Dashboard Power BI

Today I’m chatting with Emilie Shario, a Data Strategist in Residence at Amplify Partners. Emilie thinks data teams should operate like product teams. But what led her to that conclusion, and how has she put the idea into practice? Emilie answers those questions and more, delving into what kind of pushback and hiccups someone can expect when switching from being data-driven to product-driven and sharing advice for data scientists and analytics leaders.

Highlights / Skip to:

Answering the question “whose job is it” (5:18) Understanding and solving problems instead of just building features people ask for (9:05) Emilie explains what Amplify Partners is and talks about her work experience and how it fuels her perspectives on data teams (11:04) Emilie and I talk about the definition of data product (13:00) Emilie talks about her approach to building and training a data team (14:40) We talk about UX designers and how they fit into Emilie’s data teams (18:40) Emilie talks about the book and blog “Storytelling with Data” (21:00) We discuss the push back you can expect when trying to switch a team from being data driven to being product driven (23:18) What hiccups can people expect when switching to a product driven model (30:36) Emilie’s advice for data scientists and and analyst leaders (35:50) Emilie explains what Locally Optimistic is (37:34)

Quotes from Today’s Episode “Our thesis is…we need to understand the problems we’re solving before we start building solutions, instead of just building the things people are asking for.” — Emilie (2:23)

“I’ve seen this approach of flipping the ask on its head—understanding the problem you’re trying to solve—work and be more successful at helping drive impact instead of just letting your data team fall into this widget builder service trap.” — Emilie (4:43)

“If your answer to any problem to me is, ‘That’s not my job,’ then I don’t want you working for me because that’s not what we’re here for. Your job is whatever the problem in front of you that needs to be solved.” — Emilie (7:14)

“I don’t care if you have all of the data in the world and the most talented machine learning engineers and you’ve got the ability to do the coolest new algorithm fancy thing. If it doesn’t drive business impact, it doesn’t matter.” — Emilie (7:52)

“Data is not just a thing that anyone can do. It’s not just about throwing numbers in a spreadsheet anymore. It’s about driving business impact. But part of how we drive business impact with data is making it accessible. And accessible isn’t just giving people the numbers, it’s also communicating with it effectively, and UX is a huge piece of how we do that.” — Emilie (19:57)

“There are no null choices in design. Someone is deciding what some other human—a customer, a client, an internal stakeholder—is going to use, whether it’s a React app, or a Power BI dashboard, or a spreadsheet dump, or whatever it is, right? There will be an experience that is created, whether it is intentionally created or not.” — Brian (20:28)

“People will think design is just putting in colors that match together, like, or spinning the color wheel and seeing what lands. You know, there’s so much more to it. And it is an expertise; it is a domain that you have to develop.” — Emilie (34:58)

Links Referenced: Blog post by Rifat Majumder storytellingwithdata.com Experiencing Data Episode 28 with Cole Nussbaumer Knaflic locallyoptimistic.com Twitter: @emilieschario

Realize the Promise of Streaming with the Databricks Lakehouse Platform

2022-07-19 · Databricks DATA + AI Summit 2023 Watch

video

by Erica Lee (Upwork)

AI/ML Data Lakehouse Databricks ETL/ELT Data Streaming

Streaming is the future of all data pipelines and applications. It enables businesses to make data-driven decisions sooner and react faster, develop data-driven applications considered previously impossible, and deliver new and differentiated experiences to customers. However, many organizations have not realized the promise of streaming to its full potential because it requires them to completely redevelop their data pipelines and applications on new, complex, proprietary, and disjointed technology stacks.

The Databricks Lakehouse Platform is a simple, unified, and open platform that supports all streaming workloads ranging from ingestion, ETL to event processing, event-driven application, and ML inference. In this session, we will discuss the streaming capabilities of the Lakehouse Platform and demonstrate how easy it is to build end-to-end, scalable streaming pipelines and applications, to fulfill the promise of streaming for your business. You will also hear from Erica Lee, VP of ML at Upwork, the world's largest Work Marketplace, share how the Upwork team uses Databricks to enable real-time predictions by computing ML features in a continuous streaming manner.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Towards a Modular Future: Reimagining and Rebuilding Kedro-viz for Visualizing Modular Pipelines

2022-07-19 · Databricks DATA + AI Summit 2023 Watch

video

AI/ML Data Science DataViz Databricks

Kedro is an open-source framework for creating portable pipelines through modular data science code, and provides a powerful interactive visualisation tool called ‘Kedro-Viz’, a webapp that magically generates a highly powerful and informational visualisation of the pipeline.

In 2020, the Kedro project introduced an important set of features to support Modular Pipelines, which allows users to set up a series of pipelines that are logically isolated and re-usable to form higher level pipelines.

With this paradigm shift comes the need to reimagine the visualization of the pipeline on Kedro-viz, in that it needs to introduce a series of redesigns and new features to support this new representation of pipeline structure.

As a core contributor and team member to the Kedro-viz project throughout the past year, I have witnessed the journey of this transition through shipping the core features for modular pipelines on Kedro-viz.

This talk will focus on my experience as a front end developer as I walk through the unique architecture and data ingestion setup for this project. I will deep-dive into the unique set of problems and assumptions we have to make in accommodating this new modular pipeline setup, and our approach for solving them within a Front End(React + Redux) context.

Not to say I will definitely share the mistakes and learnings along the way, and how this paved the path towards the app architecture choices for our next set of features in ML experiment tracking.

This talk is for the curious data practitioner who is up for exposure to a fresh set of problems beyond the typical data science domain, and for those who are up for a ride through the mind-boggling details of the unique set up of front end development and data visualisation for data science.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Future of the Airflow UI

2022-07-01 · Airflow Summit 2022

session

by Brent Bovenzi (Astronomer)

Airflow

Sneak peek at the future of the Airflow UI. In Airflow 2.3 with the Tree -> Grid view changes, we began to swap out parts of the Flask app with React. This was one step towards AIP-38, to build a fully modern UI for Airflow. Come check out what is in store after Grid view in the current UI. Discuss the possibilities to rethink Airflow with a brand new UI down the line. Such as: Integrating all DAG visualizations into each other and remove constant page reloads More live data Greater cross-DAG visualizations (ie: DAG Dependencies view from 2.1) Improved user settings: (dark mode, color blind support, language, date format) And more!

Ten Things to Know About ModelOps

2022-06-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mark Palmer , Larry Derany , Thomas Hill

AI/ML Analytics Data Science data data-engineering data-models

The past few years have seen significant developments in data science, AI, machine learning, and advanced analytics. But the wider adoption of these technologies has also brought greater cost, risk, regulation, and demands on organizational processes, tasks, and teams. This report explains how ModelOps can provide both technical and operational solutions to these problems. Thomas Hill, Mark Palmer, and Larry Derany summarize important considerations, caveats, choices, and best practices to help you be successful with operationalizing AI/ML and analytics in general. Whether your organization is already working with teams on AI and ML, or just getting started, this report presents ten important dimensions of analytic practice and ModelOps that are not widely discussed, or perhaps even known. In part, this report examines: Why ModelOps is the enterprise "operating system" for AI/ML algorithms How to build your organization's IP secret sauce through repeatable processing steps How to anticipate risks rather than react to damage done How ModelOps can help you deliver the many algorithms and model formats available How to plan for success and monitor for value, not just accuracy Why AI will be soon be regulated and how ModelOps helps ensure compliance

AI-Enabled Analytics for Business

2022-01-19 · O'Reilly Data Science Books O'Reilly Amazon

book

by Lawrence S. Maisel , Robert J. Zwerling , Jesper H. Sorensen

AI/ML Analytics business-intelligence data data-science

We are entering the era of digital transformation where human and artificial intelligence (AI) work hand in hand to achieve data driven performance. Today, more than ever, businesses are expected to possess the talent, tools, processes, and capabilities to enable their organizations to implement and utilize continuous analysis of past business performance and events to gain forward-looking insight to drive business decisions and actions. AI-Enabled Analytics in Business is your Roadmap to meet this essential business capability. To ensure we can plan for the future vs react to the future when it arrives, we need to develop and deploy a toolbox of tools, techniques, and effective processes to reveal forward-looking unbiased insights that help us understand significant patterns, relationships, and trends. This book promotes clarity to enable you to make better decisions from insights about the future. Learn how advanced analytics ensures that your people have the right information at the right time to gain critical insights and performance opportunities Empower better, smarter decision making by implementing AI-enabled analytics decision support tools Uncover patterns and insights in data, and discover facts about your business that will unlock greater performance Gain inspiration from practical examples and use cases showing how to move your business toward AI-Enabled decision making AI-Enabled Analytics in Business is a must-have practical resource for directors, officers, and executives across various functional disciplines who seek increased business performance and valuation.

Learning PHP, MySQL & JavaScript, 6th Edition

2021-07-22 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Robin Nixon

HTML JavaScript MySQL Cyber Security data data-engineering relational-databases

Build interactive, data-driven websites with the potent combination of open source technologies and web standards, even if you have only basic HTML knowledge. With the latest edition of this popular hands-on guide, you'll tackle dynamic web programming using the most recent versions of today's core technologies: PHP, MySQL, JavaScript, CSS, HTML5, jQuery, and the powerful React library. Web designers will learn how to use these technologies together while picking up valuable web programming practices along the way, including how to optimize websites for mobile devices. You'll put everything together to build a fully functional social networking site suitable for both desktop and mobile browsers. Explore MySQL from database structure to complex queries Use the MySQL PDO extension, PHP's improved MySQL interface Create dynamic PHP web pages that tailor themselves to the user Manage cookies and sessions and maintain a high level of security Enhance JavaScript with the React library Use Ajax calls for background browser-server communication Style your web pages by acquiring CSS skills Implement HTML5 features, including geolocation, audio, video, and the canvas element Reformat your websites into mobile web apps

Integrating D3.js with React: Learn to Bring Data Visualization to Life

2021-06-03 · O'Reilly Data Visualization Books O'Reilly Amazon

book

by Elad Elrom

DataViz HTML JavaScript TypeScript javascript-frameworks web-development web-mobile

Integrate D3.js into a React TypeScript project and create a chart component working in harmony with React. This book will show you how utilize D3 with React to bring life to your charts. Seasoned author Elad Elrom will show you how to create simple charts such as line, bar, donut, scatter, histogram and others, and advanced charts such as a world map and force charts. You'll also learn to share the data across your components and charts using React Recoil state management. Then integrate third-party chart libraries that are built on D3 such as Rechart, Visx, Nivo, React-vi, and Victory and in the end deploy your chart as a server or serverless app on popular platforms. React and D3 are two of the most popular frameworks in their respective areas – learn to bring them together and take your storytelling to the next level. What You'll Learn Set up your project with React, TypeScript and D3.js Create simple and advanced D3.js charts Work with complex charts such as world and force charts Integrate D3 data with React state management Improve the performance of your D3 components Deploy as a server or serverless app and debug test Who This Book Is ForReaders that already have basic knowledge of React, HTML, CSS and JavaScript.

[Replay] - Covid-19 interview with Dr. Kyu Rhee

2020-11-25 · Making Data Simple Listen

podcast_episode

by Dr. Kyu Rhee (IBM) , Al Martin (IBM)

AI/ML Big Data IBM

Send us a text Hosted by Al Martin, VP, Data and AI Expert Services and Learning at IBM, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract This week on Making Data Simple, we have a returning guest Dr. Kyu Rhee VP & Chief Health Officer IBM and IBM Watson Health, discussing the Covid-19 pandemic and how we prepare and react individually and as a country. What can we do for ourselves and how this pandemic affects the economy. And when do we see a light at the end of the tunnel.

Show Notes

https://www.ibm.com/blogs/watson-health/author/kyurhee/
https://www.ibm.com/impact/covid-19/

Connect with the Team

Producer Kate Brown - LinkedIn.

Producer Michael Sestak - LinkedIn. Producer Meighann Helene - LinkedIn.

Host Al Martin - LinkedIn and Twitter.

Additional resources: IBM Watson Health COVID-19 Resources: https://www.ibm.com/watson-health/covid-19

IBM Watson Health: Micromedex with Watson: https://www.ibm.com/products/dynamed-and-micromedex-with-watson

How governments are rising to the challenge of COVID-19: https://www.ibm.com/blogs/watson-health/governments-agencies-rising-challenge-of-covid-19/

Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

[Replay] - Covid-19 interview with Dr. Kyu Rhee

2020-09-16 · Making Data Simple Listen

podcast_episode

by Dr. Kyu Rhee (IBM) , Al Martin (IBM)

AI/ML Big Data IBM

Send us a text Hosted by Al Martin, VP, Data and AI Expert Services and Learning at IBM, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts. Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next. Abstract This week on Making Data Simple, we have a returning guest Dr. Kyu Rhee VP & Chief Health Officer IBM and IBM Watson Health, discussing the Covid-19 pandemic and how we prepare and react individually and as a country. What can we do for ourselves and how this pandemic affects the economy. And when do we see a light at the end of the tunnel. Show Notes 1. https://www.ibm.com/blogs/watson-health/author/kyurhee/ 2. https://www.ibm.com/impact/covid-19/ Connect with the Team Producer Kate Brown - LinkedIn. Producer Michael Sestak - LinkedIn. Producer Meighann Helene - LinkedIn.

Host Al Martin - LinkedIn and Twitter. Additional resources: IBM Watson Health COVID-19 Resources: https://www.ibm.com/watson-health/covid-19 IBM Watson Health: Micromedex with Watson: https://www.ibm.com/products/dynamed-and-micromedex-with-watson How governments are rising to the challenge of COVID-19: https://www.ibm.com/blogs/watson-health/governments-agencies-rising-challenge-of-covid-19/ Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Covid-19 interview with Dr. Kyu Rhee

2020-05-13 · Making Data Simple Listen

podcast_episode

by Dr. Kyu Rhee (IBM) , Al Martin (IBM)

AI/ML Big Data IBM

Send us a text Hosted by Al Martin, VP, Data and AI Expert Services and Learning at IBM, Making Data Simple provides the latest thinking on big data, A.I., and the implications for the enterprise from a range of experts.

Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next. Abstract

This week on Making Data Simple, we have a returning guest Dr. Kyu Rhee VP & Chief Health Officer IBM and IBM Watson Health, discussing the Covid-19 pandemic and how we prepare and react individually and as a country. What can we do for ourselves and how this pandemic affects the economy. And when do we see a light at the end of the tunnel. Show Notes 1. https://www.ibm.com/blogs/watson-health/author/kyurhee/ 2. https://www.ibm.com/impact/covid-19/ Connect with the Team Producer Kate Brown - LinkedIn. Producer Michael Sestak - LinkedIn. Producer Meighann Helene - LinkedIn.

Host Al Martin - LinkedIn and Twitter. Additional resources:

IBM Watson Health COVID-19 Resources: https://www.ibm.com/watson-health/covid-19 IBM Watson Health: Micromedex with Watson: https://www.ibm.com/products/dynamed-and-micromedex-with-watson How governments are rising to the challenge of COVID-19: https://www.ibm.com/blogs/watson-health/governments-agencies-rising-challenge-of-covid-19/ (edited)

Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Pro D3.js: Use D3.js to Create Maintainable, Modular, and Testable Charts

2019-10-31 · O'Reilly Data Science Books O'Reilly Amazon

book

by Marcos Iglesias

API CI/CD DataViz JavaScript d3 data data-science data-science-tasks data-visualization

Go beyond the basics of D3.js to create maintainable, modular, and testable charts and to package them into a library that can be distributed as open source software or kept for private use. This book will show you how to transform regular D3.js chart code into reusable and extendable modules.You know the basics of working with D3.js, but it's time to become a professional D3.js practitioner. This book is your launching pad to refactoring code, composing complex visualizations from small components, working as a team with other developers, and integrating charts with a Continuous Integration system. You'll begin by creating a production-ready chart using D3.js v5, ES2015, and a test-driven approach and then move on to using and extending Britecharts, the reusable charting library based on Reusable API patterns. Finally, you'll see how to use D3.js along with React to document and build your charts to compose a charting library you can release into the NPM repository. With Pro D3.js, you'll become an accomplished D3.js developer in no time. What You Will Learn Create v5 D3.js charts with ES2016 and unit tests Develop modular, testable and extensible code with the Reusable API pattern Work with and extend Britecharts, a reusable charting library created at Eventbrite Use Webpack and npm to create and publish a charting library from your own chart collections Write reference documentation and build a documentation homepage for your library. Who This Book Is For Data scientists, data visualization engineers, and frontend developers with a fundamental knowledge of D3.js and some experience with JavaScript, as well as data journalists and consultants.

Enterprise Insight with Dinesh Nirmal - Making Data Simple [Season 3 - Episode 19]

2019-05-15 · Making Data Simple Listen

podcast_episode

by Dinesh Nirmal (IBM Software) , Al Martin (IBM)

AI/ML Analytics Big Data Blockchain Data Analytics IBM

Send us a text This week on Making Data Simple, Dinesh Nirmal comes on the show to discuss current industry trends. Host Al Martin poses questions that are both technical and leadership oriented. Together, they discuss the new, emerging technologies that drives them while providing their own definitions of team building and success. Listen, engage, react. Give us your feedback and get in on the conversation.

Show Notes Check us out on: - YouTube - Apple Podcasts - Google Play Music - Spotify - TuneIn - Stitcher 00:10 - Connect with Producer Steve Moore on LinkedIn and Twitter. 00:15 - Connect with Producer Liam Seston on LinkedIn and Twitter. 00:20 - Connect with Producer Rachit Sharma on LinkedIn. 00:25 - Connect with Host Al Martin on LinkedIn and Twitter. 01:37 - Connect with Dinesh Nirmal on LinkedIn and Twitter. 06:06 - An interesting read on the state of illegal dumping in rural California 11:14 - Some examples of successful AI uses cases. 14:31 - Learn about blockchain here. 29:06 - Find out how open source is helping remove data silos in the enterprise. 32:40 - Check out IBM's content on big data analytics. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Managing Database Access Control For Teams With strongDM

2019-01-29 · Data Engineering Podcast Listen

podcast_episode

by Justin McCarthy (StrongDM) , Tobias Macey

Ansible Chef Data Engineering Data Management

Summary Controlling access to a database is a solved problem… right? It can be straightforward for small teams and a small number of storage engines, but once either or both of those start to scale then things quickly become complex and difficult to manage. After years of running across the same issues in numerous companies and even more projects Justin McCarthy built strongDM to solve database access management for everyone. In this episode he explains how the strongDM proxy works to grant and audit access to storage systems and the benefits that it provides to engineers and team leads.

Introduction

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Justin McCarthy about StrongDM, a hosted service that simplifies access controls for your data

Interview

Introduction How did you get involved in the area of data management? Can you start by explaining the problem that StrongDM is solving and how the company got started?

What are some of the most common challenges around managing access and authentication for data storage systems? What are some of the most interesting workarounds that you have seen? Which areas of authentication, authorization, and auditing are most commonly overlooked or misunderstood?

Can you describe the architecture of your system?

What strategies have you used to enable interfacing with such a wide variety of storage systems?

What additional capabilities do you provide beyond what is natively available in the underlying systems? What are some of the most difficult aspects of managing varying levels of permission for different roles across the diversity of platforms that you support, given that they each have different capabilities natively? For a customer who is onboarding, what is involved in setting up your platform to integrate with their systems? What are some of the assumptions that you made about your problem domain and market when you first started which have been disproven? How do organizations in different industries react to your product and how do their policies around granting access to data differ? What are some of the most interesting/unexpected/challenging lessons that you have learned in the process of building and growing StrongDM?

Contact Info

LinkedIn @justinm on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

StrongDM Authentication Vs. Authorization Hashicorp Vault Configuration Management Chef Puppet SaltStack Ansible Okta SSO (Single Sign On SOC 2 Two Factor Authentication SSH (Secure SHell) RDP

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Support Data Engineering Podcast

Metabase Self Service Business Intelligence with Sameer Al-Sakran - Episode 29

2018-04-30 · Data Engineering Podcast Listen

podcast_episode

by Sameer Al-Sakran (Metabase) , Tobias Macey

API BI Data Engineering Data Management Datadog GitHub Hadoop Metabase Python Redash Scala SQL +1 more

Summary

Business Intelligence software is often cumbersome and requires specialized knowledge of the tools and data to be able to ask and answer questions about the state of the organization. Metabase is a tool built with the goal of making the act of discovering information and asking questions of an organizations data easy and self-service for non-technical users. In this episode the CEO of Metabase, Sameer Al-Sakran, discusses how and why the project got started, the ways that it can be used to build and share useful reports, some of the useful features planned for future releases, and how to get it set up to start using it in your environment.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. For complete visibility into the health of your pipeline, including deployment tracking, and powerful alerting driven by machine-learning, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix performance bottlenecks in no time. Go to dataengineeringpodcast.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Sameer Al-Sakran about Metabase, a free and open source tool for self service business intelligence

Interview

Introduction How did you get involved in the area of data management? The current goal for most companies is to be “data driven”. How would you define that concept?

How does Metabase assist in that endeavor?

What is the ratio of users that take advantage of the GUI query builder as opposed to writing raw SQL?

What level of complexity is possible with the query builder?

What have you found to be the typical use cases for Metabase in the context of an organization? How do you manage scaling for large or complex queries? What was the motivation for using Clojure as the language for implementing Metabase? What is involved in adding support for a new data source? What are the differentiating features of Metabase that would lead someone to choose it for their organization? What have been the most challenging aspects of building and growing Metabase, both from a technical and business perspective? What do you have planned for the future of Metabase?

Contact Info

Sameer

salsakran on GitHub @sameer_alsakran on Twitter LinkedIn

Metabase

Website @metabase on Twitter metabase on GitHub

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Expa Metabase Blackjet Hadoop Imeem Maslow’s Hierarchy of Data Needs 2 Sided Marketplace Honeycomb Interview Excel Tableau Go-JEK Clojure React Python Scala JVM Redash How To Lie With Data Stripe Braintree Payments

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

talk-data.com

Activity Trend

Top Events

Top Speakers

Observability for Distributed Computing with Dask

Unlocking The Potential Of Streaming Data Applications Without The Operational Headache At Grainite

102 - CDO Spotlight: The Non-Technical Roles Data Science and Analytics Teams Need to Drive Adoption of Data Products w/ Iván Herrero Bartolomé

Speeding Up The Time To Insight For Supply Chains And Logistics With The Pathway Database That Thinks

Full Stack FastAPI, React, and MongoDB

098 - Why Emilie Schario Wants You to Run Your Data Team Like a Product Team

Realize the Promise of Streaming with the Databricks Lakehouse Platform

Towards a Modular Future: Reimagining and Rebuilding Kedro-viz for Visualizing Modular Pipelines

Future of the Airflow UI

Ten Things to Know About ModelOps

AI-Enabled Analytics for Business

Learning PHP, MySQL & JavaScript, 6th Edition

Integrating D3.js with React: Learn to Bring Data Visualization to Life

[Replay] - Covid-19 interview with Dr. Kyu Rhee

[Replay] - Covid-19 interview with Dr. Kyu Rhee

Covid-19 interview with Dr. Kyu Rhee

Pro D3.js: Use D3.js to Create Maintainable, Modular, and Testable Charts

Enterprise Insight with Dinesh Nirmal - Making Data Simple [Season 3 - Episode 19]

Managing Database Access Control For Teams With strongDM

Metabase Self Service Business Intelligence with Sameer Al-Sakran - Episode 29