talk-data.com talk-data.com

Topic

Cloud Computing

infrastructure saas iaas

4055

tagged

Activity Trend

471 peak/qtr
2020-Q1 2026-Q1

Activities

4055 activities · Newest first

Graph-Powered Analytics and Machine Learning with TigerGraph

With the rapid rise of graph databases, organizations are now implementing advanced analytics and machine learning solutions to help drive business outcomes. This practical guide shows data scientists, data engineers, architects, and business analysts how to get started with a graph database using TigerGraph, one of the leading graph database models available. You'll explore a three-stage approach to deriving value from connected data: connect, analyze, and learn. Victor Lee, Phuc Kien Nguyen, and Alexander Thomas present real use cases covering several contemporary business needs. By diving into hands-on exercises using TigerGraph Cloud, you'll quickly become proficient at designing and managing advanced analytics and machine learning solutions for your organization. Use graph thinking to connect, analyze, and learn from data for advanced analytics and machine learning Learn how graph analytics and machine learning can deliver key business insights and outcomes Use five core categories of graph algorithms to drive advanced analytics and machine learning Deliver a real-time 360-degree view of core business entities, including customer, product, service, supplier, and citizen Discover insights from connected data through machine learning and advanced analytics

Pro Power BI Architecture: Development, Deployment, Sharing, and Security for Microsoft Power BI Solutions

This book provides detailed guidance around architecting and deploying Power BI reporting solutions, including help and best practices for sharing and security. You’ll find chapters on dataflows, shared datasets, composite model and DirectQuery connections to Power BI datasets, deployment pipelines, XMLA endpoints, and many other important features related to the overall Power BI architecture that are new since the first edition. You will gain an understanding of what functionality each of the Power BI components provide (such as Dataflow, Shared Dataset, Datamart, thin reports, and paginated reports), so that you can make an informed decision about what components to use in your solution. You will get to know the pros and cons of each component, and how they all work together within the larger Power BI architecture. Commonly encountered problems you will learn to handle include content unexpectedly changing while users are in the process of creating reports and building analyses, methods of sharing analyses that don’t cover all the requirements of your business or organization, and inconsistent security models. Detailed examples help you to understand and choose from among the different methods available for sharing and securing Power BI content so that only intended recipients can see it. The knowledge provided in this book will allow you to choose an architecture and deployment model that suits the needs of your organization. It will also help ensure that you do not spend your time maintaining your solution, but on using it for its intended purpose: gaining business value from mining and analyzing your organization’s data. What You Will Learn Architect Power BI solutions that are reliable and easy to maintain Create development templates and structures in support of reusability Set up and configure the Power BI gateway as a bridge between on-premises data sourcesand the Power BI cloud service Select a suitable connection type—Live Connection, DirectQuery, Scheduled Refresh, or Composite Model—for your use case Choose the right sharing method for how you are using Power BI in your organization Create and manage environments for development, testing, and production Secure your data using row-level and object-level security Save money by choosing the right licensing plan Who This Book Is For Data analysts and developers who are building reporting solutions around Power BI, as well as architects and managers who are responsible for the big picture of how Power BI meshes with an organization’s other systems, including database and data warehouse systems.

Anna Nerezova is a Digital Marketing and Cloud Transformation Consultant with 15 years of experience in data, analytics and optimization. She has built innovative solutions using Google Cloud to solve problems in the media and entertainment, tech, health and non-profit industries Anna is a Google Cloud Engineer Scholar, and is on the 100 Women in Analytics list by Google Analytics. She is also a Google Developer Group NYC lead and a mentor for the Northeast, and a proud Women Techmakers Ambassador.

Anna is passionate about using the latest AI/ML technologies to provide solutions and bring growth and better experience for all users. She is a strong advocate for diversity and inclusion, and is committed to helping underrepresented groups become entrepreneurs and financially independent.

Live from the Lakehouse: Data sharing, Databricks marketplace, and Fivetran & cloud data platforms

Hear from two guests. First, Zaheera Valani (Sr Director, Engineering at Databricks) on data sharing and Databricks marketplace. Second guest, Taylor Brown (COO and co-founder, Fivetran), discusses cloud data platforms and automating data pulling from thousands of disparate data sources - how Fivetran and Databricks partner. Hosted by Holly Smith (Sr Resident Solutions Architect, Databricks) and Jimmy Obeyeni (Strategic Account Executive, Databricks)

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Today I’m chatting with Peter Everill, who is the Head of Data Products for Analytics and ML Designs at the UK grocery brand, Sainsbury’s. Peter is also a founding member of the Data Product Leadership Community. Peter shares insights on why his team spends so much time conducting discovery work with users, and how that leads to higher adoption and in turn, business value. Peter also gives us his in-depth definition of a data product, including the three components of a data product and the four types of data products he’s encountered. He also shares the 8-step product management methodology that his team uses to develop data products that truly deliver value to end users. Pete also shares the #1 resource he would invest in right now to make things better for his team and their work.

Highlights/ Skip to:

I introduce Peter, who I met through the Data Product Leadership Community (00:37) What the data team structure at Sainsbury’s looks like and how Peter wound up working there (01:54) Peter shares the 8-step product management methodology that has been developed by his team and where in that process he spends most of his time (04:54) How involved the users are in Peter’s process when it comes to developing data products (06:13) How Peter was able to ensure that enough time is taken on discovery throughout the design process (10:03) Who on Peter’s team is doing the core user research for product development (14:52) Peter shares the three things that he feels make data product teams successful (17:09) How Peter defines a data product, including the three components of a data product and the four types of data products (18:34) Peter and I discuss the importance of spending time in discovery (24:25) Peter explains why he measures reach and impact as metrics of success when looking at implementation (26:18) How Peter solves for the gap when handing off a product to the end users to implement and adopt (29:20) How Peter hires for data product management roles and what he looks for in a candidate (33:31) Peter talks about what roles or skills he’d be looking for if he was to add a new person to his team (37:26)

Quotes from Today’s Episode “I’m a big believer that the majority of analytics in its simplest form is improving business processes and decisions. A big part of our discovery work is that we align to business areas, business divisions, or business processes, and we spend time in that discovery space actually mapping the business process. What is the goal of this process? Ultimately, how does it support the P&L?” — Peter Everill (12:29)

“There’s three things that are successful for any organization that will make this work and make it stick. The first is defining what you mean by a data product. The second is the role of a data product manager in the organization and really being clear what it is that they do and what they don’t do. … And the third thing is their methodology, from discovery through to delivery. The more work you put upfront defining those and getting everyone trained and clear on that, I think the quicker you’ll get to an organization that’s really clear about what it’s delivering, how it delivers, and who does what.” – Peter Everill (17:31)

“The important way that data and analytics can help an organization firstly is, understanding how that organization is performing. And essentially, performance is how well processes and decisions within the organization are being executed, and the impact that has on the P&L.” – Peter Everill (20:24)

“The great majority of organizations don’t allocate that percentage [20-25%] of time to discovery; they are jumping straight into solution. And also, this is where organizations typically then actually just migrate what already exists from, maybe, legacy service into a shiny new cloud platform, which might be good from a defensive data strategy point of view, but doesn’t offer new net value—apart from speed, security and et cetera of the cloud. Ultimately, this is why analytics organizations aren’t generally delivering value to organizations.” – Peter Everill (25:37)

“The only time that value is delivered, is from a user taking action. So, the two metrics that we really focus on with all four data products [are] reach [and impact].” – Peter Everill (27:44)

“In terms of benefits realization, that is owned by the business unit. Because ultimately, you’re asking them to take the action. And if they do, it’s their part of the P&L that’s improving because they own the business, they own the performance. So, you really need to get them engaged on the release, and for them to have the superusers, the champions of the product, and be driving voice of the release just as much as the product team.” – Peter Everill (30:30)

On hiring DPMs: “Are [candidates] showing the aptitude, do they understand what the role is, rather than the experience? I think data and analytics and machine learning product management is a relatively new role. You can’t go on LinkedIn necessarily, and be exhausted with a number of candidates that have got years and years of data and analytics product management.” – Peter Everill (36:40)

Links LinkedIn: https://www.linkedin.com/in/petereverill/

On today’s episode, we’re joined by Ben Sebree, Senior Vice President of R&D, CivicPlus, a technology company committed to empowering governments to efficiently operate, serve, and govern through the use of innovative and integrated technology solutions.

We talk about:  Helping local governments to be more efficient in the modern cloud ageMeasuring the potential impact & user adoption of new tech offeringsDigital initiatives, including AI, that cities can undertake to most benefit residentsHow governments can build trust by providing more ways for citizens to interact with government

Demand Forecasting Best Practices

Lead your demand planning process to excellence and deliver real value to your supply chain. In Demand Forecasting Best Practices you’ll learn how to: Lead your team to improve quality while reducing workload Properly define the objectives and granularity of your demand planning Use intelligent KPIs to track accuracy and bias Identify areas for process improvement Help planners and stakeholders add value Determine relevant data to collect and how best to collect it Utilize different statistical and machine learning models An expert demand forecaster can help an organization avoid overproduction, reduce waste, and optimize inventory levels for a real competitive advantage. Demand Forecasting Best Practices teaches you how to become that virtuoso demand forecaster. This one-of-a-kind guide reveals forecasting tools, metrics, models, and stakeholder management techniques for delivering more effective supply chains. Everything you learn has been proven and tested in a live business environment. Discover author Nicolas Vandeput’s original five step framework for demand planning excellence and learn how to tailor it to your own company’s needs. Illustrations and real-world examples make each concept easy to understand and easy to follow. You’ll soon be delivering accurate predictions that are driving major business value. About the Technology An expert demand forecaster can help an organization avoid overproduction, reduce waste, and optimize inventory levels for a real competitive advantage. This book teaches you how to become that virtuoso demand forecaster. About the Book Demand Forecasting Best Practices reveals forecasting tools, metrics, models, and stakeholder management techniques for managing your demand planning process efficiently and effectively. Everything you learn has been proven and tested in a live business environment. Discover author Nicolas Vandeput’s original five step framework for demand planning excellence and learn how to tailor it to your own company’s needs. Illustrations and real-world examples make each concept easy to understand and easy to follow. You’ll soon be delivering accurate predictions that are driving major business value. What's Inside Enhance forecasting quality while reducing team workload Utilize intelligent KPIs to track accuracy and bias Identify process areas for improvement Assist stakeholders in sales, marketing, and finance Optimize statistical and machine learning models About the Reader For demand planners, sales and operations managers, supply chain leaders, and data scientists. About the Author Nicolas Vandeput is a supply chain data scientist, the founder of consultancy company SupChains in 2016, and a teacher at CentraleSupélec, France. Quotes This new book continues to push the FVA mindset, illustrating practices that drive the efficiency and effectiveness of the business forecasting process. - Michael Gilliland, Editor-in-Chief, Foresight: Journal of Applied Forecasting A must-read for any SCM professional, data scientist, or business owner. It's practical, accessible, and packed with valuable insights. - Edouard Thieuleux, Founder of AbcSupplyChain An exceptional resource that covers everything from basic forecasting principles to advanced forecasting techniques using artificial intelligence and machine learning. The writing style is engaging, making complex concepts accessible to both beginners and experts. - Daniel Stanton, Mr. Supply Chain® Nicolas did it again! Demand Forecasting Best Practices provides practical and actionable advice for improving the demand planning process. - Professor Spyros Makridakis, The Makridakis Open Forecasting Center, Institute For the Future (IFF), University of Nicosia This book is now my companion on all of our planning and forecasting projects. A perfect foundation for implementation and also to recommend process improvements. - Werner Nindl, Chief Architect – CPM Practice Director, Pivotal Drive This author understands the nuances of forecasting, and is able to explain them well. - Burhan Ul Haq, Director of Products, Enablers Both broader and deeper than I expected. - Maxim Volgin, Quantitative Marketing Manager, KLM Great book with actionable insights. - Simon Tschöke, Head of Research, German Edge Cloud

Apache Airflow is Scalable, Dynamic, Extensible , Elegant and can it be a lot more ? We have taken Airflow to the next level, using it as hybrid cloud data service accelerating our transformation. During this talk we will present the implementation of Airflow as an orchestration solution between LEGACY, PRIVATE and PUBLIC cloud (AWS / AZURE) : Comparison between public/private offers. Harness the power of Hybric cloud orchestrator to meet the regulatory requirements (European Financial Institutions) Real production use cases

Discover the transformation of Airflow at GoDaddy: from its initial deployment on-prem to its migration to the cloud, and finally to a Single Pane Orchestration Model. This evolution has streamlined our Data Platform and improved governance. Our experience will be beneficial for anyone seeking to optimize their Airflow implementation and simplify their orchestration processes. History and Use-cases Design, Organization decisions, and Governance: Examining the decision-making process and governance structure. Migration to Cloud:Process of transitioning Airflow from on-premises to the cloud. Data Processing engines used with Airflow for Data Processing. Challenges: Obstacles faced during and after migration and how they were overcome. *Demonstrating how Airflow can be integrated with a central Glue Catalog and Data Lake Mesh model. Single Pane Orchestration (PAAS) and custom re-usable Github Actions: Examining benefits of using a Single Pane Orchestration model Monitoring

In this presentation, we discuss how we built a fully managed workflow orchestration system at Salesforce using Apache Airflow to facilitate dependable data lake infrastructure on the public cloud. We touch upon how we utilized kubernetes for increased scalability and resilience, as well as the most effective approaches for managing and scaling data pipelines. We will also talk about how we addressed data security and privacy, multitenancy, and interoperability with other internal systems. We discuss how we use this system to empower users with the ability to effortlessly build reliable pipelines that incorporate failure detection, alerting, and monitoring for deep insights through monitoring, removing the undifferentiated heavy lifting associated with running and managing their own orchestration engines. Lastly, we elaborate on how we integrated our in-house CI/CD pipelines to enable effective DAG and dependency management, further enhancing the system’s capabilities.

The session will cover capabilities of data lineage in Apache Airflow, how to use them, and motivations for it. It will present the technical know-how of integrating data lineage solutions with Apache Airflow, and provisioning DAGs metadata to fuel lineage functionalities in a way transparent to the user, limiting the setup friction. It will include Google’s Cloud Composer lineage integration implemented through the current Airflow’s data lineage architecture, and our approach to the lineage evolution strategy.

In 2022, cloud data centres accounted for up to 3.7% of global greenhouse gas emissions, exceeding those of aviation and shipping. Yet in the same year, Britain wasted 4 Terawatt hours of renewable energy because it couldn’t be transported from where it was generated to where it was needed. So why not move the cloud to the clean energy? VertFlow is an Airflow operator that deploys workloads to the greenest Google Cloud data centre, based on the realtime carbon intensity of electricity grids worldwide. At Ovo Energy, many of our batch workloads, like generation forecasts, don’t have latency or data residency requirements, so they can run anywhere. We use VertFlow to let them chase the sun to wherever energy is greenest, helping us save carbon on our mission to save carbon. VertFlow is available on PyPI: https://pypi.org/project/VertFlow/ Find out more at https://cloud.google.com/blog/topics/sustainability/ovo-energy-builds-greener-software-with-google-cloud

DAG Authoring - learn how to go beyond the basics and best practices when implementing Airflow DAGs. It will be a survival guide for Airflow DAG developers who need to cope with hundreds of Airflow operators. This session will go beyond 101 or “for dummies” session and will be of interest to both those who are just starting to develop Airflow DAGs and Airflow experts, as it will help them improve their productivity.

Discover PepsiCo’s dynamic data quality strategy in a multi-cloud landscape. Join me, the Director of Data Engineering, as I unveil our Airflow utilization, custom operator integration, and the power of Great Expectations. Learn how we’ve harmonized Data Mesh into our decentralized development for seamless data integration. Explore our journey to maintain quality and enhance data as a strategic asset at PepsiCo.

Productive cross-team collaboration between data engineers and analysts is the goal of all data teams, however, fulfilling on that mission can be challenging given the diverse set of skills that each group brings. In this talk we present an example of how one team tackled this topic by creating a flexible, dynamic and extensible framework using Airflow and cloud services that allowed engineers and analysts to jointly create data-centric micro-services to serve up projections and other robust analysis for use in the organization. The framework, which utilized dynamic DAG generation configured using yaml files, Kubernetes jobs and dbt transformations, abstracted away many of the details associated with workflow orchestration, allowing analysts to focus on their Python or R code and data processing logic while enabling data engineers to monitor the pipelines and ensure their scalability.

The ability to create DAGs programmatically opens up new possibilities for collaboration between Data Science and Data Engineering. Engineering and DevOPs are typically incentivized by stability whereas Data Science is typically incentivized by fast iteration and experimentation. With Airflow, it becomes possible for engineers to create tools that allow Data Scientists and Analysts to create robust no-code/low-code data pipelines for feature stores. We will discuss Airlow as a means of bridging the gap between data infrastructure and modeling iteration as well as examine how a Qbiz customer did just this by creating a tool which allows Data Scientists to build features, train models and measure performance, using cloud services, in parallel.

Today, all major cloud service providers and 3rd party providers include Apache Airflow as a managed service offering in their portfolios. While these cloud based solutions help with the undifferentiated heavy lifting of environment management, some data teams are also looking to operate self-managed Airflow instances to satisfy specific differentiated capabilities. In this session, we would talk about: Why should you might need to run self managed Airflow The available deployment options (with emphasis on Airflow on Kubernetes) How to deploy Airflow on Kubernetes using automation (Helm Charts & Terraform) Developer experience (sync DAGs using automation) Operator experience (Observability) Owned responsibilities and Tradeoffs A thorough understanding would help you understand the end-to-end perspectives of operating a highly available and scalable self managed Airflow environment to meet your ever growing workflow needs.

Reliability is a complex and important topic. I will focus on both reliability definition and best practices. I will begin by reviewing the Apache Airflow components that impact reliability. I will subsequently examine those aspects, showing the single points of failure, mitigations, and tradeoffs. The journey starts with the scheduling process. I will focus on the aspects of Scheduler infrastructure and configuration that address reliability improvements. It doesn’t run in a vacuum therefore I’ll share my observations on the reliability aspect of Scheduler infrastructure. We recommend tasks to be idempotent but that is not always possible. I will share the challenges of running user’s code in the distributed architecture of Cloud Composer. I will refer to the volatility of some cloud resources and mitigation methods in various scenarios. Deferrability plays important part in the reliability, but there are also other elements we shouldn’t ignore.

A Practical Guide to SAP Integration Suite: SAP’s Cloud Middleware and Integration Solution

This book covers the basics of SAP’s Integration Suite, including a broad overview of its capabilities, installation, and real-life examples to illustrate how it can be used to integrate, develop, administer, and monitor applications in the cloud. As you progress through the book, you will see how SAP Integration Suite works as an open, enterprise-grade platform that is a fully vendor-managed, multi-cloud offering that will help you expedite your SAP and third-party integration scenarios. The entire value chain is explored in detail, including usage of APIs and runtime control. Author Jaspreet Bagga demonstrates how SAP’s prebuilt integration packages facilitate quicker, more comprehensive integrations, and how they support a variety of integration patterns. You’ll learn how to leverage the platform to enable seamless cloud and on-premises applications connectivity, develop custom scenarios, mix master data, blend business-to-business (B2B) and electronic data interchange (EDI) processes, including trading partner management. Also covered are business-to-government (B2G) scenarios, orchestrating data and pipelines, and mixing event-driven integration. Upon completing this book, you will have a thorough understanding of why SAP Integration Suite is the middleware of SAP’s integration strategy, and be able to effectively use it in your own integration scenarios. What You Will Learn Understand SAP Integration Suite and its core capabilities Know how integration technologies, such as architecture and supplementary intelligent technologies, work within the SAP Integration Suite Discover services for pre-packaged accelerators: SAP API Management, the Integration Advisor, and the SAP API Business Hub Utilize integration features to link your on-premises or cloud-based systems Understand the capabilities of the newly released Migration Assessment Who This Book Is forWeb developers and application leads who want to learn SAP Integration Suite.