Build A Common Understanding Of Your Data Reliability Rules With Soda Core and Soda Checks Language

2022-09-26 · Data Engineering Podcast Listen

podcast_episode

by Tom Baeyens (Soda Data) , Tobias Macey

Analytics BI Data Engineering Data Management Dataflow ETL/ELT Google Analytics Hevo Data Kubernetes Modern Data Stack MongoDB MySQL +2 more

Summary Regardless of how data is being used, it is critical that the information is trusted. The practice of data reliability engineering has gained momentum recently to address that question. To help support the efforts of data teams the folks at Soda Data created the Soda Checks Language and the corresponding Soda Core utility that acts on this new DSL. In this episode Tom Baeyens explains their reasons for creating a new syntax for expressing and validating checks for data assets and processes, as well as how to incorporate it into your own projects.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don’t forget to thank them for their continued support of this show! Atlan is the metadata hub for your data ecosystem. Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan’s active metadata capabilities. Push information about data freshness and quality to your business intelligence, automatically scale up and down your warehouse based on usage patterns, and let the bots answer those questions in Slack so that the humans can focus on delivering real value. Go to dataengineeringpodcast.com/atlan today to learn more about how Atlan’s active metadata platform is helping pioneering data teams like Postman, Plaid, WeWork & Unilever achieve extraordinary things with metadata and escape the chaos. Prefect is the modern Dataflow Automation platform for the modern data stack, empowering data practitioners to build, run and monitor robust pipelines at scale. Guided by the principle that the orchestrator shouldn’t get in your way, Prefect is the only tool of its kind to offer the flexibility to write code as workflows. Prefect specializes in glueing together the disparate pieces of a pipeline, and integrating with modern distributed compute libraries to bring power where you need it, when you need it. Trusted by thousands of organizations and supported by over 20,000 community members, Prefect powers over 100MM business critical tasks a month. For more information on Prefect, visit dataengineeringpodcast.com/prefect. Data engineers don’t enjoy writing, maintaining, and modifying ETL pipelines all day, every day. Especially once they realize 90% of all major data sources like Google Analytics, Salesforce, Adwords, Facebook, Spreadsheets, etc., are already available as plug-and-play connectors with reliable, intuitive SaaS solutions. Hevo Data is a highly reliable and intuitive data pipeline platform used by data engineers from 40+ countries to set up and run low-latency ELT pipelines with zero maintenance. Boasting more than 150 out-of-the-box connectors that can be set up in minutes, Hevo also allows you to monitor and control your pipelines. You get: real-time data flow visibility, fail-safe mechanisms, and alerts if anything breaks; preload transformations and auto-schema mapping precisely control how data lands in your destination; models and workflows to transform data for analytics; and reverse-ETL capability to move the transformed data back to your business software to inspire timely action. All of this, plus its transparent pricing and 24*7 live support, makes it consistently voted by users as the Leader in the Data Pipeline category on review platforms like G2. Go to dataengineeringpodcast.com/hevodata and sign up for a free 14-day trial that also comes

Shaping the Future of Remote Work with Dvir Shapira at Venn

2022-09-21 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Dvir Shapira (Venn LocalZone)

Analytics AWS BI Cyber Security

In today’s episode we’re talking to Dvir Shapira. Dvir is Chief Product Officer at Venn LocalZone, a company that’s creating a secure workspace for remote work.

We talk about:

…and much more.

This episode is brought to you by Qrvey

The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com.

Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

saas #analytics #AWS #BI

Building A Shared Understanding Of Data Assets In A Business Through A Single Pane Of Glass With Workstream

2022-09-19 · Data Engineering Podcast Listen

podcast_episode

by Nichola Freund (Workstream.io) , Tobias Macey

Analytics BI Data Engineering Data Management Dataflow ETL/ELT Google Analytics Hevo Data Kubernetes Modern Data Stack MongoDB MySQL +2 more

Summary There is a constant tension in business data between growing siloes, and breaking them down. Even when a tool is designed to integrate information as a guard against data isolation, it can easily become a silo of its own, where you have to make a point of using it to seek out information. In order to help distribute critical context about data assets and their status into the locations where work is being done Nicholas Freund co-founded Workstream. In this episode he discusses the challenge of maintaining shared visibility and understanding of data work across the various stakeholders and his efforts to make it a seamless experience.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don’t forget to thank them for their continued support of this show! Atlan is the metadata hub for your data ecosystem. Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan’s active metadata capabilities. Push information about data freshness and quality to your business intelligence, automatically scale up and down your warehouse based on usage patterns, and let the bots answer those questions in Slack so that the humans can focus on delivering real value. Go to dataengineeringpodcast.com/atlan today to learn more about how Atlan’s active metadata platform is helping pioneering data teams like Postman, Plaid, WeWork & Unilever achieve extraordinary things with metadata and escape the chaos. Prefect is the modern Dataflow Automation platform for the modern data stack, empowering data practitioners to build, run and monitor robust pipelines at scale. Guided by the principle that the orchestrator shouldn’t get in your way, Prefect is the only tool of its kind to offer the flexibility to write code as workflows. Prefect specializes in glueing together the disparate pieces of a pipeline, and integrating with modern distributed compute libraries to bring power where you need it, when you need it. Trusted by thousands of organizations and supported by over 20,000 community members, Prefect powers over 100MM business critical tasks a month. For more information on Prefect, visit dataengineeringpodcast.com/prefect. Data engineers don’t enjoy writing, maintaining, and modifying ETL pipelines all day, every day. Especially once they realize 90% of all major data sources like Google Analytics, Salesforce, Adwords, Facebook, Spreadsheets, etc., are already available as plug-and-play connectors with reliable, intuitive SaaS solutions. Hevo Data is a highly reliable and intuitive data pipeline platform used by data engineers from 40+ countries to set up and run low-latency ELT pipelines with zero maintenance. Boasting more than 150 out-of-the-box connectors that can be set up in minutes, Hevo also allows you to monitor and control your pipelines. You get: real-time data flow visibility, fail-safe mechanisms, and alerts if anything breaks; preload transformations and auto-schema mapping precisely control how data lands in your destination; models and workflows to transform data for analytics; and reverse-ETL capability to move the transformed data back to your business software to inspire timely action. All of this, plus its transparent pricing and 24*7 live support, makes it consistently voted by users as the Leader in the Data Pipeline category on review platforms like G2. Go to

The Digital Transformation of Society with Joyce Durst at Growth Acceleration Partners

2022-09-14 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Joyce Durst (Growth Acceleration Partners)

Analytics AWS Cyber Security

In today’s episode, we’re joined by Joyce Durst. Joyce is the CEO and Co-Founder of Growth Acceleration Partners (GAP), a strategic software delivery partner based in Austin, Texas.

We talk about:

This episode is brought to you by Qrvey

The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com.

Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

Building a Time Series Database with Rick Spencer at InfluxData

2022-09-07 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Rick Spencer (InfluxData)

Analytics AWS Cyber Security

In today’s episode, we’re talking to Rick Spencer. Rick is VP of Product at InfluxData, a platform to help developers build time series-based applications quickly and at scale. We talk about Rick’s background, how InfluxData got started, and the kinds of problems it solves today. Rick describes the differences between building a product for developers and one for non-developers. We go on to discuss the difference between a time series database and a regular database, the benefits of a time series database, the idea of data gravity, and the interaction between engineering and product teams. This episode is brought to you by Qrvey The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com. Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

Exploring a New Age of Software with Mariano Jurich at Making Sense

2022-08-31 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Mariano Jurich (Making Sense)

Analytics AWS Cyber Security Web3

In today’s episode, we’re talking to Mariano Jurich, Project and Product Manager at Making Sense — a platform for developing game-changing software solutions. We talk about the history of Making Sense and what the company is working on today, how to recognize when a new customer is a good fit, understanding the many types of users that use your product over time, and the importance of focusing on user experience. We go on to discuss the reasons why a company with a strong product-market fit might still struggle to achieve success, how remote work could shape the future of software development, and how the software industry in Latin America specifically looks set to change. Finally, we talk about how the growth of Web3 will impact software development, lead to greater democratization, and drive a more globalized world. This episode is brought to you by Qrvey The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com. Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

Building Apps in 60 Minutes with Baskar Agneeswaran at Vajro

2022-08-24 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Baskar Agneeswaran (Vajro)

Analytics AWS Cloud Computing Marketing Cyber Security

In today’s episode, we’re talking to Baskar Agneeswaran, CEO and Co-Founder at Vajro, a cloud-based mobile commerce platform for building high-converting mobile apps for online stores. We talk about Baskar’s and Vajro’s background, how it’s possible to build an app within 60 minutes, and how smartphones might evolve over the next 15 years. Baskar also shares the marketing and sales models used by his company. We discuss the difference between companies that have sales as a growth engine and those that have marketing as a growth engine, and how to strike a balance between these and the product itself. Baskar shares some of the lessons he’s learned scaling a company like Vajro. He also explains the different phases of growth. Finally, we talk about the importance of diversity and how to balance that with alignment around a core mission. This episode is brought to you by Qrvey The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com. Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

The Evolution of User Testing with Kaj van de Loo at UserTesting

2022-08-18 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Kaj van de Loo (UserTesting)

Agile/Scrum Analytics AWS Cyber Security

In this episode, we’re talking to Kaj van de Loo, Chief Technology Officer at UserTesting. We talk about the company’s history and the problems it solves, the way Agile development methodologies have evolved, the different types of Agile development, and the differences between B2B and B2C software. Kaj talks about some of the best ways for companies to understand users and how to analyze data like web traffic, the growing importance of personalization in user experience, the best time to add product management to a team, and more. Finally, we talk about the ideal ratio of QAs to developers and whether being a CTO makes someone a better CEO.

This episode is brought to you by Qrvey The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com. Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

Building a Strong SaaS Security Layer with Brook Lovatt at Cloudentity

2022-08-10 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Brook Lovatt (Cloudentity)

Analytics API AWS Cyber Security

In this episode, we’re talking to Brook Lovatt, Chief Executive Officer at Cloudentity. Cloudentity is a company that provides application and security teams with a better way to automate and control how information is shared over APIs. We talk about the problems Cloudentity solves and how it came to be, along with the options available to today’s SaaS companies when it comes to building a security authorization layer. Brook shares some of the positive impacts of facilitating data sharing. We discuss the differences between data and API, how SaaS has changed over time, the shift towards more product-oriented CEOs (and the advantages of this as a company scales), and the trend of selling software directly to developers. Finally, we look at the growing importance of being a product specialist, and what the future holds for SaaS and developers. This episode is brought to you by Qrvey The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com. Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

The Critical Role of Good SaaS Documentation with Ken Babcock at Tango

2022-08-03 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Ken Babcock (Tango)

Analytics AWS Cyber Security

In this episode, we’re talking to Ken Babcock, Co-Founder of Tango. Tango is a platform for building beautiful step-by-step how-to guides with screenshots, in seconds. Ken talks about meeting his co-founders at Harvard Business School and how the project got started, and we go on to discuss how well-defined processes and documentation can make a company much more scalable. How has the pandemic and the rise of remote work affected the need for clear instructions and documentation? We talk about how SaaS companies can help other businesses transition to the digital world and the role well-documented processes play here. Is there a difference between B2B and B2C SaaS companies when it comes to digital transformation? We also discuss how companies might sometimes grow too fast and hinder progress this way. Finally, we talk about the pros and cons of VC funding and what the near future holds for Tango. This episode is brought to you by Qrvey The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com. Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

Ensuring SaaS Success in Covid and Beyond with Chris Wacker at Laserfiche

2022-07-28 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Chris Wacker (Laserfiche)

Analytics AWS Cyber Security

Welcome to the latest episode of SaaS Scaled. Today we’re joined by Chris Wacker, CEO at Laserfiche, the leading SaaS provider of intelligent content management and business process automation. We chat about how Laserfiche came into being, how SaaS has changed and impacted business over the years, the impact of Covid, and the impact of widespread digital transformation on the world. Chris shares some of the key principles that make a SaaS team and product successful. We go on to discuss the difference between short- and long-term thinking with SaaS and how to strike the right balance here, both in SaaS and in business generally. Finally, Chris shares a book that has had a big impact on him. This episode is brought to you by Qrvey The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com. Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

Growing Small Businesses in Today’s Internet with Maria Thomas at Buffer

2022-07-22 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Maria Thomas (Buffer)

Analytics AWS Marketing Cyber Security Web3

In today’s episode of SaaS Scaled, we’re talking to Maria Thomas. Maria is Chief Product Officer at Buffer, a SaaS company building a social media and organic marketing platform for small businesses. Maria focuses on the design elements of marketing and engineering. We chat about the main problems Buffer solves and how it came into being, and Maria talks about the importance of transparency within SaaS companies and the benefits of being a value-driven company. We go on to discuss the future — how are Web3, decentralization, and other emerging technologies changing the way the internet works and how people monetize their work? Maria talks about vision and how Buffer defines its vision in a more narrow sense. This episode is brought to you by Qrvey The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com. Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

Serverless Kafka and Apache Spark in a Multi-Cloud Data Lakehouse Architecture

2022-07-19 · Databricks DATA + AI Summit 2023 Watch

video

Analytics Cloud Computing Data Lake Data Lakehouse Databricks Delta ELK Kafka Spark Data Streaming

Apache Kafka in conjunction with Apache Spark became the de facto standard for processing and analyzing data. Both frameworks are open, flexible, and scalable. Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use serverless SaaS offerings to focus on business logic. However, hybrid and multi-cloud scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden.

This post explores different architecture to build serverless Kafka and Spark multi-cloud architectures across regions and continents. We start from the analytics perspective of a data lake and explore its relation to a fully integrated data streaming layer with Kafka to build a modern data lakehouse. Real-world use cases show the joint value and explore the benefit of the "delta lake" integration.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Supercharge your SaaS applications with a modern, cloud-native database

2022-07-19 · Databricks DATA + AI Summit 2023 Watch

video

Cloud Computing Databricks

Today’s world demands modern applications that process data at faster speeds and deliver real-time insights. Yet the challenge for most businesses is their data infrastructure isn't designed for data intensity — the idea that high volumes of data should be quickly ingested and processed, no matter how complex or diverse the data sets. How do you meet the demands of a data-intensive application? It starts with the right database. This session gives you a roadmap with key criteria for powering modern, data-intensive applications with a cloud-native database — and how three customers drove up to 100x better performance for their applications.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

How To Maximize the Odds of Startup Success with Patrick Parker at SaaS Partners

2022-07-14 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Patrick Parker (SaaS Partners)

AI/ML Analytics AWS Blockchain Cyber Security Web3

Welcome to the latest episode of SaaS Scaled, where we’re talking to Patrick Parker, CEO of SaaS Partners — a company aimed at helping people build and scale their own SaaS businesses. Patrick talks about his experience in consulting and software security, and what ultimately led him to start building SaaS Partners. We address why new startups so often fail and discuss some of the key challenges they need to overcome. What are some simple things new startup founders can do to boost their chances of success? We discuss the importance of focusing on one key problem at a time, and Patrick talks about the value of reusing and copying certain successful models of building a startup. We also dive into the impact of emerging technologies on the SaaS space such as AI, machine learning, and the metaverse. Finally, we talk about Web3, cryptocurrency, and blockchain, what this growing trend means for SaaS and the world as a whole, and how long it will take to become truly mainstream. This episode is brought to you by Qrvey The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com. Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

Managing Multiple ML Models For Multiple Clients : Steps For Scaling Up

2022-07-01 · Airflow Summit 2022

session

by Ori Peri

AI/ML Airflow KPI MLOps

For most ML-based SaaS companies, the need to fulfill each customer’s KPI will usually be addressed by matching a dedicated model. Along with the benefits of optimizing the model’s performance, a model per customer solution carries a heavy production complexity with it. In this manner, incorporating up-to-date data as well as new features and capabilities as part of a model’s retraining process can become a major production bottleneck. In this talk, we will see how Riskified scaled up modeling operations based on MLOps ideas, and focus on how we used Airflow as our ML pipeline orchestrator. We will dive into how we wrap Airflow as an internal service, the goals we started with, the obstacles along the way and finally - how we solved them. You will receive tools for how to set up your own Airflow-based continuous training ML pipeline, and how we adjusted it such that ML engineers and data scientists would be able to collaborate and work in parallel using the same pipeline.

How Tech Can Align Teams and Operations with Joe Keehnast from RevenueWell

2022-06-15 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Joe Keehnast (RevenueWell)

Marketing

In this episode of SaaS Scaled, we’re talking to Joe Keehnast. He’s the VP of Product at RevenueWell, a company that gives group dental practices and DSOs the tools they need to align marketing and operations across multiple locations. We chat about Joe’s experience and history and dive into how RevenueWell works and some of the main problems it solves. We also cover some of the challenges involved in building a project like RevenueWell and the pros and cons of building, buying, and renting software. Joe shares his thoughts on the most effective way to keep the people in your business consistently aligned and on the same page. How do you deal with the feedback you get from different areas like sales, market, and existing customers when it comes to product management? Finally, we talk about the importance of customer conversations when it comes to making product decisions, and how to define a clear and useful product vision.

Discover And De-Clutter Your Unstructured Data With Aparavi

2022-06-13 · Data Engineering Podcast Listen

podcast_episode

by Rod Christensen (Aparavi) , Tobias Macey

AWS Azure BigQuery CDP Cloud Computing Data Engineering Data Lake Data Management Databricks ETL/ELT GCP Java +12 more

Summary Unstructured data takes many forms in an organization. From a data engineering perspective that often means things like JSON files, audio or video recordings, images, etc. Another category of unstructured data that every business deals with is PDFs, Word documents, workstation backups, and countless other types of information. Aparavi was created to tame the sprawl of information across machines, datacenters, and clouds so that you can reduce the amount of duplicate data and save time and money on managing your data assets. In this episode Rod Christensen shares the story behind Aparavi and how you can use it to cut costs and gain value for the long tail of your unstructured data.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don’t forget to thank them for their continued support of this show! This episode is brought to you by Acryl Data, the company behind DataHub, the leading developer-friendly data catalog for the modern data stack. Open Source DataHub is running in production at several companies like Peloton, Optum, Udemy, Zynga and others. Acryl Data provides DataHub as an easy to consume SaaS product which has been adopted by several companies. Signup for the SaaS product at dataengineeringpodcast.com/acryl RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their state-of-the-art reverse ETL pipelines enable you to send enriched data to any cloud tool. Sign up free… or just get the free t-shirt for being a listener of the Data Engineering Podcast at dataengineeringpodcast.com/rudder. Data teams are increasingly under pressure to deliver. According to a recent survey by Ascend.io, 95% in fact reported being at or over capacity. With 72% of data experts reporting demands on their team going up faster than they can hire, it’s no surprise they are increasingly turning to automation. In fact, while only 3.5% report having current investments in automation, 85% of data teams plan on investing in automation in the next 12 months. 85%!!! That’s where our friends at Ascend.io come in. The Ascend Data Automation Cloud provides a unified platform for data ingestion, transformation, orchestration, and observability. Ascend users love its declarative pipelines, powerful SDK, elegant UI, and extensible plug-in architecture, as well as its support for Python, SQL, Scala, and Java. Ascend automates workloads on Snowflake, Databricks, BigQuery, and open source Spark, and can be deployed in AWS, Azure, or GCP. Go to dataengineeringpodcast.com/ascend and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $5,000 when you become a customer. Your host is Tobias Macey and today I’m interviewing Rod Christensen about Aparavi, a platform designed to find and unlock the value of data, no matter where it lives

Interview

Introduction How did you get involved in the area of data management? Can you describe what Aparavi is and the story behind it? Who are the target customers for Aparavi and how does that inform your product roadmap and messaging? What are some of th

The Life of a Product Evangelist with Jim Walker from Cockroach Labs

2022-06-08 · SaaS Scaled - Interviews about SaaS Startups, Analytics, & Operations Listen

podcast_episode

by Jim Walker (Cockroach Labs)

Marketing

In today’s episode of SaaS Scaled, we’re talking to Jim Walker, Principal Product Evangelist at Cockroach Labs. We talk about what Jim’s role involves and why it’s so important. He explains the difference between being a product evangelist and simply selling the product. He also gives us some insight into Cockroach Labs and the problems they solve. We discuss transactional data and how to strike a balance between centralization and decentralization when it comes to scalability. Jim shares his thoughts on how SaaS companies should approach product marketing and asks, do the time-tested methods of the last couple of decades still work? Jim dives into serverless technology and how it can help businesses scale in a more optimized and sustainable way. He also looks at how more distributed systems can help build a better world.

Bringing The Modern Data Stack To Everyone With Y42

2022-06-06 · Data Engineering Podcast Listen

podcast_episode

by Hung Dang (Y42) , Tobias Macey

Airflow Analytics CDP Cloud Computing Data Engineering Data Lake Data Management ETL/ELT Kubernetes Modern Data Stack MongoDB MySQL +3 more

Summary Cloud services have made highly scalable and performant data platforms economical and manageable for data teams. However, they are still challenging to work with and manage for anyone who isn’t in a technical role. Hung Dang understood the need to make data more accessible to the entire organization and created Y42 as a better user experience on top of the "modern data stack". In this episode he shares how he designed the platform to support the full spectrum of technical expertise in an organization and the interesting engineering challenges involved.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don’t forget to thank them for their continued support of this show! This episode is brought to you by Acryl Data, the company behind DataHub, the leading developer-friendly data catalog for the modern data stack. Open Source DataHub is running in production at several companies like Peloton, Optum, Udemy, Zynga and others. Acryl Data provides DataHub as an easy to consume SaaS product which has been adopted by several companies. Signup for the SaaS product at dataengineeringpodcast.com/acryl RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their state-of-the-art reverse ETL pipelines enable you to send enriched data to any cloud tool. Sign up free… or just get the free t-shirt for being a listener of the Data Engineering Podcast at dataengineeringpodcast.com/rudder. The most important piece of any data project is the data itself, which is why it is critical that your data source is high quality. PostHog is your all-in-one product analytics suite including product analysis, user funnels, feature flags, experimentation, and it’s open source so you can host it yourself or let them do it for you! You have full control over your data and their plugin system lets you integrate with all of your other data tools, including data warehouses and SaaS platforms. Give it a try today with their generous free tier at dataengineeringpodcast.com/posthog Your host is Tobias Macey and today I’m interviewing Hung Dang about Y42, the full-stack data platform that anyone can run

Interview

Introduction How did you get involved in the area of data management? Can you describe what Y42 is and the story behind it? How would you characterize your positioning in the data ecosystem? What are the problems that you are trying to solve?

Who are the personas that you optimize for and how does that manifest in your product design and feature priorities?

How is the Y42 platform implemented?

What are the core engineering problems that you have had to address in order to tie together the various underlying services that you integrate? How have the design and goals of the product changed or evolved since you started working on it?

What are the sharp edges and failure conditions that you have had to automate around in order to support non-technical users? What is the process for integrating Y42 with an organization’s data systems?

What is the story for onboarding from existing systems and importing workflows (e.g. Airflow d

talk-data.com

SaaS

Activity Trend

Top Events

Top Speakers

Build A Common Understanding Of Your Data Reliability Rules With Soda Core and Soda Checks Language

Shaping the Future of Remote Work with Dvir Shapira at Venn

saas #analytics #AWS #BI

Building A Shared Understanding Of Data Assets In A Business Through A Single Pane Of Glass With Workstream

The Digital Transformation of Society with Joyce Durst at Growth Acceleration Partners

Building a Time Series Database with Rick Spencer at InfluxData

Exploring a New Age of Software with Mariano Jurich at Making Sense

Building Apps in 60 Minutes with Baskar Agneeswaran at Vajro

The Evolution of User Testing with Kaj van de Loo at UserTesting

Building a Strong SaaS Security Layer with Brook Lovatt at Cloudentity

The Critical Role of Good SaaS Documentation with Ken Babcock at Tango

Ensuring SaaS Success in Covid and Beyond with Chris Wacker at Laserfiche

Growing Small Businesses in Today’s Internet with Maria Thomas at Buffer

Serverless Kafka and Apache Spark in a Multi-Cloud Data Lakehouse Architecture

Supercharge your SaaS applications with a modern, cloud-native database

How To Maximize the Odds of Startup Success with Patrick Parker at SaaS Partners

Managing Multiple ML Models For Multiple Clients : Steps For Scaling Up

How Tech Can Align Teams and Operations with Joe Keehnast from RevenueWell

Discover And De-Clutter Your Unstructured Data With Aparavi

The Life of a Product Evangelist with Jim Walker from Cockroach Labs

Bringing The Modern Data Stack To Everyone With Y42