Send us a text This week on Making Data Simple, join Ajay Kulkarni, CEO and co-founder of TigerData, as we dive into the rapidly evolving world of data. Ajay shares his front-row perspective on the challenges and opportunities of building and scaling time-series databases in an era of AI-driven automation. From the mechanics of managing massive data streams to the bold bets shaping the future of IoT, this conversation goes deep into what’s breaking, what’s working, and what’s next. Whether you’re a data engineer, tech leader, or simply fascinated by the speed of AI innovation, this episode is packed with insights you won’t want to miss. 01:15 Meet AJ Kulkarni04:29 TigerData07:16 Timeseries 09:25 Use Cases 11:03 Why Progress? 11:58 Why TigerData16:05 AI is Everything21:06 The Fastest Postgres 25:45 Advanced Features28:53 Future of IOT36:48 The Future of TigerData38:03 San Francisco38:26 A Big Bet41:06 Good BooksLinkedIn: https://www.linkedin.com/in/ajaykulkarni/ Website: https://tigerdata.com Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.
talk-data.com
Topic
IoT
Internet of Things (IoT)
26
tagged
Activity Trend
Top Events
In this episode, I talk with Ilya Preston, co-founder and CEO of PAXAFE, a logistics orchestration and decision intelligence platform for temperature-controlled supply chains (aka “cold chain”). Ilya explains how PAXAFE helps companies shipping sensitive products, like pharmaceuticals, vaccines, food, and produce, by delivering end-to-end visibility and actionable insights powered by analytics and AI that reduce product loss, improve efficiency, and support smarter real-time decisions.
Ilya shares the challenges of building a configurable system that works for transportation, planning, and quality teams across industries. We also discuss their product development philosophy, team structure, and use of AI for document processing, diagnostics, and workflow automation.
Highlights/ Skip to:
Intro to Paxafe (2:13) How PAXAFE brings tons of cold chain data together in one user experience (2:33) Innovation in cold chain analytics is up, but so is cold chain product loss. (4:42) The product challenge of getting sufficient telemetry data at the right level of specificity to derive useful analytical insights (7:14) Why and how PAXAFE pivoted away from providing IoT hardware to collect telemetry (10:23) How PAXAFE supports complex customer workflows, cold chain logistics, and complex supply chains (13:57) Who the end users of PAXAFE are, and how the product team designs for these users (20:00) Pharma loses around $40 billion a year relying on ‘Bob’s intuition’ in the warehouse. How Paxafe balances institutional user knowledge with the cold hard facts of analytics (42:43) Lessons learned when Ilya’s team fell in love with its own product and didn’t listen to the market (23:57)
Quotes from Today’s Episode "Our initial vision for what PAXAFE would become was 99.9% spot on. The only thing we misjudged was market readiness—we built a product that was a few years ahead of its time." –IIya
"As an industry, pharma is losing $40 billion worth of product every year because decisions are still based on warehouse intuition about what works and what doesn’t. In production, the problem is even more extreme, with roughly $800 billion lost annually due to temperature issues and excursions." -IIya
"With our own design, our initial hypothesis and vision for what Pacaf could be really shaped where we are today. Early on, we had a strong perspective on what our customers needed—and along the way, we fell in love with our own product and design.." -IIya
"We spent months perfecting risk scores… only to hear from customers, ‘I don’t care about a 71 versus a 62—just tell me what to do.’ That single insight changed everything." -IIya
"If you’re not talking to customers or building a product that supports those conversations, you’re literally wasting time. In the zero-to-product-market-fit phase, nothing else matters, you need to focus entirely on understanding your customers and iterating your product around their needs..” -IIya
"Don’t build anything on day one, probably not on day two, three, or four either. Go out and talk to customers. Focus not on what they think they need, but on their real pain points. Understand their existing workflows and the constraints they face while trying to solve those problems." -IIya
Links
PAXAFE: https://www.paxafe.com/ LinkedIn for Ilya Preston: https://www.linkedin.com/in/ilyapreston/ LinkedIn for company: https://www.linkedin.com/company/paxafe/
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here. Edge computing is poised to transform industries by bringing computation and data storage closer to the source of data generation. This shift unlocks new types of value creation with data & AI and allows for a privacy-first and deeply personalized use of AI on our devices. What will the edge computing transition look like? How do you ensure applications are edge-ready, and what is the role of AI in the transition? Derek Collison is the founder and CEO at Synadia. He is an industry veteran, entrepreneur and pioneer in large-scale distributed systems and cloud computing. Derek founded Synadia Communications and Apcera, and has held executive positions at Google, VMware, and TIBCO Software. He is also an active angel investor and a technology futurist around Artificial Intelligence, Machine Learning, IOT and Cloud Computing. Justyna Bak is VP of Marketing at Synadia. Justyna is a versatile executive bridging Marketing, Sales and Product, a spark-plug for innovation at startups and Fortune 100 and a tech expert in Data Analytics and AI, AppDev and Networking. She is an astute influencer, panelist and presenter (Google, HBR) and a respected leader in Silicon Valley and Europe. In the episode, Richie, Derek, and Justyna explore the transition from cloud to edge computing, the benefits of reduced latency, real-time decision-making in industries like manufacturing and retail, the role of AI at the edge, and the future of edge-native applications, and much more. Links Mentioned in the Show: SynadiaConnect with Derek and JustynaCourse: Understanding Cloud ComputingRelated Episode: The Database is the Operating System with Mike Stonebraker, CTO & Co-Founder At DBOSRewatch sessions from RADAR: Forward Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here. Staying ahead means knowing what’s happening right now—not minutes or hours later. Real-time analytics promises to help teams react faster, make informed choices, and even predict issues before they arise. But implementing these systems is no small feat, and it requires careful alignment between technical capabilities and business needs. How do you ensure that real-time data actually drives impact? And what should organizations consider to make sure their real-time analytics investments lead to tangible benefits? Zuzanna Stamirowska is the CEO of Pathway.com - the fastest data processing engine on the market which makes real-time intelligence possible. Zuzanna is also the author of the state-of-the-art forecasting model for maritime trade published by the National Academy of Sciences of the USA. While working on this project she saw that the digitization of traditional industries was slowed down by the lack of a software infrastructure capable of doing automated reasoning on top of data streams, in real time. This was the spark to launch Pathway. She holds a Master’s degree in Economics and Public Policy from Sciences Po, Ecole Polytechnique, and ENSAE, as well as a PhD in Complexity Science.. Hélène Stanway is Independent Advisor & Consultant at HMLS Consulting Ltd. Hélène is an award-winning and highly effective insurance leader with a proven track record in emerging technologies, innovation, operations, data, change, and digital transformation. Her passion for actively combining the human element, design, and innovation alongside technology has enabled companies in the global insurance market to embrace change by achieving their desired strategic goals, improving processes, increasing efficiency, and deploying relevant tools. With a special passion for IoT and Sensor Technology, Hélène is a perpetual learner, driven to help delegates succeed. In the episode, Richie, Zuzanna and Hélène explore real-time analytics, their operational impact, use-cases of real-time analytics across industries, the benefits of adopting real-time analytics, the key roles and stakeholders you need to make that happen, operational challenges, strategies for effective adoption, the real-time of the future, common pitfalls, and much more. Links Mentioned in the Show:
Pathway
Connect with Zuzanna and HélèneLiArticle: What are digital twins and why do we need them?Course: Time Series Analysis in Power BIRelated Episode: How Real Time Data Accelerates Business Outcomes with George TrujilloSign up to RADAR: Forward Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data
Show Notes The Data Product Management In Action podcast, brought to you by Soda and executive producer Scott Hirleman, is a platform for data product management practitioners to share insights and experiences. In Season 01, Episode 18, our host Frannie Helforoush is back again interviewing Katy Pusch about her extensive experience in data product management, particularly with decision-support data products. Katy shares her insights on incorporating machine learning and analytics to empower stakeholders in making informed decisions. They both explore team structure, the challenges encountered in product development, and the critical importance of validating products with users to ensure their effectiveness. About our host Frannie Helforoush: Frannie's journey began as a software engineer and evolved into a strategic product manager. Now, as a data product manager, she leverages her expertise in both fields to create impactful solutions. Frannie thrives on making data accessible and actionable, driving product innovation, and ensuring product thinking is integral to data management. Connect with Frannie on LinkedIn. About our guest Katy Pusch: Katy brings more than a decade of experience in product management and market strategy, driving market change and adoption of innovative technology solutions. She has successfully built and launched data products, IoT solutions, and SaaS platforms in multiple industries such as healthcare, education, and real estate. She is currently serving as a Sr.Product Line Director at Trintech. With a background in research, she brings data science and market intelligence to every aspect of her work. Katy is passionate about data privacy and tech-ethics, and is pursuing an MS in History and Sociology of Technology and Science at GeorgiaTech. When she’s not working with her team to deliver top solutions, Katy enjoys spending time with her husband, building Lego models, and pursuing a private pilot license. Connect with Katy on LinkedIn. All views and opinions expressed are those of the individuals and do not necessarily reflect their employers or anyone else. Join the conversation on LinkedIn. Apply to be a guest or nominate someone that you know. Do you love what you're listening to? Please rate and review the podcast, and share it with fellow practitioners you know. Your support helps us reach more listeners and continue providing valuable insights!
One of the most annoying conversations about data that happens far too often is: “Can you do an analysis and answer this business problem for me?” “Sure, where’s the data?” “I don’t know. Probably in one of our databases.” At this point more time is spent hunting for data than actually analyzing it. Rather than grumbling about it, it would obviously be more productive to learn how to solve data discoverability issues. What’s the best way to properly document data sets? How can you avoid spending all your time maintaining dashboards that no one actually uses? Shinji Kim is the Founder & CEO of Select Star, an automated data discovery platform that helps you understand your data. Previously, she was the CEO of Concord Systems (concord.io), a NYC-based data infrastructure startup acquired by Akamai Technologies in 2016. She led building Akamai’s new IoT data platform for real-time messaging, log processing, and edge computing. Prior to Concord, Shinji was the first Product Manager hired at Yieldmo, where she led the Ad Format Lab, A/B testing, and yield optimization. Before Yieldmo, she was analyzing data and building enterprise applications at Deloitte Consulting, Facebook, Sun Microsystems, and Barclays Capital. Shinji studied Software Engineering at University of Waterloo and General Management at Stanford GSB. She advises early stage startups on product strategy, customer development, and company building. In the episode, Richie and Shinji explore the importance of data governance, the utilization of data, data quality, challenges in data usage, why documentation matters, metadata and data lineage, improving collaboration between data and business teams, data governance trends to look forward to, and much more. Links Mentioned in the Show: Select StarConnect with Shinji[Course] Data Governance ConceptsRelated Episode: Making Data Governance Fun with Tiankai Feng, Data Strategy & Data Governance Lead at ThoughtWorksRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business
Send us a text Datatopics is a podcast presented by Kevin Missoorten to talk about the fuzzy and misunderstood concepts in the world of data, analytics, and AI and get to the bottom of things.
In today's episode - a second one on collaborative data ecosystems - , we're diving into the world of collaborative Intelligence covering topics like federated learning, swarm learning, Edge AI and more groundbreaking approaches that are transforming the landscape of machine learning.
Join our expert guests Thomas Huybrechts and Virginie Marelli as we explore the inner workings of this innovative approach. We'll delve into the core concepts of federated learning, including how it enables organizations to leverage the collective knowledge of distributed data while maintaining data privacy and security. We'll also discuss the practical applications of federated learning in various domains, such as healthcare, finance, and IoT, and how it is being used to address real-world challenges.
Datatopics is brought to you by Dataroots Music: The Gentlemen - DivKidThe thumbnail is generated by Midjourney
On today’s episode, we’re talking to Dylan Barrell, Chief Technology Officer at Deque Systems, Inc, a web accessibility software and services company aimed at giving everyone, regardless of ability, equal access to information, services and applications on the web.
We talk about:
- Dylan’s background and what Deque does.
- The importance of accessibility in software.
- Dylan’s book, “Agile Accessibility Handbook,” and why he wrote it.
- Are there any particular tools to identify accessibility issues in software?
- Countries that are leading the way around SaaS accessibility.
- Advice for smaller, newer SaaS companies to prioritize accessibility.
- How tech trends like AI, the IoT and algorithms have impacted accessibility.
Dylan Barrell - https://www.linkedin.com/in/dylanbarrell/ Deque Systems - https://www.linkedin.com/company/deque-systems-inc/
This episode is brought to you by Qrvey
The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com.
Qrvey, the modern no-code analytics solution for SaaS companies on AWS.
saas #analytics #AWS #BI
In today’s episode, we’re talking to Andy Serwatuk, Director of Solutions Architecture at Onix Networking Corp., a Google Cloud Premier Partner enabling companies to effectively leverage the Google Cloud Platform across industries and use cases.
We discuss:
Andy’s background and how he started at Onix.The differences between SaaS and non-SaaS companies.Is Google Cloud a no-brainer for SaaS companies today?The value of outsourcing tasks to citizens.How can SaaS companies learn more about IoT and other emerging trends? …and much more.
This episode is brought to you by Qrvey
The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com.
Qrvey, the modern no-code analytics solution for SaaS companies on AWS.
saas #analytics #AWS #BI
Today I’m chatting with Katy Pusch, Senior Director of Product and Integration for Cox2M. Katy describes the lessons she’s learned around making sure that the “juice is always worth the squeeze” for new users to adopt data solutions into their workflow. She also explains the methodologies she’d recommend to data & analytics professionals to ensure their IOT and data products are widely adopted. Listen in to find out why this former analyst turned data product leader feels it’s crucial to focus on more than just delivering data or AI solutions, and how spending more time upfront performing qualitative research on users can wind up being more efficient in the long run than jumping straight into development.
Highlights/ Skip to:
What Katy does at Cox2M, and why the data product manager role is so hard to define (01:07) Defining the value of the data in workflows and how that’s approached at Cox2M (03:13) Who buys from Cox2M and the customer problems that Katy’s product solves (05:57) How Katy approaches the zero-to-one process of taking IOT sensor data and turning it into a customer experience that provides a valuable solution (08:00) What Katy feels best motivates the adoption of a new solution for users (13:21) Katy describes how she spends more time upfront before development to ensure she’s solving the right problems for users (16:13) Katy’s views on the importance of data science & analytics pros being able to communicate in the language of their audience (20:47) The differences Katy sees between designing data products for sophisticated data users vs a broader audience (24:13) The methods Katy uses to effectively perform qualitative research and her triangulation method to surface the real needs of end users (27:29) Katy’s views on the most valuable skills for future data product managers (35:24)
Quotes from Today’s Episode “I’ve had the opportunity to get a little bit closer to our customers than I was in the beginning parts of my tenure here at Cox2M. And it’s just like a SaaS product in the sense that the quality of your data is still dependent on your customers’ workflows and their ability to engage in workflows that supply accurate data. And it’s been a little bit enlightening to realize that the same is true for IoT.” – Katy Pusch (02:11)
“Providing insights to executives that are [simply] interesting is not really very impactful. You want to provide things that are actionable and that drive the business forward.” – Katy Pusch (4:43)
“So, there’s one side of it, which is [the] happy path: figure out a way to embed your product in the customer’s existing workflow. That’s where the most success happens. But in the situation we find ourselves in right now with [this IoT solution], we do have to ask them to change their workflow.”-- Katy Pusch (12:46)
“And the way to communicate [the insight to other stakeholders] is not with being more precise with your numbers [or adding] statistics. It’s just to communicate the output of your analysis more clearly to the person who needs to be able to make a decision.” -- Katy Pusch (23:15)
“You have to define ‘What decision is my user making on a repeated basis that is worth building something that it does automatically?’ And so, you say, ‘What are the questions that my user needs answers to on a repeated basis?’ … At its essence, you’re answering three or four questions for that user [that] have to be the most important [...] questions for your user to add value. And that can be a difficult thing to derive with confidence.” – Katy Pusch (25:55)
“The piece of workflow [on the IOT side] that’s really impactful there is we’re asking for an even higher degree of change management in that case because we’re asking them to attach this device to their vehicle, and then detach it at a different point in time and there’s a procedure in the solution to allow for that, but someone at the dealership has to engage in that process. So, there’s a change management in the workflow that the juice has to be worth the squeeze to encourage a customer to embark in that journey with you.” – Katy Pusch (12:08)
“Finding people in your organization who have the appetite to be cross-functionally educated, particularly in a data arena, is very important [to] help close some of those communication gaps.” – Katy Pusch (37:03)
Summary Industrial applications are one of the primary adopters of Internet of Things (IoT) technologies, with business critical operations being informed by data collected across a fleet of sensors. Vopak is a business that manages storage and distribution of a variety of liquids that are critical to the modern world, and they have recently launched a new platform to gain more utility from their industrial sensors. In this episode Mário Pereira shares the system design that he and his team have developed for collecting and managing the collection and analysis of sensor data, and how they have split the data processing and business logic responsibilities between physical terminals and edge locations, and centralized storage and compute.
Announcements
Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription So now your modern data stack is set up. How is everyone going to find the data they need, and understand it? Select Star is a data discovery platform that automatically analyzes & documents your data. For every table in Select Star, you can find out where the data originated, which dashboards are built on top of it, who’s using it in the company, and how they’re using it, all the way down to the SQL queries. Best of all, it’s simple to set up, and easy for both engineering and operations teams to use. With Select Star’s data catalog, a single source of truth for your data is built in minutes, even across thousands of datasets. Try it out for free and double the length of your free trial today at dataengineeringpodcast.com/selectstar. You’ll also get a swag package when you continue on a paid plan. Your host is Tobias Macey and today I’m interviewing Mário Pereira about building a data management system for globally distributed IoT sensors at Vopak
Interview
Introduction How did you get involved in the area of data management? Can you describe what Vopak is and what kinds of information you rely on to power the business? What kinds of sensors and edge devices are you using?
What kinds of consistency or variance do you have between sensors across your locations?
How much computing power and storage space do you place at the edge?
What level of pre-processing/filtering is being done at the edge and how do you decide what information needs to be centralized? What are some examples of decision-making that happens at the edge?
Can you describe the platform architecture that you have built for collecting and processing sensor data?
What was your process for selecting and evaluating the various components?
How much tolerance do you have for missed messages/dropped data? How long are your data retention period
Send us a text Let's talk supply chain efficiency of beer! Federico Crespo, CEO Valiot, takes us there with supply chain AI that increases Heineken production by 5%. Related within, time to get over Tom Brady. Seriously. Show Notes 06:19 What's up with supply chain 11:14 Ok, what about Inflation? 12:34 Valiot, the gods of IOT 15:08 Optimizing beer production 21:53 Shifting from IOT to AI 27:32 Valiot's differentiation 31:06 Are we ready for another Covid? Find Federico: https://www.linkedin.com/in/crespofederico/? originalSubdomain=mx Find Valiot: https://valiot.io/ Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next. Abstract Making Data Simple Podcast is hosted by Al Martin, WW VP Account Technical Leader IBM Technology Sales, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.
Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next. Abstract On this week's episode of Making Data Simple, we are joined by Rob Thomas, who is General Manager for IBM's Data and AI team. Rob discusses his recent media features on CNN and Fox News, giving his remarks on those experiences. He also gives his predictions on the IT industries for 2020, noting upcoming challenges. We are also treated to a more personal side of Rob, as he reveals his New Year's resolutions and goals. Tune-in to find out. Connect with Rob Twitter LinkedIn Personal Website IBM THINK Blogs Show Notes 05:41 - Check out this article on effective methods of time management. 12:53 - Forbes agrees, saying AI needs to be implemented the right way. 27:28 - Find out more on IBM Cloud Pak for Data here. 31:14 - Learn more about Watson Internet of Things (IoT) here. 34:58 - Read here on how Google's AI built another AI. 42:22 - Are you also looking to run the New York Marathon? Here are some tips on how to enter. Connect with the Team Producer Liam Seston - LinkedIn. Producer Lana Cosic - LinkedIn. Producer Meighann Helene - LinkedIn. Producer Mark Simmonds - LinkedIn. Host Al Martin - LinkedIn and Twitter. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.
IoT has created a tidal wave that data savvy organizations can turn into profitable business solutions. Most IoT data comes from sensors, which are now attached to almost every device imaginable, from factory floor machines and agricultural fields to your cell phone and toothbrush. But IoT is forcing companies to rethink their data architectures to ingest, process, and analyze streaming data in real-time.
To help us understand the impact of IoT on data architectures, we invited Dan Graham to our show for a second time. Dan is a former product marketing manager at both IBM and Teradata, renowned for combining deep technical knowledge with industry marketing savvy. During his tenure at those companies, he was responsible for MPP data management systems, data warehouses, and data lakes, and most recently, the Internet of Things.
Summary
The past year has been an active one for the timeseries market. New products have been launched, more businesses have moved to streaming analytics, and the team at Timescale has been keeping busy. In this episode the TimescaleDB CEO Ajay Kulkarni and CTO Michael Freedman stop by to talk about their 1.0 release, how the use cases for timeseries data have proliferated, and how they are continuing to simplify the task of processing your time oriented events.
Introduction
Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m welcoming Ajay Kulkarni and Mike Freedman back to talk about how TimescaleDB has grown and changed over the past year
Interview
Introduction How did you get involved in the area of data management? Can you refresh our memory about what TimescaleDB is? How has the market for timeseries databases changed since we last spoke? What has changed in the focus and features of the TimescaleDB project and company? Toward the end of 2018 you launched the 1.0 release of Timescale. What were your criteria for establishing that milestone?
What were the most challenging aspects of reaching that goal?
In terms of timeseries workloads, what are some of the factors that differ across varying use cases?
How do those differences impact the ways in which Timescale is used by the end user, and built by your team?
What are some of the initial assumptions that you made while first launching Timescale that have held true, and which have been disproven? How have the improvements and new features in the recent releases of PostgreSQL impacted the Timescale product?
Have you been able to leverage some of the native improvements to simplify your implementation? Are there any use cases for Timescale that would have been previously impractical in vanilla Postgres that would now be reasonable without the help of Timescale?
What is in store for the future of the Timescale product and organization?
Contact Info
Ajay
@acoustik on Twitter LinkedIn
Mike
LinkedIn Website @michaelfreedman on Twitter
Timescale
Website Documentation Careers timescaledb on GitHub @timescaledb on Twitter
Parting Question
From your perspective, what is the biggest gap in the tooling or technology for data management today?
Links
TimescaleDB Original Appearance on the Data Engineering Podcast 1.0 Release Blog Post PostgreSQL
Podcast Interview
RDS DB-Engines MongoDB IOT (Internet Of Things) AWS Timestream Kafka Pulsar
Podcast Episode
Spark
Podcast Episode
Flink
Podcast Episode
Hadoop DevOps PipelineDB
Podcast Interview
Grafana Tableau Prometheus OLTP (Online Transaction Processing) Oracle DB Data Lake
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast
In this Episode, Wayne Eckerson asks Charles Reeves about his organization’s Internet of Things and Big Data strategy. Reeves is senior manager of BI and analytics at Graphics Packaging International, a leader in the packaging industry with hundreds of worldwide customers. He has 25 years of professional experience in IT management including nine years in reporting, analytics, and data governance.
In this podcast, @RobertoMaranca shared his thoughts on running a large data-driven organization. He shared his thoughts on the future of data organizations through compliance and privacy. He shared how businesses could survive policy like GDPR and prepare themselves for better data transparency and visibility. This podcast is great for leadership, leading a transnational corporation.
TIMELINE: 0:28 Roberto's journey. 8:18 Best practices as a data steward. 16:58 Data leadership and GDPR. 22:18 Impact of GDPR. 25:34 GDPR creating better knowledge archive. 29:27 GDPR and IOT infrastructure. 35:08 Shadow IT phenomenon and consumer privacy. 44:54 Suggestions for enterprises to deal with privacy disruption. 50:52 Data debt. 53:10 Opportunities in new privacy frameworks. 57:52 Roberto's success mantra. 1:02:38 Roberto's favorite reads.
Roberto's Recommended Read: Team of Teams: New Rules of Engagement for a Complex World by General Stanley McChrystal and Tantum Collins https://amzn.to/2kUxW1K Do Androids Dream of Electric Sheep?: The inspiration for the films Blade Runner and Blade Runner 2049 by Philip K. Dick https://amzn.to/2xOOpxZ A Scanner Darkly by Philip K. Dick https://amzn.to/2sAsUMs Other Philip K. Dick Books @ https://amzn.to/2JBwwY0
Podcast Link: https://futureofdata.org/data-leadership-through-privacy-gdpr-by-robertomaranca/
Roberto's BIO: With almost 25 years of experience in the world of IT and Data, Roberto has spent most its working life with General Electric in their Capital Division, where since 2014, as Chief Data Officer for their International Unit, he has been overlooking the implementation of the Data Governance and Quality frameworks, spanning from supporting risk model validation to enabling divestitures and leading their more recent Basel III data initiatives. For the last year, he has held the role of Chief Data Officer at Lloyds Banking Group, shaping and implementing a new Data Strategy and dividing his time between BCBS 239 and GDPR programs.
Roberto has got a Master’s Degree in Aeronautical Engineering from “Federico II” Naples University.
About #Podcast:
FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.
Want to sponsor? Email us @ [email protected]
Keywords:
FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy
Summary
Data integration and routing is a constantly evolving problem and one that is fraught with edge cases and complicated requirements. The Apache NiFi project models this problem as a collection of data flows that are created through a self-service graphical interface. This framework provides a flexible platform for building a wide variety of integrations that can be managed and scaled easily to fit your particular needs. In this episode project members Kevin Doran and Andy LoPresto discuss the ways that NiFi can be used, how to start using it in your environment, and plans for future development. They also explained how it fits in the broad landscape of data tools, the interesting and challenging aspects of the project, and how to build new extensions.
Preamble
Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. Are you struggling to keep up with customer request and letting errors slip into production? Want to try some of the innovative ideas in this podcast but don’t have time? DataKitchen’s DataOps software allows your team to quickly iterate and deploy pipelines of code, models, and data sets while improving quality. Unlike a patchwork of manual operations, DataKitchen makes your team shine by providing an end to end DataOps solution with minimal programming that uses the tools you love. Join the DataOps movement and sign up for the newsletter at datakitchen.io/de today. After that learn more about why you should be doing DataOps by listening to the Head Chef in the Data Kitchen at dataengineeringpodcast.com/datakitchen Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Kevin Doran and Andy LoPresto about Apache NiFi
Interview
Introduction How did you get involved in the area of data management? Can you start by explaining what NiFi is? What is the motivation for building a GUI as the primary interface for the tool when the current trend is to represent everything as code? How did you get involved with the project?
Where does it sit in the broader landscape of data tools?
Does the data that is processed by NiFi flow through the servers that it is running on (á la Spark/Flink/Kafka), or does it orchestrate actions on other systems (á la Airflow/Oozie)?
How do you manage versioning and backup of data flows, as well as promoting them between environments?
One of the advertised features is tracking provenance for data flows that are managed by NiFi. How is that data collected and managed?
What types of reporting are available across this information?
What are some of the use cases or requirements that lend themselves well to being solved by NiFi?
When is NiFi the wrong choice?
What is involved in deploying and scaling a NiFi installation?
What are some of the system/network parameters that should be considered? What are the scaling limitations?
What have you found to be some of the most interesting, unexpected, and/or challenging aspects of building and maintaining the NiFi project and community? What do you have planned for the future of NiFi?
Contact Info
Kevin Doran
@kevdoran on Twitter Email
Andy LoPresto
@yolopey on Twitter Email
Parting Question
From your perspective, what is the biggest gap in the tooling or technology for data management today?
Links
NiFi HortonWorks DataFlow HortonWorks Apache Software Foundation Apple CSV XML JSON Perl Python Internet Scale Asset Management Documentum DataFlow NSA (National Security Agency) 24 (TV Show) Technology Transfer Program Agile Software Development Waterfall Spark Flink Kafka Oozie Luigi Airflow FluentD ETL (Extract, Transform, and Load) ESB (Enterprise Service Bus) MiNiFi Java C++ Provenance Kubernetes Apache Atlas Data Governance Kibana K-Nearest Neighbors DevOps DSL (Domain Specific Language) NiFi Registry Artifact Repository Nexus NiFi CLI Maven Archetype IoT Docker Backpressure NiFi Wiki TLS (Transport Layer Security) Mozilla TLS Observatory NiFi Flow Design System Data Lineage GDPR (General Data Protection Regulation)
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast
In this podcast, Drew Conway (@DrewConway) from Aluvium talks about his journey to start an IoT startup. He sheds light on the opportunities in the industrial IoT space and shares some insights into the mechanism of running a data science startup in the IoT space. She shared some tactical suggestions for any future leader. This podcast is great for data science startup entrepreneurs and/or Sr. executives in IoT.
Timeline: 0:28 Drew's journey from counter-terrorism to IoT startup. 9:29 Data science in the industrial space. 12:01 Entrepreneurship in the IoT start-up. 18:36 Selling data analysis to executives in the industrial space. 24:14 Automation in the industrial setting. 29:27 What is an IoT ready company? 32:40 Challenges in integrating data tools in the industrial sector. 37:27 Data science talent pool in industrial and manufacturing companies. 41:52 Challenges in IoT adoption for industrial companies. 46:31 Alluvium's interaction with industries. 50:57 Picking the right use case as an IoT start-up. 52:49 Right customers for an IoT start-up. 59:26 Words of wisdom for anyone building a IoT start-up.
Drew's Recommended Listen: Gödel, Escher, Bach: An Eternal Golden Braid by Douglas R. Hofstadter https://amzn.to/2x0uo7d
Podcast Link: https://futureofdata.org/drewconway-on-fabric-of-an-iot-startup-futureofdata-podcast/
Drew's BIO: Drew Conway, CEO and founder of Alluvium, is a leading expert in the application of computational methods to social and behavioral problems at large-scale. Drew has been writing and speaking about the role of data — and the discipline of data science — in industry, government, and academia for several years.
Drew has advised and consulted companies across many industries, ranging from fledgling start-ups to Fortune 100 companies, as well as academic institutions and government agencies at all levels. Drew started his career in counter-terrorism as a computational social scientist in the U.S. intelligence community.
About #Podcast:
FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.
Want to sponsor? Email us @ [email protected]
Keywords:
FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy
In this podcast, Drew Conway (@DrewConway) from Alluvium talks about his journey on creating a socially connected and responsible data science practice. He shared tactical steps and suggestions to help recruit the right talent, build the right culture, and nurture the relationship to create a sustained and impactful data science practice. The session is great for folks caring to create a self-sustaining and growth compliant data science practice.
Timeline: 0:28 Drew's journey from counter-terrorism to IoT startup. 9:29 Data science in the industrial space. 12:01 Entrepreneurship in the IoT start-up. 18:36 Selling data analysis to executives in the industrial space. 24:14 Automation in the industrial setting. 29:27 What is an IoT ready company? 32:40 Challenges in integrating data tools in the industrial sector. 37:27 Data science talent pool in industrial and manufacturing companies. 41:52 Challenges in IoT adoption for industrial companies. 46:31 Alluvium's interaction with industries. 50:57 Picking the right use case as an IoT start-up. 52:49 Right customers for an IoT start-up. 59:26 Words of wisdom for anyone building an IoT start-up.
Drew's Recommended Listen: Gödel, Escher, Bach: An Eternal Golden Braid by Douglas R. Hofstadter https://amzn.to/2x0uo7d
Podcast Link: https://futureofdata.org/drewconway-on-creating-socially-responsible-data-science-practice-futureofdata-podcast/
Drew's BIO: Drew Conway, CEO, and founder of Alluvium, is a leading expert in applying computational methods to social and behavioral problems at a large-scale. Drew has been writing and speaking about the role of data — and the discipline of data science — in industry, government, and academia for several years.
Drew has advised and consulted companies across many industries, ranging from fledgling start-ups to Fortune 100 companies, as well as academic institutions and government agencies at all levels. Drew started his career in counter-terrorism as a computational social scientist in the U.S. intelligence community.
About #Podcast:
FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.
Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/
Want to sponsor? Email us @ [email protected]
Keywords: