talk-data.com talk-data.com

Topic

IoT

Internet of Things (IoT)

connected_devices sensors data_collection

112

tagged

Activity Trend

11 peak/qtr
2020-Q1 2026-Q1

Activities

112 activities · Newest first

Stream Processing with Apache Flink

Get started with Apache Flink, the open source framework that powers some of the world’s largest stream processing applications. With this practical book, you’ll explore the fundamental concepts of parallel stream processing and discover how this technology differs from traditional batch data processing. Longtime Apache Flink committers Fabian Hueske and Vasia Kalavri show you how to implement scalable streaming applications with Flink’s DataStream API and continuously run and maintain these applications in operational environments. Stream processing is ideal for many use cases, including low-latency ETL, streaming analytics, and real-time dashboards as well as fraud detection, anomaly detection, and alerting. You can process continuous data of any kind, including user interactions, financial transactions, and IoT data, as soon as you generate them. Learn concepts and challenges of distributed stateful stream processing Explore Flink’s system architecture, including its event-time processing mode and fault-tolerance model Understand the fundamentals and building blocks of the DataStream API, including its time-based and statefuloperators Read data from and write data to external systems with exactly-once consistency Deploy and configure Flink clusters Operate continuously running streaming applications

Mastering MongoDB 4.x - Second Edition

This book, Mastering MongoDB 4.x, provides an in-depth exploration of MongoDB's features and capabilities, empowering readers to create high-performance and fault-tolerant database solutions. Through practical examples and clear explanations, you will learn how to implement complex queries, optimize database performance, manage large-scale clusters, and ensure robust failover and backup strategies. What this Book will help me do Understand advanced querying techniques and best practices in data indexing and management. Effectively configure and monitor MongoDB instances for scalability and optimized performance. Master techniques for replication and sharding to support high-availability systems. Deploy MongoDB-based applications seamlessly across on-premise and cloud environments. Learn to integrate MongoDB with modern technologies like big data platforms, containers, and IoT applications. Author(s) Alex Giamas is a seasoned database administrator and developer with significant experience in working with both relational and non-relational databases. Having authored numerous articles and given lectures on MongoDB and other data management technologies, Alex brings practical insights to his writing. He emphasizes real-world applications with examples drawn from his extensive career. Who is it for? This book is designed for developers and database administrators already familiar with MongoDB and basic database concepts, who are looking to enhance their expertise for implementing advanced MongoDB solutions. It is also suitable for professionals aspiring to earn MongoDB certifications and expand their skills to manage large, high-performance database systems efficiently.

Summary

The past year has been an active one for the timeseries market. New products have been launched, more businesses have moved to streaming analytics, and the team at Timescale has been keeping busy. In this episode the TimescaleDB CEO Ajay Kulkarni and CTO Michael Freedman stop by to talk about their 1.0 release, how the use cases for timeseries data have proliferated, and how they are continuing to simplify the task of processing your time oriented events.

Introduction

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m welcoming Ajay Kulkarni and Mike Freedman back to talk about how TimescaleDB has grown and changed over the past year

Interview

Introduction How did you get involved in the area of data management? Can you refresh our memory about what TimescaleDB is? How has the market for timeseries databases changed since we last spoke? What has changed in the focus and features of the TimescaleDB project and company? Toward the end of 2018 you launched the 1.0 release of Timescale. What were your criteria for establishing that milestone?

What were the most challenging aspects of reaching that goal?

In terms of timeseries workloads, what are some of the factors that differ across varying use cases?

How do those differences impact the ways in which Timescale is used by the end user, and built by your team?

What are some of the initial assumptions that you made while first launching Timescale that have held true, and which have been disproven? How have the improvements and new features in the recent releases of PostgreSQL impacted the Timescale product?

Have you been able to leverage some of the native improvements to simplify your implementation? Are there any use cases for Timescale that would have been previously impractical in vanilla Postgres that would now be reasonable without the help of Timescale?

What is in store for the future of the Timescale product and organization?

Contact Info

Ajay

@acoustik on Twitter LinkedIn

Mike

LinkedIn Website @michaelfreedman on Twitter

Timescale

Website Documentation Careers timescaledb on GitHub @timescaledb on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

TimescaleDB Original Appearance on the Data Engineering Podcast 1.0 Release Blog Post PostgreSQL

Podcast Interview

RDS DB-Engines MongoDB IOT (Internet Of Things) AWS Timestream Kafka Pulsar

Podcast Episode

Spark

Podcast Episode

Flink

Podcast Episode

Hadoop DevOps PipelineDB

Podcast Interview

Grafana Tableau Prometheus OLTP (Online Transaction Processing) Oracle DB Data Lake

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

Fast Data Architectures for Streaming Applications, 2nd Edition

Why have stream-oriented data systems become so popular, when batch-oriented systems have served big data needs for many years? In the updated edition of this report, Dean Wampler examines the rise of streaming systems for handling time-sensitive problems—such as detecting fraudulent financial activity as it happens. You’ll explore the characteristics of fast data architectures, along with several open source tools for implementing them. Batch processing isn’t going away, but exclusive use of these systems is now a competitive disadvantage. You’ll learn that, while fast data architectures using tools such as Kafka, Akka, Spark, and Flink are much harder to build, they represent the state of the art for dealing with mountains of data that require immediate attention. Learn how a basic fast data architecture works, step-by-step Examine how Kafka’s data backplane combines the best abstractions of log-oriented and message queue systems for integrating components Evaluate four streaming engines, including Kafka Streams, Akka Streams, Spark, and Flink Learn which streaming engines work best for different use cases Get recommendations for making real-world streaming systems responsive, resilient, elastic, and message driven Explore an example IoT streaming application that includes telemetry ingestion and anomaly detection

In this Episode, Wayne Eckerson asks Charles Reeves about his organization’s Internet of Things and Big Data strategy. Reeves is senior manager of BI and analytics at Graphics Packaging International, a leader in the packaging industry with hundreds of worldwide customers. He has 25 years of professional experience in IT management including nine years in reporting, analytics, and data governance.

In this podcast, @RobertoMaranca shared his thoughts on running a large data-driven organization. He shared his thoughts on the future of data organizations through compliance and privacy. He shared how businesses could survive policy like GDPR and prepare themselves for better data transparency and visibility. This podcast is great for leadership, leading a transnational corporation.

TIMELINE: 0:28 Roberto's journey. 8:18 Best practices as a data steward. 16:58 Data leadership and GDPR. 22:18 Impact of GDPR. 25:34 GDPR creating better knowledge archive. 29:27 GDPR and IOT infrastructure. 35:08 Shadow IT phenomenon and consumer privacy. 44:54 Suggestions for enterprises to deal with privacy disruption. 50:52 Data debt. 53:10 Opportunities in new privacy frameworks. 57:52 Roberto's success mantra. 1:02:38 Roberto's favorite reads.

Roberto's Recommended Read: Team of Teams: New Rules of Engagement for a Complex World by General Stanley McChrystal and Tantum Collins https://amzn.to/2kUxW1K Do Androids Dream of Electric Sheep?: The inspiration for the films Blade Runner and Blade Runner 2049 by Philip K. Dick https://amzn.to/2xOOpxZ A Scanner Darkly by Philip K. Dick https://amzn.to/2sAsUMs Other Philip K. Dick Books @ https://amzn.to/2JBwwY0

Podcast Link: https://futureofdata.org/data-leadership-through-privacy-gdpr-by-robertomaranca/

Roberto's BIO: With almost 25 years of experience in the world of IT and Data, Roberto has spent most its working life with General Electric in their Capital Division, where since 2014, as Chief Data Officer for their International Unit, he has been overlooking the implementation of the Data Governance and Quality frameworks, spanning from supporting risk model validation to enabling divestitures and leading their more recent Basel III data initiatives. For the last year, he has held the role of Chief Data Officer at Lloyds Banking Group, shaping and implementing a new Data Strategy and dividing his time between BCBS 239 and GDPR programs.

Roberto has got a Master’s Degree in Aeronautical Engineering from “Federico II” Naples University.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Getting Started with Kudu

Fast data ingestion, serving, and analytics in the Hadoop ecosystem have forced developers and architects to choose solutions using the least common denominator—either fast analytics at the cost of slow data ingestion or fast data ingestion at the cost of slow analytics. There is an answer to this problem. With the Apache Kudu column-oriented data store, you can easily perform fast analytics on fast data. This practical guide shows you how. Begun as an internal project at Cloudera, Kudu is an open source solution compatible with many data processing frameworks in the Hadoop environment. In this book, current and former solutions professionals from Cloudera provide use cases, examples, best practices, and sample code to help you get up to speed with Kudu. Explore Kudu’s high-level design, including how it spreads data across servers Fully administer a Kudu cluster, enable security, and add or remove nodes Learn Kudu’s client-side APIs, including how to integrate Apache Impala, Spark, and other frameworks for data manipulation Examine Kudu’s schema design, including basic concepts and primitives necessary to make your project successful Explore case studies for using Kudu for real-time IoT analytics, predictive modeling, and in combination with another storage engine

Summary

Data integration and routing is a constantly evolving problem and one that is fraught with edge cases and complicated requirements. The Apache NiFi project models this problem as a collection of data flows that are created through a self-service graphical interface. This framework provides a flexible platform for building a wide variety of integrations that can be managed and scaled easily to fit your particular needs. In this episode project members Kevin Doran and Andy LoPresto discuss the ways that NiFi can be used, how to start using it in your environment, and plans for future development. They also explained how it fits in the broad landscape of data tools, the interesting and challenging aspects of the project, and how to build new extensions.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. Are you struggling to keep up with customer request and letting errors slip into production? Want to try some of the innovative ideas in this podcast but don’t have time? DataKitchen’s DataOps software allows your team to quickly iterate and deploy pipelines of code, models, and data sets while improving quality. Unlike a patchwork of manual operations, DataKitchen makes your team shine by providing an end to end DataOps solution with minimal programming that uses the tools you love. Join the DataOps movement and sign up for the newsletter at datakitchen.io/de today. After that learn more about why you should be doing DataOps by listening to the Head Chef in the Data Kitchen at dataengineeringpodcast.com/datakitchen Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Kevin Doran and Andy LoPresto about Apache NiFi

Interview

Introduction How did you get involved in the area of data management? Can you start by explaining what NiFi is? What is the motivation for building a GUI as the primary interface for the tool when the current trend is to represent everything as code? How did you get involved with the project?

Where does it sit in the broader landscape of data tools?

Does the data that is processed by NiFi flow through the servers that it is running on (á la Spark/Flink/Kafka), or does it orchestrate actions on other systems (á la Airflow/Oozie)?

How do you manage versioning and backup of data flows, as well as promoting them between environments?

One of the advertised features is tracking provenance for data flows that are managed by NiFi. How is that data collected and managed?

What types of reporting are available across this information?

What are some of the use cases or requirements that lend themselves well to being solved by NiFi?

When is NiFi the wrong choice?

What is involved in deploying and scaling a NiFi installation?

What are some of the system/network parameters that should be considered? What are the scaling limitations?

What have you found to be some of the most interesting, unexpected, and/or challenging aspects of building and maintaining the NiFi project and community? What do you have planned for the future of NiFi?

Contact Info

Kevin Doran

@kevdoran on Twitter Email

Andy LoPresto

@yolopey on Twitter Email

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

NiFi HortonWorks DataFlow HortonWorks Apache Software Foundation Apple CSV XML JSON Perl Python Internet Scale Asset Management Documentum DataFlow NSA (National Security Agency) 24 (TV Show) Technology Transfer Program Agile Software Development Waterfall Spark Flink Kafka Oozie Luigi Airflow FluentD ETL (Extract, Transform, and Load) ESB (Enterprise Service Bus) MiNiFi Java C++ Provenance Kubernetes Apache Atlas Data Governance Kibana K-Nearest Neighbors DevOps DSL (Domain Specific Language) NiFi Registry Artifact Repository Nexus NiFi CLI Maven Archetype IoT Docker Backpressure NiFi Wiki TLS (Transport Layer Security) Mozilla TLS Observatory NiFi Flow Design System Data Lineage GDPR (General Data Protection Regulation)

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

Streaming Change Data Capture

There are many benefits to becoming a data-driven organization, including the ability to accelerate and improve business decision accuracy through the real-time processing of transactions, social media streams, and IoT data. But those benefits require significant changes to your infrastructure. You need flexible architectures that can copy data to analytics platforms at near-zero latency while maintaining 100% production uptime. Fortunately, a solution already exists. This ebook demonstrates how change data capture (CDC) can meet the scalability, efficiency, real-time, and zero-impact requirements of modern data architectures. Kevin Petrie, Itamar Ankorion, and Dan Potter—technology marketing leaders at Attunity—explain how CDC enables faster and more accurate decisions based on current data and reduces or eliminates full reloads that disrupt production and efficiency. The book examines: How CDC evolved from a niche feature of database replication software to a critical data architecture building block Architectures where data workflow and analysis take place, and their integration points with CDC How CDC identifies and captures source data updates to assist high-speed replication to one or more targets Case studies on cloud-based streaming and streaming to a data lake and related architectures Guiding principles for effectively implementing CDC in cloud, data lake, and streaming environments The Attunity Replicate platform for efficiently loading data across all major database, data warehouse, cloud, streaming, and Hadoop platforms

In this podcast, Drew Conway (@DrewConway) from Aluvium talks about his journey to start an IoT startup. He sheds light on the opportunities in the industrial IoT space and shares some insights into the mechanism of running a data science startup in the IoT space. She shared some tactical suggestions for any future leader. This podcast is great for data science startup entrepreneurs and/or Sr. executives in IoT.

Timeline: 0:28 Drew's journey from counter-terrorism to IoT startup. 9:29 Data science in the industrial space. 12:01 Entrepreneurship in the IoT start-up. 18:36 Selling data analysis to executives in the industrial space. 24:14 Automation in the industrial setting. 29:27 What is an IoT ready company? 32:40 Challenges in integrating data tools in the industrial sector. 37:27 Data science talent pool in industrial and manufacturing companies. 41:52 Challenges in IoT adoption for industrial companies. 46:31 Alluvium's interaction with industries. 50:57 Picking the right use case as an IoT start-up. 52:49 Right customers for an IoT start-up. 59:26 Words of wisdom for anyone building a IoT start-up.

Drew's Recommended Listen: Gödel, Escher, Bach: An Eternal Golden Braid by Douglas R. Hofstadter https://amzn.to/2x0uo7d

Podcast Link: https://futureofdata.org/drewconway-on-fabric-of-an-iot-startup-futureofdata-podcast/

Drew's BIO: Drew Conway, CEO and founder of Alluvium, is a leading expert in the application of computational methods to social and behavioral problems at large-scale. Drew has been writing and speaking about the role of data — and the discipline of data science — in industry, government, and academia for several years.

Drew has advised and consulted companies across many industries, ranging from fledgling start-ups to Fortune 100 companies, as well as academic institutions and government agencies at all levels. Drew started his career in counter-terrorism as a computational social scientist in the U.S. intelligence community.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Visualizing Streaming Data

While tools for analyzing streaming and real-time data are gaining adoption, the ability to visualize these data types has yet to catch up. Dashboards are good at conveying daily or weekly data trends at a glance, though capturing snapshots when data is transforming from moment to moment is more difficult—but not impossible. With this practical guide, application designers, data scientists, and system administrators will explore ways to create visualizations that bring context and a sense of time to streaming text data. Author Anthony Aragues guides you through the concepts and tools you need to build visualizations for analyzing data as it arrives. Determine your company’s goals for visualizing streaming data Identify key data sources and learn how to stream them Learn practical methods for processing streaming data Build a client application for interacting with events, logs, and records Explore common components for visualizing streaming data Consider analysis concepts for developing your visualization Define the dashboard’s layout, flow direction, and component movement Improve visualization quality and productivity through collaboration Explore use cases including security, IoT devices, and application data

In this podcast, Drew Conway (@DrewConway) from Alluvium talks about his journey on creating a socially connected and responsible data science practice. He shared tactical steps and suggestions to help recruit the right talent, build the right culture, and nurture the relationship to create a sustained and impactful data science practice. The session is great for folks caring to create a self-sustaining and growth compliant data science practice.

Timeline: 0:28 Drew's journey from counter-terrorism to IoT startup. 9:29 Data science in the industrial space. 12:01 Entrepreneurship in the IoT start-up. 18:36 Selling data analysis to executives in the industrial space. 24:14 Automation in the industrial setting. 29:27 What is an IoT ready company? 32:40 Challenges in integrating data tools in the industrial sector. 37:27 Data science talent pool in industrial and manufacturing companies. 41:52 Challenges in IoT adoption for industrial companies. 46:31 Alluvium's interaction with industries. 50:57 Picking the right use case as an IoT start-up. 52:49 Right customers for an IoT start-up. 59:26 Words of wisdom for anyone building an IoT start-up.

Drew's Recommended Listen: Gödel, Escher, Bach: An Eternal Golden Braid by Douglas R. Hofstadter https://amzn.to/2x0uo7d

Podcast Link: https://futureofdata.org/drewconway-on-creating-socially-responsible-data-science-practice-futureofdata-podcast/

Drew's BIO: Drew Conway, CEO, and founder of Alluvium, is a leading expert in applying computational methods to social and behavioral problems at a large-scale. Drew has been writing and speaking about the role of data — and the discipline of data science — in industry, government, and academia for several years.

Drew has advised and consulted companies across many industries, ranging from fledgling start-ups to Fortune 100 companies, as well as academic institutions and government agencies at all levels. Drew started his career in counter-terrorism as a computational social scientist in the U.S. intelligence community.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this last part of the two-part podcast, @TimothyChou discussed the Internet of Things landscape's future. He laid out how the internet has always been about the internet of things and not the internet of people. He sheds light on the internet of things as it is spread across themes of things, connect, collect, learn, and do workflows. He builds an interesting case about achieving precision to introduction optimality.

Timeline: 0:29 Timothy's journey. 8:56 Selling cloud to Oracle. 15:57 Communicating economics and technology disruption. 23:54 Internet of people to the internet of things.

Timothy's Recommended Read: Life 3.0: Being Human in the Age of Artificial Intelligence by Max Tegmark http://amzn.to/2Cidyhy Zone to Win: Organizing to Compete in an Age of Disruption Paperback by Geoffrey A. Moore http://amzn.to/2Hd5zpv

Podcast Link: https://futureofdata.org/timothychou-on-world-of-iot-its-future-part-2/

Timothy's BIO: Timothy Chou has his career spanning through academia, successful (and not so successful) startups, and large corporations. He was one of only a few people to hold the President's title at Oracle. As President of Oracle On Demand, he grew the cloud business from its very beginning. Today that business is over $2B. He wrote about the move of applications to the cloud in 2004 in his first book, “The End of Software”. Today he serves on the board of Blackbaud, a nearly $700M vertical application cloud service company.

After earning his Ph.D. in EE at the University of Illinois, he went to work for Tandem Computers, one of the original Silicon Valley startups. Had he understood stock options, he would have joined earlier. He’s invested in and been a contributor to a number of other startups, some you’ve heard of like Webex, and others you’ve never heard of but were sold to companies like Cisco and Oracle. Today he is focused on several new ventures in cloud computing, machine learning, and the Internet of Things.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this first part of a two-part podcast, @TimothyChou discussed the Internet of Things landscape. He laid out how the internet has always been about the internet of things and not the internet of people. He sheds light on the internet of things as it is spread across themes of things, connect, collect, learn, and do workflows. He builds an interesting case about achieving precision to introduction optimality.

Timeline: 0:29 Reason behind the failure of IoT projects. 19:10 Which businesses will be impacted by IoT expansion? 30:22 How is IoT getting impacted in the world of AI. 40:35 Innovative startups in the IoT industry. 49:17 What's slowing down IoT? 52:20 How much IoT and cloud are married together? 54:32 Timothy's success mantra. 56:16 Parting thoughts.

Timothy's Recommended Read: Life 3.0: Being Human in the Age of Artificial Intelligence by Max Tegmark http://amzn.to/2Cidyhy Zone to Win: Organizing to Compete in an Age of Disruption Paperback by Geoffrey A. Moore http://amzn.to/2Hd5zpv

Podcast Link: https://futureofdata.org/timothychou-on-world-of-iot-its-future-part-1-futureofdata-podcast/

Timothy's BIO: Timothy Chou has his career spanning through academia, successful (and not so successful) startups, and large corporations. He was one of only a few people to hold the President's title at Oracle. As President of Oracle On Demand, he grew the cloud business from its very beginning. Today that business is over $2B. He wrote about the move of applications to the cloud in 2004 in his first book, “The End of Software”. Today he serves on the board of Blackbaud, a nearly $700M vertical application cloud service company.

After earning his Ph.D. in EE at the University of Illinois, he went to work for Tandem Computers, one of the original Silicon Valley startups. Had he understood stock options, he would have joined earlier. He’s invested in and been a contributor to a number of other startups, some you’ve heard of like Webex, and others you’ve never heard of but were sold to companies like Cisco and Oracle. Today he is focused on several new ventures in cloud computing, machine learning, and the Internet of Things.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this podcast, Venu Vasudevan(@ProcterGamble) talks about the best practices of creating a research-led data-driven data science team. He walked through his journey of creating a robust and sustained data science team, spoke about bias in data science, and some practices leaders and data science practitioners could adopt to create an impactful data science team. This podcast is great for future data science leaders and practitioners leading organizations to put together a data science practice.

Timeline: 0:29 Venu's jouney. 11:18 Venu's current role in PNG. 13:11 Standardization of technology and IoT. 17:18 The state of AI. 19:46 Running an AI and data practice for a company. 22:30 Building a data science practice in a startup in comparison to a transnational company. 24:05 Dealing with bias. 27:32 Culture: a block or an opportunity. 30:05 Dealing with data we've never dealt with before. 32:32 Sustainable vs. disruption. 36:17 Starting a data science team. 38:34 Data science as an art of doing and science of doing business. 41:37 Tips to improve storytelling for a data practitioner. 43:30 Challenges in Venu's journey. 44:55 Tenets of a good data scientist. 47:27 Diversity in hiring. 50:50 KPI's to look out for if you are running an AI practice. 51:37 Venu's favorite read.

Venu's Recommended Read: Isaac Newton: The Last Sorcerer - Michael White http://amzn.to/2FzGV0N Against the Gods: The Remarkable Story of Risk - Peter L. Bernstein http://amzn.to/2DRPveU

Podcast Link: https://futureofdata.org/venu-vasudevan-venuv62-proctergamble-on-creating-a-rockstar-data-science-team-futureofdata/

Venu's BIO: Venu Vasudevan is Research Director, Data Science & AI at Procter & Gamble, where he directs the Data Science & AI organization at Procter & Gamble research. He is a technology leader with a track record of successful consumer & enterprise innovation at the intersection of AI, Machine Learning, Big Data, and IoT. Previously he was VP of Data Science at an IoT startup, a founding member of the Motorola team that created the Zigbee IoT standard, worked to create an industry-first zero-click interface for mobile with Dag Kittlaus (co-creator of Apple Siri), created an industry-first Google Glass experience for TV, an ARRIS video analytics and big data platform recently acquired by Comcast, and a social analytics platform leveraging Twitter that was featured in Wired Magazine and BBC. Venu held a Ph.D. (Databases & AI) from Ohio State University and was a Motorola’s Science Advisory Board (top 2% of Motorola technologists). He is an Adjunct Professor at Rice University’s Electrical and Computer Engineering department and was a mentor at Chicago’s 1871 startup incubator.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this podcast, Pascal Marmier sat with Vishal Kumar to discuss challenges, opportunities, and nuances in running an innovation hub within regulated corporate culture settings and shared some best practices in promoting data driven innovation.

Timeline: 0:29 Pascal's journey. 6:12 Quitting law and getting into digital analytics. 10:44 Defining Head Analytics Catalyst. 13:46 Putting up an Analytics Catalyst team. 18:43 Steps to create a data lab. 22:02 Securing executive sponsorship. 25:45 Differences in creating lab in Europe in comparison to the USA. 29:43 Challenges in setting up a digital analytics catalyst. 32:27 Ideal team members to have in a digital analytics catalyst team. 35:14 Company culture interfering with lab innovation. 38:00 Lab innovation determining the company's future. 42:19 Important KPIs for setting up a lab. 46:55 Prophesy on the insurance company. 51:15 What can insurance do to secure themselves? 54:48 Insurance dealing with changing risk profiles. 59:26 Pascal's favorite read. 1:00:56 Closing remarks.

Podcast link: https://futureofdata.org/pascal-marmier-pmarmier-swissre-discuss-running-data-driven-innovation-catalyst/

About Pascal Marmier: After many years helping to build the swissnex network in Boston and in China, I recently joined Swiss Re in Boston to help the Digital Analytics Catalyst team identify and develop novel ideas/tech into a sustainable business. As part of the digital transformation of the insurance industry, our team in Boston and London is engaging with startups and academia in various fields of technology such as digital health, IoT, AI. We are working with the teams at Swiss Re to provide business solutions based on the transformative power of data innovation.

Pascal's Favorite Read: Sapiens: A Brief History of Humankind http://amzn.to/2yHvYGV About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/ Want to sponsor? Email us @ [email protected]

Keywords: FutureOfData Data Analytics Leadership Podcast Big Data Strategy

Decision Support, Analytics, and Business Intelligence, Third Edition

Rapid technology change is impacting organizations large and small. Mobile and Cloud computing, the Internet of Things (IoT), and “Big Data” are driving forces in organizational digital transformation. Decision support and analytics are available to many people in a business or organization. Business professionals need to learn about and understand computerized decision support for organizations to succeed. This text is targeted to busy managers and students who need to grasp the basics of computerized decision support, including: What is analytics? What is a decision support system? What is “Big Data”? What are “Big Data” business use cases? Overall, it addresses 61 fundamental questions. In a short period of time, readers can “get up to speed” on decision support, analytics, and business intelligence. The book then provides a quick reference to important recurring questions.

Business in Real-Time Using Azure IoT and Cortana Intelligence Suite: Driving Your Digital Transformation

Learn how today’s businesses can transform themselves by leveraging real-time data and advanced machine learning analytics. This book provides prescriptive guidance for architects and developers on the design and development of modern Internet of Things (IoT) and Advanced Analytics solutions. In addition, Business in Real-Time Using Azure IoT and Cortana Intelligence Suite offers patterns and practices for those looking to engage their customers and partners through Software-as-a-Service solutions that work on any device. Whether you're working in Health & Life Sciences, Manufacturing, Retail, Smart Cities and Buildings or Process Control, there exists a common platform from which you can create your targeted vertical solutions. Business in Real-Time Using Azure IoT and Cortana Intelligence Suite uses a reference architecture as a road map. Building on Azure’s PaaS services, you'll see how a solution architecture unfolds that demonstrates a complete end-to-end IoT and Advanced Analytics scenario. What You'll Learn: Automate your software product life cycle using PowerShell, Azure Resource Manager Templates, and Visual Studio Team Services Implement smart devices using Node.JS and C# Use Azure Streaming Analytics to ingest millions of events Provide both "Hot" and "Cold" path outputs for real-time alerts, data transformations, and aggregation analytics Implement batch processing using Azure Data Factory Create a new form of Actionable Intelligence (AI) to drive mission critical business processes Provide rich Data Visualizations across a wide variety of mobile and web devices Who This Book is For: Solution Architects, Software Developers, Data Architects, Data Scientists, and CIO/CTA Technical Leadership Professionals

MQTT Essentials - A Lightweight IoT Protocol

Dive into the world of MQTT, the preferred protocol for IoT and M2M communication. This book provides a comprehensive guide to understanding, implementing, and securing MQTT-based systems, enabling readers to create efficient and lightweight communication networks for their connected devices. What this Book will help me do Understand the underlying principles and protocol structure of MQTT. Securely configure and deploy an MQTT broker for communication. Develop Python, Java, and JavaScript-based MQTT client applications. Utilize MQTT for real-world IoT use cases such as sensor data interchange. Optimize MQTT usage for low-latency and lightweight communication scenarios. Author(s) Gastón C. Hillar is an experienced IoT developer and author with a deep understanding of IoT protocols and technologies. With years of practical experience in designing and deploying secure IoT systems, Gastón specializes in breaking down complex topics into digestible and actionable insights. Through his books, he aims to empower developers to effectively integrate IoT technologies into their work. Who is it for? The book is tailored for software developers and engineers who are looking to integrate MQTT into their IoT solutions. It's ideal for individuals with pre-existing knowledge in IoT concepts who want to deepen their understanding of MQTT. Readers seeking to secure, optimize, and utilize MQTT for communication and automation tasks will find it especially useful. It's a perfect fit for those working with Python, Java, and web technologies in IoT contexts.

Python: Data Analytics and Visualization

Understand, evaluate, and visualize data About This Book Learn basic steps of data analysis and how to use Python and its packages A step-by-step guide to predictive modeling including tips, tricks, and best practices Effectively visualize a broad set of analyzed data and generate effective results Who This Book Is For This book is for Python Developers who are keen to get into data analysis and wish to visualize their analyzed data in a more efficient and insightful manner. What You Will Learn Get acquainted with NumPy and use arrays and array-oriented computing in data analysis Process and analyze data using the time-series capabilities of Pandas Understand the statistical and mathematical concepts behind predictive analytics algorithms Data visualization with Matplotlib Interactive plotting with NumPy, Scipy, and MKL functions Build financial models using Monte-Carlo simulations Create directed graphs and multi-graphs Advanced visualization with D3 In Detail You will start the course with an introduction to the principles of data analysis and supported libraries, along with NumPy basics for statistics and data processing. Next, you will overview the Pandas package and use its powerful features to solve data-processing problems. Moving on, you will get a brief overview of the Matplotlib API .Next, you will learn to manipulate time and data structures, and load and store data in a file or database using Python packages. You will learn how to apply powerful packages in Python to process raw data into pure and helpful data using examples. You will also get a brief overview of machine learning algorithms, that is, applying data analysis results to make decisions or building helpful products such as recommendations and predictions using Scikit-learn. After this, you will move on to a data analytics specialization - predictive analytics. Social media and IOT have resulted in an avalanche of data. You will get started with predictive analytics using Python. You will see how to create predictive models from data. You will get balanced information on statistical and mathematical concepts, and implement them in Python using libraries such as Pandas, scikit-learn, and NumPy. You'll learn more about the best predictive modeling algorithms such as Linear Regression, Decision Tree, and Logistic Regression. Finally, you will master best practices in predictive modeling. After this, you will get all the practical guidance you need to help you on the journey to effective data visualization. Starting with a chapter on data frameworks, which explains the transformation of data into information and eventually knowledge, this path subsequently cover the complete visualization process using the most popular Python libraries with working examples This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products: Getting Started with Python Data Analysis, Phuong Vo.T.H &Martin Czygan Learning Predictive Analytics with Python, Ashish Kumar Mastering Python Data Visualization, Kirthi Raman Style and approach The course acts as a step-by-step guide to get you familiar with data analysis and the libraries supported by Python with the help of real-world examples and datasets. It also helps you gain practical insights into predictive modeling by implementing predictive-analytics algorithms on public datasets with Python. The course offers a wealth of practical guidance to help you on this journey to data visualization