talk-data.com talk-data.com

Topic

Big Data

data_processing analytics large_datasets

1217

tagged

Activity Trend

28 peak/qtr
2020-Q1 2026-Q1

Activities

1217 activities · Newest first

In this podcast, Alex Wissner-Gross from Gemedy sat with Vishal Kumar from AnalyticsWeek to discuss the convoluted world of intelligence in artificial intelligence. He sheds deep insights on what machines perceive as intelligence and how to evaluate the current unwrapping of AI capabilities. This podcast is a must-attend for anyone who wishes to understand what AI is all about.

Timeline: 0:28 Alex's journey. 7:20 Alex's role in Gemedy. 9:19 Physics of AI. 12:05 General use cases for distribution of AI capabilities. 15:00 State of AI. 20:03 Defining intelligence. 23:42 Maximum freedom of action. 28:12 Intelligence and maximizing future freedom of action. 30:10 Maximum freedom of action and maximizing impact. 31:45 Thoughts on deep learning. 36:55 Data sets or data models? 39:27 AI in the context of business? 44:08 AI and the protection of human interests. 48:40 AI that ensures the employability of humans. 51:11 Advice for businesses to get started with AI. 59:01 Alex's ingredients to success. 1:01:16 Alex's favorite reads. 1:04:26 Key takeaways.

Alex's Recommended Listen: Accelerando (Singularity) Mass Market by Charles Stross (Author) https://amzn.to/2GDkBUl Diaspora: A Novel by Greg Egan https://amzn.to/2s1GF5L Rainbows End: A Novel with One Foot in the Future Mass Market by Vernor Vinge https://amzn.to/2J3oarQ

Podcast Link: https://futureofdata.org/alexwg-on-unwrapping-intelligence-in-artificialintelligence-futureofdata/

Alex's BIO: Dr. Alexander D. Wissner-Gross is an award-winning scientist, engineer, entrepreneur, investor, and author. He serves as President and Chief Scientist of Gemedy and holds academic appointments at Harvard and MIT. He has received 125 major distinctions, authored 18 publications, been granted 24 issued, pending, and provisional patents, founded, managed, and advised 4 technology companies that were acquired for a combined value of over $600 million. In 1998 and 1999, he won the USA Computer Olympiad and the Intel Science Talent Search. In 2003, he became the last person in MIT history to receive a triple major, with bachelor's in Physics, Electrical Science and Engineering, and Mathematics, while graduating first in his class from the MIT School of Engineering. In 2007, he completed his Ph.D. in Physics at Harvard, where his research on neuromorphic computing, machine learning, and programmable matter was awarded the Hertz Doctoral Thesis Prize. A thought leader in artificial intelligence, he is a contributing author of the New York Times Science Bestseller, This Idea Must Die, and the Amazon #1 New Release, What to Think About Machines That Think. A popular TED speaker, his talks have been viewed more than 2 million times and translated into 27 languages. His work has been featured in more than 200 press outlets worldwide, including The Wall Street Journal, BusinessWeek, CNN, USA Today, and Wired.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this podcast, Justin Borgman talks about his journey of starting a data science start, doing an exit, and jumping on another one. The session is filled with insights for leadership, looking for entrepreneurial wisdom to get on a data-driven journey.

Timeline: 0:28 Justin's journey. 3:22 Taking the plunge to start a new company. 5:49 Perception vs. reality of starting a data warehouse company. 8:15 Bringing in something new to the IT legacy. 13:20 Getting your first few customers. 16:16 Right moment for a data warehouse company to look for a new venture. 18:20 Right person to have as a co-founder. 20:29 Advantages of going seed vs. series A. 22:13 When is a company ready for seeding or series A? 24:40 Who's a good adviser? 26:35 Exiting Teradata. 28:54 Teradata to starting a new company. 31:24 Excitement of starting something from scratch. 32:24 What is Starburst? 37:15 Presto, a great engine for cloud platforms. 40:30 How can a company get started with Presto. 41:50 Health of enterprise data. 44:15 Where does Presto not fit in? 45:19 Future of enterprise data. 46:36 Drawing parallels between proprietary space and open source space. 49:02 Does align with open-source gives a company a better chance in seeding. 51:44 John's ingredients for success. 54:05 John's favorite reads. 55:01 Key takeaways.

Paul's Recommended Read: The Outsiders Paperback – S. E. Hinton amzn.to/2Ai84Gl

Podcast Link: https://futureofdata.org/running-a-data-science-startup-one-decision-at-a-time-futureofdata-podcast/

Justin's BIO: Justin has spent the better part of a decade in senior executive roles building new businesses in the data warehousing and analytics space. Before co-founding Starburst, Justin was Vice President and General Manager at Teradata (NYSE: TDC), where he was responsible for the company’s portfolio of Hadoop products. Prior to joining Teradata, Justin was co-founder and CEO of Hadapt, the pioneering "SQL-on-Hadoop" company that transformed Hadoop from file system to analytic database accessible to anyone with a BI tool. Teradata acquired Hadapt in 2014.

Justin earned a BS in Computer Science from the University of Massachusetts at Amherst and an MBA from the Yale School of Management.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this episode, Wayne Eckerson and Jeff Magnusson discuss the data architecture Stitch Fix created to support its data science workloads, as well as the need to balance man and machine and art and science.

Magnusson is the vice president of data platform at Stitch Fix. He leads a team responsible for building the data platform that supports the company's team of 80+ data scientists, as well as other business users. That platform is designed to facilitate self-service among data scientists and promote velocity and innovation that differentiate Stitch Fix in the marketplace. Before Stitch Fix, Magnusson managed the data platform architecture team at Netflix where he helped design and open source many of the components of the Hadoop-based infrastructure and big data platform.

In this podcast, Marc Rind from ADP talked about big data in HR. He shared some of the best practices and opportunities that reside in HR data. Marc also shared some tactical steps to perform to help build better data-driven teams to execute data-driven strategies. This podcast is great for folks looking to explore the depth of HR data and the opportunities that reside in it.

Timeline: 0:28 Marc's journey. 4:50 Marc's typical day. 7:23 Data use cases in ADP. 11:20 Driving innovation and thought leadership. 15:15 Creating awareness for the necessity for innovation. 18:54 Listening skills key for innovation. 20:25 HR's role in the time of automation. 27:45 Product development and data science. 30:36 Working on a client analytics platform. 34:41 Team building. 37:52 Tips for established businesses to get started with data. 41:20 Data opportunities for entrepreneurs in the HR space. 43:23 Marc's ingredients for success. 46:35 Marc's reading list. 48:35 Key takeaways.

Podcast Link: https://futureofdata.org/understanding-bigdata-bigopportunity-in-hr-marcrind-futureofdata/

Marc's BIO: Marc is responsible for leading the research and development of Automatic Data Processing’s (ADP’s) Analytics and Big Data initiative. In this capacity, Marc drives the innovation and thought leadership in building ADP’s Client Analytics platform. ADP Analytics provides its clients not only the ability to read the pulse of its own human capital…but also provides the information on how they stack up within their industry, along with the best courses of action to achieve its goals through quantifiable insights.

Marc was also an instrumental leader behind the small business market payroll platform; RUN Powered by ADP®. Marc leads a number of the technology teams responsible for delivering its critically acclaimed product focused on its innovative user experience for small business owners.

Prior to joining ADP, Marc’s innovative spirit and fascination with data were forged at Bolt Media, a dot-com start-up based in NY’s “Silicon Alley”. The company was an early predecessor to today’s social media outlets. As an early ‘Data Scientist,’ Marc focused on the patterns and predictions of site usage through the harnessing of the data on its +10 million user profiles.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Analytics and Big Data for Accountants

Analytics is the new force driving business. Tools have been created to measure program impacts and ROI, visualize data and business processes, and uncover the relationship between key performance indicators, many using the unprecedented amount of data now flowing into organizations. Featuring updated examples and surveys, this dynamic book covers leading-edge topics in analytics and finance. It is packed with useful tips and practical guidance you can apply immediately. This book prepares accountants to: Deal with major trends in predictive analytics, optimization, correlation of metrics, and big data. Interpret and manage new trends in analytics techniques affecting your organization. Use new tools for data analytics. Critically interpret analytics reports and advise decision makers.

In this podcast, @JohnNives discusses ways to demystify AI for the enterprise. He shares his perspective on how businesses should engage with AI and what are some of the best practices and considerations for businesses to adopt AI in their strategic roadmap. This podcast is great for anyone seeking to learn about the way to adopt AI in the enterprise landscape.

Timelines: 0:28 John's journey. 6:50 John's current role. 9:40 The role of a chief digital officer. 11:16 The current trend of AI. 13:52 AI hype or real? 16:42 Why AI now? 19:03 Demystifying deep learning. 23:35 Enterprise use cases of AI. 28:25 Attributes of a successful AI project. 32:20 Best AI investments in an enterprise. 36:56 Convincing leadership to adopt AI. 39:20 Organizational implications of adopting AI. 43:45 What do executives get wrong about AI? 48:36 Tips for executives to understand the AI landscape. 53:11 John's favorite reads. 57:35 Closing remarks.

John's Recommended Listen: FutureOfData Podcast math.im/itunes War and Peace Leo Tolstoy (Author),‎ Frederick Davidson (Narrator),‎ Inc. Blackstone Audio (Publisher) amzn.to/2w7ObkI

Podcast Link: https://futureofdata.org/johnnives-on-ways-to-demystify-ai-for-enterprise/

Jean's BIO: Jean-Louis (John) Nives serves as Chief Digital Officer and the Global Chair of the Digital Transformation practice at N2Growth. Prior to joining N2Growth, Mr. Nives was at IBM Global Business Services, within the Watson and Analytics Center of Competence. There he worked on Cognitive Digital Transformation projects related to Watson, Big Data, Analytics, Social Business and Marketing/Advertising Technology. Examples include CognitiveTV and the application of external unstructured data (social, weather, etc.) for business transformation. Prior relevant experience includes executive leadership positions at Nielsen, IRI, Kraft and two successful advertising technology acquisitions (Appnexus and SintecMedia). In this capacity, Jean-Louis combined information, analytics and technology to created significant business value in transformative ways. Jean-Louis earned a Bachelor’s Degree in Industrial Engineering from University at Buffalo and an MBA in Finance and Computer Science from Pace University. He is married with four children and lives in the New York City area.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to discuss their journey in creating the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Infographics Powered by SAS

Create compelling business infographics with SAS and familiar office productivity tools. A picture is worth a thousand words, but what if there are a billion words? When analyzing big data, you need a picture that cuts through the noise. This is where infographics come in. Infographics are a representation of information in a graphic format designed to make the data easily understandable. With infographics, you don’t need deep knowledge of the data. The infographic combines story telling with data and provides the user with an approachable entry point into business data. Infographics Powered by SAS : Data Visualization Techniques for Business Reporting shows you how to create graphics to communicate information and insight from big data in the boardroom and on social media. Learn how to create business infographics for all occasions with SAS and learn how to build a workflow that lets you get the most from your SAS system without having to code anything, unless you want to! This book combines the perfect blend of creative freedom and data governance that comes from leveraging the power of SAS and the familiarity of Microsoft Office. Topics covered in this book include: SAS Visual Analytics SAS Office Analytics SAS/GRAPH software (SAS code examples) Data visualization with SAS Creating reports with SAS Using reports and graphs from SAS to create business presentations Using SAS within Microsoft Office

A Deep Dive into NoSQL Databases: The Use Cases and Applications

A Deep Dive into NoSQL Databases: The Use Cases and Applications, Volume 109, the latest release in the Advances in Computers series first published in 1960, presents detailed coverage of innovations in computer hardware, software, theory, design and applications. In addition, it provides contributors with a medium in which they can explore their subjects in greater depth and breadth. This update includes sections on NoSQL and NewSQL databases for big data analytics and distributed computing, NewSQL databases and scalable in-memory analytics, NoSQL web crawler application, NoSQL Security, a Comparative Study of different In-Memory (No/New)SQL Databases, NoSQL Hands On-4 NoSQLs, the Hadoop Ecosystem, and more. Provides a very comprehensive, yet compact, book on the popular domain of NoSQL databases for IT professionals, practitioners and professors Articulates and accentuates big data analytics and how it gets simplified and streamlined by NoSQL database systems Sets a stimulating foundation with all the relevant details for NoSQL database researchers, developers and administrators

In this podcast, Wayne Eckerson and Joe Caserta discuss data migration, compare cloud offerings from Amazon, Google, and Microsoft, and define and explain artificial intelligence.

You can contact Caserta by visiting caserta.com or by sending him an email to [email protected]. Follow him on Twitter @joe_caserta.

Caserta is President of a New York City-based consulting firm he founded in 2001 and a longtime data guy. In 2004, Joe teamed up with data warehousing legend, Ralph Kimball to write to write the book The Data Warehouse ETL Toolkit. Today he’s now one of the leading authorities on big data implementations. This makes Joe one of the few individuals with in-the-trenches experience on both sides of the data divide, traditional data warehousing on relational databases and big data implementations on Hadoop and the cloud.

Summary

The rate of change in the data engineering industry is alternately exciting and exhausting. Joe Crobak found his way into the work of data management by accident as so many of us do. After being engrossed with researching the details of distributed systems and big data management for his work he began sharing his findings with friends. This led to his creation of the Hadoop Weekly newsletter, which he recently rebranded as the Data Engineering Weekly newsletter. In this episode he discusses his experiences working as a data engineer in industry and at the USDS, his motivations and methods for creating a newsleteter, and the insights that he has gleaned from it.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Joe Crobak about his work maintaining the Data Engineering Weekly newsletter, and the challenges of keeping up with the data engineering industry.

Interview

Introduction How did you get involved in the area of data management? What are some of the projects that you have been involved in that were most personally fulfilling?

As an engineer at the USDS working on the healthcare.gov and medicare systems, what were some of the approaches that you used to manage sensitive data? Healthcare.gov has a storied history, how did the systems for processing and managing the data get architected to handle the amount of load that it was subjected to?

What was your motivation for starting a newsletter about the Hadoop space?

Can you speak to your reasoning for the recent rebranding of the newsletter?

How much of the content that you surface in your newsletter is found during your day-to-day work, versus explicitly searching for it? After over 5 years of following the trends in data analytics and data infrastructure what are some of the most interesting or surprising developments?

What have you found to be the fundamental skills or areas of experience that have maintained relevance as new technologies in data engineering have emerged?

What is your workflow for finding and curating the content that goes into your newsletter? What is your personal algorithm for filtering which articles, tools, or commentary gets added to the final newsletter? How has your experience managing the newsletter influenced your areas of focus in your work and vice-versa? What are your plans going forward?

Contact Info

Data Eng Weekly Email Twitter – @joecrobak Twitter – @dataengweekly

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

USDS National Labs Cray Amazon EMR (Elastic Map-Reduce) Recommendation Engine Netflix Prize Hadoop Cloudera Puppet healthcare.gov Medicare Quality Payment Program HIPAA NIST National Institute of Standards and Technology PII (Personally Identifiable Information) Threat Modeling Apache JBoss Apache Web Server MarkLogic JMS (Java Message Service) Load Balancer COBOL Hadoop Weekly Data Engineering Weekly Foursquare NiFi Kubernetes Spark Flink Stream Processing DataStax RSS The Flavors of Data Science and Engineering CQRS Change Data Capture Jay Kreps

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

Creating a Data-Driven Enterprise in Media

The data-driven revolution is finally hitting the media and entertainment industry. For decades, broadcast television and print media relied on traditional delivery channels for solvency and growth, but those channels fragmented as cable, streaming, and digital devices stole the show. In this ebook, you’ll learn about the trends, challenges, and opportunities facing players in this industry as they tackle big data, advanced analytics, and DataOps. You’ll explore best practices and lessons learned from three real-world media companies—Sling TV, Turner Broadcasting, and Comcast—as they proceed on their data-driven journeys. Along the way, authors Ashish Thusoo and Joydeep Sen Sarma explain how DataOps breaks down silos and connects everyone who handles data, including engineers, data scientists, analysts, and business users. Big-data-as-a-service provider Qubole provides a five-step maturity model that outlines the phases that a company typically goes through when it first encounters big data. Case studies include: Sling TV: this live streaming content platform delivers live TV and on-demand entertainment instantly to a variety of smart televisions, tablets, game consoles, computers, smartphones, and streaming devices Turner Broadcasting System: this Time Warner division recently created the Turner Data Cloud to support direct-to-consumer services, including FilmStruck, Boom (for kids), and NBA League Pass Comcast: the largest broadcasting and cable TV company is building a single integrated big data platform to deliver internet, TV, and voice to more than 28 million customers

In this podcast, Wayne Eckerson and James Serra discuss myths of modern data management. Some of the myths discussed include 'all you need is a data lake', 'the data warehouse is dead', 'we don’t need OLAP cubes anymore', 'cloud is too expensive and latency is too slow', 'you should always use a NoSQL product over a RDBMS.'

Serra is big data and data warehousing solutions architect at Microsoft with over thirty years of IT experience. He is a popular blogger and speaker and has presented at dozens of Microsoft PASS and other events. Prior to Microsoft, Serra was an independent data warehousing and business intelligence architect and developer.

Modern Big Data Processing with Hadoop

Delve into the world of big data with 'Modern Big Data Processing with Hadoop.' This comprehensive guide introduces you to the powerful capabilities of Apache Hadoop and its ecosystem to solve data processing and analytics challenges. By the end, you will have mastered the techniques necessary to architect innovative, scalable, and efficient big data solutions. What this Book will help me do Master the principles of building an enterprise-level big data strategy with Apache Hadoop. Learn to integrate Hadoop with tools such as Apache Spark, Elasticsearch, and more for comprehensive solutions. Set up and manage your big data architecture, including deployment on cloud platforms with Apache Ambari. Develop real-time data pipelines and enterprise search solutions. Leverage advanced visualization tools like Apache Superset to make sense of data insights. Author(s) None R. Patil, None Kumar, and None Shindgikar are experienced big data professionals and accomplished authors. With years of hands-on experience in implementing and managing Apache Hadoop systems, they bring a depth of expertise to their writing. Their dedication lies in making complex technical concepts accessible while demonstrating real-world best practices. Who is it for? This book is designed for data professionals aiming to advance their expertise in big data solutions using Apache Hadoop. Ideal readers include engineers and project managers involved in data architecture and those aspiring to become big data architects. Some prior exposure to big data systems is beneficial to fully benefit from this book's insights and tutorials.

In this podcast, Amy Gershkoff(@amygershkoff) talks about the ingredients of a successful data science team. Amy sheds light on the challenges of building a successful team and how businesses could align themselves to get maximum out of their data science practice. Amy discussed some tricks, tips, and easy to execute strategies to keep the data science team and practice at the top of its efficiency. This is a great session for anyone who wants to be part of a winning and thriving data science practice within the organization.

Timeline:

0:29 Amy's journey. 8:39 Working on Obama's campaign. 15:35 Getting started with a data project. 20:39 First steps for creating a data science team. 27:53 Hiring a data scientist recruiter. 33:00 Building an internal data science workforce. 40:00 Hiring the right data scientist. 42:36 Tips for a data scientist to become a good hire. 44:42 Leadership getting educated in data science. 48:05 How to build diversity in the data science field. 52:52 Being bias free. 54:20 Amy's reading list. 56:06 Key takeaways. 

Youtube: https://youtu.be/0PBK5dfQaUk iTunes: http://apple.co/2zMLByT

Podcast Link: https://futureofdata.org/amygershkoff-on-building-winning-datascience-team/

Amy's BIO: Dr. Amy Gershkoff consults and advises technology companies across the globe. She is the former Chief Data Officer for Ancestry, the world's leading genealogy and consumer genomics company. Prior to joining Ancestry, she was Chief Data Officer at Zynga. Previously, Amy built and led the Customer Analytics & Insights team and led the Global Data Science team at eBay. She has also served as the Chief Data Scientist for WPP, Data Alliance, where she worked across WPP’s more than 350 operating companies worldwide to create integrated data and technology solutions. She was also the Head of Media Planning at Obama for America 2012, where she was the architect of Obama’s advertising strategy and designed the campaign's analytics systems.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this podcast Stephen Gatchell (@stephengatchell) from @Dell talks about the ingredients of a successful data scientist. He sheds light on the importance of data governance and compliance in defining a robust data science strategy. He suggested tactical steps that executives could take in starting their journey to a robust governance framework. He talked about how to take away the scare from governance. He gave insights on some of the things leaders could do today to build robust data science teams and framework. This podcast is great for leaders seeking some tactical insights into building a robust data science framework.

Timeline:

0:29 Stephen's journey. 4:45 Dell's customer experience journey. 7:39 Suggestions for a startup in regard to customer experience. 12:02 Building a center of excellence around data. 15:29 Data ownership. 19:18 Fixing data governance. 24:02 Fixing the data culture. 29:40 Distributed data ownership and data lakes. 32:50 Understanding data lakes. 35:50 Common pitfalls and opportunities in data governance. 38:50 Pleasant surprises in data governance. 41:30 Ideal data team. 44:04 Hiring the right candidates for data excellence. 46:13 How do I know the "why"? 49:05 Stephen's success mantra. 50:56 Stephen's best read. Steve's Recommended Read: Big Data MBA: Driving Business Strategies with Data Science by Bill Schmarzo http://amzn.to/2HWjOyT

Podcast Link: https://futureofdata.org/want-to-fix-datascience-fix-governance-by-stephengatchell-futureofdata/

Steve's BIO: Stephen is currently a Chief Data Officer Engineering & Data Lake at Dell and serves on the Dell Information Quality Governance Office and the Dell IT Technology Advisory Board, developing Dell’s corporate strategies for the Business Data Lake, Advanced Analytics, and Information Asset Management. Stephen also serves as a Customer Insight Analyst for the Chief Technology Office, analyzing customer technology challenges and requirements. Stephen has been awarded the People’s Choice Award by the Dell Total Customer Experience Team for the Data Governance and Business Data Lake project, as well as a Chief Technology Officer Innovation finalist for utilizing advanced analytics for customer configurations improving product development and product test coverage. Prior to Stephen’s current role, he managed Dell’s Global Product Development Lab Operations team developing internal cloud orchestration and automation environments, an Information Systems Executive for IBM leading acquisition conversion efforts, and was VP of Enterprise Systems and Operations managing mission-critical Information Systems for Telelogic (a Swedish public software firm). Stephen has an MBA from Southern New Hampshire University, a BSBA, and an AS in Finance from Northeastern University.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

0 Comments

Cleaning Up the Data Lake with an Operational Data Hub

The data lake was once heralded as the answer to the flood of big data that arrived in a variety of structured and unstructured formats. But, due to the ease of integration and the lack of governance, data lakes in many companies have devolved into unusable data swamps. This short ebook shows you how to solve this problem using an Operational Data Hub (ODH) to collect, store, index, cleanse, harmonize, and master data of all shapes and formats. Gerhard Ungerer—CTO and co-founder of Random Bit LLC—explains how the ODH supports transactional integrity so that the hub can serve as integration point for enterprise applications. You’ll also learn how the ODH helps you leverage the investment in your data lake (or swamp), so that the data trapped there can finally be ingested, processed, and provisioned. With this ebook, you’ll learn how an ODH: Allows you to focus on categorizing data for easy and fast retrieval Provides flexible storage models, indexing support, query capabilities, security, and a governance framework Delivers flexible storage models; support for indexing, scripting, and automation; query capabilities; transactional integrity; and security Includes a governance model to help you access, ingest, harmonize, materialize, provision, and consume data

Gaining Data Agility with Multi-Model Databases

Most organizations realize that their future depends on the ability to quickly adapt to constant changes brought on by variable and complex environments. It's become increasingly clear that the core source behind these innovative solutions is data. Polyglot persistence refers to systems that provide many different types of data storage technologies to deal with this vast variability of data. Applications that need to access data from more than one store have to navigate an array of databases in a complex—and ultimately unsustainable—maze. One solution to this problem is readily available. In this ebook, consultant Joel Ruisi explains how a multi-model database enables you to take advantage of many different types of data models (and multiple schemas) in a single backend. With a multi-model database, companies can easily centralize, manage, and search all the data the IT system collects. The result is data agility: the ability to adapt to changing environments and serve users what they need when they need it. Through several detailed use cases, this ebook explains how multi-model databases enable you to: Store and manage multiple heterogeneous data sources Consolidate your data by bringing everything in "as is" Invisibly extend model features from one model to another Take a hybrid approach to analytical and operational data Enhance user search experience, including big data search Conduct queries across data models Offer SQL without relational constraints

In this podcast, Ashok Srivastava(@aerotrekker) talks about how the code of creating a great data science practice goes through #PeopleDataTech, and he suggested how to handle unreasonable expectations from reasonable technologies. He shared his journey through culturally diverse organizations and how he successfully build data science practice. He shared his role in Intuit and some of the AI/Machine learning focus in his current role. This podcast is a must for all data-driven leaders, strategists, and wannabe technologists tasked to grow their organization and build a robust data science practice.

Timeline:

0:29 Ashok's journey. 9:58 The role of a CDO at Intuit. 12:45 Ashok's secret to success working with diverse workforces. 15:42 Building a culture of data science. 19:03 Tactical strategies to convince the leadership about data. 22:03 Comparing a data officer and analytics officer. 24:09 Ownership of data. 27:33 Best practices for putting together a data team. 30:16 Best practices for a company to build a good data science practice. 32:40 Who's the ideal data science candidate? 35:17 Data citizens as data leaders. 37:47 Use cases of AI at Intuit. 39:55 Deciding which product deserves AI. 42:35 Disruptive nature of AI. 45:05 Ashok's success mantra. 46:56 Ashok's favorite reads. 49:15 Key takeaways.

Ashok's Recommended Read: Guns, Germs, and Steel: The Fates of Human Societies - Jared Diamond Ph.D. http://amzn.to/2C4bLMT Collapse: How Societies Choose to Fail or Succeed: Revised Edition - by Jared Diamond http://amzn.to/2C3Bu8f

Podcast Link: https://futureofdata.org/ashok-srivastavaaerotrekker-on-winning-the-art-of-datascience/

Ashok's BIO: Ashok N. Srivastava, Ph.D., is the Senior Vice President and Chief Data Officer at Intuit. He is responsible for setting the vision and direction for large-scale machine learning and AI across the enterprise to help power prosperity across the world. He is hiring hundreds of people in machine learning, AI, and related areas at all levels.

Previously, he was Vice President of Big Data and Artificial Intelligence Systems and the Chief Data Scientist at Verizon. He is an Adjunct Professor at Stanford in the Electrical Engineering Department and is the Editor-in-Chief of the AIAA Journal of Aerospace Information Systems. Ashok is a Fellow of the IEEE, the American Association for the Advancement of Science (AAAS), and the American Institute of Aeronautics and Astronautics (AIAA).

Ashok has a range of business experience, including serving as Senior Director at Blue Martini Software and Senior Consultant at IBM.

He has won numerous awards, including the Distinguished Engineering Alumni Award, the NASA Exceptional Achievement Medal, the IBM Golden Circle Award, the Department of Education Merit Fellowship, and several fellowships from the University of Colorado. Ashok holds a Ph.D. in Electrical Engineering from the University of Colorado at Boulder.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this podcast, Bill Schmarzo talks about the ingredients of successful data science practice, team, and executives. Bill shared his insights on what some leaders in the industries are doing and some challenges seen in the successful deployment. Bill shared his key take on ingredients for some of the successful hires. This podcast is great for growth mindset executives willing to learn about creating a successful data science practice.

Timeline: 0:29 Bill's journey. 5:05:00 Bill's current role. 7:04 Data science adoption challenges for businesses. 9:33 The good side of data science adoption. 11:22 How is data science changing business. 14:34 Strategies behind distributed IT. 18:35 Analysing the current amount of data. 21:50 Who should own the idea of data science? 24:34 The right background for a CDO. 25:52 Bias in IT. 29:35 Hacks to keep yourself bias-free. 31:58 Team vs. tool for putting together a good data-driven practice. 34:54 Value cycle in data science. 37:10 Maturity model. 39:17 Convincing culture heavy businesses to adopt data. 42:47 Keeping oneself sane during the technological disruption. 46:20 Hiring the right talent. 51:46 Ingredients of a good data science hire. 56:00 Bill's success mantra. 59:07 Bill's favorite reads. 1:00:36 Closing remarks.

Bill's Recommended Read: Moneyball: The Art of Winning an Unfair Game by Michael Lewis http://amzn.to/2FqBFg8 Big Data MBA: Driving Business Strategies with Data Science by Bill Schmarzo http://amzn.to/2tlZAvP

Podcast Link: https://futureofdata.org/schmarzo-dellemc-on-ingredients-of-healthy-datascience-practice-futureofdata-podcast/

Bill's BIO: Bill Schmarzo is the CTO for the Big Data Practice, where he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogger, and is a frequent speaker on the use of Big Data and data science to power the organization's key business initiatives. He is a University of San Francisco School of Management Fellow, where he teaches the "Big Data MBA" course.

Bill has over three decades of experience in data warehousing, BI, and analytics. Bill authored EMC's Vision Workshop methodology that links an organization's strategic business initiatives with their supporting data and analytic requirements and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data Warehouse Institute's faculty as the head of the analytic applications curriculum.

Bill holds a master's degree in Business Administration from the University of Iowa and a Bachelor of Science degree in Mathematics, Computer Science, and Business Administration from Coe College.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

In this episode, Wayne Eckerson and Jeff Magnusson discuss a self-service model for data science work and the role of a data platform in that environment. Magnusson also talks about Flotilla, a new open source API that makes it easy for data scientists to execute tasks on the data platform.

Magnusson is the vice president of data platform at Stitch Fix. He leads a team responsible for building the data platform that supports the company's team of 80+ data scientists, as well as other business users. That platform is designed to facilitate self-service among data scientists and promote velocity and innovation that differentiate Stitch Fix in the marketplace. Before Stitch Fix, Magnusson managed the data platform architecture team at Netflix where he helped design and open source many of the components of the Hadoop-based infrastructure and big data platform.