talk-data.com talk-data.com

Topic

Big Data

data_processing analytics large_datasets

1217

tagged

Activity Trend

28 peak/qtr
2020-Q1 2026-Q1

Activities

1217 activities · Newest first

Elasticsearch 5.x Cookbook - Third Edition

Elasticsearch 5.x Cookbook is a comprehensive guide that teaches you how to leverage the full power of Elasticsearch for high-performance search and analytics. Through step-by-step recipes, you'll explore deployment, query building, plugin integration, and advanced analytics, ensuring you can manage and scale Elasticsearch like a pro. What this Book will help me do Understand and deploy complex Elasticsearch cluster topologies for optimal performance. Create tailored mappings to gain finer control over data indexing and retrieval. Design and execute advanced queries and analytics using Elasticsearch capabilities. Integrate Elasticsearch with popular programming languages and big data platforms. Monitor and improve Elasticsearch cluster health using the best practices and tools. Author(s) Alberto Paro is a seasoned software engineer and data scientist with extensive experience in distributed systems and search technologies. Having worked on numerous search-related projects, he brings practical, real-world insights to his writing. Alberto is passionate about teaching and simplifying complex concepts, making this book both approachable and expertly detailed. Who is it for? This book is ideal for developers or data engineers seeking to utilize Elasticsearch for advanced search and analytics tasks. If you have some prior knowledge of JSON and programming concepts, particularly Java, you will benefit most from this material. Whether you're looking to integrate Elasticsearch into your systems or to optimize its usage, this book caters to your needs.

Why the increased merging of AdTech and MarTech is creating new types of (potential) harms; how the definitions of PII & pseudonymous data are evolving and why legislation is not lagging behind: it’s the technology people who can’t stop complaining while building rockets and self-driving cars!

HBase High Performance Cookbook

"HBase High Performance Cookbook" is your guide to mastering the optimization, scaling, and tuning of HBase systems. Covering everything from configuring HBase clusters to designing scalable table structures and performance tuning, this comprehensive book provides practical advice and strategies for leveraging HBase's full potential. By following this book's recipes, you'll supercharge your HBase expertise. What this Book will help me do Understand how to configure HBase for optimal performance, improving your data system's efficiency. Learn to design table structures to maximize scalability and functionality in HBase. Gain skills in performing CRUD operations and using advanced features like MapReduce within HBase. Discover practices for integrating HBase with other technologies such as ElasticSearch. Master the steps involved in setting up and optimizing HBase in cloud environments for enhanced performance. Author(s) Ruchir Choudhry is a seasoned data management professional with extensive experience in distributed database systems. He possesses deep expertise in HBase, Hadoop, and other big data technologies. His practical and engaging writing style aims to demystify complex technical topics, making them accessible to developers and architects alike. Who is it for? This book is tailored for developers and system architects looking to deepen their understanding of HBase. Whether you are experienced with other NoSQL databases or are new to HBase, this book provides extensive practical knowledge. Ideal for professionals working in big data applications or those eager to optimize and scale their database systems effectively.

Summary

There is a vast constellation of tools and platforms for processing and analyzing your data. In this episode Matthew Rocklin talks about how Dask fills the gap between a task oriented workflow tool and an in memory processing framework, and how it brings the power of Python to bear on the problem of big data.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. You can help support the show by checking out the Patreon page which is linked from the site. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host is Tobias Macey and today I’m interviewing Matthew Rocklin about Dask and the Blaze ecosystem.

Interview with Matthew Rocklin

Introduction How did you get involved in the area of data engineering? Dask began its life as part of the Blaze project. Can you start by describing what Dask is and how it originated? There are a vast number of tools in the field of data analytics. What are some of the specific use cases that Dask was built for that weren’t able to be solved by the existing options? One of the compelling features of Dask is the fact that it is a Python library that allows for distributed computation at a scale that has largely been the exclusive domain of tools in the Hadoop ecosystem. Why do you think that the JVM has been the reigning platform in the data analytics space for so long? Do you consider Dask, along with the larger Blaze ecosystem, to be a competitor to the Hadoop ecosystem, either now or in the future? Are you seeing many Hadoop or Spark solutions being migrated to Dask? If so, what are the common reasons? There is a strong focus for using Dask as a tool for interactive exploration of data. How does it compare to something like Apache Drill? For anyone looking to integrate Dask into an existing code base that is already using NumPy or Pandas, what does that process look like? How do the task graph capabilities compare to something like Airflow or Luigi? Looking through the documentation for the graph specification in Dask, it appears that there is the potential to introduce cycles or other bugs into a large or complex task chain. Is there any built-in tooling to check for that before submitting the graph for execution? What are some of the most interesting or unexpected projects that you have seen Dask used for? What do you perceive as being the most relevant aspects of Dask for data engineering/data infrastructure practitioners, as compared to the end users of the systems that they support? What are some of the most significant problems that you have been faced with, and which still need to be overcome in the Dask project? I know that the work on Dask is largely performed under the umbrella of PyData and sponsored by Continuum Analytics. What are your thoughts on the financial landscape for open source data analytics and distributed computation frameworks as compared to the broader world of open source projects?

Keep in touch

@mrocklin on Twitter mrocklin on GitHub

Links

http://matthewrocklin.com/blog/work/2016/09/22/cluster-deployments?utm_source=rss&utm_medium=rss https://opendatascience.com/blog/dask-for-institutions/?utm_source=rss&utm_medium=rss Continuum Analytics 2sigma X-Array Tornado

Website Podcast Interview

Airflow Luigi Mesos Kubernetes Spark Dryad Yarn Read The Docs XData

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

As we are stepping into 4th industrial age, we must prepare our HR4.0 workforce to use new age tools to fix Employee engagement. We will build-up a case for need of an AI in HR and discuss some considerations we must take to build a powerful and scalable system. We will spend some time discussing one of the ways TAO.ai is solving employee engagement and preparing powerful AI to work with new age tools, talent, technologies and techniques to empower workers with best decision making support system.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers and lead practitioners to come on show and discuss their journey in creating the data driven future.

Wanna Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords: FutureOfData Data Analytics Leadership Podcast Big Data Strategy

In this session, Juan F. Gorricho, Chief Data & Analytics Officer, PFCU / Walt Disney, sat with Vishal Kumar, CEO AnalyticsWeek and shared his journey as an analytics executive, best practices, hacks for upcoming executives, and some challenges/opportunities he's observing as a Chief Data & Analytics Officer. Juan discussed creating a data-driven culture and what leaders could do to get buy-ins for building strong data science capabilities.

Timeline: 0:29 Juan's journey. 4:57 Defining International society of chief data officers. 7:08 Joining International society of CDO. 7:45 Being in a credit union and being a CDO. 10:33 Hacks to creating a data-driven culture. 16:25 Being a partner of Walt Disney. 19:20 Data sharing with Disney. 21:50 Data officers vs. analytics officer. 25:59 Getting the leadership onboard on data. 30:44 The business decision making of a CDO at PFCU. 33:33 Collaboration with IT. 37:48 Challenges Juan faces in his current role. 45:03 Building data solutions or buying data solutions? 49:05 Advice for data leaders.

Podcast link: https://futureofdata.org/analyticsweek-leadership-podcast-with-juan-f-gorricho-disney/

Here's Juan F. Gorricho Bio: Juan F. Gorricho is currently the Chief Data & Analytics Officer for Partners Federal Credit Union. In this role, Juan leads the data and analytics strategy development and execution for Partners, one of the top credit unions in the country, exclusively serving the more than 100,000 cast members of The Walt Disney Company. Juan has more than 15 years of experience in the data and analytics space, including multiple speaking engagements. In his prior roles with Disney, Juan led multiple multimillion-dollar projects to implement business intelligence and analytical solutions for key lines of business such as Labor Operations and Merchandise. Juan has an Industrial Engineering degree from Universidad de los Andes in Bogotá, Colombia, and an MBA from the Darden Graduate School of Business Administration at the University of Virginia. Juan is married and lives with his wife and two children in Orlando, Florida, United States of America.

Follow @jgorricho

The podcast is sponsored by: TAO.ai(https://tao.ai), Artificial Intelligence Driven Career Coach

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Strategies in Biomedical Data Science

An essential guide to healthcare data problems, sources, and solutions Strategies in Biomedical Data Science provides medical professionals with much-needed guidance toward managing the increasing deluge of healthcare data. Beginning with a look at our current top-down methodologies, this book demonstrates the ways in which both technological development and more effective use of current resources can better serve both patient and payer. The discussion explores the aggregation of disparate data sources, current analytics and toolsets, the growing necessity of smart bioinformatics, and more as data science and biomedical science grow increasingly intertwined. You'll dig into the unknown challenges that come along with every advance, and explore the ways in which healthcare data management and technology will inform medicine, politics, and research in the not-so-distant future. Real-world use cases and clear examples are featured throughout, and coverage of data sources, problems, and potential mitigations provides necessary insight for forward-looking healthcare professionals. Big Data has been a topic of discussion for some time, with much attention focused on problems and management issues surrounding truly staggering amounts of data. This book offers a lifeline through the tsunami of healthcare data, to help the medical community turn their data management problem into a solution. Consider the data challenges personalized medicine entails Explore the available advanced analytic resources and tools Learn how bioinformatics as a service is quickly becoming reality Examine the future of IOT and the deluge of personal device data The sheer amount of healthcare data being generated will only increase as both biomedical research and clinical practice trend toward individualized, patient-specific care. Strategies in Biomedical Data Science provides expert insight into the kind of robust data management that is becoming increasingly critical as healthcare evolves.

Data Visualization, Volume I

Data visualization involves graphical and visual tools used in data analysis and decision making. The emphasis in this book is on recent trends and applications of visualization tools using conventional and big data. These tools are widely used in data visualization and quality improvement to analyze, enhance, and improve the quality of products and services. Data visualization is an easy way to obtain a first look at the data visually. The book provides a collection of visual and graphical tools widely used to gain an insight into the data before applying more complex analysis. The focus is on the key application areas of these tools including business process improvement, business data analysis, health care, finance, manufacturing, engineering, process improvement, and Lean Six Sigma. The key areas of application include data and data analysis concepts, recent trends in data visualization and ÒBig Data,Ó widely used charts and graphs and their applications, analysis of the relationships between two or more variables graphically using scatterplots, bubble graphs, matrix plots, etc., data visualization with big data, computer applications and implementation of widely used graphical and visual tools, and computer instructions to create the graphics presented along with the data files.

Pro Apache Phoenix: An SQL Driver for HBase, First Edition

Leverage Phoenix as an ANSI SQL engine built on top of the highly distributed and scalable NoSQL framework HBase. Learn the basics and best practices that are being adopted in Phoenix to enable a high write and read throughput in a big data space. This book includes real-world cases such as Internet of Things devices that send continuous streams to Phoenix, and the book explains how key features such as joins, indexes, transactions, and functions help you understand the simple, flexible, and powerful API that Phoenix provides. Examples are provided using real-time data and data-driven businesses that show you how to collect, analyze, and act in seconds. Pro Apache Phoenix covers the nuances of setting up a distributed HBase cluster with Phoenix libraries, running performance benchmarks, configuring parameters for production scenarios, and viewing the results. The book also shows how Phoenix plays well with other key frameworks in the Hadoop ecosystem such as Apache Spark, Pig, Flume, and Sqoop. You will learn how to: Handle a petabyte data store by applying familiar SQL techniques Store, analyze, and manipulate data in a NoSQL Hadoop echo system with HBase Apply best practices while working with a scalable data store on Hadoop and HBase Integrate popular frameworks (Apache Spark, Pig, Flume) to simplify big data analysis Demonstrate real-time use cases and big data modeling techniques Who This Book Is For Data engineers, Big Data administrators, and architects

MATLAB Machine Learning

This book is a comprehensive guide to machine learning with worked examples in MATLAB. It starts with an overview of the history of Artificial Intelligence and automatic control and how the field of machine learning grew from these. It provides descriptions of all major areas in machine learning. The book reviews commercially available packages for machine learning and shows how they fit into the field. The book then shows how MATLAB can be used to solve machine learning problems and how MATLAB graphics can enhance the programmer’s understanding of the results and help users of their software grasp the results. Machine Learning can be very mathematical. The mathematics for each area is introduced in a clear and concise form so that even casual readers can understand the math. Readers from all areas of engineering will see connections to what they know and will learn new technology. The book then provides complete solutions in MATLAB for several important problems in machine learning including face identification, autonomous driving, and data classification. Full source code is provided for all of the examples and applications in the book. What you'll learn: An overview of the field of machine learning Commercial and open source packages in MATLAB How to use MATLAB for programming and building machine learning applications MATLAB graphics for machine learning Practical real world examples in MATLAB for major applications of machine learning in big data Who is this book for: The primary audiences are engineers and engineering students wanting a comprehensive and practical introduction to machine learning.

Apache Spark for Data Science Cookbook

In "Apache Spark for Data Science Cookbook," you'll delve into solving real-world analytical challenges using the robust Apache Spark framework. This book features hands-on recipes that cover data analysis, distributed machine learning, and real-time data processing. You'll gain practical skills to process, visualize, and extract insights from large datasets efficiently. What this Book will help me do Master using Apache Spark for processing and analyzing large-scale datasets effectively. Harness Spark's MLLib for implementing machine learning algorithms like classification and clustering. Utilize libraries such as NumPy, SciPy, and Pandas in conjunction with Spark for numerical computations. Apply techniques like Natural Language Processing and text mining using Spark-integrated tools. Perform end-to-end data science workflows, including data exploration, modeling, and visualization. Author(s) Nagamallikarjuna Inelu and None Chitturi bring their extensive experience working with data science and distributed computing frameworks like Apache Spark. Nagamallikarjuna specializes in applying machine learning algorithms to big data problems, while None has contributed to various big data system implementations. Together, they focus on providing practitioners with practical and efficient solutions. Who is it for? This book is primarily intended for novice and intermediate data scientists and analysts who are curious about using Apache Spark to tackle data science problems. Readers are expected to have some familiarity with basic data science tasks. If you want to learn practical applications of Spark in data analysis and enhance your big data analytics skills, this resource is for you.

Fast Data Processing Systems with SMACK Stack

Fast Data Processing Systems with SMACK Stack introduces you to the SMACK stack-a combination of Spark, Mesos, Akka, Cassandra, and Kafka. You will learn to integrate these technologies to build scalable, efficient, and real-time data processing platforms tailored for solving critical business challenges. What this Book will help me do Understand the concepts of fast data pipelines and design scalable architectures using the SMACK stack Gain expertise in functional programming with Scala and leverage its power in data processing tasks Build and optimize distributed databases using Apache Cassandra for scaling extensively Deploy and manage real-time data streams using Apache Kafka to handle massive messaging workloads Implement cost-effective cluster infrastructures with Apache Mesos for efficient resource utilization Author(s) None Estrada is an expert in distributed systems and big data technologies. With years of experience implementing SMACK-based solutions across industries, Estrada offers a practical viewpoint to designing scalable systems. Their blend of theoretical knowledge and applied practices ensures readers receive actionable guidance. Who is it for? This book is perfect for software developers, data engineers, or data scientists looking to deepen their understanding of real-time data processing systems. If you have a foundational knowledge of the technologies in the SMACK stack or wish to learn how to combine these cutting-edge tools to solve complex problems, this is for you. Readers with an interest in building efficient big data solutions will find tremendous value here.

In this session, Scott Zoldi, Chief Analytics Officer, FICO, sat with Vishal Kumar, CEO AnalyticsWeek and shared his journey as an analytics executive, best practices, and hacks for upcoming executives challenges/opportunities he's observing as a Chief Analytics Officer. Scott discussed creating the data-driven culture and what leaders could do to get buy-ins for building strong data science capabilities. Scott discussed his passion for security analytics. He shared some best practices to put-up a Cyber Security Center of Excellence. Scott also shared what traits future leaders should have.

Timeline:

0:29 Scott's journey. 5:10 On Falcon's fraud manager. 9:12 Area in secuity where AI works. 11:40 FICO's dealing with new products. 15:30 Centre of excellence for cyber security. 22:00 Should a center of excellence be inside out or in partnership? 28:22 The CAO role in FICO. 31:14 Is FICO in facing or out facing? 32:12 Being analytical in a gutt based organization. 35:54 Art of doing business and science of doing business. 38:22 Challenges as CAO in FICO. 41:09 Opportunity for data science in the security space. 45:54 Qualities required for a CAO. 48:54 Tips for a data scientist to get hired at FICO.

Podcast link: https://futureofdata.org/analyticsweek-leadership-podcast-with-scott-zoldi-cao-fico/

Here's Scott Zoldi's Bio: Scott Zoldi is Chief Analytics Officer at FICO, responsible for the analytic development of FICO’s product and technology solutions, including the FICO™ Falcon® Fraud Manager product, which protects about two-thirds of the world’s payment card transactions from fraud. While at FICO, Scott has been responsible for authoring 72 analytic patents, 36 patents granted, and 36 in process. Scott is actively involved in developing new analytic products and Big Data analytics applications, many of which leverage new streaming artificial intelligence innovations such as adaptive analytics, collaborative profiling, and self-learning models. Scott is most recently focused on the applications of streaming self-learning analytics for real-time detection of Cyber Security attacks and Money Laundering. Scott serves on two boards of directors, including Software San Diego and Cyber Center of Excellence. Scott received his Ph.D. in theoretical physics from Duke University.

Follow @scottzoldi

The podcast is sponsored by: TAO.ai(https://tao.ai), Artificial Intelligence Driven Career Coach

Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale

The Complete Guide to Data Science with Hadoop—For Technical Professionals, Businesspeople, and Students Demand is soaring for professionals who can solve real data science problems with Hadoop and Spark. is your complete guide to doing just that. Drawing on immense experience with Hadoop and big data, three leading experts bring together everything you need: high-level concepts, deep-dive techniques, real-world use cases, practical applications, and hands-on tutorials. Practical Data Science with Hadoop® and Spark The authors introduce the essentials of data science and the modern Hadoop ecosystem, explaining how Hadoop and Spark have evolved into an effective platform for solving data science problems at scale. In addition to comprehensive application coverage, the authors also provide useful guidance on the important steps of data ingestion, data munging, and visualization. Once the groundwork is in place, the authors focus on specific applications, including machine learning, predictive modeling for sentiment analysis, clustering for document analysis, anomaly detection, and natural language processing (NLP). This guide provides a strong technical foundation for those who want to do practical data science, and also presents business-driven guidance on how to apply Hadoop and Spark to optimize ROI of data science initiatives. Learn What data science is, how it has evolved, and how to plan a data science career How data volume, variety, and velocity shape data science use cases Hadoop and its ecosystem, including HDFS, MapReduce, YARN, and Spark Data importation with Hive and Spark Data quality, preprocessing, preparation, and modeling Visualization: surfacing insights from huge data sets Machine learning: classification, regression, clustering, and anomaly detection Algorithms and Hadoop tools for predictive modeling Cluster analysis and similarity functions Large-scale anomaly detection NLP: applying data science to human language

In this session, Mike Flowers, Chief Analytics Officer, Enigma, sat with Vishal Kumar, CEO AnalyticsWeek and shared his journey as an analytics executive, best practices, hacks for upcoming executives, and some challenges/opportunities he's observing as a Chief Analytics Officer. Mike discussed his journey from trial prosecutor to Chief Analytics Officer, sharing some great stories on how Govt. embraces data analytics.

Timeline: 0:29 Mike's journey. 23:32 Mike's role in Enigma. 27:46 The role of CAO in Enigma. 29:50 How much Mike's role is customer-facing vs. in facing. 30:00 Getting over the roadblocks of working with the government. 34:06 Creating a data bridge. 39:17 Collaboration in the data science field. 46:02 Challenges in working with Clients at Enigma. 51:34 Benefits of having a legal background before coming to data analytics.

Podcast link: https://futureofdata.org/enigma_io/

Here's Mike Flowers Bio: Mike is Chief Analytics Officer at New York City tech start-up Enigma, an operational data management and intelligence company, where he leads data scientists assisting the development and deployment of decision-support technologies to Fortune 500 clients in compliance, manufacturing, banking, and finance, and several U.S. and foreign government agencies. In addition, he is a Senior Fellow at Bloomberg Philanthropies, working with select U.S. city governments to launch sustainable analytics programs. Mike is also an advisor to numerous organizations in a wide variety of fields, including, for example, Weil Cornell Medical College, the Inter-American Development Bank, the Office of the New York State Comptroller, the Greater London Authority, the government of New South Wales, Australia, and the French national government.

From 2014-15, Mike was an Executive-in-Residence and the first MacArthur Urban Science Fellow at NYU’s Center for Urban Science and Progress, where he advised students and faculty on projects to advance data-driven decision-making in city government.

From 2009-2013, Mike served under Mayor Michael Bloomberg as New York City’s first Chief Analytics Officer. During his tenure, he founded the Mayor’s Office of Data Analytics, which provides quantitative support to the city’s public safety, public health, infrastructure development, finance, economic development, disaster preparedness and response, legislative, sustainability, and human services efforts. In addition, Mike designed and oversaw the implementation of NYC DataBridge, a first-of-its-kind citywide analytics platform that enables the sharing and analysis of city data across agencies and with the public, and he ran the implementation of the city’s internationally-recognized Open Data initiative. For this work, Mike was twice recognized by the White House for innovation.

Follow @mpflowersnyc

The podcast is sponsored by: TAO.ai(https://tao.ai), Artificial Intelligence Driven Career Coach

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Visualizing Graph Data

Visualizing Graph Data teaches you not only how to build graph data structures, but also how to create your own dynamic and interactive visualizations using a variety of tools. This book is loaded with fascinating examples and case studies to show you the real-world value of graph visualizations. About the Technology Assume you are doing a great job collecting data about your customers and products. Are you able to turn your rich data into important insight? Complex relationships in large data sets can be difficult to recognize. Visualizing these connections as graphs makes it possible to see the patterns, so you can find meaning in an otherwise over-whelming sea of facts. About the Book Visualizing Graph Data teaches you how to understand graph data, build graph data structures, and create meaningful visualizations. This engaging book gently introduces graph data visualization through fascinating examples and compelling case studies. You'll discover simple, but effective, techniques to model your data, handle big data, and depict temporal and spatial data. By the end, you'll have a conceptual foundation as well as the practical skills to explore your own data with confidence. What's Inside Techniques for creating effective visualizations Examples using the Gephi and KeyLines visualization packages Real-world case studies About the Reader No prior experience with graph data is required. About the Author Corey Lanum has decades of experience building visualization and analysis applications for companies and government agencies around the globe. Quotes Shows you how to solve visualization problems and explore complex data sets. A pragmatic introduction. - John D. Lewis, DDN Excellent! Hands-on! Shows you how to kick-start your graph data visualization. - Rocio Chongtay, University of Southern Denmark A clear and concise guide to both graph theory and visualization. - Jonathan Suever, PhD, Georgia Institute of Technology Great coverage, with real-life business use cases. - Sumit Pal, Big Data consultant

In this session, John Young, Chief Analytics Officer, Epsilon Data Management, sat with Vishal Kumar, CEO AnalyticsWeek and shared his journey to Chief Analytics Officer, life @ Epsilon, and discussed some challenges/opportunities faced by data-driven organizations, its executives and shared some best practices.

Timeline: 2:51 What's Epsilon? 5:12 John's journey. 9:24 The role of CAO in Epsilon. 12:12 How much John's role is in facing and out facing. 13:19 Best practices in data analytics at Epsilon. 16:15 Demarcating CDO and CAO. 19:52 Depth and breadth of decision making at Epsilon. 25:00 Dealing with clients of Epsilon. 28:48 Best data practices for businesses. 34:39 Build or buy data? 37:21 Creating a center of excellence with data. 40:01 Building a data team. 43:45 Tips for aspiring data analytics executives. 46:05 Art of doing business and science of doing business. 48:31 Closing remarks.

Podcast link: https://futureofdata.org/analyticsweek-leadership-podcast-with-john-young-epsilon-data-management/

Here's John's Bio: Mr. Young has general management responsibilities for the 150+ member Analytic Consulting Group at Epsilon. His responsibilities also include design and consultation on various database marketing analytic engagements, including predictive modeling, segmentation, measurement, and profiling. John also brings thought leadership on important marketing topics. John works with companies in numerous industries, including financial services, technology, retail, healthcare, and not-for-profit.

Before joining Epsilon in 1994, Mr. Young was a Marketing Research Manager at Digitas, a Market Research Manager at Citizens Bank, Research Manager at the AICPA, and an Assistant Economist at the Federal Reserve Bank of Kansas City.

Mr. Young has presented at numerous conferences, including NCDM Winter and Summer, DMA Annual, DMA Marketing Analytics, LIMRA Big Data Analytics, and Epsilon’s Client Symposiums. He has published in DM News, CRM Magazine’s Viewpoints, Chief Marketer, Loyalty 360, Colloquy, and serves on the advisory board of the DMA’s Analytics Community.

Mr. Young holds a B.S. and M.S. in Economics from Colorado State University, Fort Collins, Colorado.

The podcast is sponsored by: TAO.ai(https://tao.ai), Artificial Intelligence Driven Career Coach

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to Join? If you or any you know wants to join in, Register your interest @ http://play.analyticsweek.com/guest/

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

The Big Data Transformation

Business executives today are well aware of the power of data, especially for gaining actionable insight into products and services. But how do you jump into the big data analytics game without spending millions on data warehouse solutions you don’t need? This 40-page report focuses on massively parallel processing (MPP) analytical databases that enable you to run queries and dashboards on a variety of business metrics at extreme speed and Exabyte scale. Because they leverage the full computational power of a cluster, MPP analytical databases can analyze massive volumes of data—both structured and semi-structured—at unprecedented speeds. This report presents five real-world case studies from Etsy, Cerner Corporation, Criteo and other global enterprises to focus on one big data analytics platform in particular, HPE Vertica. You’ll discover: How one prominent data storage company convinced both business and tech stakeholders to adopt an MPP analytical database Why performance marketing technology company Criteo used a Center of Excellence (CoE) model to ensure the success of its big data analytics endeavors How YPSM uses Vertica to speed up its Hadoop-based data processing environment Why Cerner adopted an analytical database to scale its highly successful health information technology platform How Etsy drives success with the company’s big data initiative by avoiding common technical and organizational mistakes

Beginning Hibernate: For Hibernate 5

Get started with the Hibernate 5 persistence layer and gain a clear introduction to the current standard for object-relational persistence in Java. This updated edition includes the new Hibernate 5.0 framework as well as coverage of NoSQL, MongoDB, and other related technologies, ranging from applications to big data. Beginning Hibernate is ideal if you're experienced in Java with databases (the traditional, or connected, approach), but new to open-source, lightweight Hibernate. The book keeps its focus on Hibernate without wasting time on nonessential third-party tools, so you'll be able to immediately start building transaction-based engines and applications. Experienced authors Joseph Ottinger with Dave Minter and Jeff Linwood provide more in-depth examples than any other book for Hibernate beginners. They present their material in a lively, example-based manner—not a dry, theoretical, hard-to-read fashion. What You'll Learn Build enterprise Java-based transaction-type applications that access complex data with Hibernate Work with Hibernate 5 using a present-day build process Use Java 8 features with Hibernate Integrate into the persistence life cycle Map using Java's annotations Search and query with the new version of Hibernate Integrate with MongoDB using NoSQL Keep track of versioned data with Hibernate Envers Who This Book Is For Experienced Java developers interested in learning how to use and apply object-relational persistence in Java and who are new to the Hibernate persistence framework.

Business Analytics for Managers, 2nd Edition

The intensified used of data based on analytical models to control digitalized operational business processes in an intelligent way is a game changer that continuously disrupts more and more markets. This book exemplifies this development and shows the latest tools and advances in this field Business Analytics for Managers offers real-world guidance for organizations looking to leverage their data into a competitive advantage. This new second edition covers the advances that have revolutionized the field since the first edition's release; big data and real-time digitalized decision making have become major components of any analytics strategy, and new technologies are allowing businesses to gain even more insight from the ever-increasing influx of data. New terms, theories, and technologies are explained and discussed in terms of practical benefit, and the emphasis on forward thinking over historical data describes how analytics can drive better business planning. Coverage includes data warehousing, big data, social media, security, cloud technologies, and future trends, with expert insight on the practical aspects of the current state of the field. Analytics helps businesses move forward. Extensive use of statistical and quantitative analysis alongside explanatory and predictive modeling facilitates fact-based decision making, and evolving technologies continue to streamline every step of the process. This book provides an essential update, and describes how today's tools make business analytics more valuable than ever. Learn how Hadoop can upgrade your data processing and storage Discover the many uses for social media data in analysis and communication Get up to speed on the latest in cloud technologies, data security, and more Prepare for emerging technologies and the future of business analytics Most businesses are caught in a massive, non-stop stream of data. It can become one of your most valuable assets, or a never-ending flood of missed opportunity. Technology moves fast, and keeping up with the cutting edge is crucial for wringing even more value from your data— Business Analytics for Managers brings you up to date, and shows you what analytics can do for you now.