talk-data.com talk-data.com

Topic

Data Science

machine_learning statistics analytics

1516

tagged

Activity Trend

68 peak/qtr
2020-Q1 2026-Q2

Activities

1516 activities · Newest first

Doing Data Science

Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know. In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science. Topics include: Statistical inference, exploratory data analysis, and the data science process Algorithms Spam filters, Naive Bayes, and data wrangling Logistic regression Financial modeling Recommendation engines and causality Data visualization Social networks and data journalism Data engineering, MapReduce, Pregel, and Hadoop Doing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

Getting Started with Greenplum for Big Data Analytics

This book serves as a thorough introduction to using the Greenplum platform for big data analytics. It explores key concepts for processing, analyzing, and deriving insights from big data using Greenplum, covering aspects from data integration to advanced analytics techniques like programming with R and MADlib. What this Book will help me do Understand the architecture and core components of the Greenplum platform. Learn how to design and execute data science projects using Greenplum. Master loading, processing, and querying big data in Greenplum efficiently. Explore programming with R and integrating it with Greenplum for analytics. Gain skills in high-availability configurations, backups, and recovery within Greenplum. Author(s) Sunila Gollapudi is a seasoned expert in the field of big data analytics and has multiple years of experience working with platforms like Greenplum. Her real-world problem-solving expertise shapes her practical and approachable writing style, making this book not only educational but enjoyable to read. Who is it for? This book is ideal for data scientists or analysts aiming to explore the capabilities of big data platforms like Greenplum. It suits readers with basic knowledge of data warehousing, programming, and analytics tools who want to deepen their expertise and effectively harness Greenplum for analytics.

Agile Data Science

Mining big data requires a deep investment in people and time. How can you be sure you’re building the right models? With this hands-on book, you’ll learn a flexible toolset and methodology for building effective analytics applications with Hadoop. Using lightweight tools such as Python, Apache Pig, and the D3.js library, your team will create an agile environment for exploring data, starting with an example application to mine your own email inboxes. You’ll learn an iterative approach that enables you to quickly change the kind of analysis you’re doing, depending on what the data is telling you. All example code in this book is available as working Heroku apps. Create analytics applications by using the agile big data development methodology Build value from your data in a series of agile sprints, using the data-value stack Gain insight by using several data structures to extract multiple features from a single dataset Visualize data with charts, and expose different aspects through interactive reports Use historical data to predict the future, and translate predictions into action Get feedback from users after each sprint to keep your project on track

On Being a Data Skeptic

"Data is here, it's growing, and it's powerful." Author Cathy O'Neil argues that the right approach to data is skeptical, not cynical––it understands that, while powerful, data science tools often fail. Data is nuanced, and "a really excellent skeptic puts the term 'science' into 'data science.'" The big data revolution shouldn't be dismissed as hype, but current data science tools and models shouldn't be hailed as the end-all-be-all, either.

Data Science for Business

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today. Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You'll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company's data science projects. You'll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making. Understand how data science fits in your organization–and how you can use it for competitive advantage Treat data as a business asset that requires careful investment if you're to gain real value Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way Learn general concepts for actually extracting knowledge from data Apply data science principles when interviewing data science job candidates

Analyzing the Analyzers

Despite the excitement around "data science," "big data," and "analytics," the ambiguity of these terms has led to poor communication between data scientists and organizations seeking their help. In this report, authors Harlan Harris, Sean Murphy, and Marck Vaisman examine their survey of several hundred data science practitioners in mid-2012, when they asked respondents how they viewed their skills, careers, and experiences with prospective employers. The results are striking. Based on the survey data, the authors found that data scientists today can be clustered into four subgroups, each with a different mix of skillsets. Their purpose is to identify a new, more precise vocabulary for data science roles, teams, and career paths. This report describes: Four data scientist clusters: Data Businesspeople, Data Creatives, Data Developers, and Data Researchers Cases in miscommunication between data scientists and organizations looking to hire Why "T-shaped" data scientists have an advantage in breadth and depth of skills How organizations can apply the survey results to identify, train, integrate, team up, and promote data scientists

Data Visualization: a successful design process

Dive into the world of data visualization with 'Data Visualization: a Successful Design Process'. Learn to convert complex datasets into vivid, insightful visuals using proven design methodologies and tools. This resource equips you to craft visuals that not only engage your audience but also uncover critical trends and narratives hidden in data. What this Book will help me do Master the fundamentals of visualization taxonomy to choose the ideal design for your data. Develop analytical questions and identify key narratives to structure your data representation. Understand the human visual system and how it impacts effective visual communication. Apply critical thinking to select visualization techniques suited to different data types. Gain an in-depth knowledge of data visualization tools and contemporary practices. Author(s) The author of this book is a seasoned expert in the field of data visualization, known for their innovative approach to transforming data into impactful visuals. With years of experience navigating the intersection of data science and design, their method focuses on empowering professionals to communicate insights effectively. Their writing combines a deep understanding of technical skills with actionable, inspiring guidance. Who is it for? This book is perfect for professionals, analysts, and designers who aim to improve their data visualization skills. Whether you are a beginner or an experienced individual seeking to refine your approach, this book caters to all skill levels. If your goal includes communicating data insights clearly and effectively to varied audiences, this book is for you.

How Data Science Is Transforming Health Care

In the early days of the 20th century, department store magnate JohnWanamaker famously said, "I know that half of my advertising doesn'twork. The problem is that I don't know which half." That remainedbasically true until Google transformed advertising with AdSense basedon new uses of data and analysis. The same might be said about healthcare and it's poised to go through a similar transformation as newtools, techniques, and data sources come on line. Soon we'll makepolicy and resource decisions based on much better understanding ofwhat leads to the best outcomes, and we'll make medical decisionsbased on a patient's specific biology. The result will be betterhealth at less cost. This paper explores how data analysis will help us structure thebusiness of health care more effectively around outcomes, and how itwill transform the practice of medicine by personalizing for eachspecific patient.

Statistical Learning and Data Science

Driven by a vast range of applications, data analysis and learning from data are vibrant areas of research. Various methodologies, including unsupervised data analysis, supervised machine learning, and semi-supervised techniques, have continued to develop to cope with the increasing amount of data collected through modern technology. With a focus on applications, this volume presents contributions from some of the leading researchers in the different fields of data analysis. Synthesizing the methodologies into a coherent framework, the book covers a range of topics, from large-scale machine learning to synthesis objects analysis.

What Is Data Science?

We've all heard it: according to Hal Varian, statistics is the next sexy job. Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement mean? Why do we suddenly care about statistics and about data? This report examines the many sides of data science -- the technologies, the companies and the unique skill sets.The web is full of "data-driven apps." Almost any e-commerce application is a data-driven application. There's a database behind a web front end, and middleware that talks to a number of other databases and data services (credit card processing companies, banks, and so on). But merely using data isn't really what we mean by "data science." A data application acquires its value from the data itself, and creates more data as a result. It's not just an application with data; it's a data product. Data science enables the creation of data products.

Startups and small businesses are the backbone of the UK economy and are at the heart of the nation’s innovation and growth. We’re lucky enough to see this in action every day with our customers, working with them to create a business bank account and industry-first features that support them wherever they are on their business journey. Join us to hear how data science and experimentation have supported this product growth.

Sustained motivation doesn’t come just from excelling at your job — it comes from a sense of care, belonging, and shared purpose. When leaders bring care, connection, and flexibility into how they manage, data scientists don’t just get more done — they find meaning in the work. In this talk, Dmitry will share practical strategies for nurturing motivation and long-term engagement within data science teams.

Capture value at scale with C3 AI Agentic Process Automation

Join Nikhil Krishnan, SVP and CTO, Data Science, C3 AI, for a 6-minute live interview at Microsoft Ignite on C3 AI Agentic Process Automation. See how intelligent, adaptable workflows turn natural language into reliable, context-aware enterprise actions. Discover how end-users can orchestrate fully automated or human-in-the-loop workflows that leverage all enterprise and external data — and learn how C3 AI drives value at scale.

Traditional career ladders break when data roles span engineering, analytics, product, data science and now AI. Your data scientist ships production code, your analytics engineer shapes product strategy and your senior IC has no interest in management. This talk shares how we designed a career framework for Data at Pleo that recognises hybrid roles, using capability mapping data and talent reviews strategically to create compelling growth paths beyond people management.