talk-data.com talk-data.com

Topic

Data Science

machine_learning statistics analytics

1516

tagged

Activity Trend

68 peak/qtr
2020-Q1 2026-Q1

Activities

1516 activities · Newest first

Today I’m joined by Vera Liao, Principal Researcher at Microsoft. Vera is a part of the FATE (Fairness, Accountability, Transparency, and Ethics of AI) group, and her research centers around the ethics, explainability, and interpretability of AI products. She is particularly focused on how designers design for explainability. Throughout our conversation, we focus on the importance of taking a human-centered approach to rendering model explainability within a UI, and why incorporating users during the design process informs the data science work and leads to better outcomes. Vera also shares some research on why example-based explanations tend to out-perform [model] feature-based explanations, and why traditional XAI methods LIME and SHAP aren’t the solution to every explainability problem a user may have.

Highlights/ Skip to:

I introduce Vera, who is Principal Researcher at Microsoft and whose research mainly focuses on the ethics, explainability, and interpretability of AI (00:35) Vera expands on her view that explainability should be at the core of ML applications (02:36) An example of the non-human approach to explainability that Vera is advocating against (05:35) Vera shares where practitioners can start the process of responsible AI (09:32) Why Vera advocates for doing qualitative research in tandem with model work in order to improve outcomes (13:51) I summarize the slides I saw in Vera’s deck on Human-Centered XAI and Vera expands on my understanding (16:06) Vera’s success criteria for explainability (19:45) The various applications of AI explainability that Vera has seen evolve over the years (21:52) Why Vera is a proponent of example-based explanations over model feature ones (26:15) Strategies Vera recommends for getting feedback from users to determine what the right explainability experience might be (32:07) The research trends Vera would most like to see technical practitioners apply to their work (36:47) Summary of the four-step process Vera outlines for Question-Driven XAI design (39:14)

Links “Human-Centered XAI: From Algorithms to User Experiences” Presentation “Human-Centered XAI: From Algorithms to User Experiences” Slide Deck  “Human-Centered AI Transparency in the Age of Large Language Models” MSR Microsoft Research Vera's Personal Website

One of the biggest challenges for the analyst or data scientist is figuring out just how wide and just how deep to go with stakeholders when it comes to key (but, often, complicated) concepts that underpin the work that's being delivered to them. Tell them too little, and they may overinterpret or misinterpret what's been presented. Tell them too much, and they may tune out or fall asleep… and, as a result, overinterpret or misinterpret what's been presented. On this episode, Dr. Nicholas Cifuentes-Goodbody from WorldQuant University joined Julie, Val, and Tim to discuss how to effectively thread that particular needle. For complete show notes, including links to items mentioned in this episode and a transcript of the show, visit the show page.

On today’s episode, we’re joined by Ben Johnson Founder, CEO of Particle41, a provider of software and product development solutions crafted by world-class app development, DevOps, and data science teams. We talk about:

What components the CTO owns in a SaaS companyOptimizing the efficiency of dev teamsHow much of the CTO role is internal vs. externalHow to interview & identify a great CTO candidate

Building Statistical Models in Python

Building Statistical Models in Python is your go-to guide for mastering statistical modeling techniques using Python. By reading this book, you will explore how to use Python libraries like stats models and others to tackle tasks such as regression, classification, and time series analysis. What this Book will help me do Develop a deep practical knowledge of statistical concepts and their implementation in Python. Create regression and classification models to solve real-world problems. Gain expertise analyzing time series data and generating valuable forecasts. Learn to perform hypothesis verification to interpret data correctly. Understand survival analysis and apply it in various industry scenarios. Author(s) Huy Hoang Nguyen, Paul N Adams, and Stuart J Miller bring their extensive expertise in data science and Python programming to the table. With years of professional experience in both industry and academia, they aim to make statistical modeling approachable and applicable. Combining technical depth with hands-on coding, their goal is to ensure readers not only understand the theory but also gain confidence in its application. Who is it for? This book is tailored for beginners and intermediate programmers seeking to learn statistical modeling without a prerequisite in mathematics. It's ideal for data analysts, data scientists, and Python enthusiasts who want to leverage statistical models to gain insights from data. With this book, you will journey from the basics to advanced applications, making it perfect for those who aim to master statistical analysis.

Complex Spatial Data Science in the Boardroom | Katy Ashwin & Blair Freebairn | KFC UK & Geolytix

Where should we open 50 stores? Simple question right?

To answer it Katy Ashwin, Marketing Planning Analyst at KFC UK and Blair Freebairn, CEO of Geolytix, talk through complex spatial modelling, as well as mobility data derived interaction surfaces, spatial ML ensemble models by channel and store format, big data optimization to create opportunity heat surfaces, and much more.

Learn more about site selection : https://carto.com/solutions/site-selection

Expert Panel | Sustainability
video
by Biswajit Acharya (Tata Consultancy Services) , Jasmine Small (Marine Stewardship Council (MSC)) , Dr Andrew Smith (Fathom) , Caroline Robinson (Women+ in Geospatial)

This panel on sustainability features Caroline Robinson, Unite Lead Executive Board at Women+ in Geospatial; Biswajit Acharya, Ph.D. , Consulting Partner, Sustainable Banking, Finance and Investment at Tata Consultancy Services; Dr Andrew Smith, Co-Founder at Fathom; and Jasmine Small, Data Science and Research Officer at Marine Stewardship Council (MSC). They focus on the role of geospatial data and technology in sustainability, and how it can be best used to tackle challenges and help organizations reach their goals in this area.

Learn more about CARTO's mission to deliver technology, data and scalability to help global businesses and communities make strategic decisions surrounding their sustainability programs. https://sustainability.carto.com/

Expert Panel: Spatial Data Science Careers
video
by Charlie Dacke (Office for National Statistics) , Helen McKenzie (CARTO) , Adam Dennett (Bartlett Centre for Advanced Spatial Analysis at UCL) , Jeremy Morley (OS)

Helen McKenzie (Geospatial Advocate at CARTO), Charlie Dacke ( Head of Geospatial Technology and Standards at Office for National Statistics), Jeremy Morley (Chief Geospatial Scientist at OS), and Adam Dennett ( Professor & Head of Department,Bartlett Centre for Advanced Spatial Analysis at UCL).

This panel discusses the spatial data science industry, and insights for aspiring spatial data scientists.

David Foster just published the 2nd edition of his amazing book, Generative Deep Learning (O'Reilly 2023). We chat about a lot - running a consultancy, all things writing, the impact of AI on kids, why in-person events matter more than ever, and much more.

David's LinkedIn: https://www.linkedin.com/in/davidtfoster/

Book (Amazon): https://www.amazon.com/Generative-Deep-Learning-Teaching-Machines/dp/1492041947

Applied Data Science Partners: https://adsp.ai/

Good Charts, Updated and Expanded

The ultimate guide to data visualization and information design for business. Making good charts is a must-have skill for managers today. The vast amount of data that drives business isn't useful if you can't communicate the valuable ideas contained in that data—the threats, the opportunities, the hidden trends, the future possibilities. But many think that data visualization is too difficult—a specialist skill that's either the province of data scientists and complex software packages or the domain of professional designers and their visual creativity. Not so. Anyone can learn to produce quality "dataviz" and, more broadly, clear and effective information design. Good Charts will show you how to do it. In this updated and expanded edition, dataviz expert Scott Berinato provides all you need for turning those ordinary charts kicked out of a spreadsheet program into extraordinary visuals that captivate and persuade your audience and for transforming presentations that seem like a mishmash of charts and bullet points into clear, effective, persuasive storytelling experiences. Good Charts shows how anyone who invests a little time getting better at visual communication can create an outsized impact—both in their career and in their organization. You will learn: A framework for getting to better charts in just a few minutes Design techniques that immediately make your visuals clearer and more persuasive The building blocks of storytelling with your data How to build teams to bring visual communication skills into your organization and culture This new edition of Good Charts not only provides new visuals and updated concepts but adds an entirely new chapter on building teams around the visualization part of a data science operation and creating workflows to integrate visualization into everything you do. Graphics that merely present information won't cut it anymore. Make Good Charts your go-to resource for turning plain, uninspiring charts and presentations into smart, effective visualizations and stories that powerfully convey ideas.

David is a Machine Learning Engineer and technologist focused on building embedded systems to use novel techniques, and state of the art technologies (Podman, Balena, TensorFlow, Flutter) in machine learning. Software developer with experience in software exploitation, information security, open-source development and DevOps practices. Community leader for the data science community in Colo…

Maddie is a Sr. ML / Research Engineer in industry, published author and seasoned open-source AI leader, with 6+ years of experience in ML R&D. Her areas of interest include generative models, NLP and Human <> AI interactions. She was also a 2x startup founder, a Blockchain educator/researcher, Founder of Women Who Code - Data Science, and technical advisor to various startups and Di…

M-statistics

M-STATISTICS A comprehensive resource providing new statistical methodologies and demonstrating how new approaches work for applications M-statistics introduces a new approach to statistical inference, redesigning the fundamentals of statistics, and improving on the classical methods we already use. This book targets exact optimal statistical inference for a small sample under one methodological umbrella. Two competing approaches are offered: maximum concentration (MC) and mode (MO) statistics combined under one methodological umbrella, which is why the symbolic equation M=MC+MO. M-statistics defines an estimator as the limit point of the MC or MO exact optimal confidence interval when the confidence level approaches zero, the MC and MO estimator, respectively. Neither mean nor variance plays a role in M-statistics theory. Novel statistical methodologies in the form of double-sided unbiased and short confidence intervals and tests apply to major statistical parameters: Exact statistical inference for small sample sizes is illustrated with effect size and coefficient of variation, the rate parameter of the Pareto distribution, two-sample statistical inference for normal variance, and the rate of exponential distributions. M-statistics is illustrated with discrete, binomial, and Poisson distributions. Novel estimators eliminate paradoxes with the classic unbiased estimators when the outcome is zero. Exact optimal statistical inference applies to correlation analysis including Pearson correlation, squared correlation coefficient, and coefficient of determination. New MC and MO estimators along with optimal statistical tests, accompanied by respective power functions, are developed. M-statistics is extended to the multidimensional parameter and illustrated with the simultaneous statistical inference for the mean and standard deviation, shape parameters of the beta distribution, the two-sample binomial distribution, and finally, nonlinear regression. Our new developments are accompanied by respective algorithms and R codes, available at GitHub, and as such readily available for applications. M-statistics is suitable for professionals and students alike. It is highly useful for theoretical statisticians and teachers, researchers, and data science analysts as an alternative to classical and approximate statistical inference.

Throughout history, small businesses have consistently played a pivotal role in the global economy, serving as its foundational backbone. As we navigate the digital age, the emergence of large corporations and rapid technological advancements present new challenges. Now, more than ever, it's imperative for small businesses to adapt, embracing a data-driven approach to remain competitive and sustainable. In this evolving landscape, we need champions dedicated to guiding these businesses, ensuring they harness the full potential of modern tools and insights to ensure a fair and varied marketplace of goods and services for all.  Dr Kendra Vant, Executive General Manager of Data & AI Products at Xero, is an industry leader in building data-driven products that harness AI and machine learning to solve complex problems for the small-business economy. Working across Australia, Asia and the US, Kendra has led data and technology teams at companies such as Seek, Telstra, Deloitte and now Xero where she leads the company's global efforts using emerging practices and technologies to help small businesses and their advisors benefit from the power of data and insights. Starting with doctoral research in experimental quantum physics at MIT and a stint building quantum computers at Los Alamos National Laboratory, Kendra has made a career of solving hard problems and pushing the boundaries of what's possible. In the episode, Kendra and Richie delve into the transformative impact of data science on small businesses, use-cases of data science for small businesses, how Xero has supported numerous small businesses with data science. They also cover the integration of AI in product development, the unexpected depth of data in seemingly low-tech sectors, the pivotal role of software platforms in data analysis and much more.  Links Mentioned in The Show: Xero Analyzing Business Data in SQL Financial Modeling in Spreadsheets Implementing AI Solutions in Business Generative AI Concepts

Em um papo empolgante, mergulhamos no universo dos profissionais de dados e suas habilidades essenciais, com um foco especial no poderoso Power BI e demais ferramentas. Descubra, como as ferramentas de analytics, como o Power BI, estão moldando o futuro do campo de dados e análises.

Nste episódio do Data Hackers — a maior comunidade de AI e Data Science do Brasil-, conheçam as apaixonadas pela área de dados e principais referências no assunto: a Karine Lago — especialista em Business Intelligence, Power BI e Excel, premiada pela Microsoft mais de sete vezes e Escritora; e a Letícia Smirelli — Chief Product Officer (CPO), Power BI Specialist, Microsoft Data Analyst Associate e DataViz & Dashboard Design; ambas sócias na Nexos Educação.

Lembrando que você pode encontrar todos os podcasts da comunidade Data Hackers no Spotify, iTunes, Google Podcast, Castbox e muitas outras plataformas. Caso queira, você também pode ouvir o episódio aqui no post mesmo!

Link Medium: https://medium.com/data-hackers/power-bi-dashboards-e-a-carreira-de-analista-de-dados-data-hackers-podcast-72-829986f5f2a1

Falamos no episódio

Conheça nosso convidado:

Karine Lago — especialista em Business Intelligence, Escritora, Power BI e Excel, premiada pela Microsoft mais de sete vezes;  Letícia Smirelli — Chief Product Officer (CPO), Power BI Specialist, Microsoft Data Analyst Associate e DataViz & Dashboard Design.

Bancada Data Hackers:

Paulo Vasconcellos Gabriel Lages Monique Femme

Links de referências:

Tech and Cheers — Meetup ed. Data Connect (São Paulo): https://www.sympla.com.br/evento/tech-and-cheers-meetup-ed-data-connect/2110360 https://towardsdatascience.com/whats-the-difference-between-analytics-and-statistics-cd35d457e17

Tech and Cheers — ed. Mulher.ADA (Blumenau):https://www.sympla.com.br/evento/tech-and-cheers-meetup-ed-mulher-ada/2109236

World Economic Forum (The Future of Jobs Report 2023):https://www.weforum.org/reports/the-future-of-jobs-report-2023/ Canal Karine Lago (Youtube):https://www.youtube.com/@KarineLago Pagina Karine Lago: https://keepo.io/karinedolago/?fbclid=PAAaZ32JXyRtPv7wcHcfaxtKA5TOU9VRaCt_F_nb7zhAptO4AtthorxiHWCdg_aem_Ab53sgYj0AXg1wHrOP9-c_K7pwoMqX0psYWAvNMAanqh5pafTHBFb3bnshKB534J9AA Canal Leticia Smirelli (Youtube): https://www.youtube.com/@LeticiaSmirelli Pagina Leticia Smirelli: https://keepo.io/leticia/?fbclid=PAAabu7cvnFTkkFw1UiJrDMIXiMJ45Av6XKlCXIfWAUiRH2c4kiSZzo7FX6TY_aem_Ab7BHn25MaVK22HFw9zXNfsYv5k5Y5o9WLMGZeFB9wSSSAV3d7EDA0JuGjXWSqd_SEs

Maddie Shang - OpenMined (Sr. AI Research Engineer)

Maddie is a Sr. ML / Research Engineer in industry, published author and seasoned open-source AI leader, with 6+ years of experience in ML R&D. Her areas of interest include generative models, NLP and Human <> AI interactions. She was also a 2x startup founder, a Blockchain educator/researcher, Founder of Women Who Code - Data Science, and technical advisor to various startups and Di…

As companies scale and become more successful, new horizons open, but with them come unexpected challenges. The influx of revenue and expansion of operations often reveal hidden complexities that can hinder efficiency and inflate costs. In this tricky situation, data teams can find themselves entangled in a web of obstacles that slow down their ability to innovate and respond to ever-changing business needs. Enter cloud analytics—a transformative solution that promises to break down barriers and unleash potential. By migrating analytics to the cloud, organizations can navigate the growing pains of success, cutting costs, enhancing flexibility, and empowering data teams to work with agility and precision. John Knieriemen is the Regional Business Lead for North America at Exasol, the market-leading high-performance analytics database. Prior to joining Exasol, he served as Vice President and General Manager at Teradata during an 11-year tenure with the company. John is responsible for strategically scaling Exasol’s North America business presence across industries and expanding the organization’s partner network.  Solongo Erdenekhuyag is the former Customer Success and Data Strategy Leader at Exasol. Solongo is skilled in strategy, business development, program management, leadership, strategic partnerships, and management. In the episode, Richie, Solongo, and John cover the motivation for moving analytics to the cloud, economic triggers for migration, success stories from organizations who have migrated to the cloud, the challenges and potential roadblocks in migration, the importance of flexibility and open-mindedness and much more.  Links from the Show ExasolAmazon S3Azure Blob StorageGoogle Cloud StorageBigQueryAmazon RedshiftSnowflake[Course] Understanding Cloud Computing[Course] AWS Cloud Concepts

Ian Macomber, head of analytics engineering and data science at Ramp and formerly the VP of analytics and data engineering at Drizly, and Ryan Delgado, a staff software engineer at Ramp, have played pivotal roles in establishing Ramp's data team from the ground up and are spearheading the development of their comprehensive roadmap. In this conversation with Tristan and Julia, Ian and Ryan share insights on how Ramp's data team transformed unstructured data from contracts into valuable insights to enable faster decision-making. The $8 billion company values speed and empowers teams to build, ship, and measure products quickly. Ian and Ryan also talked about their approach to adopting new tech and elevating data as an equal player alongside product engineering and design. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.  The Analytics Engineering Podcast is sponsored by dbt Labs.

David is a Machine Learning Engineer and technologist focused on building embedded systems to use novel techniques, and state of the art technologies (Podman, Balena, TensorFlow, Flutter) in machine learning. Software developer with experience in software exploitation, information security, open-source development and DevOps practices. Community leader for the data science community in Colo…

Maddie is a Sr. ML / Research Engineer in industry, published author and seasoned open-source AI leader, with 6+ years of experience in ML R&D. Her areas of interest include generative models, NLP and Human <> AI interactions. She was also a 2x startup founder, a Blockchain educator/researcher, Founder of Women Who Code - Data Science, and technical advisor to various startups and Di…