talk-data.com

Topic

NLP

Natural Language Processing (NLP)

ai machine_learning text_analysis

Activities

tagged

Activity Trend

24 peak/qtr

2020-Q1 2026-Q1

Top Events

O'Reilly Data Science Books 32 O'Reilly Data Engineering Books 18 Data Skeptic 18 DataTalks.Club 16 PyConDE & PyData Berlin 2023 15 Databricks DATA + AI Summit 2023 13 O'Reilly AI & ML Books 12 DataFramed 10 Making Data Simple 6 Data Council 2023 6 Google Cloud Next '24 6 Data + AI Summit 2025 4

Top Speakers

Kyle Polich 18 Maddie Shang (OpenMined) 7 Richie (DataCamp) 7 Al Martin (IBM) 6 Casey Stella 2 Jim Sterne (Board Chair, Digital Analytics Association - USA) 2 Konrad Banachewicz 2 Lena Nahorna (Grammarly) 2 David Xue (Astronomer) 2 Ofer Mendelevitch 2 Julia Silge 2 Pramod Singh 2

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Data Skeptic ×

NLP in 2019

2019-12-31 · Data Skeptic Listen

podcast_episode

by Kyle Polich

A year in recap.

The Limits of NLP

2019-12-24 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Colin Raffel

We are joined by Colin Raffel to discuss the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer".

Serverless NLP Model Training

2019-12-10 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Alex Reeves

AI/ML

Alex Reeves joins us to discuss some of the challenges around building a serverless, scalable, generic machine learning pipeline. The is a technical deep dive on architecting solutions and a discussion of some of the design choices made.

Annotator Bias

2019-11-23 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Mor Geva

GitHub

The modern deep learning approaches to natural language processing are voracious in their demands for large corpora to train on. Folk wisdom estimates used to be around 100k documents were required for effective training. The availability of broadly trained, general-purpose models like BERT has made it possible to do transfer learning to achieve novel results on much smaller corpora. Thanks to these advancements, an NLP researcher might get value out of fewer examples since they can use the transfer learning to get a head start and focus on learning the nuances of the language specifically relevant to the task at hand. Thus, small specialized corpora are both useful and practical to create. In this episode, Kyle speaks with Mor Geva, lead author on the recent paper Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets, which explores some unintended consequences of the typical procedure followed for generating corpora. Source code for the paper available here: https://github.com/mega002/annotator_bias

NLP for Developers

2019-11-20 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Lance Olson (Microsoft)

AI/ML Lance

While at MS Build 2019, Kyle sat down with Lance Olson from the Applied AI team about how tools like cognitive services and cognitive search enable non-data scientists to access relatively advanced NLP tools out of box, and how more advanced data scientists can focus more time on the bigger picture problems.

Indigenous American Language Research

2019-11-13 · Data Skeptic Listen

podcast_episode

by Manuel Mager , Kyle Polich

Manuel Mager joins us to discuss natural language processing for low and under-resourced languages. We discuss current work in this area and the Naki Project which aggregates research on NLP for native and indigenous languages of the American continent.

What BERT is Not

2019-10-14 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Allyson Ettinger

Allyson Ettinger joins us to discuss her work in computational linguistics, specifically in exploring some of the ways in which the popular natural language processing approach BERT has limitations.

BERT

2019-07-29 · Data Skeptic Listen

podcast_episode

by Kyle Polich

Kyle provides a non-technical overview of why Bidirectional Encoder Representations from Transformers (BERT) is a powerful tool for natural language processing projects.

Under Resourced Languages

2019-06-15 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Priyanka Biswas

Priyanka Biswas joins us in this episode to discuss natural language processing for languages that do not have as many resources as those that are more commonly studied such as English. Successful NLP projects benefit from the availability of like large corpora, well-annotated corpora, software libraries, and pre-trained models. For languages that researchers have not paid as much attention to, these tools are not always available.

ELMo

2019-03-29 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Matthew Peters

ELMo (Embeddings from Language Models) introduced the idea of deep contextualized word representations. It extends previous ideas like word2vec and GloVe. The ELMo model is a neural network able to map natural language into a vector space. This vector space, out of box, proved to be incredibly useful in a wide variety of seemingly unrelated NLP tasks like sentiment analysis and name entity recognition.

seq2seq

2019-03-01 · Data Skeptic Listen

podcast_episode

by Kyle Polich

AI/ML

A sequence to sequence (or seq2seq) model is neural architecture used for translation (and other tasks) which consists of an encoder and a decoder. The encoder/decoder architecture has obvious promise for machine translation, and has been successfully applied this way. Encoding an input to a small number of hidden nodes which can effectively be decoded to a matching string requires machine learning to learn an efficient representation of the essence of the strings. In addition to translation, seq2seq models have been used in a number of other NLP tasks such as summarization and image captioning. Related Links tf-seq2seq Describing Multimedia Content using Attention-based Encoder--Decoder Networks Show and Tell: A Neural Image Caption Generator Attend to You: Personalized Image Captioning with Context Sequence Memory Networks

Text Mining in R

2019-02-22 · Data Skeptic Listen

podcast_episode

by Julia Silge , Kyle Polich

Data Science

Kyle interviews Julia Silge about her path into data science, her book Text Mining with R, and some of the ways in which she's used natural language processing in projects both personal and professional. Related Links https://stack-survey-2018.glitch.me/ https://stackoverflow.blog/2017/03/28/realistic-developer-fiction/

Recurrent Relational Networks

2019-02-15 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Rasmus Berg Palm

One of the most challenging NLP tasks is natural language understanding and reasoning. How can we construct algorithms that are able to achieve human level understanding of text and be able to answer general questions about it? This is truly an open problem, and one with the bAbI dataset has been constructed to facilitate. bAbI presents a variety of different language understanding and reasoning tasks and exists as benchmark for comparing approaches. In this episode, Kyle talks to Rasmus Berg Palm about his recent paper Recurrent Relational Networks

Very Large Corpora and Zipf's Law

2019-01-18 · Data Skeptic Listen

podcast_episode

by Linh Da , Kyle Polich

AI/ML

The earliest efforts to apply machine learning to natural language tended to convert every token (every word, more or less) into a unique feature. While techniques like stemming may have cut the number of unique tokens down, researchers always had to face a problem that was highly dimensional. Naive Bayes algorithm was celebrated in NLP applications because of its ability to efficiently process highly dimensional data. Of course, other algorithms were applied to natural language tasks as well. While different algorithms had different strengths and weaknesses to different NLP problems, an early paper titled Scaling to Very Very Large Corpora for Natural Language Disambiguation popularized one somewhat surprising idea. For many NLP tasks, simply providing a large corpus of examples not only improved accuracy, but it also showed that asymptotically, some algorithms yielded more improvement from working on very, very large corpora. Although not explicitly in about NLP, the noteworthy paper The Unreasonable Effectiveness of Data emphasizes this point further while paying homage to the classic treatise The Unreasonable Effectiveness of Mathematics in the Natural Sciences. In this episode, Kyle shares a few thoughts along these lines with Linh Da. The discussion winds up with a brief introduction to Zipf's law. When applied to natural language, Zipf's law states that the frequency of any given word in a corpus (regardless of language) will be proportional to its rank in the frequency table.

Let's Talk About Natural Language Processing

2019-01-04 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Lucy Park

Python

This episode reboots our podcast with the theme of Natural Language Processing for the next few months. We begin with introductions of Yoshi and Linh Da and then get into a broad discussion about natural language processing: what it is, what some of the classic problems are, and just a bit on approaches. Finishing out the show is an interview with Lucy Park about her work on the KoNLPy library for Korean NLP in Python. If you want to share your NLP project, please join our Slack channel. We're eager to see what listeners are working on! http://konlpy.org/en/latest/

Spam Filtering with Naive Bayes

2018-07-27 · Data Skeptic Listen

podcast_episode

by Kyle Polich

AI/ML

Today's spam filters are advanced data driven tools. They rely on a variety of techniques to effectively and often seamlessly filter out junk email from good email. Whitelists, blacklists, traffic analysis, network analysis, and a variety of other tools are probably employed by most major players in this area. Naturally content analysis can be an especially powerful tool for detecting spam. Given the binary nature of the problem ( or ) its clear that this is a great problem to use machine learning to solve. In order to apply machine learning, you first need a labelled training set. Thankfully, many standard corpora of labelled spam data are readily available. Further, if you're working for a company with a spam filtering problem, often asking users to self-moderate or flag things as spam can be an effective way to generate a large amount of labels for "free". With a labeled dataset in hand, a data scientist working on spam filtering must next do feature engineering. This should be done with consideration of the algorithm that will be used. The Naive Bayesian Classifer has been a popular choice for detecting spam because it tends to perform pretty well on high dimensional data, unlike a lot of other ML algorithms. It also is very efficient to compute, making it possible to train a per-user Classifier if one wished to. While we might do some basic NLP tricks, for the most part, we can turn each word in a document (or perhaps each bigram or n-gram in a document) into a feature. The Naive part of the Naive Bayesian Classifier stems from the naive assumption that all features in one's analysis are considered to be independent. If and are known to be independent, then . In other words, you just multiply the probabilities together. Shh, don't tell anyone, but this assumption is actually wrong! Certainly, if a document contains the word algorithm, it's more likely to contain the word probability than some randomly selected document. Thus, Pr(\text{algorithm}) \cdot Pr(\text{probability})" />, violating the assumption. Despite this "flaw", the Naive Bayesian Classifier works remarkably will on many problems. If one employs the common approach of converting a document into bigrams (pairs of words instead of single words), then you can capture a good deal of this correlation indirectly. In the final leg of the discussion, we explore the question of whether or not a Naive Bayesian Classifier would be a good choice for detecting fake news.

Measuring the Influence of Fashion Designers

2015-08-14 · Data Skeptic Listen

podcast_episode

by Kyle Polich , Yusan Lin

Data Science

Yusan Lin shares her research on using data science to explore the fashion industry in this episode. She has applied techniques from data mining, natural language processing, and social network analysis to explore who are the innovators in the fashion world and how their influence effects other designers. If you found this episode interesting and would like to read more, Yusan's papers Text-Generated Fashion Influence Model: An Empirical Study on Style.com and The Hidden Influence Network in the Fashion Industry are worth reading.

[MINI] Natural Language Processing

2015-04-17 · Data Skeptic Listen

podcast_episode

by Kyle Polich

This episode overviews some of the fundamental concepts of natural language processing including stemming, n-grams, part of speech tagging, and th bag of words approach.