talk-data.com talk-data.com

Company

Grammarly

Speakers

47

Activities

41

Speakers from Grammarly

Talks & appearances

41 activities from Grammarly speakers

Join Kostia Omelianchuk and Lukas Beisteiner as they unpack the full scope of Grammatical Error Correction (GEC) from task framing, evaluation, and training to inference optimization and serving high-performance production systems at Grammarly. They will discuss: The modern GEC recipe (shift from heavily human-annotated corpora to semi-synthetic data generation), LLM-as-a-judge techniques for scalable evaluation, and techniques to make deployment fast and affordable, including Speculative Decoding.

In this talk, we will examine how LLM outputs are evaluated by potential end users versus professional linguist-annotators, as two ways of ensuring alignment with real-world user needs and expectations. We will compare the two approaches, highlight the advantages and recurring pitfalls of user-driven annotation, and share the mitigation techniques we have developed from our own experience.

How can we influence quality during the prompt creation stage, as well as how to work with already-generated text—improving it, identifying errors, and filtering out undesirable results. We'll explore linguistic approaches that help achieve better, more controlled outcomes from LLMs.

Two Grammarly data scientists discuss measuring AI ROI from two angles: one builds a novel experimentation framework, the other designs a new scoring system to quantify impact. They cover how each approach was designed, implemented, and validated, and share lessons learned.

Andrew Garkavyi and Lesha Levzhynskyi discuss the history and present state of shipping features across Grammarly's multiple platforms, recounting challenges and approaches from fully native to web to hybrid, and addressing overlays, assistant and chat modes in the age of LLMs.

How we do feature experimentation in the macOS Grammarly app

  • The basics: experiments, flags, holdouts, audience filters
  • Development and testing
  • Checking the metrics: analyzing some real cases
  • Benefits and drawbacks
  • Real-world examples

Learn what it took for our team to replicate the visuals and interactions of the system keyboard so we could provide users with a smooth transition to our keyboard powered by Grammarly functionality. You will be surprised that it requires a bit more than creating 30 UIButtons. We will discuss different animations and interactions that we had to implement. We’ll also describe how we attained feature parity with the system keyboard regarding emoji, which some other third-party keyboards don’t offer. Finally, we’ll explain why original architecture decisions may not always be bulletproof, even for such a simple thing as a keyboard.

During the session, we’ll discuss the challenges that prompt engineering has presented, both when it first gained popularity and as it continued to evolve. We’ll share how these challenges informed the development of our prompt engineering tooling and workflows. We’ll cover: Standardizing communication with LLMs; Using templating to customize prompts; Building prompt-centric production workflows; Working with structured LLM output; Ensuring the quality of LLM output; Creating tooling that supports our prompt engineering workflows.

LLMs have unlocked new opportunities in NLP with their possible applications. Features that used to take months to be planned and developed now require a day to be prototyped. But how can we make sure that a successful prototype will turn into a high-quality feature useful for millions of customers? In this talk, we will explore real examples of the challenges that arise when ensuring the quality of LLM outputs and how we address them at Grammarly.

Overview of Grammarly's in-house solution for conducting quality evaluations of suggestions, including what human quality evaluations are, how the solution provides insights into the impact and quality of new features before deployment, and a deep dive into the scalable, distributed design of the solution.

LLMs have opened up new avenues in NLP with their possible applications, but evaluating their output introduces a new set of challenges. In this talk, we discuss these challenges and our approaches to measuring the model output quality. We will talk about the existing evaluation methods and their pros and cons and then take a closer look at their application in a practical case study.

This talk covers Grammarly's approach to using a combination of third-party LLM APIs and in-house LLMs, the role of LLMs in Grammarly's product offerings, an overview of the tools and processes used in our ML infrastructure, and how we address challenges such as access, cost control, and load testing of LLMs, sharing our experience in optimizing and serving LLMs.

🚀 Writing efficient unit tests is essential for critical software components. But how do we ensure that our tests are thorough enough to cover a huge variety of typical inputs and most edge cases? I'll introduce you to the concept of property-based testing and share how we benefited from using it at Grammarly.