talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (6 results)

See all 6 →
Showing 5 results

Activities & events

Title & Speakers Event
Caio Gomes – CDO @ Único idTech , Paulo Vasconcellos – host , Gabriel Lages – host , Wallysson Nunes – Staff Frontend Engineer @ Hotmart

Nos últimos tempos, se você é dev, tech lead ou faz parte de algum squad de desenvolvimento, é impossível não ter sido impactado pela enxurrada de ferramentas de IA voltadas para desenvolvimento de software. Mas… o que de fato mudou na prática? Será que a AI está mesmo revolucionando o jeito de programar — ou estamos só vivendo mais a hype do Vibe Coding ? Nesse episódio, chamamos Caio Gomes - Chief AI Officer & Chief Data Officer @ Magalu e Wallysson Nunes - Staff Frontend Engineer @ Hotmartpra discutir tudo sobre como a inteligência artificial está moldando o presente e o futuro da engenharia de software. Lembrando que você pode encontrar todos os podcasts da comunidade Data Hackers no Spotify, iTunes, Google Podcast, Castbox e muitas outras plataformas. Falamos no episódio: Caio Gomes - Chief AI Officer & Chief Data Officer @ Magalu Wallysson Nunes - Staff Frontend Engineer @ Hotmart Nossa Bancada — Data Hackers: Gabriel Lages - Data Hackers  Paulo Vasconcellos - Data Hackers Referências: 🛠️ Ferramenta de Crawler → Firecrawl 🛠️ Ferramenta de Crawler → Jina AI 🤖 Ferramenta de Agentes de AI → Manus 👨‍🏫 Criador do termo Vibe Coding → Andrej Karpathy 🔥 Lovable → Site Oficial 🔥 Cursor → Site Oficial 🔥 Windsurf → Site Oficial 🔥 GitHub Copilot → Site Oficial 💀 Stack Overflow está praticamente morto — Pragmatic Engineer 🔓 Hacker mostra como quebrou os 10 maiores sites feitos no Lovable em 47 minutos

AI/ML GitHub
Data Hackers

Building Smarter Systems for Complex Queries - Florian Hoenicke

​Outline:

  • ​Limitations of traditional search tools.
  • ​Complex queries remain unresolved.
  • ​Agentic search as a game-changer.
  • ​Redefining the search experience.

​About the Speaker:

​Florian has 10 years of experience in the AI field working at Axel-Springer, Deloitte, and SoundCloud. Currently, he works as a Principal Engineer at Jina AI leading the technical development Prompt Engineering and Embedding Models Technology. Florian is serving as an AI policy advisor, providing explanations and insights to members of the European Parliament to enhance their understanding of artificial intelligence.

​​​​DataTalks.Club is the place to talk about data. Join our slack community!

How to build an Agentic Search Flow

Register to reserve your spot!

Date and Time

Nov 22, 2024 from 5:30 PM to 8:30 PM

Location

The Meetup will take place at MotionLab.Berlin, Bouchéstraße 12/Halle 20 in Berlin

When the Medium is the Message: Addressing Input Biases in Multimodal/Multilingual Models

An embedding model is trained to produce outputs that ensure that semantic similarity is preserved as distance in embedding spaces — like is near like and far from unlike. But models trained with diverse kinds of inputs, i.e. different media and different languages, learn to treat those properties as semantic properties. Two pictures are more “semantically alike” than a picture and a descriptive text that matches it. Similar problems arise with multilingual models: Two English sentences are more alike than an English sentence and a Chinese translation. This undermines the general utility of embedding models. This presentation shows evidence of where this comes from and offers approaches to mitigate the problem.

About the Speaker

Scott Martens is a long-term veteran of AI and NLP research, having started working at AI start-ups in 1994, and a KU Leuven graduate with a doctorate in linguistics. His background includes machine translation development and the intersection between linguistics, philology, and modern AI. Dr. Martens is a Senior Content Manager and Evangelist at Jina AI in Berlin.

Vector Streaming: The Memory Efficient Indexing for Vector Databases

Vector databases are everywhere, powering LLMs. Indexing vectors, especially multivector embeddings like ColPali and Colbert, at a bulk is memory intensive. Vector streaming solves this problem by parallelizing the tasks of parsing, chunking, and embedding generation and indexing it continuously chunk by chunk instead of bulk. This not only increase the speed but also makes the whole task more optimized and memory efficient. Supports, Weaviate, Elastic and Pinecone.

About the Speaker

Sonam Pankaj is a GenerativeAI Evangelist at Articul8-ai and the co-creator and maintainer of the open-source library called Embed-Anything, which helps to create local dense, splade, and multimodal embeddings and index them to vector databases; it’s built-in Rust for speed and efficiency . She worked previously at Qdrant Engine and Rasa. Previously, she also worked as an AI researcher at Saama and has worked extensively on clinical trial analytics. She is passionate about topics like metric learning and biases in language models. She has also published a paper in the most reputed journal of computational linguistics, COLING, in ACL Anthology.

How to Unlock More Value from Self-Driving Datasets

AV/ADAS is one of the most advanced fields in Visual AI. However, getting your hands on a high quality dataset can be tough, let alone working with them to get a model to production. In this talk, I will show you the leading methods and tools to help visualize as well take these datasets to the next level. I will demonstrate how to clean and curate AV datasets as well as perform state of the art augmentations using diffusion models to create synthetic data that can empower the self driving car models of the future,

About the Speaker

Daniel Gural is a seasoned Machine Learning Engineer at Voxel51 with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data.

When the Medium is the Message: Addressing Input Biases in Multimodal/Multilingual Models

Thanks to deep learning, autonomous cars equipped with cameras and LiDAR can accurately recognize common objects such as cars, streets, and pedestrians, significantly enhancing their understanding of the environment. However, these models often display overconfidence, which can result in misidentifications. For example, consider an exaggerated scenario where an elephant on the street might be mistakenly identified as a trunk because the model has not been trained to recognize elephants. This issue stems from the models’ design to make decisions rather than acknowledge uncertainty by saying ‘I don’t know.’ In this talk, we will discuss how models can recognize their limitations and avoid making uncertain decisions, particularly through the lens of an autonomous car

About the Speaker

Hanieh Shojaei is a PhD researcher at the Institute of Cartography and Geoinformatics (IKG) at Leibniz University Hannover, specializing in uncertainty estimation and reliability of AI models. Her research focuses on using deep learning for LiDAR scene segmentation to enhance environmental perception and assess prediction reliability for autonomous vehicles.

Nov 22 - Berlin AI, Machine Learning and Computer Vision Meetup

This event is a collaboration with our friends from Meetup.ai, check out their Meetup page. ------

Dear Friends, colleagues, and fellow technologists, "Guess who's back back back - back again"

Meetup.ai is back! So tell your friends and click on that sign-up button to join the event :)

About the evening We once again join the fray in Generative AI to discuss one of the hottest topics in the field, Retrieval Augmented Generation! So we are happy to invite you to our RAGs to Riches event on July 18th, at 17:30, once again co-hosted by our great friends at Thoughtworks Berlin. Thoughtworks is a global technology consultancy that integrates strategy, design and software engineering - and a community of passionate, purpose-led individuals. In this event, we will look at RAGs from both the practical and theoretical sides with two fantastic speakers.

Speakers:

  • Stan Girard - Co-Founder & CEO @ Quivr - "Chatbots are going to destroy infrastructures and your cloud bills"
  • Florian Hönnicke - Principal AI Engineer at Jina AI - "SAG - Stochastic Augmented Generation\, improving the creativity of LLMs"

Agenda: 17:30 - Walk in. Let's kick off the event with a refreshing drink! 17:50 - Welcome word by Nicholas Borsotto 18:00 - Welcome word by Stephanie Kunsleben (Community Manager at Thoughtworks) 18:10 - Stan Girard Talk 18:45 - Small break 19:00 - Florian Hönnicke Talk 19:30 - Networking Looking forward to seeing you all there. Shout out to Natalia Woroniec and Neon for helping us with the speaker selection!

For the community:

  • Please contact Soraya Maan on LinkedIn to talk about future topics for the meetup, and collaborations or to recommend speakers.
  • Contact Mario Savovski for pitching in our community round.

------ Code of Conduct We adhere to the Berlin Code of Conduct to ensure a welcoming and respectful environment for all participants. The event space operates under largely compatible Thoughtworks Meetups & Events CoC.

Accessibility The Location is accessible for wheelchair users. This includes the entrance (no steps to get into the location), toilets and the stage.

Prompt me Up! #3 - RAGs to Riches
Susana Guzmán – author , Bo Wang – author , Shubham Saboo – author , Feng Wang – author , Cristian Mitroi – author , Jina AI – author

Dive into the world of modern search systems with 'Neural Search - From Prototype to Production with Jina.' This book introduces you to the fundamentals of neural search, exploring how machine learning revolutionizes information retrieval. You'll gain hands-on experience building versatile, scalable search engines using Jina, unraveling the complexities of AI-powered searches. What this Book will help me do Understand the basics of neural search compared to traditional search methods. Develop mastery of vector representation and its application in neural search. Learn to utilize Jina for constructing AI-powered search engines. Enhance your capabilities to handle multi-modal search systems like text, images, and audio. Acquire the skills to deploy and optimize deep learning-powered search systems effectively. Author(s) Bo Wang, Cristian Mitroi, Feng Wang, Shubham Saboo, and Susana Guzmán are experienced technologists and AI researchers passionate about simplifying complex subjects like neural search. With their expertise in Jina and deep learning, their collaborative approach ensures practical, reader-friendly content that empowers learners to excel in creating cutting-edge search systems. Who is it for? This book is perfect for machine learning, AI, or Python developers eager to advance their understanding of neural search. Whether you're building text, image, or other modality-based search systems, it caters to beginners with foundational knowledge and extends to professionals wanting to deepen their skills. Unlock the potential of Jina for your projects.

data data-engineering search AI/ML Python
O'Reilly Data Engineering Books
Showing 5 results