talk-data.com
People (27 results)
See all 27 →Activities & events
| Title & Speakers | Event |
|---|---|
|
[Notes]How to Build a Portfolio That Reflects Your Real Skills
2025-12-28 · 18:00
These are the notes of the previous "How to Build a Portfolio That Reflects Your Real Skills" event: Properties of an ideal portfolio repository:
📌 Backend & Frontend Portfolio Project Ideas
☕ Junior Java Backend Developer (Spring Boot)1. Shop Manager ApplicationA monolithic Spring Boot app designed with microservice-style boundaries. Features
Engineering Focus
2. Parallel Data Processing EngineBackend service for processing large datasets efficiently. Features
Demonstrates
3. Distributed Task Queue SystemSimple async job processing system. Features
Demonstrates
4. Rate Limiting & Load Control ServiceStandalone service that protects APIs from abuse. Features
Demonstrates
5. Search & Indexing BackendDocument or record search service. Features
Demonstrates
6. Distributed Configuration & Feature Flag ServiceCentralized config service for other apps. Features
Demonstrates
🐹 Mid-Level Go Backend Developer (Non-Kubernetes)1. High-Throughput Event Processing PipelineMulti-stage concurrent pipeline. Features
2. Distributed Job Scheduler & Worker SystemAsync job execution platform. Features
3. In-Memory Caching ServiceRedis-like cache written from scratch. Features
4. Rate Limiting & Traffic Shaping GatewayReverse-proxy-style rate limiter. Features
5. Log Aggregation & Query EngineIncrementally built system: Step-by-step
🐍 Mid-Level Python Backend Developer1. Asynchronous Task Processing SystemAsync job execution platform. Features
2. Event-Driven Data PipelineStreaming data processing service. Features
3. Distributed Rate Limiting ServiceAPI protection service. Steps
4. Search & Indexing BackendSearch system for logs or documents. Features
5. Configuration & Feature Flag ServiceShared configuration backend. Steps
🟦 Mid-Level TypeScript Backend Developer1. Asynchronous Job Processing SystemQueue-based task execution. Features
2. Real-Time Chat / Notification ServiceWebSocket-based system. Features
3. Rate Limiting & API GatewayAPI gateway with protections. Features
4. Search & Filtering EngineSearch backend for products, logs, or articles. Features
5. Feature Flag & Configuration ServiceCentralized config management. Features
🟨 Mid-Level Node.js Backend Developer1. Async Task Queue SystemBackground job processor. Features
2. Real-Time Chat / Notification ServiceSocket-based system. Features
3. Rate Limiting & API GatewayTraffic control service. Features
4. Search & Indexing BackendIndexing & querying service. 5. Feature Flag / Configuration ServiceShared backend for app configs. ⚛️ Mid-Level Frontend Developer (React / Next.js)1. Dynamic Analytics DashboardInteractive data visualization app. Features
2. E-Commerce StoreFull shopping experience. Features
3. Real-Time Chat / Collaboration AppLive multi-user UI. Features
4. CMS / Blogging PlatformSEO-focused content app. Features
5. Personalized Analytics / Recommendation UIData-heavy frontend. Features
6. AI Chatbot App — “My House Plant Advisor”LLM-powered assistant with production-quality UX. Core Features
Advanced Features
✅ Final AdviceYou do NOT need to build everything. Instead, pick 1–2 strong projects per role and focus on depth:
📌 Portfolio Quality Signals (Very Important)
🎯 Why This Helps in InterviewsWorking on serious projects gives you:
🎥 Demo & Documentation Best Practices
🤝 Open Source & Personal Projects (Interview Signal)Always mention that you have contributed to Open Source or built personal projects.
|
[Notes]How to Build a Portfolio That Reflects Your Real Skills
|
|
AI-Powered Search
2025-01-20
Trey Grainger
– author
Apply cutting-edge machine learning techniques—from crowdsourced relevance and knowledge graph learning, to Large Language Models (LLMs)—to enhance the accuracy and relevance of your search results. Delivering effective search is one of the biggest challenges you can face as an engineer. AI-Powered Search is an in-depth guide to building intelligent search systems you can be proud of. It covers the critical tools you need to automate ongoing relevance improvements within your search applications. Inside you’ll learn modern, data-science-driven search techniques like: Semantic search using dense vector embeddings from foundation models Retrieval augmented generation (RAG) Question answering and summarization combining search and LLMs Fine-tuning transformer-based LLMs Personalized search based on user signals and vector embeddings Collecting user behavioral signals and building signals boosting models Semantic knowledge graphs for domain-specific learning Semantic query parsing, query-sense disambiguation, and query intent classification Implementing machine-learned ranking models (Learning to Rank) Building click models to automate machine-learned ranking Generative search, hybrid search, multimodal search, and the search frontier AI-Powered Search will help you build the kind of highly intelligent search applications demanded by modern users. Whether you’re enhancing your existing search engine or building from scratch, you’ll learn how to deliver an AI-powered service that can continuously learn from every content update, user interaction, and the hidden semantic relationships in your content. You’ll learn both how to enhance your AI systems with search and how to integrate large language models (LLMs) and other foundation models to massively accelerate the capabilities of your search technology. About the Technology Modern search is more than keyword matching. Much, much more. Search that learns from user interactions, interprets intent, and takes advantage of AI tools like large language models (LLMs) can deliver highly targeted and relevant results. This book shows you how to up your search game using state-of-the-art AI algorithms, techniques, and tools. About the Book AI-Powered Search teaches you to create a search that understands natural language and improves automatically the more it is used. As you work through dozens of interesting and relevant examples, you’ll learn powerful AI-based techniques like semantic search on embeddings, question answering powered by LLMs, real-time personalization, and Retrieval Augmented Generation (RAG). What's Inside Sparse lexical and embedding-based semantic search Question answering, RAG, and summarization using LLMs Personalized search and signals boosting models Learning to Rank, multimodal, and hybrid search About the Reader For software developers and data scientists familiar with the basics of search engine technology. About the Author Trey Grainger is the Founder of Searchkernel and former Chief Algorithms Officer and SVP of Engineering at Lucidworks. Doug Turnbull is a Principal Engineer at Reddit and former Staff Relevance Engineer at Spotify. Max Irwin is the Founder of Max.io and former Managing Consultant at OpenSource Connections. Quotes Belongs on the shelf of every search practitioner! - Khalifeh AlJadda, Google A treasure map! Now you have decades of semantic search knowledge at your fingertips. - Mark Moyou, NVIDIA Modern and comprehensive! Everything you need to build world-class search experiences. - Kelvin Tan, SearchStax Kick starts your ability to implement AI search with easy to understand examples. - David Meza, NASA |
O'Reilly AI & ML Books
|
|
Text and Vector Search from Scratch
2024-05-27 · 14:00
Alexey Grigorev
– Founder
@ DataTalks.Club
Hands-on workshop on building a search engine from scratch, focusing on text search and vector search. Topics include in-memory text search, tokenization and preprocessing, inverted index construction, embeddings, converting text to vectors, cosine similarity, and strategies to combine text and vector search. The session includes practical coding in a Jupyter Notebook using Python to implement both text and vector search approaches. |
Implement a Search Engine
|
|
Building generative AI experiences for the enterprise on Google Cloud
2024-04-11 · 21:05
Eddie Zhou
– Founding Engineer
@ Glean
Building an assistant capable of answering complex, company-specific questions and executing workflows requires first building a powerful Retrieval Augmented Generation (RAG) system. Founding engineer Eddie Zhou explains how Glean built its RAG system on Google Cloud— combining a domain-adapted search engine with dynamic prompts to harness the full capabilities of Gemini's reasoning engine. By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only. Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25. |
|
|
Accelerate analytics and semantic search in real-time with AlloyDB for PostgreSQL
2024-04-10 · 22:00
Sam Idicula
– Senior Staff Software Engineer
@ Google Cloud
,
Sridhar Ranganathan
– Product Manager
@ Google Cloud
,
Fei Meng
– Head of Data Platform
@ Nuro
Your transactional data powers many applications – from Analytics to generative AI and interactive online systems. AlloyDB unifies all these workloads onto a single, high-performance platform to extend your real-time data. This session dives into two built-in features: AlloyDB AI and the Analytics Accelerator. We'll show the key technologies behind these features, including Google's fast vector search and the columnar engine that enables fast analytical queries, hybrid transaction, and analytics use cases. We’ll share how customers simplified their Analytical and gen AI apps with these two features. Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25. |
|
|
Building generative AI experiences for the enterprise on Google Cloud
2024-04-10 · 16:30
Eddie Zhou
– Founding Engineer
@ Glean
Building an assistant capable of answering complex, company-specific questions and executing workflows requires first building a powerful Retrieval Augmented Generation (RAG) system. Founding engineer Eddie Zhou explains how Glean built its RAG system on Google Cloud— combining a domain-adapted search engine with dynamic prompts to harness the full capabilities of Gemini's reasoning engine. By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only. |
|
|
Nathan Beach
– Group Product Manager
@ Google Cloud
,
Juho Kallio
– CTO
@ IPRally
Learn how the patent search engine company IPRally created a custom compute platform to enable higher scale data processing and deep learning. The solution relies on Ray Core and Google Kubernetes Engine, and harvests the cheapest resources from all around the world. In addition to the efficiency, the goal was to build the best environment for machine learning R&D. This has been achieved with integration to Weights&Biases as the experiment tracking system. In this session, we’ll go through on a high level the solution. Please note: seating is limited and on a first-come, first served basis; standing areas are available Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25. |
|
|
AI for Search Success at QAD
2024-04-09 · 19:50
Joey Jablonski
– VP of Global Solutions
@ Pythian
,
Jim Josey
– Vice President Information Technology Services
@ QAD
This presentation explores deploying retrieval augmented generation (RAG) on Vertex AI Search to enhance QAD's internal data search (Jira, Confluence, Google Sites). Discover how GenAI improves query responses, utilizing a user-friendly web app on Google App Engine to counteract the loss of institutional knowledge. Join us for insights into this innovative enterprise search solution. By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only. Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25. |
|
|
Eddie Zhou
– Founding Engineer
@ Glean
Building an assistant capable of answering complex, company-specific questions and executing workflows requires first building a powerful Retrieval Augmented Generation (RAG) system. Founding engineer Eddie Zhou explains how Glean built its RAG system on Google Cloud— combining a domain-adapted search engine with dynamic prompts to harness the full capabilities of Gemini's reasoning engine. By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only. |
|
|
Elastic Berlin Meetup @ Zalando: September Edition
2023-09-14 · 16:30
Join us for a meetup on September 14th at 18.30 at Zalando, Berlin! 18:30: Join us for a drink Please have your full and real name in your profile description because the security team will check if you're registered when you arrive. 18:45: Rankquest: Benchmarking Search API Ranking with Elasticsearch (Jilles van Gurp) Search Ranking is something that many companies that use Elasticsearch struggle with. Something we noticed while helping various clients is that many companies never evolve to having a systematic approach for testing their search ranking quality. It's too abstract for them; they don't know where to start with this, and they don't really get how this should be done or even why this is important. In this presentation we present and unveil our new ranking tool, Rankquest Studio, which aims to address some of these issues. Rankquest emerged out of our frustration with existing tools and approaches in this space and we'll reflect a bit on the requirements we have for this before diving into a demo. Rankquest Studio, is open source, web-based, easy to use, and it can be used to to build out test benchmarks for evaulating your search solutions. 19:15: Generative Black-Box Testing for Evaluating Search Quality (Oliver Trosien @ Zalando) We present a novel generative Black-Box Testing approach that uses semantically equivalent queries (e.g. “rote Kleider”, “Kleid rot”) for introspecting the quality of a search engine. At Zalando, we developed a tool for finding search quality problems at scale with the help of mass-generating semantically equivalent query variants. This is a novel way to find relevance problems that complements other approaches that use customer metrics or ground truth data. The tool allows easy extension with new test scenarios and languages by non-technical native language speakers. We will show how it was used to continuously monitor search relevance, to find regressions, and how you can implement such a tool yourself. Here are some questions that we’ll discuss in the session:
20.00: Pizza Special thanks to our hosts, Zalando! |
Elastic Berlin Meetup @ Zalando: September Edition
|
|
Cutting the Edge in Fighting Cybercrime: Reverse-Engineering a Search Language to Cross-Compile
2022-07-22 · 18:21
Traditional cybersecurity Security Information and Event Management (SIEM) ways do not scale well for data sources with 30TiB per day, leading HSBC to create a Cybersecurity Lakehouse with Delta and Spark. Creating a platform to overcome several conventional technical constraints, the limitation in the amount of data for long-term analytics available in traditional platforms and query languages is difficult to scale and time-consuming to run. In this talk, we’ll learn how to implement (or actually reverse-engineer) a language with Scala and translate it into what Apache Spark understands, the Catalyst engine. We’ll guide you through the technical journey of building equivalents of a query language into Spark. We’ll learn how HSBC business benefited from this cutting-edge innovation, like decreasing time and resources for Cyber data processing migration, improving Cyber threat Incident Response, and fast onboarding of HSBC Cyber Analysts on Spark with Cybersecurity Lakehouse platform. Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/ |
Databricks DATA + AI Summit 2023 |
|
Elasticsearch in Action
2015-11-17
Elasticsearch in Action teaches you how to build scalable search applications using Elasticsearch. You'll ramp up fast, with an informative overview and an engaging introductory example. Within the first few chapters, you'll pick up the core concepts you need to implement basic searches and efficient indexing. With the fundamentals well in hand, you'll go on to gain an organized view of how to optimize your design. Perfect for developers and administrators building and managing search-oriented applications. About the Technology Modern search seems like magic'you type a few words and the search engine appears to know what you want. With the Elasticsearch real-time search and analytics engine, you can give your users this magical experience without having to do complex low-level programming or understand advanced data science algorithms. You just install it, tweak it, and get on with your work. About the Book Elasticsearch in Action teaches you how to write applications that deliver professional quality search. As you read, you'll learn to add basic search features to any application, enhance search results with predictive analysis and relevancy ranking, and use saved data from prior searches to give users a custom experience. This practical book focuses on Elasticsearch's REST API via HTTP. Code snippets are written mostly in bash using cURL, so they're easily translatable to other languages. What's Inside What is a great search application? Building scalable search solutions Using Elasticsearch with any language Configuration and tuning About the Reader This book is for developers and administrators building and managing search-oriented applications. About the Authors Radu Gheorghe is a search consultant and software engineer. Matthew Lee Hinman develops highly available, cloud-based systems. Roy Russo is a specialist in predictive analytics. Quotes To understand how a modern search infrastructure works is a daunting task. Radu, Matt, and Roy make it an engaging, hands-on experience. - Sen Xu, Twitter Inc. An indispensable guide to the challenges of search of semi-structured data. - Artur Nowak, Evidence Prime The best resource for a complex topic. Highly recommended. - Daniel Beck, juris GmbH Took me from confused to confident in a week. - Alan McCann, Givsum.com |
|
|
ElasticSearch Blueprints
2015-07-24
Vineeth Mohan
– author
Dive into search technology with "ElasticSearch Blueprints"! This is the perfect project-based guide to help you master Elasticsearch. You will learn how to build and design scalable, effective search solutions, improve search relevancy, manage data efficiently, perform analytics, and visualize your data in comprehensive ways. What this Book will help me do Build and fine-tune scalable search engine features with Elasticsearch. Design and implement accurate ecommerce search solutions using filters. Analyze and visualize data with Elasticsearch's powerful data aggregation capabilities. Increase search relevancy and enhance user query assistance using analyzers. Incorporate enhanced data organization methods, including parent-child relationships. Author(s) None Mohan is an experienced professional specializing in search technologies. With a strong technical background, they have engaged deeply with Elasticsearch, creating solutions that address practical challenges. Their approach focuses on making technical topics accessible, guiding readers step-by-step through projects. Who is it for? This book is tailored for data professionals, application developers, and enthusiasts eager to delve into search technologies. Whether you're beginning with Elasticsearch or aiming to refine your skills, this guide will advance your expertise. By working through practical cases, you'll gain confidence in using Elasticsearch effectively to meet diverse requirements. |
|
|
ElasticSearch Cookbook - Second Edition
2015-01-28
Alberto Paro
– author
The "ElasticSearch Cookbook - Second Edition" is a hands-on guide featuring over 130 advanced recipes to help you harness the power of ElasticSearch, a leading search and analytics engine. Through insightful examples and practical guidance, you'll learn to implement efficient search solutions, optimize queries, and manage ElasticSearch clusters effectively. What this Book will help me do Design and configure ElasticSearch topologies optimized for your specific deployment needs. Develop and utilize custom mappings to optimize your data indexes. Execute advanced queries and filters to refine and retrieve search results effectively. Set up and monitor ElasticSearch clusters for optimal performance. Extend ElasticSearch capabilities through plugin development and integrations using Java and Python. Author(s) Alberto Paro is a technology expert with years of experience working with ElasticSearch, Big Data solutions, and scalable cloud architecture. He has authored multiple books and technical articles on ElasticSearch, leveraging his extensive knowledge to provide practical insights. His approachable and detail-oriented style makes complex concepts accessible to technical professionals. Who is it for? This book is best suited for software developers and IT professionals looking to use ElasticSearch in their projects. Readers should be familiar with JSON, as well as basic programming skills in Java. It is ideal for those who have an understanding of search applications and want to deepen their expertise. Whether you're integrating ElasticSearch into a web application or optimizing your system's search capabilities, this book will provide the skills and knowledge you need. |
|
|
Solr in Action
2014-03-25
Trey Grainger
– author
,
Timothy Potter
– author
Solr in Action is a comprehensive guide to implementing scalable search using Apache Solr. This clearly written book walks you through well-documented examples ranging from basic keyword searching to scaling a system for billions of documents and queries. It will give you a deep understanding of how to implement core Solr capabilities. About the Technology About the Book Whether you're handling big (or small) data, managing documents, or building a website, it is important to be able to quickly search through your content and discover meaning in it. Apache Solr is your tool: a ready-to-deploy, Lucene-based, open source, full-text search engine. Solr can scale across many servers to enable real-time queries and data analytics across billions of documents. Solr in Action teaches you to implement scalable search using Apache Solr. This easy-to-read guide balances conceptual discussions with practical examples to show you how to implement all of Solr's core capabilities. You'll master topics like text analysis, faceted search, hit highlighting, result grouping, query suggestions, multilingual search, advanced geospatial and data operations, and relevancy tuning. What's Inside How to scale Solr for big data Rich real-world examples Solr as a NoSQL data store Advanced multilingual, data, and relevancy tricks Coverage of versions through Solr 4.7 About the Reader This book assumes basic knowledge of Java and standard database technology. No prior knowledge of Solr or Lucene is required. About the Authors Trey Grainger is a director of engineering at CareerBuilder. Timothy Potter is a senior member of the engineering team at LucidWorks. The authors work on the scalability and reliability of Solr, as well as on recommendation engine and big data analytics technologies. Quotes The knowledge and techniques you need. - From the Foreword by Yonik Seeley, Creator of Solr Readable and immediately applicable ... an excellent book. - John Viviano, InterCorp, Inc. The go-to guide for Solr ... a definitive resource for both beginners and experts. - Scott Anthony, Business Instruments A well-dosed combination of deep technical knowledge and real-world experience. - Alexandre Madurell, Piksel, Inc. |
|
|
ElasticSearch Server
2013-02-21
Rafal Kuc
– author
,
Marek Rogozinski
– author
ElasticSearch Server is an excellent resource for mastering the ElasticSearch open-source search engine. This book takes you through practical steps to implement, configure, and optimize search capabilities, suitable for various data sets and applications, making faster and more accurate search outcomes accessible. What this Book will help me do Understand the core concepts of ElasticSearch, including data indexing, dynamic mapping, and search analysis. Develop practical skills in writing queries and filters to retrieve precise and relevant results. Learn to set up and efficiently manage ElasticSearch clusters for scalability and real-time performance. Implement advanced ElasticSearch functions like autocompletion, faceting, and geo-search. Utilize optimization techniques for cluster monitoring, health-checks, and tuning for reliable performance. Author(s) The authors of ElasticSearch Server are industry professionals with extensive experience in search technologies and system architecture. They have contributed to multiple tools and publications in the field of data search and analytics. Their writing aims to distill complex technical concepts into practical knowledge, making it valuable for readers from all backgrounds. Who is it for? This book is perfect for developers, system architects, and IT professionals seeking a robust and scalable search solution for their projects. Whether you're new to ElasticSearch or looking to deepen your expertise, this book will serve as a practical guide to implement ElasticSearch effectively. The only prerequisites are a basic understanding of databases and general query concepts, so prior search server knowledge is not required. |
|