talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (53 results)

See all 53 →
Showing 6 results

Activities & events

Title & Speakers Event
Ilyas Toumlilt – Site Reliability Engineer @ Criteo

We will share Criteo's journey of integrating authentication and authorisation into our Kafka infrastructure, including how we incorporated OAuth and JWT authentication systems into Kafka to enhance the security of our data streams. The talk covers the obstacles we faced and the lessons learned during transforming an open Kafka infra into a safeguarded platform.

Kafka oauth jwt authentication authorization
Jay Kreps – CEO and co-founder @ Confluent

Discussion about why Kafka, why Confluent was created, evolution of Kafka and data streaming, open source and communities.

Kafka confluent

***IMPORTANT: Registration will be closed on this page. Please RSVP with the following link: https://www.meetup.com/paris-apache-kafka-meetup/events/299612020/ ***

***

Nous avons le plaisir de vous proposer un évènement exceptionnel pour le meetup de mars !

Nous avons l'honneur d'accueillir Jay Kreps (CEO, co-fondateur de Confluent et co-createur de Kafka) pour parler de l'histoire et l'évolution de Kafka suivi d'un retour d'experience de Criteo concernant leur voyage pour sécuriser et authentifier leurs infrastructure Kafka

L'événement se déroulera le Jeudi 21 Mars dans les locaux de Criteo qui sponsorise cet événement au côté de la société Confluent. Un grand merci à eux pour rendre ce meetup possible !

⚠️ Le nombre de places étant limité, nous vous prions de bien vouloir vous inscrire à l'évènement uniquement si vous êtes sûr de pouvoir y participer! Merci. ⚠️ There are limited seats, so if you cannot attend, please change your RSVP so someone else can join. Thanks!

*** 🗓 Agenda:

  • 6:30pm: Door opening
  • 6:45pm - 7:15pm: Interactive Q&A with the Kafka co-creator and confluent CEO - Jay Kreps (in english)
  • 7:20pm - 8:15pm: Safeguarding Our Kafka Kingdom: A Journey into Authentication and Authorisation - Ilyas Toumlilt - Criteo
  • 8:15pm - 9:30pm - Food & Networking

Please note that Jay will not be able to attend the networking part of the event but Confluent people will be here to answer any additional questions in networking area. Especially Gilles Philippart for technical one and Anissa Lallemand for any non technical one.

*** 💡 Presentation 1 : Jay Kreps - CEO and co-founder of Confluent, original co-creator of Apache Kafka®

Interactive Q&A with the Kafka co-creator and confluent CEO (in english) A 30 min Q/A sessions to ask questions to Jay We will start with discussion about why Kafka, why Confluent was created, evolution of Kafka and data streaming, open source and communities… and then we will switch to question from the public

Bio : Jay Kreps is CEO and co-founder of Confluent. Prior to Confluent he was the lead architect for data and infrastructure at Linkedin. He is the initial developer of several open source projects, including Apache Kafka.

💡 Presentation 2 : Ilyas Toumlilt - SRE @ Criteo

Safeguarding Our Kafka Kingdom: A Journey into Authentication and Authorisation

We will share Criteo's journey of integrating authentication and authorisation into our Kafka infrastructure, a significant leap we took after years of operating without them. We will delve into how we successfully incorporated our Criteo's existing OAuth and JWT authentication systems into Kafka, enhancing the security of our data streams.

While the subject might seem technical and complex, we promise an engaging narrative filled with both our victories and challenges. We will recount our integration and deployment war stories, the obstacles we overcame, and the lessons we learned along the way. This talk is not just about the destination but the journey, the transformation that took place as we navigated the intricate maze of transforming a “full open” Kafka infra into a safeguarded platform.

Our aim is to keep this presentation fun and easy to follow, avoiding deep technical jargon. Sharing our journey with the community.

Bio : Ilyas works at Criteo as Site Reliability Engineer, in the Stream-Processing Platform team. He’s interested about large scale distributed systems design and efficiency, he previously built Concordant & AntidoteDB open-source databases during his PhD. Ilyas is also curious about Linux Kernel topics, and outside of work, enjoys reading books and running.

Important info

1:❗For safety reasons, the venue's staff will check everyone's identity on site. 📝Please remember to bring an ID with you and register for the event with your real name and family name. Thank you!

# 2 Attendees consent to be photographed\, filmed and sound recorded as members of the audience which may be used for marketing or promotional purposes by Confluent\, Criteo and the Paris Kafka Meetup

3: Please be on time. We can’t guarantee a seat once the meetup has started

Kafka meetup with Jay Kreps @ Criteo !
Yair Weinberger – CTO and co-founder @ Alooma , Tobias Macey – host

Summary

Building an ETL pipeline is a common need across businesses and industries. It’s easy to get one started but difficult to manage as new requirements are added and greater scalability becomes necessary. Rather than duplicating the efforts of other engineers it might be best to use a hosted service to handle the plumbing so that you can focus on the parts that actually matter for your business. In this episode CTO and co-founder of Alooma, Yair Weinberger, explains how the platform addresses the common needs of data collection, manipulation, and storage while allowing for flexible processing. He describes the motivation for starting the company, how their infrastructure is architected, and the challenges of supporting multi-tenancy and a wide variety of integrations.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. For complete visibility into the health of your pipeline, including deployment tracking, and powerful alerting driven by machine-learning, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix performance bottlenecks in no time. Go to dataengineeringpodcast.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Yair Weinberger about Alooma, a company providing data pipelines as a service

Interview

Introduction How did you get involved in the area of data management? What is Alooma and what is the origin story? How is the Alooma platform architected?

I want to go into stream VS batch here What are the most challenging components to scale?

How do you manage the underlying infrastructure to support your SLA of 5 nines? What are some of the complexities introduced by processing data from multiple customers with various compliance requirements?

How do you sandbox user’s processing code to avoid security exploits?

What are some of the potential pitfalls for automatic schema management in the target database? Given the large number of integrations, how do you maintain the

What are some challenges when creating integrations, isn’t it simply conforming with an external API?

For someone getting started with Alooma what does the workflow look like? What are some of the most challenging aspects of building and maintaining Alooma? What are your plans for the future of Alooma?

Contact Info

LinkedIn @yairwein on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Alooma Convert Media Data Integration ESB (Enterprise Service Bus) Tibco Mulesoft ETL (Extract, Transform, Load) Informatica Microsoft SSIS OLAP Cube S3 Azure Cloud Storage Snowflake DB Redshift BigQuery Salesforce Hubspot Zendesk Spark The Log: What every software engineer should know about real-time data’s unifying abstraction by Jay Kreps RDBMS (Relational Database Management System) SaaS (Software as a Service) Change Data Capture Kafka Storm Google Cloud PubSub Amazon Kinesis Alooma Code Engine Zookeeper Idempotence Kafka Streams Kubernetes SOC2 Jython Docker Python Javascript Ruby Scala PII (Personally Identifiable Information) GDPR (General Data Protection Regulation) Amazon EMR (Elastic Map Reduce) Sequoia Capital Lightspeed Investors Redis Aerospike Cassandra MongoDB

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

API Amazon EMR Kinesis Azure BigQuery Cassandra Cloud Computing Cloud Storage Data Collection Data Engineering Data Management Datadog Docker ELK ETL/ELT GCP GDPR/CCPA Hubspot Informatica JavaScript Kafka Kubernetes Microsoft MongoDB Pub/Sub Python RDBMS Redis Redshift S3 SaaS Scala Cyber Security Snowflake Spark SSIS TIBCO Spotfire
Joe Crobak – Data Engineer @ United States Digital Service (USDS) , Tobias Macey – host

Summary

The rate of change in the data engineering industry is alternately exciting and exhausting. Joe Crobak found his way into the work of data management by accident as so many of us do. After being engrossed with researching the details of distributed systems and big data management for his work he began sharing his findings with friends. This led to his creation of the Hadoop Weekly newsletter, which he recently rebranded as the Data Engineering Weekly newsletter. In this episode he discusses his experiences working as a data engineer in industry and at the USDS, his motivations and methods for creating a newsleteter, and the insights that he has gleaned from it.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. Your host is Tobias Macey and today I’m interviewing Joe Crobak about his work maintaining the Data Engineering Weekly newsletter, and the challenges of keeping up with the data engineering industry.

Interview

Introduction How did you get involved in the area of data management? What are some of the projects that you have been involved in that were most personally fulfilling?

As an engineer at the USDS working on the healthcare.gov and medicare systems, what were some of the approaches that you used to manage sensitive data? Healthcare.gov has a storied history, how did the systems for processing and managing the data get architected to handle the amount of load that it was subjected to?

What was your motivation for starting a newsletter about the Hadoop space?

Can you speak to your reasoning for the recent rebranding of the newsletter?

How much of the content that you surface in your newsletter is found during your day-to-day work, versus explicitly searching for it? After over 5 years of following the trends in data analytics and data infrastructure what are some of the most interesting or surprising developments?

What have you found to be the fundamental skills or areas of experience that have maintained relevance as new technologies in data engineering have emerged?

What is your workflow for finding and curating the content that goes into your newsletter? What is your personal algorithm for filtering which articles, tools, or commentary gets added to the final newsletter? How has your experience managing the newsletter influenced your areas of focus in your work and vice-versa? What are your plans going forward?

Contact Info

Data Eng Weekly Email Twitter – @joecrobak Twitter – @dataengweekly

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

USDS National Labs Cray Amazon EMR (Elastic Map-Reduce) Recommendation Engine Netflix Prize Hadoop Cloudera Puppet healthcare.gov Medicare Quality Payment Program HIPAA NIST National Institute of Standards and Technology PII (Personally Identifiable Information) Threat Modeling Apache JBoss Apache Web Server MarkLogic JMS (Java Message Service) Load Balancer COBOL Hadoop Weekly Data Engineering Weekly Foursquare NiFi Kubernetes Spark Flink Stream Processing DataStax RSS The Flavors of Data Science and Engineering CQRS Change Data Capture Jay Kreps

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

Analytics Flink API Amazon EMR Big Data Data Analytics Data Engineering Data Management Data Science ELK Hadoop Java Kubernetes Spark
I Heart Logs 2014-09-23
Jay Kreps – author

Why a book about logs? That’s easy: the humble log is an abstraction that lies at the heart of many systems, from NoSQL databases to cryptocurrencies. Even though most engineers don’t think much about them, this short book shows you why logs are worthy of your attention. Based on his popular blog posts, LinkedIn principal engineer Jay Kreps shows you how logs work in distributed systems, and then delivers practical applications of these concepts in a variety of common uses—data integration, enterprise architecture, real-time stream processing, data system design, and abstract computing models. Go ahead and take the plunge with logs; you’re going love them. Learn how logs are used for programmatic access in databases and distributed systems Discover solutions to the huge data integration problem when more data of more varieties meet more systems Understand why logs are at the heart of real-time stream processing Learn the role of a log in the internals of online data systems Explore how Jay Kreps applies these ideas to his own work on data infrastructure systems at LinkedIn

data data-engineering log-data NoSQL
O'Reilly Data Engineering Books
Showing 6 results