Links:
https://join.slack.com/t/integratedmlai/shared_invite/zt-r3hpj44k-gfhf1pzIt3jixrATyXCWnQ https://www.linkedin.com/in/thomives/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
Topic
HyperText Markup Language (HTML)
370
tagged
Links:
https://join.slack.com/t/integratedmlai/shared_invite/zt-r3hpj44k-gfhf1pzIt3jixrATyXCWnQ https://www.linkedin.com/in/thomives/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Caitlin’s background The last mile in data The Pareto Principle Failing to use data Making sure data is used Communicating with decision-makers Working backwards from the last mile Understanding how data drives decisions Sketching and prototyping Showing the benefits of power data Measurability Driving change in data Asking high-leverage questions Resistance from users Understanding domain experts Linear projects vs circular projects Recommendations for data analyst students Finding Caitlin online
Links:
Emelie's talk https://locallyoptimistic.com/post/linear-and-circular-projects-part-1/ https://locallyoptimistic.com/post/linear-and-circular-projects-part-2/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Rishabh's background Rishabh’s experience as a sales engineer Prescriptive analytics vs predictive analytics The problem with the term ‘data science’ Is machine learning a part of analytics? Day-to-day of people that work with ML Rule-based systems to machine learning The role of analysts in rule-based systems and in data teams Do data analysts know data better than data scientists? Data analysts’ documentation and recommendations Iterative work - data scientists/ML vs data analysts Analyzing results of experiments Overlaps between machine learning and analytics Using tools to bridge the gap between ML and analytics Do companies overinvest in ML and underinvest in analystics? Do companies hire data scientists while forgetting to hire data analysts? The difficulty of finding senior data analysts Is data science sexier than data analytics? Should ML and data analytics teams work together or independently? Building data teams Rishabh’s newsletter – MLOpsRoundup
Links:
https://mlopsroundup.substack.com/ https://twitter.com/rish_bhargava
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Tammy’s background Being the chief of data First projects as the first data person in a company Initial resistance Expanding the team Role of business analyst Platanomelon’s stack Order for growing the data team Demand forecasting Should analysts know machine learning Qualifications for the first data person in a company Providing accurate results Receiving insights in a timely manner Providing useful insights Giving ownership to the team Starting as the first data person in a company Data For Future podcast Supporting team members that are stuck Finding Tammy online
Links:
Tammy's podcast: https://dataforfuture.org/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Mihail’s background NLP and self-driving vehicles Transitioning from academia to the industry Machine learning researchers Finding open-ended problems Machine learning engineers Is data science more engineering or research? What can engineers and researchers learn from one another? Bridging the disconnect between researchers and engineers Breaking down silos Fluid roles Full-stack data scientists Advice to machine learning researchers Advice to machine learning engineers Reading papers Choosing between engineering or research if you’re just starting Confetti.ai
Links:
https://twitter.com/mihail_eric http://confetti.ai/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Marianna’s background Being the only data scientist What should already be in the company How much experience do you need Identifying problems Prioritization What should the company already know? First week First month First quarter Managing expectations Solving problems without ML Project timelines Finding the best solution Evaluating performance Getting stuck Communicating with analysts Transitioning from engineering to data science Growing the team Stopping projects Questions for the company From research to production Wrapping up
Links:
Marianna's LinkedIn: https://www.linkedin.com/in/marianna-diachuk-53ba60116/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Adam’s background Adam’s laser and data experience Metrics and why do we care about them Examples of metrics KPIs KPI examples Derived KPIs Creating metrics — grocery store example Metric efficiency North Star metrics Threshold metrics Health metrics Data team metrics Experiments: treatment and control groups Accelerate metrics and timeboxing
Links:
Domino's article about measuring value: http://blog.dominodatalab.com/measuring-data-science-business-value Adam's article about skills useful for data scientists: https://towardsdatascience.com/how-to-apply-your-hard-earned-data-science-skillset-812585e3cc06 Adam's article about standing out: https://towardsdatascience.com/how-to-stand-out-as-a-great-data-scientist-in-2021-3b7a732114a9
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Natalie’s background Airbyte What is ETL? Why ELT instead of ETL? Transformations How does ELT help analysts be more independent? Data marts and Data warehouses Ingestion DB ETL vs ELT Data lakes Data swamps Data governance Ingestion layer vs Data lake Do you need both a Data warehouse and a Data lake? Airbyte and ELT Modern data stack Reverse ETL Is drag-and-drop killing data engineering jobs? Who is responsible for managing unused data? CDC – Change Data Capture Slowly changing dimension Are there cases where ETL is preferable over ELT? Why is Airbyte open source? The case of Elasticsearch and AWS
Links:
Natalie's LinkedIn: https://www.linkedin.com/in/nataliekwong/ https://airbyte.io/blog/why-the-future-of-etl-is-not-elt-but-el
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Learning algorithms and data structures Resources for learning algorithms and data structures Most important data structures Learning the abstractions Learning algorithms if they aren’t needed at work Common mistakes when using wrong data structures Importance of data structures for data scientists Marcello’s book - Advanced Algorithms and Data Structures Bloom filters Where Bloom filters are useful Approximate nearest neighbours Searching for most similar vectors Knowing frameworks vs knowing internals of data structures Serializing Bloom filters Algorithmic problems in job interviews Important data structures for data scientists and data engineers Learning by doing Importance of compiled languages for data scientists
Links:
Marcello's book: Advanced Algorithms and Data Structures http://mng.bz/eP79 (promo code for 35% discount: poddatatalks21) MIT, Introduction to Algorithms: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/ Algorithms specialization by Tim Roughgarden: https://www.coursera.org/specializations/algorithms
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
In our second bonus episode, we are featuring an episode from another podcast at ICPSR, hosted by the National Archive of Computerized Data on Aging (NACDA). Dr. Margaret Gatz joins NACDA's Kathryn Lavender to discuss Dr. Gatz's work on the Study of Dementia in Swedish Twins and the National Academy of Sciences-National Research Council Twin Registry (NAS-NRC).
You can listen to all of NACDA's episodes on YouTube or find them on the ICPSR website: https://www.icpsr.umich.edu/web/pages/NACDA/researcher-interviews.html
In this episode, we're featuring an interview with Dr. Joanne Goodell about her newest book, "Preparing STEM Teachers: The UTeach Replication Model.” This interview is part of one of our fellow ICPSR podcasts from the archive Partnership for Expanding Education Research in STEM (PEERS). All of the PEERS episodes are available on YouTube and from the ICPSR website!
https://www.icpsr.umich.edu/web/pages/peersdatahub/discussion-forum.html
We talked about:
Marco’s background Role of CDO Keeping track of many things Becoming a CDO Strategy vs tactics VP of Data vs CDO How many VPs of Data could be there? Splitting the work between VP and CDO Difference between CTO, CPO, and CDO Breaking down the goals and working backwards from them Assessing if we’re moving in the right direction Dealing with many meetings Being more effective Building the data-driven culture Challenges of working remotely Does CDO need deep technical skills? Importance of MBA The key skills for becoming a CDO Biggest challenges within OLX so far Demonstrating the CDO skills on a job interview Overcoming resistance
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Mikio’s background What Mikio helps with Moving from a full-time job to freelancing Finding clients and importance of a strong network Building a network Initial meetings with clients Understanding what clients need Template for the offer (Million dollar consulting) Deciding on rate type: hourly, daily, per project Taking vacations (and paying twice for them) Avoiding overworking Specializing: consulting as a product Working full-time as a principal vs being a consultant Is the overhead worth it? Getting a new client when you already have a project After freelancing: what’s next? Output of Mikio’s work Learning new things Lessons learned after finding clients Registering as a freelancer in Germany Personal liability of a freelancer Effect of globalization and remote work on consulting Advice for people who want to start freelancing Woking full-time and freelancing at the same time
Books:
Million Dollar Consulting by Alan Weiss Built to Sell by John Warrillow
Links:
Mikio's Twitter: https://twitter.com/mikiobraun Mikio's LinkedIn: https://www.linkedin.com/in/mikiobraun/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools--useful whether you work with Windows, macOS, or Linux. You'll quickly discover why the command line is an agile, scalable, and extensible technology. Even if you're comfortable processing data with Python or R, you'll learn how to greatly improve your data science workflow by leveraging the command line's power. This book is ideal for data scientists, analysts, engineers, system administrators, and researchers. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on text, CSV, HTML, XML, and JSON files Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow Create your own tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines Model data with dimensionality reduction, regression, and classification algorithms Leverage the command line from Python, Jupyter, R, RStudio, and Apache Spark
We talked about:
Carmine’s background Carmine’s startup FreshFlow Doing user research Design thinking Entrepreneur first Finding co-founders: the “expertise edges” framework The structure of the EF program Coming up with the idea How important is going through a startup accelerator? Finding your first client Finding investors Consequences of having a bad investor Splitting responsibilities between co-founders Hiring The importance of delegating Making work attractive to hires Plans for the future Just-in-time supply chain What would you have done differently? Advice for people starting a startup Don’t focus on skills only Getting motivation Am I ready for a startup? Importance of a business school Advice on finding a co-founder Do I need EF if I already have an idea? Having a prototype before the pitch
Books:
The Mom Test by Rob Fitzpatrick Design Thinking by Robert Curedale
Links:
FreshFlow: https://freshflow.ai/ Carmine's LinkedIn: https://www.linkedin.com/in/carminepaolino Carmine's Twitter: https://twitter.com/paolino
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We don't have an episode lined up for this week, but we recorded a small chat with Vladimir some time ago. Enjoy it!
We talked about:
Vladimir's background Learning by answering questions Don't be afraid of being wrong Winnings books Learning random things Approach learning as a machine learning project
Links:
Vladimir on LinkedIn: https://www.linkedin.com/in/vladimir-finkelshtein/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Lina’s background What we need to remember when starting a project (checklists) Make sure the problem is formalized and close to the core business Get the buy-in with stakeholders Building trust with stakeholders Don’t just focus on upsides – ask about concerns Turning a concert into a metric What happens when something goes wrong? Post mortem reporting Apply the 5 why’s If a lot of users say it’s a bug – it’s worth investigating Post mortem format Action points Debugging vs explaining the model Are there online versions of checklists? Make sure to log your inputs Talking to end-users and using your own service Your ideas vs Stakeholder ideas Should data practitioners educate the team about data? People skills and ‘dirty’ hacks Where to find Lina
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
We talked about:
Ben’s Background Building solutions for customers Why projects don’t make it to production Why do people choose overcomplicated solutions? The dangers of isolating data science from the business unit The importance of being able to explain things Maximizing chances of making into production The IKEA effect Risks of implementing novel algorithms If it can be done simply – do that first Don’t become the guinea pig for someone’s white paper The importance of stat skills and coding skills Structuring an agile team for ML work Timeboxing research Mentoring Ben’s book ‘Uncool techniques’ at AI-First companies Should managers learn data science? Do data scientists need to specialize to be successful?
Links:
Ben's book: https://www.manning.com/books/machine-learning-engineering-in-action (get 35% off with code "ctwsummer21")
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
Build interactive, data-driven websites with the potent combination of open source technologies and web standards, even if you have only basic HTML knowledge. With the latest edition of this popular hands-on guide, you'll tackle dynamic web programming using the most recent versions of today's core technologies: PHP, MySQL, JavaScript, CSS, HTML5, jQuery, and the powerful React library. Web designers will learn how to use these technologies together while picking up valuable web programming practices along the way, including how to optimize websites for mobile devices. You'll put everything together to build a fully functional social networking site suitable for both desktop and mobile browsers. Explore MySQL from database structure to complex queries Use the MySQL PDO extension, PHP's improved MySQL interface Create dynamic PHP web pages that tailor themselves to the user Manage cookies and sessions and maintain a high level of security Enhance JavaScript with the React library Use Ajax calls for background browser-server communication Style your web pages by acquiring CSS skills Implement HTML5 features, including geolocation, audio, video, and the canvas element Reformat your websites into mobile web apps
We talked about:
Elena’s background Why do a startup instead of being an employee? Where to get ideas for your startup Finding a co-founder What should you consider before starting a startup? Vertical startup vs infrastructure startup ‘AI First’ startups Building tools for engineers What skills do you need to start a startup? Startup risks How to be prepared to fail Work-life balance The part-time startup approach Startup investment models No resources and no technical expertise – what to do? Productionizing your services When to hire an expert Talking to people with a problem before solving the problem Starting Elena’s startup, Evidently Elena’s role at Evidently Why is Evidently open source? “People will just copy my open source code. Should I be concerned?” Bottom-up adoption Creating value so that clients engage with your product Is there a difference between countries when creating a startup? Does open source mean the data is safer? When should you hire engineers? Following the market Startups out of genuine interest vs Just for money and for fun
Links:
EvidentlyAI: https://evidentlyai.com/ Elena's LinkedIn: https://www.linkedin.com/in/elenasamuylova/ Elena's Twitter: https://twitter.com/elenasamuylova/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html