To satisfy the need for data in generative and traditional AI, in a rapidly evolving environment, the ability to efficiently extract data from the web has become indispensable for businesses and developers. This presentation delves into the methodology and tools of web crawling and web scraping, with an overview of the ethical and legal side of the process, including the best practices on how to crawl politely and efficiently and use the data to not violate any privacy or intellectual property laws.