Web Scraping and Data Mining Mastery with Python Online Course
Web Scraping and Data Mining Mastery with Python Online Course
This beginner-friendly course on Data Scraping and Data Mining will equip you with essential Python skills for extracting data from the web. Covering foundational concepts and methodologies, the course offers a comprehensive approach, including practical, live coding examples, quizzes with solutions, and real-world applications. You'll gradually dive deeper into data scraping techniques through hands-on projects that help you bridge the gap between theory and practice. With four key projects designed for experimentation and learning, this course ensures you gain the skills needed to excel in data scraping and mining.
Who is this course for?
This course is ideal for beginners new to data scraping and those looking to build smart solutions using Python. It’s also beneficial for data scientists, machine learning professionals, and dropshippers interested in learning and applying data scraping techniques to real-world projects.
What you will learn
- Learn the difference between synchronous and asynchronous requests
- Use BS4 to parse server response data
- Explore data scraping tools: Requests, BS4, Scrapy, and Selenium
- Understand BS4 parser functions for extracting data from HTML
- Write web crawlers using Scrapy to extract data
- Automate web flows and control them with Selenium
Course Table of Contents
- Introduction
- Why Data Scraping
- Applications of Data Scraping
- Introduction of Instructor
- Introduction to Course, Scraping, Tools
- Projects Overview
Requests
- Introduction to Python Requests
- Hand on with Requests
- Extracting Quotes Manually
- Quiz (Extracting Authors)
- Solution (Extracting Authors)
- Pagination
- Quiz ( Extracting Author and Quotes)
- Solution 01 (Extracting Author and Quotes)
- Solution 02 (Extracting Author and Quotes)
- Ajax Requests
- Ajax Requests for Cricket Information
- Ajax Requests Pagination
- Quiz (Extracting Top Stats from Cricket info)
- Solution 01 (Extracting Top Stats from Cricket Information)
- Solution 02 (Extracting Top Stats from Cricket Information)
Beautiful Soap 4 (BS4)
- Introduction to BS4
- Quiz (Difference Between Requests and BS4)
- Solution (Difference Between Requests and BS4)
- Hands-On with BS4
- Extracting Data from Tree
- Extracting Quotes from the Website
- Quiz (Extracting Author Names)
- Solution (Extracting Author Names)
- Attributes of Tags in BS4
- Multi-Valued Attributes of Tags in BS4
- Scraping Movie Names from IMDB
- Quiz (Getting the Ratings, Year, Name of the Movie)
- Solution 01 (Getting the Ratings, Year, Name of the Movie)
- Solution 02(Getting the Ratings, Year, Name of the Movie)
- Scraping Time, Genre, and Release Date from IMDB 01
- Scraping Time, Genre, and Release Date from IMDB 02
- Combining Two Requests Data for IMDB
- Movies Recommender System (Creating Movie URL)
- Movies Recommender System (Creating Director URL)
- Movies Recommender System using BS4 (Getting Top 4 Movies)
- Movies Recommender System using BS4 (Merge All Requests Together)
CSS Selectors
- Introduction to CSS Selectors
- CSS Selectors Hands-On (Tags)
- Quiz (Tags)
- Solution (Tags)
- CSS Selectors Hands-On (Descendants, ID, Class)
- Quiz (Descendants)
- Solution (Descendants)
- Quiz (ID)
- Solution (ID)
- Quiz (Class)
- Solution (Class)
- CSS Selectors Hands-On (Nested Tags, ID Tags, Class Tags)
- Quiz (Class with Tag)
- Solution (Class with Tag)
- CSS Selectors Hands-on(Coma Separator, Universal Selectors
- Quiz (Combining Two Selectors)
- Solution (Combining Two Selectors)
- CSS Selectors Hands-On (Sibling Notations and Direct Child)
- Quiz (Adjacent Sibling)
- Solution (Adjacent Sibling)
- Quiz (General Sibling)
- Solution (General Sibling)
- CSS Selectors Hands-On (Child Selectors)
- Quiz (First Child)
- Solution (First Child)
- Quiz (Only Child)
- Solution (Only Child)
- Quiz (Last Child)
- Solution (Last Child)
- CSS Selectors Hands-On (Negations, Attributes)
- Quiz (Negation)
- Solution (Negation)
- CSS Selectors Hands-On (Attributes, Attribute Values)
- Quiz (Attribute Values)
- Solution (Attribute Values)
- CSS Selectors Hands-On (Attributes Wild Cards Values)
- Quiz (Attributes Wild Card)
- Solution (Attributes Wild Card)
Scrapy
- Introduction to Scrapy
- Comparison of Scrapy and Requests
- Scrapy at a Glance Documentation
- Getting Started with Scrapy
- Running Documentation Spider 1
- Running Documentation Spider 2
- Writing Spider from the Scratch
- Understanding the Response (URL, Status)
- Understanding the Response (Headers)
- Understanding the Response (Values in Headers)
- Understanding the Response (Body)
- Understanding the Response (Request)
- Understanding the Response (Meta)
- Understanding the Response (Flags, Certificate, ip_address, Copy)
- Understanding the Response (replace, urljoin, follow, follow_all)
- Response CSS and Scrapy Shell
- Extracting Quotes
- Understanding Nested Selectors
- Extracting the Author and Quotes
- Checking for Next Page
- Checking for Next Page in Spider
- Checking for Next Page URL
- Scraping Quotes from Next Pages
- Exporting Extracted Data
- Quiz (Get the Tags)
- Solution (Get the Tags)
- Next Website
- CSS Selectors for Movie Names and URLs
- Combined CSS Selectors for Movie Names and URLs
- Send Request to the Film Information Page
- Merge Data from Two Callbacks
- Extracting Movie Duration and Genres
- Exporting the Extracted Data
- Quiz (Extracting the Year)
- Solution (Extracting the Year)
- Getting Director Name and URL
- Getting Top Four Movies of Directors
- Extracting Data
- Extracting Data Anomaly (CSS Selector)
- Extracting Data Anomaly (dont_filter Flag)
Scrapy Project
- Hugoboss Website for Scraping
- Understanding Site Structure
- Writing CSS Selectors for Listings
- Listings in Scrapy Shell
- Sending Request to Listings URLs
- Writing CSS for Getting the Product from the listings
- Extracting Products URL from the Listings
- Sending Requests to Products of the Listings
- Writing CSS for Getting the Product Information
- Getting the Bigger Images of the Product
- Adding Pagination to Spider and Running It
- Output of the Spider
Selenium
- Introduction to Selenium
- Getting Started with Selenium
- Configuring the Webdriver
- Extracting Quotes
- Extracting Quotes and Author Names
- Quiz (Extracting Quotes)
- Solution (Extracting Quotes)
- Clicking on Button
- Pagination and Extracting Data
- Exception Handling for Unavailable Elements
- Navigating the Website for Login
- Quiz (Log In and Extract Quote)
- Solution (Log In and Extract Quote)
Project Selenium
- Overview of Project
- Closing the Cookie Button
- Setting the Language for Translation
- Sending the Text for Translation
- Downloading the Translation
- Reading Data from File for Translation