Skip to content

priyapatel2006/python-web-scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Web Scraping Project

This repository contains a simple Python project focused on web scraping tasks. The project includes several script files (task1.py, task2.py, task3.py) which demonstrate basic scraping techniques and data handling.

Contents

  • task1.py - Scrapes IMDb's compact list page for top movies, extracting title, ranking, year, rating, and link. Saves results to movies.json to cache and avoid repeated downloads.
  • task2.py - Imports scrape_top_list from task1 and groups the scraped movies by release year. Prints a dictionary where each key is a year and the value is the list of movies released that year.
  • task3.py - Uses the same movie data and organizes it by decade. It computes decade boundaries from the minimum and maximum years and prints movies grouped under each ten‑year span.
  • movies.json - JSON file used to store the list of movie dictionaries returned by scrape_top_list. This file acts as both input (when already present) and output of the scraping process.

Getting Started

  1. Prerequisites

    • Python 3.x installed on your system
    • Recommended to use a virtual environment
  2. Installation

    python -m venv venv
    .\venv\Scripts\activate   # On Windows
    pip install -r requirements.txt  # if you have dependencies
  3. Usage Run the scripts individually to perform different scraping tasks:

    python task1.py
    python task2.py
    python task3.py
  4. Data The movies.json file contains sample data used or generated by the scripts.

Contributing

Feel free to fork the repository and submit pull requests for improvements or additional examples.

License

This project is provided under the MIT License. See LICENSE for details (not included by default).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages