Python Web Scraping Project

This repository contains a simple Python project focused on web scraping tasks. The project includes several script files (task1.py, task2.py, task3.py) which demonstrate basic scraping techniques and data handling.

task1.py - Scrapes IMDb's compact list page for top movies, extracting title, ranking, year, rating, and link. Saves results to movies.json to cache and avoid repeated downloads.
task2.py - Imports scrape_top_list from task1 and groups the scraped movies by release year. Prints a dictionary where each key is a year and the value is the list of movies released that year.
task3.py - Uses the same movie data and organizes it by decade. It computes decade boundaries from the minimum and maximum years and prints movies grouped under each ten‑year span.
movies.json - JSON file used to store the list of movie dictionaries returned by scrape_top_list. This file acts as both input (when already present) and output of the scraping process.

Getting Started

Prerequisites
- Python 3.x installed on your system
- Recommended to use a virtual environment

Installation

python -m venv venv
.\venv\Scripts\activate   # On Windows
pip install -r requirements.txt  # if you have dependencies

Usage Run the scripts individually to perform different scraping tasks:
```
python task1.py
python task2.py
python task3.py
```
Data The movies.json file contains sample data used or generated by the scripts.

Contributing

Feel free to fork the repository and submit pull requests for improvements or additional examples.

License

This project is provided under the MIT License. See LICENSE for details (not included by default).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Web Scraping Project

Contents

Getting Started

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
movies.json		movies.json
task1.py		task1.py
task2.py		task2.py
task3.py		task3.py

Folders and files

Latest commit

History

Repository files navigation

Python Web Scraping Project

Contents

Getting Started

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages