Skip to content

Python scraper for AliExpress e-commerce marketplace using ScrapingAnt API

Notifications You must be signed in to change notification settings

kami4ka/AliExpressScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AliExpress Scraper

Python scraper for AliExpress.com - the global e-commerce marketplace. Extracts product listings including titles, prices, images, and seller information using the ScrapingAnt API.

Features

  • Search products by keyword
  • Pagination support (multiple pages)
  • Extract product details:
    • Title
    • Price
    • Product ID and URL
    • Image URL
    • Sold count
    • Rating
    • Shipping info
  • Export to CSV and JSON
  • Automatic deduplication

Requirements

Note: The free plan has a concurrency limit of 10,000 API credits. Each request with browser rendering uses approximately 10 credits.

Installation

  1. Clone the repository:
git clone https://github.com/kami4ka/AliExpressScraper.git
cd AliExpressScraper
  1. Create and activate virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Set your ScrapingAnt API key:
export SCRAPINGANT_API_KEY="your_api_key_here"

Usage

Basic Usage

python main.py laptop

With Options

# Scrape 3 pages of phone listings
python main.py phone -p 3

# Export as JSON too
python main.py headphones --json

# Custom output directory
python main.py keyboard -o results/

# Specify API key directly
python main.py monitor --api-key YOUR_API_KEY

Command Line Arguments

Argument Short Description Default
keyword Search keyword (required) -
--pages -p Number of pages to scrape 2
--delay -d Delay between requests (seconds) 3.0
--output -o Output directory output/
--json Also export as JSON False
--api-key ScrapingAnt API key env var

Extracted Data Fields

Field Description
title Product title
price Current price
product_id Unique product ID
product_url Direct link to product page
image_url Product image URL
original_price Original price before discount
sold_count Number of items sold
rating Product rating
store_name Seller store name
shipping Shipping information
scraped_at Timestamp of scraping

How It Works

  1. Builds search URL with keyword and pagination
  2. Sends request to ScrapingAnt API with browser rendering
  3. Parses HTML response using BeautifulSoup
  4. Extracts product data from listing cards
  5. Deduplicates results by product ID
  6. Exports to CSV/JSON

Example Output

AliExpress Scraper
==================================================
Keyword: laptop
Max pages: 2
==================================================
Scraping page 1: https://www.aliexpress.com/w/wholesale-laptop.html
Page 1: scraped 15 listings, 15 new
Scraping page 2: https://www.aliexpress.com/w/wholesale-laptop.html?page=2
Page 2: scraped 15 listings, 15 new

==================================================
Total listings scraped: 30
Results exported to: output/aliexpress_laptop_20260113_180000.csv

Project Structure

AliExpressScraper/
├── config.py          # Configuration and constants
├── models.py          # Data models (ProductListing)
├── scraper.py         # Main scraper class
├── utils.py           # Utility functions
├── main.py            # CLI entry point
├── requirements.txt   # Python dependencies
├── .gitignore         # Git ignore rules
├── output/            # Output directory
│   └── .gitkeep
└── README.md          # Documentation

Troubleshooting

Empty results

  • AliExpress pages are JavaScript-heavy. ScrapingAnt handles this with browser rendering.
  • Try increasing delay between requests with -d flag.

Rate limiting

  • Default 3-second delay should be sufficient.
  • Increase delay if you encounter issues.

License

MIT License

About

Python scraper for AliExpress e-commerce marketplace using ScrapingAnt API

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages