Skip to content

CodingTumbleweed/MangaScrapper

Repository files navigation

MangaScrapper

A .NET desktop application for scraping manga series and chapters from various manga websites. The application uses XPath configurations to adapt to different manga sites and provides a WPF-based user interface.

Project Structure

The solution consists of three main projects:

  • MangaScrapper.Core: Core scraping functionality and business logic
  • MangaScrapper.UI: WPF-based user interface
  • MangaScrapper.Test: Unit tests for the core functionality

Features

  • Configurable manga website scraping using XPath
  • Asynchronous downloading of manga series and chapters
  • XML-based configuration storage
  • Logging support using log4net
  • Exception handling and custom exceptions
  • WPF-based user interface (in development)

Technology Stack

  • .NET Framework
  • WPF (Windows Presentation Foundation)
  • HtmlAgilityPack for HTML parsing
  • log4net for logging
  • XML serialization for configuration storage
  • Async/await for asynchronous operations

Getting Started

Prerequisites

  • Visual Studio 2017 or later
  • .NET Framework 4.5 or later

Setup

  1. Clone the repository
  2. Open MangaScrapper.sln in Visual Studio
  3. Restore NuGet packages
  4. Build the solution
  5. Run the application

Configuration

The application uses XML-based configuration to define manga website scraping rules. Configuration includes:

  • Website URLs
  • XPath expressions for series and chapter lists
  • Site-specific settings

Example configuration structure:

<ScrapperConfig>
  <ScrapperSources>
    <Source>
      <Name>MangaSite</Name>
      <AllSeriesUrl>http://example.com/manga-list</AllSeriesUrl>
      <XPath>
        <!-- XPath expressions for different elements -->
      </XPath>
    </Source>
  </ScrapperSources>
</ScrapperConfig>

Core Components

Scrapper

The main facade class that coordinates:

  • Series list retrieval
  • Chapter list retrieval
  • HTML parsing
  • Configuration management

HTML Parser

Uses HtmlAgilityPack to parse manga websites using configurable XPath expressions.

Download Manager

Handles asynchronous downloading of web pages and content.

Configuration Management

Manages XML-based configurations for different manga websites.

Testing

The project includes unit tests covering:

  • Configuration management
  • HTML parsing
  • Download functionality
  • Logging system

Run tests using Visual Studio's Test Explorer.

Logging

The application uses log4net for logging. Configure logging in log4net.config:

  • File logging
  • Console logging
  • Custom log levels

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Download Manga Series for offline viewing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages