Skip to content

edin-dal/music_db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Music_DB

Wikipedia Scraper Instructions

The Wikipedia Scraper is a tool designed to extract, clean, and process Music Meta data from Wikipedia pages. This tool is part of the repo repository and is located in the code/wikipedia directory. Follow these instructions to set up and run the scraper.

Prerequisites

Before you start, ensure you have Python and Git installed on your system. You'll need Git to clone the repository and access the Wikipedia Scraper. If you're unsure whether you have Git or need to install it, please refer to the Git documentation. Getting Started

Clone the Repository

Start by cloning the repo repository to your local machine. Open a terminal or command prompt and run the following command:

git clone https://github.com/edin-dal/music_db

Navigate to the Wikipedia Scraper Directory

Change into the directory containing the Wikipedia Scraper script:

cd music_db/code/wikipedia

Make the Script Executable

Before running the script, you need to ensure it has the necessary execution permissions. Grant execution permissions by running:

chmod +x ./scrape_clean.sh

Run the Scraper

Now, you're ready to run the scraper. Execute the script by running:

./scrape_clean.sh

The script will begin processing. This may take some time.

Output Files

Upon successful completion, the script will create an output folder within the repo/code/wikipedia directory. This folder will contain the final cleaned and processed data extracted from Wikipedia. Troubleshooting

If you encounter permission errors while running ./scrape_clean.sh, ensure that you've correctly set the execution permissions as described in step 3. Ensure you are in the correct directory (repo/code/wikipedia) before running the script. Running it from a different directory may cause path-related errors.

Support

For questions, issues, or support regarding the Wikipedia Scraper, please open an issue in the GitHub repository, and I'll get back to you as soon as possible.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published