From 5e88e2ff10e8358d368f7fc7ab7be5a9a078edc0 Mon Sep 17 00:00:00 2001 From: thetoppython <72176927+thetoppython@users.noreply.github.com> Date: Fri, 2 Oct 2020 09:23:15 +0530 Subject: [PATCH] Update README.md --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 90bbeae..7242285 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Nextdoor Scraper -At the time of development, Nextdoor could not be scraped easily using more traditional methods (e.g. Scrapy, Beautiful Soup, etc.) because requests to retrieve the next set of posts use a "random" number as a parameter. +At the time of website or software development, Nextdoor could not be scraped easily using more traditional methods like Scrapy, Beautiful Soup etc. because requests to retrieve the next set of posts use a "random" number as a parameter. Thus, this is a simple python script that uses Selenium to simulate user input to scrape relevant data off nextdoor.com. It uses a chromedriver (included in this repo) as the browser. @@ -15,8 +15,8 @@ Once a virtual environment is built, `pip install -r requirements.txt` must be r Feel free to fork this repo and make it your own! This was just a personal project of mine, but if it is useful to anyone else, I'm happy to share this project. If you'd like to use it as is: 1. Clone the repository into your directory of choosing. -2. Create your own `.env` file, and fill out the variables -3. Open command prompt, navigate to the Nextdoor_Scraper directory, and run: +2. Create your own enviroment `.env` file, and fill out the variables +3. Open command prompt, navigate to the Nextdoor_Scraper directory, and run commands mentioned bellow: * `python nextdoor.py` if you don't want to save the html file separately (as backup in case of failure) * `python html_saver.py` if you want to save the html files and `python html_scraper.py` to scrape the local files separately (more stable for longer scrapes since it'll save the files)