Link Checker

Simple command line tool for checking the validity of links on a small webpage.

Installation

Make sure you have Rust installed, then:

git clone https://github.com/matildasmeds/link_checker
cd link_checker
cargo build --release
./target/release/link_checker https://www.example.com

Replace the url with the specific url you want to check!

You can of course run it with the good old cargo run as well.

cargo run https://www.example.com

Implementation details

We use tokio, and shared mutable state based on Arc and Mutex.

There's a recursive function visit_url(), that visits links, and scrapes the HTML bodies for more links, but only if they are on the same domain as the starting url.

For links outside of the domain, we do a HEAD request, because that is enough to validate the existence of the endpoint. Also, it works well for websites such as StackOverflow, Wikipedia and LinkedIn who tend to reject scrapers. Instead of 403s and a 999 (respectively), we get 200s and 405, and with these know the link exists.

We treat 405 (Method Not Allowed) as a valid link since the server is confirming the resource exists.

Limitations

This solution keeps the checked links in unbounded memory. If the link collection exceeds process memory limits, it will crash.
We don't validate #section fragments' correctness.
While we limit parallel requests with a semaphore, there is no delay or throttle.
No retries, so in case of temporary glitches, you might need to run the program again.
Error handling is on the simple side.
There are no configuration options at the moment. The code can be easily adapted for that, if needed.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Link Checker

Installation

Implementation details

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Link Checker

Installation

Implementation details

Limitations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages