A few things I found while looking for pathological cases:
http://do.ma.in:80/what/ever is normalized to http://do.ma.in/what/ever
and so is https://do.ma.in/what/ever
but https://do.ma.in:442/what/ever comes out as http://do.ma.in:443/what/ever
http://10.2.3.4/hello/world.html comes out as http://2.3.4/hello/world.html
Spaces and %20 in query strings are normalized to +
but %20 and + in path are left as is
space is changed to %20
UTF-8 in path is %-quoted, but %27 is turned into '
(BUT ' is left alone, so the result is a uniform, but ' is officially a delimiter in https://datatracker.ietf.org/doc/html/rfc3986#section-2.2)
The above two were seen in the wild in:
http://www.seychellesnewsagency.com/articles/19841/Over++Seychelles%27+households+received+financial+assistance+following+Dec.++disasters
A few things I found while looking for pathological cases:
http://do.ma.in:80/what/everis normalized tohttp://do.ma.in/what/everand so is
https://do.ma.in/what/everbut
https://do.ma.in:442/what/evercomes out ashttp://do.ma.in:443/what/everhttp://10.2.3.4/hello/world.htmlcomes out ashttp://2.3.4/hello/world.htmlSpaces and
%20in query strings are normalized to+but
%20and+in path are left as isspace is changed to
%20UTF-8 in path is %-quoted, but
%27is turned into'(BUT
'is left alone, so the result is a uniform, but'is officially a delimiter in https://datatracker.ietf.org/doc/html/rfc3986#section-2.2)The above two were seen in the wild in:
http://www.seychellesnewsagency.com/articles/19841/Over++Seychelles%27+households+received+financial+assistance+following+Dec.++disasters