-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Labels
bugSomething isn't workingSomething isn't working
Description
We are getting the following unexpected output when parsing HTML:
Input:
<!DOCTYPE html> <html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"> <head> <meta charset="utf-8" /> <title></title> <meta content="width=device-width, initial-scale=1" name="viewport" /> </head> <body about="" prefix="rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# schema: http://schema.org/"> <main> <article> <div datatype="rdf:HTML" id="content" property="schema:description"> <p>foo</p> <div rel="schema:hasPart" resource="#bar"> <p property="schema:description" datatype="rdf:HTML"><span>bar</span></p> </div> </div> </article> </main> </body> </html>Output:
<https://dokie.li/tmp/test.html#bar> <http://schema.org/description> "<span xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\" xmlns:schema=\"http://schema.org/\">bar</span>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML> . <https://dokie.li/tmp/test.html> <http://schema.org/description> "\n <p xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\" xmlns:schema=\"http://schema.org/\">foo</p>\n <div rel=\"schema:hasPart\" resource=\"#bar\" xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\" xmlns:schema=\"http://schema.org/\">\n \n </div>\n"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML> . <https://dokie.li/tmp/test.html> <http://schema.org/hasPart> <https://dokie.li/tmp/test.html#bar> .Expected ( from http://rdf.greggkellogg.net/distiller ):
<http://example.org/> <http://schema.org/description> "\n <p>foo</p>\n <div rel=\"schema:hasPart\" resource=\"#bar\">\n <p property=\"schema:description\" datatype=\"rdf:HTML\"><span>bar</span></p>\n </div>\n"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML> . <http://example.org/> <http://schema.org/hasPart> <http://example.org/#bar> . <http://example.org/#bar> <http://schema.org/description> "<span>bar</span>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML> .Note the missing markup and content inside of the
div(\n <p property=\"schema:description\" datatype=\"rdf:HTML\"><span>bar</span></p>\n)Is this a bug in rdf-ext / rdfa-streaming-parser, or does the issue perhaps lie on our end somehow? It'd be great if you can preproduce / confirm.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working