Skip to content

Unexpected omission of HTML node that includes property attribute when node within ancestor node with datatype #63

@csarven

Description

@csarven

(Sorry for the convoluted issue title..)

Input:

<!DOCTYPE html>
<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta charset="utf-8" />
    <title>Test</title>
  </head>

  <body about="" prefix="schema: http://schema.org/">
<div about="#foo" datatype="rdf:HTML" property="schema:description">
<div>
<span lang="" property="schema:encodingFormat" xml:lang="">WebM</span>
</div>
</div>
  </body>
</html>

Output in Turtle (from https://rdf-play.rubensworks.net/ ):

<https://dokie.li/tmp/test.html#foo> <http://schema.org/encodingFormat> "WebM\n";
    <http://schema.org/description> "\n<div xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:schema=\"http://schema.org/\">\n</div>\n"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML>

Note that the span node is completely disappeared. If we change the property to rel, and span to a, a is preserved:

<https://dokie.li/tmp/test.html#foo> <http://schema.org/encodingFormat> <https://dokie.li/tmp/test.html#bar>;
    <http://schema.org/description> "\n<div xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:schema=\"http://schema.org/\">\n<a rel=\"schema:encodingFormat\" href=\"#bar\" xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:schema=\"http://schema.org/\">WebM</a>\n</div>\n"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML>

Expected output when div:

<https://dokie.li/tmp/test.html#foo> <http://schema.org/encodingFormat> "WebM\n";
    <http://schema.org/description> "\n<div xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:schema=\"http://schema.org/\">\n<span lang=\"\" property=\"schema:encodingFormat\" xml:lang=\"\" xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:schema=\"http://schema.org/\">WebM\n</span>\n</div>\n"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML>

If we remove the div immediately around the span (in the input), the output seems to preserve the span:

<https://dokie.li/tmp/test.html#foo> <http://schema.org/encodingFormat> "WebM\n";
    <http://schema.org/description> "\n<span lang=\"\" property=\"schema:encodingFormat\" xml:lang=\"\" xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:schema=\"http://schema.org/\">WebM\n</span>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML>

Are these outputs conforming or did I misunderstand the processing rules?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions