Skip to content

Improve error message for unclosed components whose attribute value might have consumed the slash.#140

Closed
ianjosephwilson wants to merge 20 commits into
t-strings:mainfrom
ianjosephwilson:ian/improve_component_unclosed_error
Closed

Improve error message for unclosed components whose attribute value might have consumed the slash.#140
ianjosephwilson wants to merge 20 commits into
t-strings:mainfrom
ianjosephwilson:ian/improve_component_unclosed_error

Conversation

@ianjosephwilson

Copy link
Copy Markdown
Contributor

Try to give more detailed error messages when a component's unquoted attribute consumes the "/" AND an error occurs where it is not closed.

This is the first step in trying to get through #132

Examples that fail but were meant to probably self-close the component:
- t"<{Comp} title={title}/>"
- t"<div><{Comp} title={title}/></div>"
- t"<{Comp2}><{Comp1} title={title}/></{Comp2}>""

The last example requires inspecting the interpolations to resolve that Comp1 != Comp2.

This brings up a past issue we tried to skip over where this would parse but fail processing:

  • t"<{Comp1}></{Comp2}>"

Comment thread tdom/parser_test.py Outdated
with pytest.raises(
TypeError, match="Component start and end tags must contain the same callable."
):
_ = TemplateParser.parse(t"<{Component1}></{Component2}>")

@davepeck davepeck Jun 14, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love that we're revisiting this...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha yeah, clearly it is one of the project's hard to scratch itches. Anytime a large involved change comes along ... so does it.

Comment thread tdom/placeholders.py
Comment thread tdom/parser.py
Comment thread tdom/parser.py Outdated
# HERE BE DRAGONS: the interpolation at end_i_index shuld be a
# component callable that matches the start tag. We do not check
# any of this in the parser, instead relying on higher layers.
if not source.expressions_match(

@davepeck davepeck Jun 14, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but there's a catch, which is that "cache only on .strings" is problematic now that we're looking at interpolation values in the parser.

def C1():
    pass

def C2():
    pass

proxy = CachedTemplateParserProxy()
proxy.to_tnode(t"<{C1}></{C1}>")  # works as expected
proxy.to_tnode(t"<{C1}></{C2}>")  # doesn't raise because we hit the cache

I'm not sure exactly how to think about this. I know some check is necessary to handle your t"<{C2}><{C1} attr=value/></{C2}></{C2}>" case. But I'm reluctant to change our caching story.

I wonder if we could compute has_ambiguous_slash in the parser and stash it on the TComponent, leaving it up to the processor to surface a good error message.

@davepeck davepeck Jun 14, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Random aside: is the expression check before the value check a perf thing? In the common case, it seems good. But I suppose we can craft identical expressions that may not be equivalent, like t"<{(lambda: 0)}></{(lambda: 0)}>")

@ianjosephwilson ianjosephwilson Jun 14, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, in this case, I am ok NOT involving this in the caching story. I think it is okay to say that the open tag value and the closing tag value for component tags need to have the same value if you want it parsed correctly ( or always?). After the "first" time whenever that is as there might be a startup stampede then we'll just assume you passed in the right values otherwise the behavior is "unpredictable". The only part I don't like is digging in the interpolation values "in general". I think if for example you wanted to pre-parse the templates then you could pass dummy values that follow that rule and it should still work. Otherwise we end up with essentially this situation t'<{C1}><{C2}></*><{C3}></*></*>'. I feel like confusing error messages would be worse but this also seems like something we could potentially remove later if it was a problem or even make configurable?

An alternative which might be possible is that when the inevitable failure occurs then we backtrack and try to rerun through the tags and decipher which component was incorrectly closed with a different component for whatever reason, the common case probably being this: t'<{C1}><{C2} title=/></{C1}>', where </{C1}> closes <{C2} title=/> because the / was consumed as part of title. Then the interpolation introspection only occurs during errors. We could try to make that work?

Comment thread tdom/parser.py Outdated
# Check for tags that might have meant to self-close but whose
# unquoted last attribute value consumed a "/", ie. <div id=app/>.
parent = self.stack[-1]
# @TODO: We need to determine which tags this might apply to, this only applies to components.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this PR sequences before the work that would answer this. But it does feel like where we land is: anything self closable (components here; in a future PR, void tags; even further, foreign tags once the parser becomes smart about such things) gets one message; everyone else gets the other.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think actually void tags wouldn't get this message because they don't go on the stack but yes we'd probably add foreign tags.

Just to clarify: This error is happening regardless of the slash but we try to make "Did you mean ..." help message for a potentially hard to debug case but the "unclosed tags remain" is always happening. They could have meant to use the slash as an attribute value but ... just forgot to close the component (or foreign tag). Ie. t'<{Comp1} path=/about/>' instead of t'<{Comp1} path="/about/" />'

@davepeck davepeck left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're close, but my instinct is the actual value check still needs to live in the processor (alas). We'd have to pass the result of has_ambiguous_slash() through our component instances so the processor can surface a sensible error message. This lets us keep our simple .strings cache for parsing. (And later, when 141 lands, .strings + a namespace, I guess?)

I dunno, what do you think about that approach?

@ianjosephwilson

Copy link
Copy Markdown
Contributor Author

I think we're close, but my instinct is the actual value check still needs to live in the processor (alas). We'd have to pass the result of has_ambiguous_slash() through our component instances so the processor can surface a sensible error message. This lets us keep our simple .strings cache for parsing. (And later, when 141 lands, .strings + a namespace, I guess?)

I dunno, what do you think about that approach?

I think I confused things by trying to change that test but the original problem actually was trying to make a parser error more helpful for something that was going to fail anyways. I found a work-around that might work but:

  • I misunderstood the original error (although there is a "family" of these types of errors)
  • I think I misunderstood your concern about the cache.

So better wait on this and I'll try to put something up again tomorrow.

Comment thread tdom/parser.py
e.add_note(
f'Did you mean to quote the last attribute or put a space before "/>" for "<{{{starttag}}} .../>"?'
)
raise e

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davepeck This block now performs the component start/end values check but where an exception is already going to happen (we closed the parser but tags are still open on the stack). So the parsing would have already failed. I think that should still keep our cache solely dependent on the shape because the values check does not affect what succeeds it is only used to provided better error reporting. Does that make sense?

Comment thread tdom/parser.py
placeholders: PlaceholderState
source: SourceTracker | None
tmap: dict[TComponent | TElement | TFragment, OpenTag]
" Map from completed tnodes back to their opentag for error reporting. "

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this mapping from the completed tnodes back to the opentags BUT it doesn't include the component bodies that we discard because there are no tnodes made. This already seems like an issue when we try to report errors based on parse issues deeper inside a component's children. I'm not sure how to handle that yet. I still think not trying to preload the cache with the children is right but we might need to keep some bookkeeping about what was parsed. There are going to be some growing pains as this error code balloons up and we try to keep refactoring it back down to sensible levels. We might need an additional helper class to try to isolate some of that from the primary parsing code.

@ianjosephwilson ianjosephwilson marked this pull request as draft June 20, 2026 21:44
@ianjosephwilson ianjosephwilson marked this pull request as ready for review June 24, 2026 19:12
@ianjosephwilson

Copy link
Copy Markdown
Contributor Author

I put this on hold to try to prototype pushing the info we need for errors into the processor while juggling them during half-parsed error handling. I think using the "parser position" from .getpos() seems to work as a key and we can keep "open" tag source info during parsing like we do with "open" tags. Then when the tags are closed we can store it in a table, sinfo_table, to both lookup during parsing but also pass on to processing via probably a new TTree wrapper structure. This seemed like a better stopping place to try to get the original errors around the "/>" in.

This was referenced Jun 28, 2026
@davepeck

davepeck commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

@ianjosephwilson I know #145 stacks on top but it also seems to include the key pieces of this PR? At this point, it seems to make sense to close this PR?

Alternatively, I'm happy to merge this PR as-is to main now if you'd prefer to stage it that way. It only targets the <{Component} ... > cases, but it's still an incremental improvement. That might make sense given that #145 is still a draft and key parts (position translation code, error chaining) are going to need time to bake.

@ianjosephwilson

Copy link
Copy Markdown
Contributor Author

Closing it sounds fine. I have been struggling with what is the best way to get feedback without going too far and pushing in a gigantic PR vs spamming PRs. It sounds like things are beginning to align and I am going to make even more improvements to #145 anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants