Skip to content

AbstractToken.__add__ loses the link to an operand reached via attribute access #51

@pomponchik

Description

@pomponchik

When a + b is evaluated and one of the operands is reached via attribute access (e.g. extra + self.token), the resulting composite token does not propagate cancellation of that operand. Cancelling the original child later has no effect on bool(composed).

The cause lives in AbstractToken.__add__ (cantok/tokens/abstract/abstract_token.py): the is_temp() helper uses sys._getframe(1).f_locals/f_globals to decide whether to treat an operand as "temporary" (and therefore unwrap-and-discard it). Operands reached via self.attr are not present in the immediate caller's locals or globals, so they are classified as temporary, unwrapped into their (often empty) children, and the original reference is dropped.

Minimal reproduction

from cantok import DefaultToken, SimpleToken


class Holder:
    def __init__(self, token):
        self.token = token

    def combine(self, extra):
        return extra + self.token  # one operand via attribute access


holder = Holder(SimpleToken())
composed = holder.combine(DefaultToken())

assert bool(composed) is True           # ok
holder.token.cancel()
assert bool(holder.token) is False      # ok
assert bool(composed) is False          # FAILS on cantok 0.0.36

Expected: bool(composed) becomes False after any child is cancelled.
Actual: bool(composed) stays True.

The workaround we apply downstream is to bind the attribute to a local variable before composing, which makes is_temp() see it in the caller's locals:

def combine(self, extra):
    instance_token = self.token
    return extra + instance_token  # now correctly linked

Regression tests (drop-in)

Two tests — one from the real-world scenario that surfaced this for us, one minimal.

def test_add_preserves_link_to_attribute_operand():
    """An operand reached via attribute access must remain linked in the
    composite token; later cancellation of that child must propagate to
    bool(composed)."""
    from cantok import DefaultToken, SimpleToken

    class Holder:
        def __init__(self, token):
            self.token = token

        def combine(self, extra):
            return extra + self.token

    holder = Holder(SimpleToken())
    composed = holder.combine(DefaultToken())

    assert bool(composed) is True

    holder.token.cancel()

    assert bool(holder.token) is False
    assert bool(composed) is False


def test_add_inside_generator_preserves_link_to_self_token():
    """Reproduces a real-world failure from a library that wraps cantok.
    A generator method composes the call-time token with an instance-stored
    token (`token = call_time_token + self.token`), yields one item, and then
    the caller cancels the instance token. The next iteration must see the
    cancellation."""
    from cantok import DefaultToken, SimpleToken

    class Crawler:
        def __init__(self, token):
            self.token = token

        def go(self, token=DefaultToken()):
            token = token + self.token
            for index in range(5):
                yield index, bool(token)

    instance_token = SimpleToken()
    crawler = Crawler(instance_token)
    iterator = crawler.go()

    index, alive = next(iterator)
    assert (index, alive) == (0, True)

    instance_token.cancel()
    assert bool(instance_token) is False

    index, alive = next(iterator)
    assert alive is False, 'composite token must reflect cancellation of self.token'

Suggested direction (a guess, not a vetted fix)

A real fix probably needs deeper familiarity with what the is_temp() heuristic is trying to optimise. Two directions that come to mind:

  1. Widen the frame search. Instead of inspecting only sys._getframe(1), walk up the call stack and consider an operand "kept" if any live frame holds a reference to it. This catches attribute-access cases at the cost of more work per __add__.
  2. Default to "kept". Treat every operand as non-temporary by default; expose an explicit way (a method, a flag, or a sentinel) for callers who actually want the unwrap-and-discard optimisation. Loses the optimisation in cases where the current heuristic would have applied, but eliminates the silent-correctness failure shown above.

Both options are guesses — the right call depends on what is_temp() is supposed to optimise and which call patterns currently rely on the unwrap path. Worth picking by whoever owns that code.


Environment

  • cantok: 0.0.36
  • Python: 3.14.0
  • OS: macOS (Darwin 24.0.0)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions