-
Notifications
You must be signed in to change notification settings - Fork 2
fix: consistent NA handling -- NA entries are always NULL internally #202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
NA entries in tf objects (tfd_reg, tfd_irreg, tfb) are now consistently represented as NULL list entries. Previously, different code paths produced inconsistent representations: concatenation and assignment created NULL entries, but arithmetic (e.g. x - NA_real_) left entries as NA-filled vectors, breaking print/format with [Inf,-Inf] ranges and causing interpolation errors in sparklines. Changes: - new_tfd: regular all-NA entries and irregular empty entries -> NULL - Accessors (tf_arg, tf_evaluations, as.tfd_irreg): handle NULL entries - fun_math/Math.tfb: skip NULL, convert all-NA results to NULL - Arithmetic ops: normalize all-NA results to NULL, propagate NULL - tfb ops (rebase): skip NA entries when rebasing to basis - print/format: safe range computation, handle all-NA objects - tfd.tf: skip NULL entries during re-evaluation Fixes #178 https://claude.ai/code/session_01HwGMNVWDFf88DXyH8fKSF7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR aims to make NA handling consistent across the functional data types by representing missing functions as NULL list entries internally, eliminating mixed NA representations that previously broke printing/range computation and downstream interpolation (Issue #178).
Changes:
- Normalize constructors and accessors to treat “all-missing” entries as
NULL(both regular and irregular data). - Update arithmetic /
Mathpaths to propagateNULLand convert all-NA results toNULL. - Add targeted tests to enforce the internal
NULLrepresentation and prevent printing/formatting failures on NA-heavy objects.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
tests/testthat/test-na-handling.R |
Adds regression tests asserting NA entries are stored as NULL across constructors and operations. |
R/tfd-class.R |
Normalizes NA/all-missing entries to NULL; updates tfd.tf() re-evaluation to preserve NA entries. |
R/print-format.R |
Adds safer range computation for printing/formatting and adjusts spark scaling behavior. |
R/ops.R |
Updates arithmetic operators to propagate NULL and normalize all-missing results. |
R/methods.R |
Makes tf_arg() / tf_evaluations() robust to NULL entries and updates is.na for irregular data. |
R/math.R |
Updates fun_math() and Math.tfb() to skip/propagate NULL and refit only non-NA entries. |
R/assertions.R |
Adjusts assert_arg() to tolerate NULL args for NA entries. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- math.R: assign attributes before names so names aren't overwritten - ops.R: tfb_op_numeric, numeric_op_tfb, tfb_op_tfb now return tfb (not tfd) for all-NA and partial-NA results via tfb_na_result helper - tfd-class.R: subset arg for per-function arg lists when re-evaluating with NA entries to avoid arg/data length mismatch; use arg[non_null] when computing NA arg positions to avoid nas/arg length mismatch - tests: add is_tfb assertions for tfb arithmetic return types, test numeric_op_tfb and tfb_op_tfb NA paths https://claude.ai/code/session_01HwGMNVWDFf88DXyH8fKSF7
When tfd.tf() re-evaluates irregular data with NA entries, `nas` is computed only for evaluations[non_null], but line 397 was applying it to the full `arg` list. For per-function args (length(arg) > 1), this produces incompatible lengths in map2. Now prunes only the non-NULL arg entries, matching the pattern already used for na_args on line 378. https://claude.ai/code/session_01HwGMNVWDFf88DXyH8fKSF7
Return numeric(0) instead of NULL for NA entries in tf_arg.tfd_irreg. NULL breaks downstream call paths that pass tf_arg(x) into constructors (e.g. tf_fmean -> tf_interpolate -> tfd(x, arg=tf_arg(x))) because new_tfd() calls order() on every arg element and order(NULL) errors with "argument 1 is not a vector". numeric(0) is semantically correct (an NA function has no arg values) and works with order(), unlist(), etc. The P2 about nas alignment in tfd.tf:372 is stale — already fixed in the previous commit. https://claude.ai/code/session_01HwGMNVWDFf88DXyH8fKSF7
When tfd.tf() re-evaluates irregular data with NA entries on a shared arg grid, and different non-NA functions produce interpolation NAs at different positions, the shared arg gets pruned differently per function. The old code produced a list of length sum(non_null) via map2 recycling, but evaluations still had length(data) entries. This caused "Can't recycle .x (size N) to match .y (size M)" in new_tfd. Fix: when a shared arg is pruned into different per-function args, expand to a full-length list with numeric(0) for NULL entries. https://claude.ai/code/session_01HwGMNVWDFf88DXyH8fKSF7
The previous approach had accumulated conditional branches for every combination of shared/per-function arg, with/without NULLs, and same/different extrapolation NAs. Each patch fixed one case but missed another (most recently: shared arg + NULLs + different NAs crashed with a recycle failure). Redesign: normalize arg to per-function (length n) at the top of the block, work uniformly with [non_null] subsetting, then collapse back to shared arg at the end only if originally shared AND NAs were uniform. This eliminates all mid-block branching on length(arg). Replace ad-hoc tests with systematic test matrix covering all combinations of the three dimensions (shared/per-function arg, with/without NULLs, same/different/no extrapolation NAs). https://claude.ai/code/session_01HwGMNVWDFf88DXyH8fKSF7
In tfd_op_tfd, when both operands are NULL (NA), the fallback rep(NA_real_, max(length(NULL), length(NULL), lengths(arg_ret)[1])) produces numeric(0) because all lengths are 0 for NA entries (whose arg is numeric(0)). Downstream tfd() then interprets these as empty data and returns a 0-length object — silent data loss. Fix: return NULL when either operand is NULL, consistent with tfd_op_numeric and the invariant that NA = NULL internally. https://claude.ai/code/session_01HwGMNVWDFf88DXyH8fKSF7
NA entries in tf objects (tfd_reg, tfd_irreg, tfb) are now consistently represented as NULL list entries. Previously, different code paths produced inconsistent representations: concatenation and assignment created NULL entries, but arithmetic (e.g. x - NA_real_) left entries as NA-filled vectors, breaking print/format with [Inf,-Inf] ranges and causing interpolation errors in sparklines.
Changes:
Fixes #178
https://claude.ai/code/session_01HwGMNVWDFf88DXyH8fKSF7