-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Description
Currently, if the response contains an NA, a clear error message is thrown:
data <- data.frame(x = rnorm(50), y = c(rnorm(49), NA))
m <- xrf(y ~x, data, family = 'gaussian', xgb_control = list(nrounds=1, max_depth=2))
Error in xrf_preconditions(family, xgb_control, glm_control, data, response_var, :
Response variable contains missing values which is not allowedHowever, if any predictor contains an NA, the *model.matrix implementation will silently drop the row, which results in confusing errors:
data <- data.frame(y = rnorm(50), x = c(rnorm(49), NA))
m <- xrf(y ~x, data, family = 'gaussian', xgb_control = list(nrounds=1, max_depth=2))
Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) :
The length of labels must equal to the number of rows in the input dataSeveral fixes may make sense:
- Fail fast & clearly with a preconditions check
- Offer several (configurable) remediation methods, like dropping offending rows or mean/mode imputation.
Metadata
Metadata
Assignees
Labels
No labels