-
Notifications
You must be signed in to change notification settings - Fork 83
Closed
Description
I regularly find myself searching XML documents for multiple unique entries among the children of a parent node. I have a handcrafted function that vectorises over the names of the entries and it thus makes separate calls to the XML library for each name; see simplified function below.
Would it be possible to vectorise at the level of the C++ code, rather than in R? I would imagine that the performance would be much better than the serialised calls I have in my code below.
# Argument `names` is a character vector
find <- function(names) {
xp <- paste0("//entry[@id='", names, "']")
# Vectorise this call to `xml_find_first()`, rather than the `lapply()` that I am using now
nodes <- lapply(xp, function(nm) xml2::xml_find_first(my_xml_document, nm))
# Process any nodes that were retrieved
found <- which(sapply(nodes, class) == "xml_node")
if (length(found)) {
nodes <- nodes[found]
cu <- sapply(nodes, function(n) xml2::xml_text(xml2::xml_find_first(n, "canonical_units")))
desc <- sapply(nodes, function(n) xml2::xml_text(xml2::xml_find_first(n, "description")))
data.frame(name = names[found], units = cu, description = desc)
} else NULL
}
Metadata
Metadata
Assignees
Labels
No labels