-
Notifications
You must be signed in to change notification settings - Fork 7
Description
In #470 ld_merge_containers are added.
While merging the results of the harvesters (starting with the most important one and adding less important ones one by one), the context of the ld_merge_dict "A" which contains the progress of the merge is updated with the new context "B".
This is done by a) updating the last dict in A.context with all values in "B"s dictionary/-ies and by b) putting all strings in "B" in reversed order of occurrence before all other items in A.context.
a) leads to the problem that if A.context[-1] maps the same compacted key to a different iri then "B", "A"s mapping is overwritten although "B" is the context of data with less importance:
from hermes.model.types import ld_dict
from hermes.model.merge.container import ld_merge_dict
obj = ld_merge_dict([{}], context=[{"codemeta": "https://doi.org/10.5063/schema/codemeta-2.0/"}])
obj["codemeta:softwareSuggestions"] = "https://github.com/softwarepub/hermes/issues"
new_obj = ld_dict([{}], context=[{"codemeta": "https://doi.org/10.5063/schema/codemeta-1.0/"}])
new_obj["codemeta:zippedCode"] = "https://github.com/softwarepub/hermes"
obj.update(new_obj) # resulting context is only [{"codemeta": "https://doi.org/10.5063/schema/codemeta-1.0/"}]
assert obj["https://doi.org/10.5063/schema/codemeta-1.0/zippedCode"] == ["https://github.com/softwarepub/hermes"] # True
assert obj["codemeta:zippedCode"] == ["https://github.com/softwarepub/hermes"] # True
assert obj["https://doi.org/10.5063/schema/codemeta-2.0/softwareSuggestions"] == ["https://github.com/softwarepub/hermes/issues"] # True
assert obj["codemeta:softwareSuggestions"] == ["https://github.com/softwarepub/hermes/issues"] # KeyErrorb) just leads to weird priorities and shouldn't affect the merging.
My suggestion is to set "A"s context to "B" concatenated with "A"s old context. This avoids deleting compaction options and will make the conversions in "A" more important then those in "B".