Skip to content

public Identifier createDataset(String dataSetJson, String dataverseAlias) {...} returns a DB identifier but we need a doi to uploadFile #14

@AleixMT

Description

@AleixMT

Hello again.

I am trying to do a bulk upload of a project into a dataverse instance. To do so I need to create a dataset for the project and then upload all the files into the created dataset. The problem is that when you create a dataset the method to do so returns an Identifier which contains an integer. This integer is supposed to identify the dataset that you just created, but when you want to upload a file into that dataset using the identifier you can not do it since the methods to upload a file only accept DOIs to identify datasets and not the identifier that you return from the createDataset method.

So, I would like to do something like this:

List<Document> documents = new ArrayList(...);
Identifier identifier = api.getDataverseOperations().createDataset(JSONMetadata.toString(),  "theDatasetName");

for (Document document: documents)
{
    try {
        api.getDatasetOperations().uploadFile(identifier.toString(), document.getInputStream(), document.getName() );  // This line fails because the identifier does not identify any dataset and it expects a DOI
    } catch (IOException e) {
        e.printStackTrace();
    }
}

Where Document is just a class that wraps file data.

But I cant do it since public Identifier createDataset(String dataSetJson, String dataverseAlias) {...} does not return a DOI.

So, my question is: ¿Is there any way to retrieve the DOI of the dataset that I just created in order to upload files to it inmediately after? Even if it involves doing extra operations. Alternatively: ¿Is there any way to use the Identifier object that you return to identify a dataset and upload files to it?

If that is not possible I will try to do another pull request. But this time I am going to need a little help, since I do not know what operations are you doing in the last line of public Identifier createDataset(String dataSetJson, String dataverseAlias) {...} where you do
return resp.getBody().getData(); where I deduce that you are parsing the return, and obtaining the Id from there.

The reason why I am proposing this change is because I think is completely possible to do so and also an improvement to the library: When you use the native API to create a dataset (using curl for example) the server returns a JSON which contains both the identifier that you return and the doi of the dataset that you just created. It is a matter of parsing the DOI and the identifier and returning them in the method or implementing an equivalent method that parses and returns only the DOI.

Please, answer me when you can to know your opinion in this subject.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions