diff --git a/docs/developer-guide/authorization/authorization_datasets.md b/docs/developer-guide/authorization/authorization_datasets.md new file mode 100644 index 000000000..c128a05d4 --- /dev/null +++ b/docs/developer-guide/authorization/authorization_datasets.md @@ -0,0 +1,276 @@ +# Datasets Authorization + +This is the list of the permissions available for datasets and all their endpoints + +## Endpoint authorization + +- DatasetCreate +- DatasetRead +- DatasetUpdate +- DatasetDelete + +## Instance authorization + +- DatasetCreateOwnerNoPid +- DatasetCreateOwnerWithPid +- DatasetCreateAny +- DatasetReadManyPublic +- DatasetReadManyAccess +- DatasetReadAny +- DatasetUpdateOwner +- DatasetUpdateAny +- DatasetDeleteOwner +- DatasetDeleteAny + +## Implementation + +How the different levels of authorization translate into data conditions applied by the backend. + +- Public + - isPublished = true +- Access (conditions are applied in logical _or_) + - isPublished = true + - the user belongs to the group listed in the _ownerGroup_ field + - the user belongs to one of the groups listed in the _accessGroups_ field + - _sharedWith contains the user's email_ (obsolete, it will be removed) +- Owner + - the user belongs to the group listed in the _ownerGroup_ field +- Any + - User can perform the action to any dataset + +## Operation to endpoints map + +- Create + - POST Datasets + - POST Datasets/isValid +- Read + - GET Datasets + - GET Datasets/fullquery + - GET Datasets/fullfacet + - GET Datasets/metadataKeys + - GET Datasets/count + - GET Datasets/findOne + - GET Datasets/_pid_ + - GET Datasets/_pid_/datasetlifecycle + - GET Datasets/_pid_/logbook +- Update + - PATCH Datasets/_pid_ + - PUT Datasets/_pid_ + - POST Datasets/_pid_/appendToArrayField + - PATCH Datasets/_pid_/datasetlifecycle +- Delete + - DELETE Datasets/_pid_ + +## Authorization standard users + +| Operation | Endpoint Authorization | Anonymous | Authenticated User | Notes | +| --------- | ---------------------- | --------- | ------------------ | ----- | +| Create | _DatasetCreate_ | __no__ | __no__ | | +| Read | _DatasetRead_ | Public
_DatasetReadPublic_ | Has Access
_DatasetReadAccess_ | | +| Update | _DatasetUpdate_ | __no__ | __no__ | | +| | | | | | +| DELETE | _DatasetDelete_ | __no__ | __no__ | | + +## Special permissions groups + +- __Dataset Create Basic__ (DsCB) + These groups are allowed to create datasets for any of the groups they belong to, although they are not allowed to assign the pid to the new dataset. + Default: _#nogroup_ + Special values: + - _#all_ : all groups are allowed to create datasets with pid assigned by the system. +- __Dataset Create Extended__ (DsCE) + These groups are allowed to create datasets for any of the groups they belong to, and they can assign the pid to the new dataset. + Default: _#nogroup_ + Special values: + - _#all_ : all groups are allowed to create datasets with explicit pid. +- __Dataset Create Privileged__ (DsCP) + These groups are allowed to create datasets for any group, and they can also assign the pid to the new dataset. + Default: _#nogroup_ + Special values: + - _#all_ : all groups can create datasets with for any group with explicit pid. +- __Dataset Read Privileged__ (DsRP) + These groups are allowed to read all datasets independently from the ownership. + Default: _#nogroup_ +- __Dataset Update Basic__ (DsUB) + These groups are allowed to update only datasets they own. + Default: _#nogroup_ + Special values: + - _#DsCB_ : all groups listed in _Dataset Create Basic_ are allowed to update the datasets they own. + - _#DsCE_ : all groups listed in _Dataset Create Extended_ are allowed to update the datasets they own. +- __Dataset Update Privileged__ (DsUP) + These groups are allowed to update any datasets independently from the ownership. + Default: _#nogroup_ + Special values: + - _#DsCP_ : all groups listed in _Dataset Create Privileged_ are allowed to update any datasets. +- __Dataset Delete Basic__ (DsDB) + These groups are allowed to delete only the datasets they own. + Default: _#nogroup_ + Special values: + - _#DsCB_ : all groups listed in _Dataset Create Basic_ are allowed to delete the datasets they own. + - _#DsCE_ : all groups listed in _Dataset Create Extended_ are allowed to delete the datasets they own. +- __Dataset Delete Privileged__ (DsDP) + These groups are allowed to delete any dataset independently of the ownership. + Default: _#nogroup_ + Special values: + - _#DsCP_ : all groups listed in _Dataset Create Privileged_ are allowed to delete any datasets. + +## Authorization special permissions groups + +If a user belongs to one of the groups which is listed in any special permission, the permissions listed in this table override the standard permissions. +When the cell is empty in the following table, the permissions listed in the standard users table are applied. +A user can belong to multiple groups listed in multiple special permissions. The union of all the permissions is applied. + +| Operation | Endpoint Authorization | Dataset Read Privileged | Dataset Create Basic | Dataset Create Extended | Dataset Create Privileged | Dataset Update Basic | Dataset Update Privileged | Admin | Dataset Delete Basic | Dataset Delete Privileged | Delete | Notes | +| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +| Create | _DatasetCreate_ | | Owner, w/o PID
_DatasetCreateOwnerNoPid_ | Owner, w/ PID
_DatasetCreateOwnerWithPid_ | Any
_DatasetCreateAny_ | | | Any
_DatasetCreateAny_ | | | | | + | Read | _DatasetRead_ | Any
_DatasetReadAny_ | | | | | | Any
_DatasetReadAny_ | | | | | +| Update | _DatasetUpdate_ | | | | | Owner
_DatasetUpdateOwner_ | Any
_DatasetUpdateAny_ | Any
_DatasetUpdateAny_ | | | | | +| | | | | | | | | | | | | | +| Delete | _DatasetDelete_ | | | | | | | | Own
_DatasetDeleteOwner_ | Any
_DatasetDeleteAny_ | Any
_DatasetDeleteAny_ | | + +## Priorities + +This section lists the connected special permissions groups in order of importance. +A user will acquire the permissions from the special permissions groups up to the rightmost group in the list they belong to. + +- Read + - Anonymous -> Authenticated -> Dataset Read Privileged -> Admin +- Create + - Anonymous -> Authenticated -> Dataset Create Basic -> Dataset Create Extended -> Dataset Create Privileged -> Admin +- Update + - Anonymous -> Authenticated -> Dataset Update Basic -> Dataset Update Privileged -> Admin +- Delete + - Anonymous -> Authenticated -> Dataset Delete Basic -> Dataset Delete Privileged -> Delete + +## Environmental Variables + +The following list present the environmental variables that should be configured to setup the special groups listed in the previous sections. +Each variable is a comma separated list of the users' groups that acquired the special permissions linked to the special group. + +- __DATASET_READ_PRIVILEGED_GROUPS__: groups with __Dataset Read Privileged__ permissions +- __DATASET_CREATE_BASIC_GROUPS__: groups with __Dataset Create Basic__ permissions +- __DATASET_CREATE_EXTENDED_GROUPS__: groups with __Dataset Create Extended__ permissions +- __DATASET_CREATE_PRIVILEGED_GROUPS__: groups with __Dataset Create Privileged__ permissions +- __DATASET_UPDATE_BASIC_GROUPS__: groups with __Dataset Update Basic__ permissions +- __DATASET_UPDATE_PRIVILEGED_GROUPS__: groups with __Dataset Update Privileged__ permissions +- __DATASET_DELETE_BASIC_GROUPS__: groups with __Dataset Delete Basic__ permissions +- __DATASET_DELETE_PRIVILEGED_GROUPS__: groups with __Dataset Delete Privileged__ permissions +- __ADMIN_GROUPS__: groups with __Admin__ permissions. This variable affects all the sub-systems. +- __DELETE_GROUPS__: groups with __Delete__ permissions. This variable affects all the sub-systems. + +## Legacy + +The legacy datasets special permissions environment variables are marked obsolete and will be removed in the future. +In the meantime, they are mapped to the matching new variable. +Here is the map: + +- Create Dataset Groups ( CREATE_DATASET_GROUPS ) -> Dataset Create Basic +- Create Dataset with PID Group ( CREATE_DATASET_WITH_PID_GROUPS ) -> Dataset Create Extended +- Create Dataset Privileged ( CREATE_DATASET_PRIVILEGED_GROUPS ) -> Dataset Create Privileged + +## Use cases and configuration examples + +This section includes few of the many use cases that the community has come across, found them informative. +Each use case provides the settings for each special permissions groups in isolation. +In a production setup, each special permissions groups will contain a list of multiple group that is the union of each individual case. + +### Data ingestion with creation only + +#### Description + +We need a functional account that allows the ingestion process to create datasets for any group so it can ingest datasets independently from who owns them. + +#### Configuration + +##### Accounts + +- username: ingestor +- group: ingestor + +##### Special permissions groups + +- DATASET_READ_PRIVILEGED_GROUPS = "" +- DATASET_CREATE_BASIC_GROUPS = "" +- DATASET_CREATE_EXTENDED_GROUPS = "" +- DATASET_CREATE_PRIVILEGED_GROUPS = "ingestor" +- DATASET_UPDATE_BASIC_GROUPS = "" +- DATASET_UPDATE_PRIVILEGED_GROUPS = "" +- DATASET_DELETE_BASIC_GROUPS = "" +- DATASET_DELETE_PRIVILEGED_GROUPS = "" +- ADMIN_GROUPS = "" +- DELETE_GROUPS = "" + +### Data ingestion with creation and update + +#### Description + +We need a functional account that allows the ingestion process to create and update datasets for any group so it can ingest datasets independently from who owns them and also perform additional updates at a later time. + +#### Configuration + +##### Accounts + +- username: ingestor +- group: ingestor + +##### Special permissions groups + +- DATASET_READ_PRIVILEGED_GROUPS = "" +- DATASET_CREATE_BASIC_GROUPS = "" +- DATASET_CREATE_EXTENDED_GROUPS = "" +- DATASET_CREATE_PRIVILEGED_GROUPS = "ingestor" +- DATASET_UPDATE_BASIC_GROUPS = "" +- DATASET_UPDATE_PRIVILEGED_GROUPS = "ingestor" +- DATASET_DELETE_BASIC_GROUPS = "" +- DATASET_DELETE_PRIVILEGED_GROUPS = "" +- ADMIN_GROUPS = "" +- DELETE_GROUPS = "" + +### Post Ingestion tasks workflow + +#### Description + +We need to set up a workflow to run post ingestions task. The process needs to be able to list any dataset that has a specific value in their keywords field independently from the group who owns the dataset. Once the list is retrieved, the process will perform the set tasks (like determining end of embargo period, performing some aggregation or statistic on the data) and save the results back in the dataset as additional scientific metadata. + +##### Accounts + +- username: post_ingestion_tasks +- group: post_ingestion_tasks + +##### Special permissions groups + +- DATASET_READ_PRIVILEGED_GROUPS = "post_ingestion_tasks" +- DATASET_CREATE_BASIC_GROUPS = "" +- DATASET_CREATE_EXTENDED_GROUPS = "" +- DATASET_CREATE_PRIVILEGED_GROUPS = "" +- DATASET_UPDATE_BASIC_GROUPS = "" +- DATASET_UPDATE_PRIVILEGED_GROUPS = "post_ingestion_tasks" +- DATASET_DELETE_BASIC_GROUPS = "" +- DATASET_DELETE_PRIVILEGED_GROUPS = "" +- ADMIN_GROUPS = "" +- DELETE_GROUPS = "" + +### Automatic workflow to delete obsolete datasets + +#### Description + +We need to set up a workflow to delete datasets that are more than 10 years old and are marked with the keyword _obsolete_. +The process needs to list all datasets that contains the value _obsolete_ in the keywords field and have creation time older than 10 years from today. Once the list has been retrieved, it has to iterate through and execute a delete command on each dataset. + +##### Accounts + +- username: delete_obsolete_datasets +- group: delete_obsolete_datasets + +##### Special permissions groups + +- DATASET_READ_PRIVILEGED_GROUPS = "delete_obsolete_datasets" +- DATASET_CREATE_BASIC_GROUPS = "" +- DATASET_CREATE_EXTENDED_GROUPS = "" +- DATASET_CREATE_PRIVILEGED_GROUPS = "" +- DATASET_UPDATE_BASIC_GROUPS = "" +- DATASET_UPDATE_PRIVILEGED_GROUPS = "" +- DATASET_DELETE_BASIC_GROUPS = "" +- DATASET_DELETE_PRIVILEGED_GROUPS = "delete_obsolete_datasets" +- ADMIN_GROUPS = "" +- DELETE_GROUPS = ""