From 77aac347ad38fc30b15479604fdb33b6ca70d6fe Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Tue, 24 Jun 2025 13:43:56 +0200 Subject: [PATCH 01/13] add small summary --- docs/guides/index.md | 7 +++++++ 1 file changed, 7 insertions(+) create mode 100644 docs/guides/index.md diff --git a/docs/guides/index.md b/docs/guides/index.md new file mode 100644 index 0000000..c9ceb34 --- /dev/null +++ b/docs/guides/index.md @@ -0,0 +1,7 @@ +# ELIXIR-on-Cloud documentation + +The ELIXIR-on-Cloud project is an initiative from the ELIXIR Compute Platform. +Our goal is to support scientists across Europe in using cloud environments for their research activities. +We support the use of ELIXIR services as well as open-source software, and the project has close connections with various academic cloud providers. +One of our key focuses is developing and providing software that implements the specifications defined by the Global Alliance for Genomics and Health (GA4GH) for federated processing of workloads ([GA4GH Cloud Work Stream](https://www.ga4gh.org/work_stream/cloud/)). +This documentation here offers guidance and best practices on how to use the services, further develop our services, and deploy services within the ELIXIR-on-Cloud Framework. From 98551ae22b16bca3a9c0958a516551333780c52f Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Tue, 24 Jun 2025 13:44:18 +0200 Subject: [PATCH 02/13] add sensitive data guideline --- docs/guides/guide-info/sensitive_data.md | 60 ++++++++++++++++++++++++ 1 file changed, 60 insertions(+) create mode 100644 docs/guides/guide-info/sensitive_data.md diff --git a/docs/guides/guide-info/sensitive_data.md b/docs/guides/guide-info/sensitive_data.md new file mode 100644 index 0000000..7bd6d7a --- /dev/null +++ b/docs/guides/guide-info/sensitive_data.md @@ -0,0 +1,60 @@ +# Processing Sensitive Data + +Processing sensitive human data is fundamental to biomedical research, enabling breakthroughs in disease understanding, biomarker detection and treatment development. +Rapid and secure access to such data accelerates research, but also introduces significant responsibilities for data protection and privacy. +Cloud-based services are increasingly used in biomedical research to connect researchers, data, and tools throughout the data lifecycle. +This page summarizes scenarios and requirements for handling sensitive data within the ELIXIR-on-Cloud framework. + +## Legal Frameworks + +Sensitive data processing in research is governed by several legal frameworks, most notably the General Data Protection Regulation (GDPR) and the European Health Data Space (EHDS): + +* **GDPR**: Allows the use of sensitive personal data for research when specific safeguards are in place. Processing is permitted in the public interest, provided measures such as data minimization, pseudonymization, and strict access controls are implemented. Explicit informed consent is often required, and a Data Protection Impact Assessment (DPIA) is strongly recommended. +* **EHDS**: Builds on GDPR by establishing a unified framework for secure sharing and secondary use of electronic health data across the EU. Under EHDS, sensitive health data (e.g., genetic or clinical records) can be reused for research, innovation, and policy-making if anonymized or pseudonymized and accessed through secure processing environments (SPEs). + +## Environments + +* A Trusted Execution Environment (TEE) is a secure and isolated area within a computer system or processor that ensures the confidentiality and integrity of code and data during execution. It aims to protect sensitive computations and data from potential threats, such as malware or unauthorized access. +* A Secure Processing Environment (SPE) is a controlled environment designed to facilitate secure data processing and analysis while maintaining confidentiality, integrity, and privacy. It focuses on secure processing techniques, often including encryption, secure computation, or secure enclaves, to protect data during computation. +* A Trusted Research Environment (TRE) is a secure and controlled environment specifically tailored for research purposes, providing secure data access, analysis, collaboration, and compliance with legal and ethical requirements. TREs emphasize privacy preservation, data governance, collaboration, and knowledge generation while protecting sensitive data. + +### Similarities + +* **Isolation**: Operates separately from the main platform it runs on. +* **Security**: Provides a secure environment for computations and data storage, including cryptographic key management and protection against malware. +* **Integrity**: Ensures the integrity of data and code within the environment. +* **Confidentiality**: Maintain confidentiality of sensitive information if the environment is compromised. +* **Controlled Access and Authentication**: Authenticates code and data before execution to ensure only trusted and verified code runs. +* **Collaboration and Analysis**: There is an offer of tools and infrastructure that enable researchers to perform analysis and collaborate within a secure environment. This allows for sharing and combining datasets while maintaining data privacy. + +### Differences + +* **Focus**: + * TEE: Secures startup, code and data during execution. + * SPE: Ensures secure data processing and computation. + * TRE: Provides a comprehensive and secure environment for research activities, including data access and compliance. +* **Data Handling**: + * TEE: Focuses on securing the execution of code and data. + * SPE: Involves secure data processing and temporary storage for processing purposes. + * TRE: Covers secure data storage, access controls, and privacy-preserving methods. +* **Application Context**: + * TEE: Used in secure mobile device environments and secure cloud computing. + * SPE: Applied in secure data analytics and cryptographic computations. + * TRE: Tailored for research involving sensitive datasets like healthcare research. + +## Use Cases + +Researchers may require access to sensitive data in different scenarios. +The four use cases are derived from the two dimensions of data storage and data processing. +Research data can be stored in a single location or in multiple locations and institutions. +We also distinguish between whether the data should be processed in the cloud or in the knowledge worker's own environment. + +| | Local processing | Cloud processing | +| ------------------ | ------------------- | -------------------- | +| **Central data** | Data repository | Cloud platform | +| **Federated data** | Federated database | Federated processing | + +* **Data repository**: Data is stored in a single database. Researchers request access, are authorized, and transfer encrypted data to their secure environment for analysis. +* **Federated database**: Data is distributed across multiple nodes. Metadata is accessible via APIs and a central portal. Researchers request access to datasets at individual nodes, which then provide data for transfer or processing. +* **Cloud platform**: Centralized sensitive data is hosted on a platform. Authorized users log in and analyze data directly within an SPE, using workflows or interactive tools. +* **Federated processing**: Sensitive data remains on separate nodes with restricted transfer. Analysis is performed via APIs in an SPE, often combining results from multiple sources. A special case of this is federated learning, where models are trained in several iterations and updated on different data sets. From 0635ffc7dc4eca4f912d2d3bef1bb19a85c4a64b Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Tue, 24 Jun 2025 13:44:29 +0200 Subject: [PATCH 03/13] adapt nav --- mkdocs.yml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mkdocs.yml b/mkdocs.yml index 7a3f8cf..fed1551 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -82,6 +82,9 @@ markdown_extensions: nav: - Home: "index.md" - Guides: + - "guides/index.md" + - Information: + - "Sensitive Data": "guides/guide-info/sensitive_data.md" - Users: - "guides/guide-user/index.md" - "User stories": "guides/guide-user/user_stories.md" From e92ee54246862dd00da06c12e3bc840b813ebdf5 Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Tue, 24 Jun 2025 13:50:46 +0200 Subject: [PATCH 04/13] Update docs/guides/guide-info/sensitive_data.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/guides/guide-info/sensitive_data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/guides/guide-info/sensitive_data.md b/docs/guides/guide-info/sensitive_data.md index 7bd6d7a..d251ef2 100644 --- a/docs/guides/guide-info/sensitive_data.md +++ b/docs/guides/guide-info/sensitive_data.md @@ -23,7 +23,7 @@ Sensitive data processing in research is governed by several legal frameworks, m * **Isolation**: Operates separately from the main platform it runs on. * **Security**: Provides a secure environment for computations and data storage, including cryptographic key management and protection against malware. * **Integrity**: Ensures the integrity of data and code within the environment. -* **Confidentiality**: Maintain confidentiality of sensitive information if the environment is compromised. +* **Confidentiality**: Maintains confidentiality of sensitive information if the environment is compromised. * **Controlled Access and Authentication**: Authenticates code and data before execution to ensure only trusted and verified code runs. * **Collaboration and Analysis**: There is an offer of tools and infrastructure that enable researchers to perform analysis and collaborate within a secure environment. This allows for sharing and combining datasets while maintaining data privacy. From 2255d6c786ab4b1dadb39c305db7d91fc9f9859b Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Mon, 12 Jan 2026 13:14:23 +0100 Subject: [PATCH 05/13] change section to general --- mkdocs.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mkdocs.yml b/mkdocs.yml index fed1551..16518d3 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -83,7 +83,7 @@ nav: - Home: "index.md" - Guides: - "guides/index.md" - - Information: + - General: - "Sensitive Data": "guides/guide-info/sensitive_data.md" - Users: - "guides/guide-user/index.md" From 0b470903cfdfd344a76c74443e08511b69d7433f Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Mon, 12 Jan 2026 13:14:43 +0100 Subject: [PATCH 06/13] use abbreviations and referneces --- docs/guides/guide-info/sensitive_data.md | 30 +++++++----------------- docs/guides/index.md | 2 +- includes/abbreviations.md | 6 +++++ includes/references.md | 3 +++ 4 files changed, 19 insertions(+), 22 deletions(-) diff --git a/docs/guides/guide-info/sensitive_data.md b/docs/guides/guide-info/sensitive_data.md index d251ef2..f62a0c4 100644 --- a/docs/guides/guide-info/sensitive_data.md +++ b/docs/guides/guide-info/sensitive_data.md @@ -1,7 +1,7 @@ # Processing Sensitive Data -Processing sensitive human data is fundamental to biomedical research, enabling breakthroughs in disease understanding, biomarker detection and treatment development. -Rapid and secure access to such data accelerates research, but also introduces significant responsibilities for data protection and privacy. +Processing sensitive human data is fundamental to biomedical research, enabling breakthroughs in disease understanding, biomarker discovery, and treatment development. +Rapid and secure access to such data accelerates research but also introduces significant responsibilities for data protection and privacy. Cloud-based services are increasingly used in biomedical research to connect researchers, data, and tools throughout the data lifecycle. This page summarizes scenarios and requirements for handling sensitive data within the ELIXIR-on-Cloud framework. @@ -10,13 +10,16 @@ This page summarizes scenarios and requirements for handling sensitive data with Sensitive data processing in research is governed by several legal frameworks, most notably the General Data Protection Regulation (GDPR) and the European Health Data Space (EHDS): * **GDPR**: Allows the use of sensitive personal data for research when specific safeguards are in place. Processing is permitted in the public interest, provided measures such as data minimization, pseudonymization, and strict access controls are implemented. Explicit informed consent is often required, and a Data Protection Impact Assessment (DPIA) is strongly recommended. -* **EHDS**: Builds on GDPR by establishing a unified framework for secure sharing and secondary use of electronic health data across the EU. Under EHDS, sensitive health data (e.g., genetic or clinical records) can be reused for research, innovation, and policy-making if anonymized or pseudonymized and accessed through secure processing environments (SPEs). +* **EHDS**: Builds on GDPR by establishing a unified framework for secure sharing and secondary use of electronic health data across the EU. The EHDS is defined as "the first common EU data space dedicated to a specific sector, establishing a common framework for use and exchange of electronic health data across the EU" ([Regulation (EU) 2025/327](eur-lex-ehds)). The EHDS aims to improve individuals' access to their electronic health data and enable secondary use for research, innovation, policymaking, health threats preparedness, patient safety, and regulatory activities. ## Environments * A Trusted Execution Environment (TEE) is a secure and isolated area within a computer system or processor that ensures the confidentiality and integrity of code and data during execution. It aims to protect sensitive computations and data from potential threats, such as malware or unauthorized access. -* A Secure Processing Environment (SPE) is a controlled environment designed to facilitate secure data processing and analysis while maintaining confidentiality, integrity, and privacy. It focuses on secure processing techniques, often including encryption, secure computation, or secure enclaves, to protect data during computation. -* A Trusted Research Environment (TRE) is a secure and controlled environment specifically tailored for research purposes, providing secure data access, analysis, collaboration, and compliance with legal and ethical requirements. TREs emphasize privacy preservation, data governance, collaboration, and knowledge generation while protecting sensitive data. +* A Secure Processing Environment (SPE) is a controlled environment designed to facilitate secure data processing and analysis while maintaining confidentiality, integrity, and privacy. It focuses on secure processing techniques, often including encryption, secure computation, or secure enclaves, to protect data during computation. Under the EHDS regulation, sensitive health data (e.g., genetic or clinical records) can be reused for research, innovation, and policy-making if anonymized or pseudonymized and accessed through SPE ([Regulation (EU) 2025/327](eur-lex-ehds)). +* A Trusted Research Environment (TRE) is a secure and controlled environment specifically tailored for research purposes, providing secure data access, analysis, collaboration, and compliance with legal and ethical requirements. TREs emphasize data governance, collaboration, and knowledge generation while ensuring privacy protection. For TREs, the **Five Safes framework** is particularly relevant as a comprehensive approach to data protection while enabling research access. This framework has been adopted by Health Data Research UK (HDR-UK), NIHR, and other major UK research institutions as the gold standard for balancing data protection with research utility ([What is the Five Safes framework?](ukdataservice-5-safes)). + +!!! note "SPEs vs TREs" + Secure Processing Environments (SPEs) and Trusted Research Environments (TREs) are conceptually very similar and serve comparable purposes in providing secure environments for sensitive data processing. The key difference lies in their regulatory and geographical context: SPEs are specifically required within the framework of the EHDS, while TREs are primarily a UK-developed concept and implementation approach. ### Similarities @@ -27,21 +30,6 @@ Sensitive data processing in research is governed by several legal frameworks, m * **Controlled Access and Authentication**: Authenticates code and data before execution to ensure only trusted and verified code runs. * **Collaboration and Analysis**: There is an offer of tools and infrastructure that enable researchers to perform analysis and collaborate within a secure environment. This allows for sharing and combining datasets while maintaining data privacy. -### Differences - -* **Focus**: - * TEE: Secures startup, code and data during execution. - * SPE: Ensures secure data processing and computation. - * TRE: Provides a comprehensive and secure environment for research activities, including data access and compliance. -* **Data Handling**: - * TEE: Focuses on securing the execution of code and data. - * SPE: Involves secure data processing and temporary storage for processing purposes. - * TRE: Covers secure data storage, access controls, and privacy-preserving methods. -* **Application Context**: - * TEE: Used in secure mobile device environments and secure cloud computing. - * SPE: Applied in secure data analytics and cryptographic computations. - * TRE: Tailored for research involving sensitive datasets like healthcare research. - ## Use Cases Researchers may require access to sensitive data in different scenarios. @@ -57,4 +45,4 @@ We also distinguish between whether the data should be processed in the cloud or * **Data repository**: Data is stored in a single database. Researchers request access, are authorized, and transfer encrypted data to their secure environment for analysis. * **Federated database**: Data is distributed across multiple nodes. Metadata is accessible via APIs and a central portal. Researchers request access to datasets at individual nodes, which then provide data for transfer or processing. * **Cloud platform**: Centralized sensitive data is hosted on a platform. Authorized users log in and analyze data directly within an SPE, using workflows or interactive tools. -* **Federated processing**: Sensitive data remains on separate nodes with restricted transfer. Analysis is performed via APIs in an SPE, often combining results from multiple sources. A special case of this is federated learning, where models are trained in several iterations and updated on different data sets. +* **Federated processing**: Sensitive data remains on separate nodes with restricted transfer. Analysis is performed via APIs in an SPE, often combining results from multiple sources. A special case of this is federated learning, where models are trained through several iterations and updated with different datasets. diff --git a/docs/guides/index.md b/docs/guides/index.md index c9ceb34..2e810b4 100644 --- a/docs/guides/index.md +++ b/docs/guides/index.md @@ -3,5 +3,5 @@ The ELIXIR-on-Cloud project is an initiative from the ELIXIR Compute Platform. Our goal is to support scientists across Europe in using cloud environments for their research activities. We support the use of ELIXIR services as well as open-source software, and the project has close connections with various academic cloud providers. -One of our key focuses is developing and providing software that implements the specifications defined by the Global Alliance for Genomics and Health (GA4GH) for federated processing of workloads ([GA4GH Cloud Work Stream](https://www.ga4gh.org/work_stream/cloud/)). +One of our key focuses is developing and providing software that implements the specifications defined by the Global Alliance for Genomics and Health (GA4GH) for federated processing of workloads ([GA4GH Cloud Work Stream](ga4gh-cloud-ws)). This documentation here offers guidance and best practices on how to use the services, further develop our services, and deploy services within the ELIXIR-on-Cloud Framework. diff --git a/includes/abbreviations.md b/includes/abbreviations.md index b3e35e5..cba8f5e 100644 --- a/includes/abbreviations.md +++ b/includes/abbreviations.md @@ -1,12 +1,18 @@ *[DBCLS]: Database Center for Life Science +*[DPIA]: Data Protection Impact Assessment *[DRS]: GA4GH Data Repository Service API +*[EHDS]: European Health Data Space *[ELIXIR]: ELIXIR unites Europe’s leading life science organisations in managing and safeguarding the increasing volume of data being generated by publicly funded research. It coordinates, integrates and sustains bioinformatics resources across its member states and enables users in academia and industry to access services that are vital for their research. *[FOSS]: Free & Open Source Software *[GA4GH]: The Global Alliance for Genomics and Health is a policy-framing and technical standards-setting organization, seeking to enable responsible genomic data sharing within a human rights framework. +*[GDPR]: General Data Protection Regulation *[GSoC]: Google Summer of Code *[IaC]: Infrastructure as Code *[LIMS]: Laboratory Information Management System *[NBDC]: National Bioscience Database Center +*[SPE]: Secure Processing Environment +*[TEE]: Trusted Execution Environment *[TES]: GA4GH Task Execution Service API +*[TRE]: Trusted Research Environment *[TRS]: GA4GH Tool Registry Service API *[WES]: GA4GH Workflow Execution Service API diff --git a/includes/references.md b/includes/references.md index 8046f17..0389bf0 100644 --- a/includes/references.md +++ b/includes/references.md @@ -31,11 +31,13 @@ [elixir-cloud-demo-smk]: [elixir-cloud-registry]: [elixir-cloud-services]: +[eur-lex-ehds]: [fair]: [funnel]: [funnel-config-slurm]: [funnel-config-slurm-service]: [ga4gh]: +[ga4gh-cloud-ws]: [ga4gh-cloud]: [ga4gh-dps]: [ga4gh-drs]: @@ -140,3 +142,4 @@ [tesk-helm-values]: [vsftpd]: [vsftpd-deploy]: +[ukdataservice-5-safes]: From d9da8841277cb51b9cedc788ce0bcd1e55c7f480 Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Fri, 23 Jan 2026 08:59:21 +0100 Subject: [PATCH 07/13] Update docs/guides/guide-info/sensitive_data.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/guides/guide-info/sensitive_data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/guides/guide-info/sensitive_data.md b/docs/guides/guide-info/sensitive_data.md index f62a0c4..77a9e42 100644 --- a/docs/guides/guide-info/sensitive_data.md +++ b/docs/guides/guide-info/sensitive_data.md @@ -9,7 +9,7 @@ This page summarizes scenarios and requirements for handling sensitive data with Sensitive data processing in research is governed by several legal frameworks, most notably the General Data Protection Regulation (GDPR) and the European Health Data Space (EHDS): -* **GDPR**: Allows the use of sensitive personal data for research when specific safeguards are in place. Processing is permitted in the public interest, provided measures such as data minimization, pseudonymization, and strict access controls are implemented. Explicit informed consent is often required, and a Data Protection Impact Assessment (DPIA) is strongly recommended. +* **GDPR**: Allows the use of sensitive personal data for research when specific safeguards are in place. Under GDPR, processing may rely on different legal bases, such as tasks carried out in the public interest or explicit informed consent from data subjects, depending on the research context and applicable national law. In all cases, measures such as data minimization, pseudonymization, and strict access controls should be implemented, and a Data Protection Impact Assessment (DPIA) is strongly recommended. * **EHDS**: Builds on GDPR by establishing a unified framework for secure sharing and secondary use of electronic health data across the EU. The EHDS is defined as "the first common EU data space dedicated to a specific sector, establishing a common framework for use and exchange of electronic health data across the EU" ([Regulation (EU) 2025/327](eur-lex-ehds)). The EHDS aims to improve individuals' access to their electronic health data and enable secondary use for research, innovation, policymaking, health threats preparedness, patient safety, and regulatory activities. ## Environments From 1714a7819fcb229d44f0348ce3c8041aa72df3e8 Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Fri, 23 Jan 2026 08:59:55 +0100 Subject: [PATCH 08/13] Update docs/guides/guide-info/sensitive_data.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/guides/guide-info/sensitive_data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/guides/guide-info/sensitive_data.md b/docs/guides/guide-info/sensitive_data.md index 77a9e42..99d31d7 100644 --- a/docs/guides/guide-info/sensitive_data.md +++ b/docs/guides/guide-info/sensitive_data.md @@ -21,7 +21,7 @@ Sensitive data processing in research is governed by several legal frameworks, m !!! note "SPEs vs TREs" Secure Processing Environments (SPEs) and Trusted Research Environments (TREs) are conceptually very similar and serve comparable purposes in providing secure environments for sensitive data processing. The key difference lies in their regulatory and geographical context: SPEs are specifically required within the framework of the EHDS, while TREs are primarily a UK-developed concept and implementation approach. -### Similarities +### Similarities Between TEE, SPE, and TRE * **Isolation**: Operates separately from the main platform it runs on. * **Security**: Provides a secure environment for computations and data storage, including cryptographic key management and protection against malware. From b7627c0f845d78b3c95528f52d9ba0afddbf640d Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Fri, 23 Jan 2026 09:00:24 +0100 Subject: [PATCH 09/13] Update docs/guides/guide-info/sensitive_data.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/guides/guide-info/sensitive_data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/guides/guide-info/sensitive_data.md b/docs/guides/guide-info/sensitive_data.md index 99d31d7..7430c22 100644 --- a/docs/guides/guide-info/sensitive_data.md +++ b/docs/guides/guide-info/sensitive_data.md @@ -35,7 +35,7 @@ Sensitive data processing in research is governed by several legal frameworks, m Researchers may require access to sensitive data in different scenarios. The four use cases are derived from the two dimensions of data storage and data processing. Research data can be stored in a single location or in multiple locations and institutions. -We also distinguish between whether the data should be processed in the cloud or in the knowledge worker's own environment. +We also distinguish between whether the data should be processed in the cloud or in the researcher's own environment. | | Local processing | Cloud processing | | ------------------ | ------------------- | -------------------- | From 51f5df6441dcd55d32fcd2a72b6875563fd1bd68 Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Fri, 23 Jan 2026 09:01:01 +0100 Subject: [PATCH 10/13] Update docs/guides/guide-info/sensitive_data.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/guides/guide-info/sensitive_data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/guides/guide-info/sensitive_data.md b/docs/guides/guide-info/sensitive_data.md index 7430c22..02357bc 100644 --- a/docs/guides/guide-info/sensitive_data.md +++ b/docs/guides/guide-info/sensitive_data.md @@ -28,7 +28,7 @@ Sensitive data processing in research is governed by several legal frameworks, m * **Integrity**: Ensures the integrity of data and code within the environment. * **Confidentiality**: Maintains confidentiality of sensitive information if the environment is compromised. * **Controlled Access and Authentication**: Authenticates code and data before execution to ensure only trusted and verified code runs. -* **Collaboration and Analysis**: There is an offer of tools and infrastructure that enable researchers to perform analysis and collaborate within a secure environment. This allows for sharing and combining datasets while maintaining data privacy. +* **Collaboration and Analysis**: Provides tools and infrastructure that enable researchers to perform analysis and collaborate within a secure environment. This allows for sharing and combining datasets while maintaining data privacy. ## Use Cases From da9de30fb0ba987c615b826ebc1affdeb0517369 Mon Sep 17 00:00:00 2001 From: Sven Twardziok Date: Fri, 23 Jan 2026 09:01:48 +0100 Subject: [PATCH 11/13] Update docs/guides/guide-info/sensitive_data.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/guides/guide-info/sensitive_data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/guides/guide-info/sensitive_data.md b/docs/guides/guide-info/sensitive_data.md index 02357bc..81aa8e0 100644 --- a/docs/guides/guide-info/sensitive_data.md +++ b/docs/guides/guide-info/sensitive_data.md @@ -26,7 +26,7 @@ Sensitive data processing in research is governed by several legal frameworks, m * **Isolation**: Operates separately from the main platform it runs on. * **Security**: Provides a secure environment for computations and data storage, including cryptographic key management and protection against malware. * **Integrity**: Ensures the integrity of data and code within the environment. -* **Confidentiality**: Maintains confidentiality of sensitive information if the environment is compromised. +* **Confidentiality**: Aims to maintain confidentiality of sensitive information and protect against compromise. * **Controlled Access and Authentication**: Authenticates code and data before execution to ensure only trusted and verified code runs. * **Collaboration and Analysis**: Provides tools and infrastructure that enable researchers to perform analysis and collaborate within a secure environment. This allows for sharing and combining datasets while maintaining data privacy. From 501d42d68ff5e47d35dd58439fe1e630fcf39cf3 Mon Sep 17 00:00:00 2001 From: Copilot <198982749+Copilot@users.noreply.github.com> Date: Fri, 23 Jan 2026 09:45:14 +0100 Subject: [PATCH 12/13] fix: correct markdown reference syntax and grammar (#34) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: svedziok <17719296+svedziok@users.noreply.github.com> --- docs/guides/guide-info/sensitive_data.md | 6 +++--- docs/guides/index.md | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/guides/guide-info/sensitive_data.md b/docs/guides/guide-info/sensitive_data.md index 81aa8e0..b6270b5 100644 --- a/docs/guides/guide-info/sensitive_data.md +++ b/docs/guides/guide-info/sensitive_data.md @@ -10,13 +10,13 @@ This page summarizes scenarios and requirements for handling sensitive data with Sensitive data processing in research is governed by several legal frameworks, most notably the General Data Protection Regulation (GDPR) and the European Health Data Space (EHDS): * **GDPR**: Allows the use of sensitive personal data for research when specific safeguards are in place. Under GDPR, processing may rely on different legal bases, such as tasks carried out in the public interest or explicit informed consent from data subjects, depending on the research context and applicable national law. In all cases, measures such as data minimization, pseudonymization, and strict access controls should be implemented, and a Data Protection Impact Assessment (DPIA) is strongly recommended. -* **EHDS**: Builds on GDPR by establishing a unified framework for secure sharing and secondary use of electronic health data across the EU. The EHDS is defined as "the first common EU data space dedicated to a specific sector, establishing a common framework for use and exchange of electronic health data across the EU" ([Regulation (EU) 2025/327](eur-lex-ehds)). The EHDS aims to improve individuals' access to their electronic health data and enable secondary use for research, innovation, policymaking, health threats preparedness, patient safety, and regulatory activities. +* **EHDS**: Builds on GDPR by establishing a unified framework for secure sharing and secondary use of electronic health data across the EU. The EHDS is defined as "the first common EU data space dedicated to a specific sector, establishing a common framework for use and exchange of electronic health data across the EU" ([Regulation (EU) 2025/327][eur-lex-ehds]). The EHDS aims to improve individuals' access to their electronic health data and enable secondary use for research, innovation, policymaking, health threats preparedness, patient safety, and regulatory activities. ## Environments * A Trusted Execution Environment (TEE) is a secure and isolated area within a computer system or processor that ensures the confidentiality and integrity of code and data during execution. It aims to protect sensitive computations and data from potential threats, such as malware or unauthorized access. -* A Secure Processing Environment (SPE) is a controlled environment designed to facilitate secure data processing and analysis while maintaining confidentiality, integrity, and privacy. It focuses on secure processing techniques, often including encryption, secure computation, or secure enclaves, to protect data during computation. Under the EHDS regulation, sensitive health data (e.g., genetic or clinical records) can be reused for research, innovation, and policy-making if anonymized or pseudonymized and accessed through SPE ([Regulation (EU) 2025/327](eur-lex-ehds)). -* A Trusted Research Environment (TRE) is a secure and controlled environment specifically tailored for research purposes, providing secure data access, analysis, collaboration, and compliance with legal and ethical requirements. TREs emphasize data governance, collaboration, and knowledge generation while ensuring privacy protection. For TREs, the **Five Safes framework** is particularly relevant as a comprehensive approach to data protection while enabling research access. This framework has been adopted by Health Data Research UK (HDR-UK), NIHR, and other major UK research institutions as the gold standard for balancing data protection with research utility ([What is the Five Safes framework?](ukdataservice-5-safes)). +* A Secure Processing Environment (SPE) is a controlled environment designed to facilitate secure data processing and analysis while maintaining confidentiality, integrity, and privacy. It focuses on secure processing techniques, often including encryption, secure computation, or secure enclaves, to protect data during computation. Under the EHDS regulation, sensitive health data (e.g., genetic or clinical records) can be reused for research, innovation, and policymaking if anonymized or pseudonymized and accessed through an SPE ([Regulation (EU) 2025/327][eur-lex-ehds]). +* A Trusted Research Environment (TRE) is a secure and controlled environment specifically tailored for research purposes, providing secure data access, analysis, collaboration, and compliance with legal and ethical requirements. TREs emphasize data governance, collaboration, and knowledge generation while ensuring privacy protection. For TREs, the **Five Safes framework** is particularly relevant as a comprehensive approach to data protection while enabling research access. This framework has been adopted by Health Data Research UK (HDR-UK), NIHR, and other major UK research institutions as the gold standard for balancing data protection with research utility ([What is the Five Safes framework?][ukdataservice-5-safes]). !!! note "SPEs vs TREs" Secure Processing Environments (SPEs) and Trusted Research Environments (TREs) are conceptually very similar and serve comparable purposes in providing secure environments for sensitive data processing. The key difference lies in their regulatory and geographical context: SPEs are specifically required within the framework of the EHDS, while TREs are primarily a UK-developed concept and implementation approach. diff --git a/docs/guides/index.md b/docs/guides/index.md index 2e810b4..b557fcd 100644 --- a/docs/guides/index.md +++ b/docs/guides/index.md @@ -3,5 +3,5 @@ The ELIXIR-on-Cloud project is an initiative from the ELIXIR Compute Platform. Our goal is to support scientists across Europe in using cloud environments for their research activities. We support the use of ELIXIR services as well as open-source software, and the project has close connections with various academic cloud providers. -One of our key focuses is developing and providing software that implements the specifications defined by the Global Alliance for Genomics and Health (GA4GH) for federated processing of workloads ([GA4GH Cloud Work Stream](ga4gh-cloud-ws)). -This documentation here offers guidance and best practices on how to use the services, further develop our services, and deploy services within the ELIXIR-on-Cloud Framework. +One of our key focuses is developing and providing software that implements the specifications defined by the Global Alliance for Genomics and Health (GA4GH) for federated processing of workloads ([GA4GH Cloud Work Stream][ga4gh-cloud-ws]). +This documentation offers guidance and best practices on how to use the services, further develop our services, and deploy services within the ELIXIR-on-Cloud Framework. From 5f8129fb8ee015eb130399eaa6ac19e3dab9c7c1 Mon Sep 17 00:00:00 2001 From: Copilot <198982749+Copilot@users.noreply.github.com> Date: Fri, 23 Jan 2026 09:45:54 +0100 Subject: [PATCH 13/13] docs(guides): apply review feedback to sensitive data guide (#33) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>