From 8e6e518fbdb6afd1e48c6cc45629d9dea292270e Mon Sep 17 00:00:00 2001 From: Emerson Frasure Date: Tue, 16 Dec 2025 14:26:20 -0500 Subject: [PATCH 1/4] feat: Set up markdownlint GitHub Action for automated linting Pull from Collab Guide [PR 42](https://github.com/Imageomics/Collaborative-distributed-science-guide/pull/42) * Adds lint.yaml with linting instructions for changed MD files * Updates contributing.md with automated linting information * Instructions to check details for automated check fail * Update CONTRIBUTING.md to include lint URLs Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> --- .github/workflows/lint.yaml | 31 +++++++++++++++++++++++++++++++ CONTRIBUTING.md | 10 ++++++++-- 2 files changed, 39 insertions(+), 2 deletions(-) create mode 100644 .github/workflows/lint.yaml diff --git a/.github/workflows/lint.yaml b/.github/workflows/lint.yaml new file mode 100644 index 0000000..d439c69 --- /dev/null +++ b/.github/workflows/lint.yaml @@ -0,0 +1,31 @@ +name: Linting + +on: + push: + branches: + - main + pull_request: + branches: + - main + +jobs: + lint: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6 + with: + fetch-depth: 0 + + # This identifies which files changed in the specific commit or PR + - uses: tj-actions/changed-files@v47 + id: changed-files + with: + files: '**/*.md' + separator: "," + + # This runs the linter ONLY on the files identified above + - uses: DavidAnson/markdownlint-cli2-action@v22 + if: steps.changed-files.outputs.any_changed == 'true' + with: + globs: ${{ steps.changed-files.outputs.all_changed_files }} + separator: "," diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 76864a9..3686e1d 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -179,7 +179,12 @@ chore: update mkdocs dependencies ### Linting -The project uses [markdownlint](https://github.com/DavidAnson/markdownlint) with configuration in `.markdownlint.json`. Key settings: +The project uses [markdownlint](https://github.com/DavidAnson/markdownlint) with configuration in `.markdownlint.json`. + +**Automated Checks:** +We have a GitHub Action that checks for formatting errors on Pull Requests. To follow best practices, **it only checks files that you have modified.** If the check fails, click the **Details** link next to the status check to view the error logs and see exactly what needs fixing. + +**Key Rules:** - 4-space indentation for lists (`MD007`). - No hard tab restrictions disabled. @@ -188,7 +193,8 @@ The project uses [markdownlint](https://github.com/DavidAnson/markdownlint) with - Allowed code blocks without language specification (`MD040`). - Allow fenced code blocks, as this commonly errors when indented (see [discussion](https://github.com/DavidAnson/markdownlint/issues/327)). -For faster PR review, you may want to run linting locally; we do have a PR Action in place as well. First install markdownlint, then run +**Local Testing** +For faster PR review, you may want to run linting locally. We recommend installing [`markdownlint-cli`](https://github.com/igorshubovych/markdownlint-cli) or the [VS Code extension](https://marketplace.visualstudio.com/items?itemName=DavidAnson.vscode-markdownlint). ```console markdownlint -c .markdownlint.json -f docs/wiki-guide/ From 74b65eeb4526a470165412bd487b40c7abc15200 Mon Sep 17 00:00:00 2001 From: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> Date: Fri, 23 Jan 2026 21:25:02 -0500 Subject: [PATCH 2/4] Add links to repos with good READMEs as examples for repo types Pull from Collab Guide [PR 50](https://github.com/Imageomics/Collaborative-distributed-science-guide/pull/50) * Add links to repos with good readmes as examples * standardize file+extension formatting and fix typo * Add pro-tip about the code checklist with link * Add guiding principles for a good readme Co-authored-by: Matt Thompson <31709066+thompsonmj@users.noreply.github.com> * Fix formatting * Reformulate as subsections for easier scanning --------- Co-authored-by: Matt Thompson <31709066+thompsonmj@users.noreply.github.com> --- docs/wiki-guide/GitHub-Repo-Guide.md | 92 ++++++++++++++++++---------- 1 file changed, 59 insertions(+), 33 deletions(-) diff --git a/docs/wiki-guide/GitHub-Repo-Guide.md b/docs/wiki-guide/GitHub-Repo-Guide.md index 450e2ca..9788407 100644 --- a/docs/wiki-guide/GitHub-Repo-Guide.md +++ b/docs/wiki-guide/GitHub-Repo-Guide.md @@ -11,31 +11,57 @@ Just joining or starting a new project and need a repository to store your work? For each repository, include the following files in the root directory as soon as possible; they can (and should) be instantiated when you create a new repository. -* [README.md](#readme) -* [LICENSE.md](#license) -* [.gitignore](#gitignore) -* [software requirements](#software-requirements-file) -* [CITATION.cff](#citation) +- [README.md](#readme) +- [LICENSE.md](#license) +- [.gitignore](#gitignore) +- [software requirements](#software-requirements-file) +- [CITATION.cff](#citation) More [recommendations](#recommended-files) are discussed below. +!!! tip "Pro tip" + All these files, plus more essential and recommended elements for a comprehensive GitHub repo, are included in our [Code Checklist](Code-Checklist.md). Following the checklist ensures compliance with the FAIR Principles for research software.[^1] + [^1]: Barker, M., Chue Hong, N. P., Katz, D. S., Lamprecht, A. L., Martinez-Ortiz, C., Psomopoulos, F., Harrow, J., Castro, L. J., Gruenpeter, M., Martinez, P. A., & Honeyman, T. (2022). Introducing the FAIR Principles for research software. _Scientific data_, 9(1), 622. [URL](https://doi.org/10.1038/s41597-022-01710-x). + ### README -The README.md file is what everyone will notice first when they open your repository on GitHub. When creating your repo be sure to include a brief description, as this will populate the `About` field in the top right of your repo, as well as start your README with some text. +The `README.md` file is what everyone will notice first when they open your repository on GitHub. When creating your repo be sure to include a brief description, as this will populate the `About` field in the top right of your repo, as well as start your README with some text. + +Once you've created your repo, populate your README (you can do this by clicking on the file `README.md`, then clicking the pencil at the top left to edit). Editing your README in the browser allows you to preview the formatting of the file before committing changes. The content of your README may vary based on the purpose or goal of your repo, but there are key elements that should always be included. + +#### Guiding Principles + +While crafting your repo, keep the following guiding principles in mind: + +- It is iterative; it does not need to be perfect from the beginning. Be honest about the scope and maturity of the project. +- It should be _useful_ for the intended audience and optimized for scanning. +- Give the audience "quick wins" to being productive with minimal examples or typical workflows rather than comprehensively covering every edge case. + +#### Key Elements + +Following the above principles, be sure to include + +- Summary of the repo: + - This could be a simple explanation of what the package or tool developed in your repo is intended to do, + - Or an abstract describing your research. +- Detailed documentation on how to access and use the project software (User Guide). + - Including installation of [dependencies](Virtual-Environments.md). + - If your tool requires input be in a particular format, this would be included in the README. It would also help to include an example file demonstrating the format. +- Information about the sources you've used (links and what they were used for), such as: + - Tools from other repos. + - Data used for analysis. + +#### Examples -Once you've created your repo, populate your README (you can do this by clicking on the file "README.md", then clicking the pencil at the top left to edit). Editing your README in the browser allows you to preview the formatting of the file before committing changes. The content of your README may vary based on the purpose or goal of your repo, but there are key elements that should always be included. +Some Imageomics repositories with nicely formulated READMEs are... -* Summary of the repo: - * This could be a simple explanation of what the package or tool developed in your repo is intended to do, - * Or an abstract describing your research. -* Detailed documentation on how to access and use the project software (User Guide). - * Including installation of [dependencies](Virtual-Environments.md). - * If your tool requires input be in a particular format, this would be included in the README. It would also help to include an example file demonstrating the format. -* Information about the sources you've used (links and what they were used for), such as: - * Tools from other repos - * Data for analysis +- [BioCLIP 2](https://github.com/Imageomics/bioclip-2): a large project which includes data, model, the code, and a demo. + - It also builds on previous work; the repo models how to request citations (including references), and addresses the case of a multi-user/group license; this complexity is handled well through clarification of type and the inclusion of a `HISTORY.md` file. + - It also is re-used a lot within Imageomics as a base style. +- [cautious-robot](https://github.com/Imageomics/cautious-robot) and [pybioclip](https://github.com/Imageomics/pybioclip/tree/1.1.0) (before the addition of a MkDocs site for documentation) are good examples of code or software-based projects. + - We want to emphasize that a project can start with a well-documented README and later grow to incorporate a documentation site as it becomes more complex (e.g., [pybioclip](https://github.com/Imageomics/pybioclip)). -For more inspiration on making an awesome README, check out [this list](https://github.com/matiassingers/awesome-readme). +For more inspiration on making an awesome README, check out [this crowd-sourced list of awesome READMEs](https://github.com/matiassingers/awesome-readme). ### LICENSE @@ -50,7 +76,7 @@ For more information on how to choose a license and why it matters, see [Choose #### 2. Add LICENSE.md to the repository -Once a license has been chosen, add a LICENSE.md file to the root of the repository. An easy way to do this is using a GitHub-provided [license template](https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-license-to-a-repository). Do not forget to update necessary fields in the template. +Once a license has been chosen, add a `LICENSE.md` file to the root of the repository. An easy way to do this is using a GitHub-provided [license template](https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-license-to-a-repository). Do not forget to update necessary fields in the template. ### GITIGNORE @@ -81,7 +107,7 @@ As with journal publications, we expect to be cited when someone uses our code. Providing this file is as simple as copying the below example and filling in your information before uploading it to your repo. More examples and information about the Citation File Format can be found on the [citation-file-format repo](https://github.com/citation-file-format/citation-file-format), including helpful [related tools](https://github.com/citation-file-format/citation-file-format#tools-to-work-with-citationcff-files-wrench). -You can check your CITATION.cff file prior to upload using this [validator tool](https://www.yamllint.com/). +You can check your `CITATION.cff` file prior to upload using this [validator tool](https://www.yamllint.com/). !!! note "Note" - When adding a DOI to your citation (`doi`), be sure to use the version-agnostic DOI from Zenodo. Since the DOI is not generated until _after_ the release, this ensures there will never be an "incorrect" DOI associated to the release—correct version reference is ensured through the `version` key, which should always be updated _**before**_ generating a new release. @@ -142,7 +168,7 @@ Though the following files are not included in every repository and do not have ### CONTRIBUTING -If you are looking to open your project to more public contributions, it is a good idea to include contributing guidelines. This could take the form of a "CONTRIBUTING.md" file or a subsection of your README. +If you are looking to open your project to more public contributions, it is a good idea to include contributing guidelines. This could take the form of a `CONTRIBUTING.md` file or a subsection of your README. Contributing guidelines are important to maintain consistency across the way people work on a project. It is important to establish conventions about the important things while avoiding excessive constraints and bureaucracy that would make contributing a pain. Important things include efficient and effective communication. @@ -150,7 +176,7 @@ Contributing guidelines are important to maintain consistency across the way peo When using the Zenodo-GitHub integration for [automatic DOI generation](DOI-Generation.md#automatic-generation), tracking metadata beyond the basics (authors, keywords, title, etc.) requires manual updates to the Zenodo record. The solution for this is to include a `.zenodo.json` file to keep track of this information (e.g., grant funding and references). -A `.zenodo.json` can be created by applying [cffconvert](https://github.com/citation-file-format/cffconvert) to your `CITATION.cff` (without the references, as these are not supported). Then add the references and other metadata back in to the JSON (following the [Zenodo dev guide](https://developers.zenodo.org/#representation)). Alterntatively, The example below can simply be copied into a new file and updated with the appropriate information (comments should be removed prior to upload). +A `.zenodo.json` can be created by applying [cffconvert](https://github.com/citation-file-format/cffconvert) to your `CITATION.cff` (without the references, as these are not supported). Then add the references and other metadata back in to the JSON (following the [Zenodo dev guide](https://developers.zenodo.org/#representation)). Alternatively, The example below can simply be copied into a new file and updated with the appropriate information (comments should be removed prior to upload). !!! note The `publication_date` and `version` will need to be updated along with the `CITATION.cff` for each release. @@ -196,21 +222,21 @@ A `.zenodo.json` can be created by applying [cffconvert](https://github.com/cita For interoperability and to avoid ambiguity, [dates and times should be reported](https://dataoneorg.github.io/Education/bestpractices/describe-formats-for) in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601). -* For dates, this means `YYYY-MM-DD` (for ISO 8601 compliance, the dashes are required). -* For times, use `THHMMSS` in 24-hour format. -* For example, the moment when there were 60 seconds left before New Year 2000 would be `1999-12-31T235900`. +- For dates, this means `YYYY-MM-DD` (for ISO 8601 compliance, the dashes are required). +- For times, use `THHMMSS` in 24-hour format. +- For example, the moment when there were 60 seconds left before New Year 2000 would be `1999-12-31T235900`. #### Branches -* Primary branch: `main` -* Other branches follow the pattern `category/reference/description`: - * **category**: `feature`, `bugfix`, `experiment` - * `feature` is for new functionality - * `bugfix` is for fixing errors - * `experiment` is for more open-ended work - * the associated issue (if no issue, put `no-ref`), formatted as `issue-NN` - * description: brief description, e.g., `solve-world-hunger` -* Example: `git branch feature/issue-1/general-ai` +- Primary branch: `main` +- Other branches follow the pattern `category/reference/description`: + - **category**: `feature`, `bugfix`, `experiment` + - `feature` is for new functionality + - `bugfix` is for fixing errors + - `experiment` is for more open-ended work + - the associated issue (if no issue, put `no-ref`), formatted as `issue-NN` + - description: brief description, e.g., `solve-world-hunger` +- Example: `git branch feature/issue-1/general-ai` #### Commits From b7130332f46b7bac498fb8b0470e901563bced1d Mon Sep 17 00:00:00 2001 From: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> Date: Mon, 26 Jan 2026 14:05:24 -0500 Subject: [PATCH 3/4] Add statement regarding Google Drive Pull from Collab Guide [PR 49](https://github.com/Imageomics/Collaborative-distributed-science-guide/pull/49) * Add statement regarding Google Drive clarify unreliable storage location for research products * Include provenance tracking considerations Co-authored-by: Hilmar Lapp --------- Co-authored-by: Hilmar Lapp --- docs/wiki-guide/Digital-Product-Lifecycle.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/wiki-guide/Digital-Product-Lifecycle.md b/docs/wiki-guide/Digital-Product-Lifecycle.md index 7153ac6..5d877a3 100644 --- a/docs/wiki-guide/Digital-Product-Lifecycle.md +++ b/docs/wiki-guide/Digital-Product-Lifecycle.md @@ -19,6 +19,7 @@ The following adds additional context and direction to supplement the diagram, o * **Datasets:** Hugging Face Dataset Repository ([Data checklist](Data-Checklist.md)). * For already published data usage, see the [Metadata Checklist](Metadata-Checklist.md). * **ML Models:** Hugging Face Model Repository ([Model checklist](Model-Checklist.md)). +* Though alternative storage options may be discussed, **Google Drive is not an acceptable storage location for research data, models, or code**. Folder activity does not include actual file additions or deletions, so content can be changed or removed without a record of when or by whom. All research, data, models, and code must be stored in **a version controlled repository, preferably in more than one location** to ensure preservation and full provenance tracking. ### Exploration Phase From 69d25396f186fda19f927d9d6de3ef4f9c809312 Mon Sep 17 00:00:00 2001 From: Emerson Frasure Date: Mon, 26 Jan 2026 15:58:58 -0500 Subject: [PATCH 4/4] bug: Markdownlint Action Ignore Templates Pull from Collab Guide [PR 53](https://github.com/Imageomics/Collaborative-distributed-science-guide/pull/53) * Include ignore in config * test: template that should be ignored * Specifically ignore files in lint.yaml * test: Remove test file * test: ensure config file is working with test md file * test: fix extra errors in the test md file * test: fix un-intended errors in test file * test: cleanup * Remove redundant yaml ignore from linter Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> --------- Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> --- .github/workflows/lint.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.github/workflows/lint.yaml b/.github/workflows/lint.yaml index d439c69..40cec2e 100644 --- a/.github/workflows/lint.yaml +++ b/.github/workflows/lint.yaml @@ -21,6 +21,8 @@ jobs: id: changed-files with: files: '**/*.md' + files_ignore: | + docs/wiki-guide/HF_*_Template*.md separator: "," # This runs the linter ONLY on the files identified above