Skip to content

Kernel docs followups#287

Draft
sayakpaul wants to merge 22 commits intomainfrom
kernel-docs-followups
Draft

Kernel docs followups#287
sayakpaul wants to merge 22 commits intomainfrom
kernel-docs-followups

Conversation

@sayakpaul
Copy link
Member

@sayakpaul sayakpaul commented Feb 17, 2026

  • Move to a fully Jinja-based template and remove re based parsing.
  • Decouple commands into two: kernels init-card and kernels fill-card.
  • Make kernel card generation a part of kernels upload (and test).
  • Add an entry for generating kernel system cards to the "writing kernels" tutorial.

@sayakpaul sayakpaul requested a review from danieldk February 17, 2026 05:32
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@sayakpaul
Copy link
Member Author

@danieldk I have broken the create-and-upload-card command into two simpler commands as we discussed internally. I would like to get a review on the progress made so far before making further changes (docs, etc.).

Would like to open a PR to this PR branch for nix build . so that the card generation process gets included in its execution? That way, once kernels upload is issued, the card also gets uploaded to the build repo.

@sayakpaul sayakpaul marked this pull request as draft February 26, 2026 12:24
@sayakpaul
Copy link
Member Author

@danieldk @drbh a couple of updates here:

  • Broadly, it's now decoupled into two commands: kernels init-card ... and kernels fill-card.
  • Existing user section preserved whilst build-based info is always updated when relevant.

Question:

Do we have to assume "build" directory exists within a kernel source directory? Asking because it seems like we can initialize and serialize the kernel card as "CARD.md" in the top-level kernel source dir and then fill it up based on the information available from build.toml.

@sayakpaul sayakpaul requested a review from drbh February 27, 2026 07:23
Comment on lines 58 to 62
kernel_card = ModelCard.from_template(
card_data=ModelCardData(license=license, library_name=LIBRARY_NAME),
template_path=str(KERNEL_CARD_TEMPLATE_PATH),
model_description=kernel_description,
modeld_description=kernel_description,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not have enough context, but it's not fully clear to me if we need the ModelCard concept from the huggingface_hub.

They way its used seems to essentially wraps the jinja templating functionality. And currently its only used to copy the card_template.md and inject two values, then the rest of the generated data is injected via regex. It may be more simple to use this to inject all of the values, or avoid the ModelCard all together and use jinja2 directly (especially since this code path requires the jinja2 dependency).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They way its used seems to essentially wraps the jinja templating functionality. And currently its only used to copy the card_template.md and inject two values, then the rest of the generated data is injected via regex.

I don't think the rest of the generated data is injected through Regex. Please take a closer look :)
https://github.com/huggingface/kernels/blob/d6f40afc0b05dbf12a6142f57600d8c4e45dc36e/kernels/src/kernels/cli/__init__.py#L399C5-L405C6

None of this stuff is regex-injected here. ModelCard is already an established module from the huggingface_hub package, and it helps by providing programmatic interfaces for interacting with the Hub card (such as saving, loading from the Hub, accessing metadata, actual content, etc.).

Then we have:

def _build_kernel_card_vars(

Only the _extract_functions_from_all does regex stuff. Other than that, I think all the other things are parsed from the directory structure and the build.toml file.

In this light, could you take another look and LMK your thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the info. in terms of saving and loading from the hub, I was under the impression we were going to rely on using kernel upload to push to the hub, and we'd load the card from the copy we have in source in kernels/src/kernels/cli/card_template.md (or if it already exists it would be cloned with the rest of the repo source).

and ahh great point, I totally made a mistake when I said "injected" I see that the variables are injected in the ModelCard.from_template call. my intention was to point out that we are parsing the file via regex, then using the parsed values as input when reconstructing the file.

It seems like we can avoid the need to parse the file all together if we just leave the jinja template placeholders like {{ kernel_description }} and let jinja/model card only replace those sections. This way we wouldn't have to capture the existing text and use that in the ModelCard.from_template call. my concern is that we can run into issue when trying to parse existing text.

example

if I run

kernels init-card --repo_id drbh/test-kernels .

then add a note under one of the sections like ## Supported backends

like this

## Supported backends

will add ABD support in the future

and then run

kernels fill-card --repo_id drbh/test-kernels .

it removes the user added comment completely.

outputs:

## Supported backends

- metal

I believe this happens because this logic

    updated_card = ModelCard.from_template(
        card_data=existing_card.data,
        template_path=str(KERNEL_CARD_TEMPLATE_PATH),
        kernel_description=description,
        **preserved,
        **dynamic_vars,
    )

overwrites the preserved (parsed) data with the dynamic_vars inputs when they both refer to the same variable.

it seems like it would be less error prone to keep the {{ supported_backends }} in the card template that the user updates

like this:

## Supported backends

{{ supported_backends }}

so a user could add notes like this

## Supported backends

will add ABD support in the future

{{ supported_backends }}

and then

kernels fill-card --repo_id drbh/test-kernels .

would result in

## Supported backends

will add ABD support in the future

- metal

following a pattern like this would remove the need to parse the file via regex, and would avoid reconstructing the whole file from a combination of regex parsed text and jinja variables, and has the benefit of clearly showing which sections of the card will be replaced.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drbh do the recent changes adhere to what you had in mind?

Comment on lines +111 to +117
caps = kernel_configs[k].get("cuda-capabilities")
if caps:
cuda_capabilities.update(caps)
if cuda_capabilities:
vars["cuda_capabilities"] = "\n".join(
f"- {cap}" for cap in cuda_capabilities
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on https://huggingface.slack.com/archives/C090JN2P8NB/p1772345791192429, I think we're including CUDA capabilities in the build.toml for special cases (like FP8) for now?

Once it's a part of metadata.json, we can modify this code to parse the capability info from there. WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants