Skip to content

Conversation

@deblasis
Copy link

@deblasis deblasis commented Jun 23, 2023

Hi there, I found myself needing this so I thought it was a good opportunity to contribute a little.

Happy to make the necessary changes, I tried to comply with style as much as possible.

Fixes #981

cc: @colemickens, @davidpinhas, @b-dean

Basic Demo

sops_oci_kms_demo.mp4

Makefile Outdated

%.pb.go: %.proto
protoc --go_out=plugins=grpc:. $<
protoc --go_out=. --go-grpc_out=require_unimplemented_servers=false:. $<
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the current way of generating files with protoc since 1.20 I believe.

The generated files are also split now as you can see below.

The flag require_unimplemented_servers=false is necessary to preserve the previous functionality.

Please note that I am using:
image

I have also tried

protoc-gen-go v1.23.0
protoc        v3.13.0

but I am still getting errors with the plugin settings.

I don't think this is an issue but just flagging for the reviewer

@@ -1,5 +1,7 @@
syntax = "proto3";

option go_package = "/keyservice";
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this ad best practice and because I was getting errors because of this as well when doing make generate

@hiddeco hiddeco added this to the v3.9.0 milestone Jul 4, 2023
@hiddeco hiddeco changed the base branch from master to main July 6, 2023 20:58
@felixfontein felixfontein modified the milestones: v3.9.0, 3.10.0 Jun 26, 2024
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
@deblasis deblasis force-pushed the issue/#981-oci-kms branch from 3fa4e3c to 6407abc Compare January 8, 2025 14:32
@jpark0910
Copy link

Is there anything else blocking this PR from being merged main for the 3.10.0 milestone release? is there an ETA on 3.10.0?

@felixfontein
Copy link
Contributor

I personally won't have time to review this anytime soon (see #1097 (comment)). I'm not sure if anyone else from the maintainers has time either, but it might well be that this won't make it into 3.10 and will get moved to the 3.11 milestone (once that exists).

@felixfontein felixfontein modified the milestones: 3.10.0, 3.11.0 Mar 31, 2025
@masonhuemmer
Copy link

Any update?

// client, err := keymanagement.NewKmsCryptoClientWithConfigurationProvider(common.CustomProfileConfigProvider("/home/<user>/.oci/config", "<profile>"), endpoint)
// Sticking with the defaults for now.

client, err = keymanagement.NewKmsCryptoClientWithConfigurationProvider(common.DefaultConfigProvider(), endpoint)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not have this just be the default config provider. Here's some reasons:

  1. It doesn't work with any sort of standard environment variables. The TF_VAR_* stuff is in there, but no OCI_* or OCI_CLI_*
  2. It doesn't work with the security_token auth that you get when you use SSO / SAML / Federation stuff.
  3. It requires you always create a ~/.oci/config which is a pain when you're running somewhere temporary, like in a container, and might not have proper users, etc.

I've written a small library that could help: https://github.com/ontariosystems/oci-cli-env-provider

It uses the semi-standard OCI_CLI_* environment variables used by the oci cli.

So what I'd do here is change it get the provider that way, and include the instance identity provider so we can run it on compute instances too (which I definitely need sops to be able to do).

import "github.com/ontariosystems/oci-cli-env-provider"

func (key *MasterKey) newKMSClient() (*kms.KmsCryptoClient, error) {
	configProvider, err := configurationProvider()
	if err != nil {
		return nil, err 
	}

	// ...
}

func configurationProvider() (common.ConfigurationProvider, error) {
	var providers []common.ConfigurationProvider

	providers = append(providers, ocep.DefaultConfigProvider())
	ip, err := auth.InstancePrincipalConfigurationProvider()
	if err != nil {
		return nil, err
	}
	providers = append(providers, ip)

	return common.ComposingConfigurationProvider(providers)
}

I've added a similar comment on #1865

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @b-dean I took a look, made a few changes and I am curious about your thoughts.

I am not sure about the policy regarding external libraries anyway I have used "ocep" but a bit differently in order to be more explicit about the behaviour within sops.

I also added SDK Env vars sourcing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deblasis, I like it. I'm going to have to build a copy and try it out some.

…into issue/getsops#981-oci-kms

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
- Introduced `oci-cli-env-provider` for enhanced OCI configuration management.
- Implemented `configurationProvider` to prioritize CLI environment variables.
- Added tests for configuration provider functionality, ensuring correct behavior with OCI CLI and environment variables.

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
- Clarified OCI KMS authentication order and added detailed examples for using OCI CLI and SDK environment variables.
- Improved formatting and removed unnecessary whitespace for better readability.

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
…ci-kms

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
- Bumped Go version from 1.23.0 to 1.24.0.
- Added toolchain version 1.24.4 for consistency.

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
…ci-kms

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
@deblasis deblasis requested a review from b-dean September 29, 2025 02:39
Copy link
Contributor

@b-dean b-dean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going to need some way to provide the correct crypto endpoint (or find it)

}
region := parts[3]
vault_ref := parts[4]
return region, vault_ref, nil
Copy link
Contributor

@b-dean b-dean Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deblasis I'm not really sure what to do with this, because I assume the ocid in your demo video is real, but mine ends up with a different ocid and a different endpoint

My ocid is of this form:

ocid1.vault.oc1.iad.example.aaaaaREDACTED

which then my endpoint ends up being:

https://example-crypto.kms.us-ashburn-1.oraclecloud.com

with example there being what you call vault_ref here. The iad in my ocid and the us-ashburn-1 in the crypto endpoint are not interchangeable. I've tried swapping both out just with oci kms management key get --key-id ${key_ocid} --endpoint ${crypto_mgmt_endpoint} and it only works w/ iad in the ocid and us-ashburn-1 in the endpoint.

So I'm not sure if we can extract the parts reliably from the key OCID. Having worked w/ OCI for a while this doesn't surprise me. They probably have several different iterations of vaults and the endpoints aren't consistent.

The best I can think of is either:

  • also require the vault ocid and call GetVault to get the crypto endpoint
  • use ListVaults to list all the vaults in the compartment, and then use the crypto management endpoint on each to call ListKeys until you find the key that was specified by the sops args and then return the crypto endpoint url of the vault that has that key 🤮

it'd be really nice to only have to say the key ocid, but that second option of loop through all of them seems horrible. any other ideas?

I suppose you could also have:

  • try both versions of the crypto endpoint url (such as both iad and us-ashburn-1) with some sort of short timeout to figure out which is the correct endpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's worse than just looping through all vaults. you don't even know which compartment it's in. so you'd have to loop though every compartment, and then every vault in every compartment. that's horrible.

It'd be better to just require to specify the vault id too. vault ocid and key ocid

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to be honest, I haven't been using OCI in a while (the PR is quite old), I just dealt with merge conflicts recently to push this over the line. Guilty as charged for just assuming that it would just work without giving it a spin after months.

Yeah, it seems that something under the hood changed which anyway means that my initial approach was brittle to begin with.

I am going to play with this as soon as I have some time in the next few days.
Your detailed suggestions are appreciated and noted. 🫡

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deblasis, good news! there's an easy way to deal with this. behold:

package main

import (
        "fmt"
        "strings"

        "github.com/oracle/oci-go-sdk/v65/common"
)

func main() {
        keys := []string{
                "ocid1.key.oc1.uk-london-1.asdadsasdagz5aacmg.abwgiljtjasdasdasdagugpfe7wrtngukihgkybqxcoozz7sbh6lq",
                "ocid1.key.oc1.iad.asdadsasdagz5aacmg.abwgiljtjasdasdasdagugpfe7wrtngukihgkybqxcoozz7sbh6lq",
        }

        for i, ocid := range keys {
                endpoint, err := endpoint(ocid)
                if err != nil {
                        fmt.Println(err.Error())
                } else {
                        fmt.Printf("key %d endpoint: %s\n", i, endpoint)
                }
        }
}

func endpoint(ocid string) (string, error) {
        region, vaultExt, err := extractRefs(ocid)
        if err != nil {
                return "", err
        }

        cryptoEndpointTemplate := common.StringToRegion(region).EndpointForTemplate("kms", "https://{vaultExt}-crypto.kms.{region}.{secondLevelDomain}")
        cryptoEndpoint := strings.Replace(cryptoEndpointTemplate, "{vaultExt}", vaultExt, 1)
        return cryptoEndpoint, nil
}

const ocidParts = 6

func extractRefs(ocid string) (string, string, error) {
        parts := strings.Split(ocid, ".")
        if len(parts) != ocidParts {
                return "", "", fmt.Errorf("OCID length is %s, expected %d", ocid, ocidParts)
        }
        region := parts[3]
        vaultExt := parts[4]
        return region, vaultExt, nil
}

which outputs the following:

$ go run main.go
key 0 endpoint: https://asdadsasdagz5aacmg-crypto.kms.uk-london-1.oraclecloud.com
key 1 endpoint: https://asdadsasdagz5aacmg-crypto.kms.us-ashburn-1.oraclecloud.com

based on the your keysource.go and this example I found in the oci-go-sdk

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deblasis, you can pull this into your branch if you like:
b-dean@a5e3d41

@b-dean
Copy link
Contributor

b-dean commented Oct 2, 2025

so it does work now, but it's painfully slow.

it took 361s to encrypt or decrypt {"foo": "bar"}. when I did the same thing with oci kms crypto encrypt cli, it was under a second.

…tion provider

- Implemented tests to verify that Instance Principal is skipped when valid environment variables are provided.
- Added fallback mechanism to ensure Instance Principal is called when environment variables are missing or invalid.
- Updated comments for clarity on the configuration provider's behavior regarding Instance Principals.

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
@deblasis
Copy link
Author

deblasis commented Oct 3, 2025

so it does work now, but it's painfully slow.

it took 361s to encrypt or decrypt {"foo": "bar"}. when I did the same thing with oci kms crypto encrypt cli, it was under a second.

Hi @b-dean

My fault.

I believe that OCI KMS tried Instance Principal auth even when you had valid credentials in env vars, causing the timeout hell you experienced.

Now it checks if env vars/config work first, skipping Instance Principal entirely. Same pattern Azure and GCP use.

Your operations should go from 361s to microseconds. 🤞

Since I was there, I stole other patterns from the other providers and improved testability.

@deblasis deblasis requested a review from b-dean October 3, 2025 09:14
@b-dean
Copy link
Contributor

b-dean commented Oct 3, 2025

@deblasis, your latest version works much better.

I was thinking about the change you made in 4e025d8, and the composing provider is supposed to take a slice of things and then only call the method (like p.TenancyOCID()) if none of the earlier providers returned one. It'd be nice if the instance principal one was not conditional, and the early exist wasn't there either.

So .. I made this in github.com/ontariosystems/oci-cli-env-provider@v0.2.0

p := ocep.LazyConfigProvider(auth.InstancePrincipalConfigurationProvider)

so it won't call the function to create the thing unless no other provider returns a valid TenancyOCID() (or whatever method is being called).

I also added a better composing provider that actually loops through everything for AuthType(), instead of just the first thing.

all together you could have:

// newIPProvider is a variable to allow tests to stub the Instance Principal provider factory
var newIPProvider = auth.InstancePrincipalConfigurationProvider

func configurationProvider() (common.ConfigurationProvider, error) {
	var providers []common.ConfigurationProvider

	// 1) Prefer the CLI-compatible envs used widely in CI/containers (envs only; no implicit fallbacks)
	providers = append(providers, ocep.OciCliEnvironmentConfigurationProvider())

	// 2) Native SDK envs (OCI_tenancy_ocid, OCI_user_ocid, OCI_fingerprint, OCI_private_key_path, OCI_region)
	providers = append(providers, common.ConfigurationProviderEnvironmentVariables("OCI", ""))

	// 3) File-based fallbacks
	if cfg := os.Getenv(OCICLIConfigFile); cfg != "" {
		if prof := os.Getenv(OCICLIProfile); prof != "" {
			if p, err := common.ConfigurationProviderFromFileWithProfile(cfg, prof, ""); err == nil {
				providers = append(providers, p)
			}
		} else {
			if p, err := common.ConfigurationProviderFromFile(cfg, ""); err == nil {
				providers = append(providers, p)
			}
		}
	}

	// 4) Instance principals for compute instances
	providers = append(providers, ocep.LazyConfigProvider(newIPProvider))	

	// 5) Always keep a last-resort default config provider at the end
	providers = append(providers, common.DefaultConfigProvider())

	return ocep.ComposingConfigProvider(providers), nil

I do wonder if the last to should be switched. the instance principal one only works on instances, where as the default works so long as you have things correct in ~/.oci/config or whatever other default locations.

So that'd be:

  1. OCI_CLI env
  2. OCI env
  3. config files from OCI_CLI_CONFIG_FILE and OCI_CLI_PROFILE
  4. default config from sdk
  5. instance profile config

thoughts?

- Added lazyConfigurationProvider to defer the creation of the ConfigurationProvider until the first method call, optimizing performance for expensive providers like Instance Principal.
- Implemented tests to ensure factory is only called once and correctly propagates errors.
- Updated comments for clarity on the lazy initialization behavior.

Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
@deblasis
Copy link
Author

deblasis commented Oct 4, 2025

Hey @b-dean! Thanks for the detailed suggestions and for adding those improvements to github.com/ontariosystems/oci-cli-env-provider v0.2.0

About this, my personal preference would be not to have the dependency at all to be honest. I added it initially but I am not a fan of it. Nothing against your library (it's solid!). Just a matter of codebase hygiene if that makes sense.

I'd like a pointer from the maintainers on this matter as they might find that a totally self contained implementation is preferable over abstractions coming from external packages whenever possible. While in doubt currently we have a hybrid approach where for now we just use ocep.OciCliEnvironmentConfigurationProvider.

https://github.com/getsops/sops/pull/1226/files#diff-a085f93310390109561334c42f9f26e612c342ded5b8318a80c71c8357f34728R28

Let's see what they say. I'd be happy to:

  1. re-implement that here and remove the dependency entirely
    or
  2. to offload the configuration logic completely to the external package
    or
  3. leave as is

My preference is on 1.

That being said, I went ahead and implemented the lazy evaluation pattern locally in a very similar fashion to your approach. I realized that the early exit was already handled by common.ComposingConfigurationProvider internally.

I did take your advice on the provider ordering as well. Runtime magic now is last resort, allowing for config overrides.

@b-dean
Copy link
Contributor

b-dean commented Oct 4, 2025

@deblasis , that makes sense. It'd be nice if Oracle would've just put some of this in their sdk to begin with. I'm fine if you want to copy parts of the code I wrote directly in here, if the sops maintainers would rather not add more libraries.

I mainly published that because I wanted to use the OCI_CLI_* env and not have to make ~/.oci/config everywhere I'd like to use sops.

@deblasis
Copy link
Author

deblasis commented Oct 5, 2025

@deblasis , that makes sense. It'd be nice if Oracle would've just put some of this in their sdk to begin with.

I mainly published that because I wanted to use the OCI_CLI_* env and not have to make ~/.oci/config everywhere I'd like to use sops.

I think the best thing you can do is to file a PR upstream to add native support for OCI_CLI_* env straight into the SDK! :)

@deblasis
Copy link
Author

deblasis commented Oct 7, 2025

Test failed due to a race condition originating from the SDK code itself.

I filed a PR over there with repro steps and a fix

oracle/oci-go-sdk#603

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for Oracle Cloud Vault / KMS

6 participants