Found another bug today.
Investigated it deeply with AI.
As I understand it, the real cause is go-yaml's bug. I reported it to the library (issue).
I am creating this issue to let you know about it + perhaps it is smart to implement a workaround for go-yaml's bug to not depend on it.
NOTE: I am just a library user who tries to dig into issues with AI's help :)
Below is a summary I've prepared with AI's help.
Summary
Constructing a Validator from a libopenapi.Document mutates the shared high-level DocumentModel. When one schema embeds a $ref nested inside it (under allOf/anyOf/oneOf/not/items/properties), building the validator corrupts the lazy $ref resolution of other schema nodes read afterwards — their declared Type comes back empty.
Root cause: NewValidatorFromV3Model → warmSchemaCaches pre-compiles every schema at construction time, which renders and yaml.Marshals schema nodes. Those nodes are shared with the live model the caller still holds, and the marshal strips their resolved YAML tags in place (see yaml/go-yaml#371). Schema Type is resolved lazily from those nodes, so any schema read after construction now resolves with an empty Type.
Environment
github.com/pb33f/libopenapi-validator v0.13.11
github.com/pb33f/libopenapi v0.38.1
go.yaml.in/yaml/v4 v4.0.0-rc.5
- Go 1.26.2
- OS: Debian 12
Reproduction/evidence
// Bug: constructing a Validator mutates the shared high-level DocumentModel.
//
// When a parameter's schema embeds a $ref NESTED inside it (under allOf,
// anyOf, oneOf, not, items, properties, ...), building the validator corrupts
// the lazy resolution of OTHER schema nodes that are read afterwards: their
// Type comes back empty.
//
// Root cause: NewValidator -> warmSchemaCaches pre-compiles every schema,
// which renders + yaml.Marshal's the schema nodes. The marshal strips the
// resolved YAML tags off nodes that are SHARED with the live model (see
// yaml/go-yaml#371). Schema Type is resolved lazily from those nodes, so any
// schema read AFTER construction resolves with an empty Type. Reading the
// schemas first (caching them while tags are intact) makes them immune --
// which pins NewValidator as the mutator rather than a read-side problem.
//
// Scope (established by minimizing from a larger real-world spec):
//
// - The trigger is a NESTED $ref, not a specific keyword. allOf, anyOf,
// oneOf, not, items, and properties all reproduce it. The ONLY safe case
// is a parameter whose schema IS a bare top-level $ref -- that is passed
// through by reference and never rendered/marshalled. See the directRef
// control sub-test below.
// - External files are NOT required. An internal #/components/schemas/...
// ref reproduces it; an external ./file.yaml#/... ref does too.
// - The corrupted node is a SEPARATE, later-resolved schema -- not the
// nested member itself. The victim here is parameter "b", read after the
// validator is built. The member node under "a" resolves fine on its own.
// - No patternProperties / unevaluatedProperties / deepObject / explode are
// needed; a bare combiner-with-one-$ref is enough.
//
// How to run:
// 1. Drop this file into the root of github.com/pb33f/libopenapi-validator.
// 2. go test -run TestValidatorCorruptsModel -v
//
// Observed with libopenapi v0.38.1 and libopenapi-validator (current main).
package validator_test
import (
"os"
"path/filepath"
"strings"
"testing"
"github.com/pb33f/libopenapi"
validator "github.com/pb33f/libopenapi-validator"
"github.com/pb33f/libopenapi/datamodel"
v3 "github.com/pb33f/libopenapi/datamodel/high/v3"
"github.com/pb33f/testify/assert"
"github.com/pb33f/testify/require"
)
// Single self-contained spec: query parameter "a" has an allOf wrapping one
// internal $ref; query parameter "b" is a plain scalar declared after it.
// "b" is the victim whose declared type ("integer") goes missing.
const corruptSpec = `openapi: 3.1.0
info:
title: validator-mutates-model repro
version: "1.0"
paths:
/things:
get:
parameters:
- name: a
in: query
schema:
allOf:
- $ref: '#/components/schemas/T'
- name: b
in: query
schema:
type: integer
responses:
'200':
description: ok
components:
schemas:
T:
type: object
`
// directRefSpec is corruptSpec with the allOf removed: parameter "a" is a
// DIRECT $ref. This variant does NOT corrupt "b" -- it isolates the combiner
// as the trigger.
const directRefSpec = `openapi: 3.1.0
info:
title: validator-mutates-model repro (direct ref control)
version: "1.0"
paths:
/things:
get:
parameters:
- name: a
in: query
schema:
$ref: '#/components/schemas/T'
- name: b
in: query
schema:
type: integer
responses:
'200':
description: ok
components:
schemas:
T:
type: object
`
// paramType returns the resolved Type of query parameter `name`, or "<empty>"
// when the schema resolves with no Type.
func paramType(t *testing.T, model *v3.Document, name string) string {
t.Helper()
for pair := model.Paths.PathItems.First(); pair != nil; pair = pair.Next() {
for _, p := range pair.Value().Get.Parameters {
if p.Name != name {
continue
}
if p.Schema != nil {
if ts := p.Schema.Schema().Type; len(ts) > 0 {
return strings.Join(ts, ",")
}
}
return "<empty>"
}
}
require.Failf(t, "parameter not found", "parameter %q not found", name)
return ""
}
// loadModel writes the given spec to a temp file and builds the v3 model.
func loadModel(t *testing.T, spec string) (libopenapi.Document, *v3.Document) {
t.Helper()
dir := t.TempDir()
path := filepath.Join(dir, "openapi.yaml")
require.NoError(t, os.WriteFile(path, []byte(spec), 0o644))
data, err := os.ReadFile(path)
require.NoError(t, err)
doc, err := libopenapi.NewDocumentWithConfiguration(data, &datamodel.DocumentConfiguration{
BasePath: dir,
})
require.NoError(t, err)
model, errs := doc.BuildV3Model()
require.Empty(t, errs)
return doc, &model.Model
}
func TestValidatorCorruptsModel(t *testing.T) {
// BUG: build the validator BEFORE parameter "b" is first read.
t.Run("validator built before reading -> type lost", func(t *testing.T) {
doc, model := loadModel(t, corruptSpec)
_, errs := validator.NewValidator(doc)
require.Empty(t, errs)
got := paramType(t, model, "b")
t.Logf("type of b after building validator: %q", got)
assert.Equal(t, "integer", got, "building the validator corrupted parameter b's type")
})
// CONTROL: read the schema first (caching it), then build the validator.
// This passes, proving the validator mutates not-yet-resolved schema nodes.
t.Run("schema read before validator -> type intact", func(t *testing.T) {
doc, model := loadModel(t, corruptSpec)
_ = paramType(t, model, "b") // force lazy resolution + cache
_, errs := validator.NewValidator(doc)
require.Empty(t, errs)
got := paramType(t, model, "b")
t.Logf("type of b after building validator: %q", got)
assert.Equal(t, "integer", got, "control unexpectedly broken")
})
// CONTROL: same shape but "a" is a DIRECT $ref instead of allOf-wrapped.
// "b" survives even when built first -- proving the combiner, not the $ref,
// is what triggers the corruption.
t.Run("direct ref (no combiner) -> type intact", func(t *testing.T) {
doc, model := loadModel(t, directRefSpec)
_, errs := validator.NewValidator(doc)
require.Empty(t, errs)
got := paramType(t, model, "b")
t.Logf("type of b after building validator (direct ref): %q", got)
assert.Equal(t, "integer", got, "direct-ref control broken")
})
}
Suggested fix
Defense-in-depth, independent of the go-yaml fix: cache warming (and any render/marshal during construction) should operate on a deep copy of the schema nodes rather than the live, shared model nodes — so building a validator can never mutate the caller's DocumentModel. Worth also flagging upstream in libopenapi that RenderInline returns nodes aliasing the live tree, and that lazy Type resolution depends on tags that get stripped.
Found another bug today.
Investigated it deeply with AI.
As I understand it, the real cause is go-yaml's bug. I reported it to the library (issue).
I am creating this issue to let you know about it + perhaps it is smart to implement a workaround for go-yaml's bug to not depend on it.
NOTE: I am just a library user who tries to dig into issues with AI's help :)
Below is a summary I've prepared with AI's help.
Summary
Constructing a
Validatorfrom alibopenapi.Documentmutates the shared high-levelDocumentModel. When one schema embeds a$refnested inside it (underallOf/anyOf/oneOf/not/items/properties), building the validator corrupts the lazy$refresolution of other schema nodes read afterwards — their declaredTypecomes back empty.Root cause:
NewValidatorFromV3Model→warmSchemaCachespre-compiles every schema at construction time, which renders andyaml.Marshals schema nodes. Those nodes are shared with the live model the caller still holds, and the marshal strips their resolved YAML tags in place (see yaml/go-yaml#371). SchemaTypeis resolved lazily from those nodes, so any schema read after construction now resolves with an emptyType.Environment
github.com/pb33f/libopenapi-validatorv0.13.11github.com/pb33f/libopenapiv0.38.1go.yaml.in/yaml/v4v4.0.0-rc.5Reproduction/evidence
Suggested fix
Defense-in-depth, independent of the go-yaml fix: cache warming (and any render/marshal during construction) should operate on a deep copy of the schema nodes rather than the live, shared model nodes — so building a validator can never mutate the caller's
DocumentModel. Worth also flagging upstream in libopenapi thatRenderInlinereturns nodes aliasing the live tree, and that lazyTyperesolution depends on tags that get stripped.