Skip to content

Commit 4da391c

Browse files
committed
mmaprototype: add top level comment for doStructuralNormalization
1 parent c7813f5 commit 4da391c

File tree

1 file changed

+227
-22
lines changed

1 file changed

+227
-22
lines changed

pkg/kv/kvserver/allocator/mmaprototype/constraint.go

Lines changed: 227 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -550,20 +550,211 @@ func buildVoterAndAllRelationships(
550550
return rels, emptyConstraintIndex, emptyVoterConstraintIndex, nil /*err*/
551551
}
552552

553-
// Structural normalization establishes relationships between every pair of
554-
// ConstraintsConjunctions in constraints and voterConstraints, and then tries
555-
// to map conjunctions in voterConstraints to narrower conjunctions in
556-
// constraints. This is done to handle configs which under-specify
557-
// conjunctions in voterConstraints under the assumption that one does not
558-
// need to repeat information provided in constraints (see the new
559-
// "strictness" comment in roachpb.SpanConfig which now requires users to
560-
// repeat the information).
561-
//
562-
// This function mutates the conf argument. It does some structural
563-
// normalization even when returning an error. See the under-specified voter
564-
// constraint examples in the datadriven test -- we sometimes see these in
565-
// production settings, and we want to fix ones that we can, and raise an
566-
// error for users to fix their configs.
553+
// doStructuralNormalization mutates config.VoterConstraints and
554+
// config.Constraints in place by making under-specified voter and all-replica
555+
// constraints more explicit when possible.
556+
//
557+
// Along the way, it also checks whether the config meets the strictness
558+
// requirements in roachpb.SpanConfig. If the config is ambiguous or cannot be
559+
// fully normalized, we continue with best-effort normalization and return an
560+
// error. The caller should surface that error to the operator, since the
561+
// resulting normalization is not guaranteed to be valid.
562+
//
563+
// (TODO(wenyihu6): we should clarify what the operator should do with the error
564+
// we return an error even when we re not violating spanconfig contract which is
565+
// confusing.)
566+
//
567+
// Phase 1: Normalizing Voter Constraints
568+
// Goal:
569+
//
570+
// - Voters must always be a subset of all replicas, so each voter constraint
571+
// must satisfy some all-replica constraint. When a voter constraint is too
572+
// broad or under-specified, we narrow it to a more specific form. This step
573+
// makes voter constraints explicit and strictly defined.
574+
//
575+
// Example (too broad):
576+
// constraints: [+region=a,+zone=a1]: 2, [+region=a,+zone=a2]: 2
577+
// voterConstraints: [+region=a]: 3 (broader than the all-replica
578+
// constraints)
579+
// constraints and needs to be narrowed to [+region=a,+zone=a1]: 2,
580+
// [+region=a,+zone=a2]: 1.
581+
//
582+
// Example (under-specified):
583+
// constraints: [+region=a]: 2, [+region=b]: 2
584+
// voterConstraints: num_voters=3 (no explicit conjunctions)
585+
// Historically, users did not need to repeat constraint information in
586+
// voterConstraints—the system would implicitly derive that voters must be
587+
// placed in region=a or region=b based on the all-replica constraints. The
588+
// new strictness requirement in roachpb.SpanConfig now requires users to
589+
// explicitly specify voterConstraints. In this example, we normalize legacy
590+
// configs to voterConstraints: [+region=a]: 2, [+region=b]: 1 by deriving
591+
// these explicit voter constraints from the all-replica constraints for
592+
// backwards compatibility.
593+
//
594+
// - While tightening voter constraints, we also detect cases where the config
595+
// is too ambiguous for us to confidently determine coverage. In those cases,
596+
// we continue with best-effort normalization and return an error. See
597+
// example on conjPossiblyIntersecting for more details.
598+
//
599+
// To achieve this, we "satisfy" each voter constraint by incrementally building
600+
// up from 0 to its desired numReplicas. For each voter constraint, we try to
601+
// associate its replicas with all-replica constraints that can cover them.
602+
//
603+
// We first establish relationships between every voter constraint conjunction
604+
// and every all-replica constraint conjunction. Then we process those
605+
// relationships in the following order (refer to conjunctionRelationship for
606+
// more details):
607+
//
608+
// - conjPossiblyIntersecting: there exists a voter constraint and an
609+
// all-replica constraint that has some shared and non-shared conjunctions. This
610+
// results in an error because we cannot prove whether a replica satisfying the
611+
// voter constraint is included in the corresponding all-replica constraint. We
612+
// essentially treat this as an ambiguous case and do not associate any replicas
613+
// using this relationship.
614+
//
615+
// Example:
616+
// constraints: [+region=a,+dc=dc1]: 2,
617+
// voterConstraints: [+region=a,+zone=a2]: 2
618+
//
619+
// The voter and all-replica constraints share +region=a but differ on
620+
// +zone=a1 vs +dc=dc1. We cannot determine if zone=a1 is in dc=dc1, so we
621+
// skip matching using this relationship and return an error.
622+
//
623+
// TODO(wenyihu6): its odd that we are returning an error when they are not
624+
// violating spanconfig contract. What do we expect the operator / the caller to
625+
// do in this case?
626+
//
627+
// - conjEqualSet: the voter constraint and the all-replica constraint have the
628+
// same set of conjunctions. We can directly satisfy the voter constraint by
629+
// using the all-replica constraint's numReplicas count.
630+
631+
// Example 1: (all-replica has fewer)
632+
// constraints: [+region=a,+zone=a1]: 1
633+
// voterConstraints: [+region=a,+zone=a1]: 2
634+
// We can only satisfy 1 of the 2 voters here. The remaining 1 must be satisfied
635+
// later by another relationship (e.g. conjStrictSubset with an empty
636+
// constraint).
637+
//
638+
// Example 2: (all-replica has more)
639+
// constraints: [+region=a,+zone=a1]: 3
640+
// voterConstraints: [+region=a,+zone=a1]: 2
641+
// We satisfy all 2 voters, using 2 of the 3 all-replica slots. The extra slot
642+
// can accommodate a non-voter.
643+
//
644+
// - conjStrictSubset: voter constraint is more specific than the all-replica
645+
// constraint. We can also directly satisfy since any replica matching the
646+
// voter constraint will also match the broader all-replica constraint.
647+
//
648+
// Example:
649+
// constraints: [+region=a,+zone=a1]: 2
650+
// voterConstraints: [+region=a,+zone=a1,+dc=dc1]: 2
651+
//
652+
// TODO(wenyihu6): I'm not sure how to convince the correctness of this method
653+
// when all-replica constraints are overlapping - add test case for this.
654+
//
655+
// - conjStrictSuperset: The voter constraint is more general than the
656+
// all-replica constraint, meaning it is under-specified. For voter constraints
657+
// that are still unsatisfied after the above steps, we satisfy them by
658+
// narrowing: creating new, tighter voter constraints derived from the
659+
// all-replica constraints. We process conjEqualSet and conjStrictSubset before
660+
// conjStrictSuperset because direct satisfaction is preferred over narrowing.
661+
//
662+
// Example:
663+
// constraints: [+region=a,+zone=a1]: 2, [+region=a,+zone=a2]: 2
664+
// voterConstraints: [+region=a]: 3
665+
//
666+
// The voter constraint "+region=a" is broader than both all-replica
667+
// constraints. Historically, we allowed this under-specification. The new
668+
// strictness requirement in roachpb.SpanConfig requires explicit repetition,
669+
// but we normalize existing configs for backwards compatibility and to avoid
670+
// breaking production deployments. We narrow it by creating more specific
671+
// voter constraints:
672+
// voterConstraints: [+region=a,+zone=a1]: 2, [+region=a,+zone=a2]: 1.
673+
//
674+
// Note: as a special case, right before narrowing, we prioritize associating
675+
// empty voter constraints with empty all-replica constraints to avoid
676+
// unnecessary narrowing.
677+
//
678+
// Example:
679+
// voter_constraints: []: 1
680+
// constraints: [+region=a]: 1, []: 1
681+
//
682+
// Without this step, voter_constraints: []:1 would be unnecessarily narrowed to
683+
// [+zone=a]:1 but []: 1 in constraints can already satisfy it.
684+
//
685+
// - conjNonIntersecting: the voter and all-replica constraints do not have any
686+
// shared conjunctions. This means that the replica from the voter constraint is
687+
// not covered by the corresponding all-replica constraint. There is nothing we
688+
// can associate here, so we just skip matching using this relationship.
689+
//
690+
// Example:
691+
// constraints: [+zone=a1]: 2
692+
// voterConstraints: [+zone=a2]: 2
693+
//
694+
// The voter constraint [+zone=a2] has no overlap with [+zone=a1]. We skip this
695+
// relationship, leaving the voter unsatisfied. Note that in the end, we will
696+
// still return the original voter constraint despite it being unsatisfied, with
697+
// an error.
698+
//
699+
// To do this, we track two pieces of state:
700+
//
701+
// - allReplicaConstraintsInfo: for each all-replica constraint, we track
702+
// remaining capacity (starting at conf.Constraints[i].numReplicas). We
703+
// decrement it whenever we use it to satisfy a voter constraint.
704+
//
705+
// - voterConstraintsAndAdditionalInfo: for each voter constraint, we track
706+
// how many replicas have been satisfied (starting at 0). We increment it
707+
// toward conf.VoterConstraints[i].numReplicas as we satisfy it using
708+
// all-replica constraints.
709+
//
710+
// After processing all relationships, we verify that each voter constraint has
711+
// reached its required replica count. If any voter constraint is still short,
712+
// we return an error since the normalization could not establish full coverage,
713+
// but we leave the voter constraint's numReplicas unchanged in the output.
714+
//
715+
// Example:
716+
// constraints: [+zone=a1]: 1
717+
// voterConstraints: [+zone=a1]: 3
718+
// We can only satisfy 1 of the 3 voters using [+zone=a1]:1. The remaining 2
719+
// cannot be satisfied by any relationship. We return an error but output
720+
// voterConstraints: [+zone=a1]: 3 unchanged.
721+
//
722+
// Phase 2: Normalizing All-Replica Constraints
723+
// After Phase 1, voter constraints are normalized, but all-replica constraints
724+
// may be under-specified compared to them. Phase 2 borrows from the empty
725+
// all-replica constraint to top up specific constraints so they satisfy voter
726+
// requirements.
727+
//
728+
// Example (num-replicas=5, num-voters=4):
729+
// Original:
730+
//
731+
// constraints: [+region=us-west-1]: 1, [+region=us-east-1]: 1, []: 3
732+
// voterConstraints: [+region=us-west-1]: 2, [+region=us-east-1]: 2
733+
//
734+
// After Phase 1 (voter constraints already explicit, so unchanged):
735+
//
736+
// constraints: [+region=us-west-1]: 1, [+region=us-east-1]: 1, []: 3
737+
// voterConstraints: [+region=us-west-1]: 2, [+region=us-east-1]: 2
738+
//
739+
// The all-replica constraints only require 1 replica per region, but voters
740+
// need 2. Consider if constraints were left under-specified, and we had only
741+
// 3 voters: 2 in us-west-1, 1 in us-east-1, and were temporarily unable to add
742+
// a voter in us-east-1. Say we lose the non-voter too, and need to add one.
743+
// With the under-specified constraint we could add the non-voter anywhere,
744+
// since we think we are allowed 3 replicas with the empty constraint
745+
// conjunction. This is technically true, but adding the non-voter to
746+
// us-east-1 where a voter is missing is more efficient.
747+
//
748+
// After Phase 2 (borrow from [] to satisfy voter requirements):
749+
//
750+
// constraints: [+region=us-west-1]: 2, [+region=us-east-1]: 2, []: 1
751+
// voterConstraints: [+region=us-west-1]: 2, [+region=us-east-1]: 2
752+
//
753+
// Now non-voter placement is correctly guided to the unconstrained slot.
754+
//
755+
// (TODO(wenyihu6): this is the one that I am still confused about. Sumeer
756+
// mentioned that it is unclear whether we truly need this. So lets revisit this
757+
// later.)
567758
func doStructuralNormalization(conf *normalizedSpanConfig) error {
568759
if len(conf.constraints) == 0 || len(conf.voterConstraints) == 0 {
569760
return nil
@@ -602,10 +793,15 @@ func doStructuralNormalization(conf *normalizedSpanConfig) error {
602793
// Even if there was an error, we will continue normalization.
603794

604795
// For each all-replica constraint, track how many replicas are remaining
605-
// (not already taken by a voter constraint). Additionally, when we find an
606-
// all replica constraint that is a subset of one or more voter constraints,
607-
// and create a new voter constraint, we record the index of that new voter
608-
// constraint in newVoterIndex.
796+
// (not already taken by a voter constraint).
797+
//
798+
// allReplicaConstraintsInfo.remainingReplicas starts out as
799+
// conf.constraints[i].numReplicas and decreases as we satisfy voter
800+
// constraints with it. During the phase for conjStrictSubset, when we find
801+
// an all replica constraint that is stricter than voter constraints
802+
// (conjStrictSuperset), we create a new voter constraint and record the
803+
// index of that new voter constraint in newVoterIndex.
804+
// len(allReplicaConstraints) == len(conf.constraints).
609805
type allReplicaConstraintsInfo struct {
610806
remainingReplicas int32
611807
newVoterIndex int
@@ -619,14 +815,18 @@ func doStructuralNormalization(conf *normalizedSpanConfig) error {
619815
newVoterIndex: -1,
620816
})
621817
}
622-
// For each voter replica constraint, we keep the current
623-
// internedConstraintsConjunction. The numReplicas start with 0 and build up
818+
// For each voter constraint, we keep the current
819+
// internedConstraintsConjunction.
820+
// internedConstraintsConjunction.numReplicas start with 0 and build up
624821
// towards the desired number in the corresponding un-normalized voter
625822
// constraint. In addition, if the voter constraint had a superset
626823
// relationship with one or more all-replica constraint, we may take some of
627824
// its voter count and construct narrower voter constraints.
628825
// additionalReplicas tracks the total count of replicas in such narrower
629-
// voter constraints.
826+
// voter constraints. len(voterConstraints) starts out as
827+
// len(conf.voterConstraints) but may grow if we create new voter
828+
// constraints to satisfy the strict superset relationships with all-replica
829+
// constraints in phase 3.
630830
type voterConstraintsAndAdditionalInfo struct {
631831
internedConstraintsConjunction
632832
additionalReplicas int32
@@ -738,6 +938,9 @@ func doStructuralNormalization(conf *normalizedSpanConfig) error {
738938
break
739939
}
740940
}
941+
942+
// TODO(wenyihu6): should we do something here for nonintersecting or intersecting constraints?
943+
// Does it make sense to surface an error here
741944
for i := range conf.voterConstraints {
742945
neededReplicas := conf.voterConstraints[i].numReplicas
743946
actualReplicas := voterConstraints[i].numReplicas + voterConstraints[i].additionalReplicas
@@ -747,7 +950,9 @@ func doStructuralNormalization(conf *normalizedSpanConfig) error {
747950
if actualReplicas < neededReplicas {
748951
err = errors.Errorf("could not satisfy all voter constraints due to " +
749952
"non-intersecting conjunctions in voter and all replica constraints")
750-
// Just force the satisfaction.
953+
// Just force the satisfaction by leaving the original numReplicas in the
954+
// output unchanged even though we couldn't satisfy it using any
955+
// all-replica constraint.
751956
//
752957
// NB: this is not the same as
753958
// voterConstraints[i].numReplicas=neededReplicas, since we may have

0 commit comments

Comments
 (0)