Skip to content

docs: add configuration documentation guides#293

Draft
WentingWu666666 wants to merge 7 commits intodocumentdb:mainfrom
WentingWu666666:users/wentingwu/config-docs
Draft

docs: add configuration documentation guides#293
WentingWu666666 wants to merge 7 commits intodocumentdb:mainfrom
WentingWu666666:users/wentingwu/config-docs

Conversation

@WentingWu666666
Copy link
Collaborator

Summary

Adds comprehensive configuration documentation for the DocumentDB Kubernetes Operator, covering TLS, storage, networking, and resource management. This replaces scattered configuration details with organized, cross-linked guides that follow Material for MkDocs best practices.

Changes

New documentation pages

  • configuration/tls.md TLS modes (Disabled/SelfSigned/CertManager/Provided), certificate rotation, Azure Key Vault integration, and troubleshooting
  • configuration/storage.md Storage classes by provider, PVC sizing, volume expansion, reclaim policies, benchmarking, and disk encryption (AKS/EKS/GKE)
  • configuration/networking.md Service types (ClusterIP/LoadBalancer), cloud-specific LB annotations, DNS configuration, and Network Policies
  • configuration/resource-management.md CPU/memory sizing, QoS classes, workload profiles (dev/prod/high-load), and monitoring
  • configuration/cluster-configuration.md Guided CRD overview with full spec YAML, field tables, and cross-references

Documentation improvements

  • Reorganize mkdocs.yml nav with Configuration section and sub-items
  • Add Material for MkDocs features (tabs, code copy, admonitions, annotations)
  • Cross-link all config guides to auto-generated API Reference
  • Remove redundant Backup/ScheduledBackup from cluster-configuration.md (already in backup-and-restore.md and api-reference.md)
  • Slim down index.md Configuration section with links to new guides
  • Add "Further reading" section and FAQ entry pointing to API Reference
  • Add .gitignore entry for MkDocs site/ build output

Testing

  • Verified all internal cross-links between config guides and API Reference
  • Built docs locally with mkdocs serve all pages render correctly
  • Verified Material for MkDocs tabs, admonitions, and code annotations work

Checklist

  • Code follows project conventions
  • No unintended debug code or TODOs left in
  • Documentation is cross-linked and consistent

Related Issues

Refs #248

Copilot AI review requested due to automatic review settings March 6, 2026 20:05
@WentingWu666666 WentingWu666666 marked this pull request as draft March 6, 2026 20:07
Add comprehensive configuration documentation for the DocumentDB Kubernetes
Operator covering TLS, storage, networking, and resource management.

New documentation pages:
- configuration/tls.md  TLS modes (Disabled/SelfSigned/CertManager/Provided),
  certificate rotation, Azure Key Vault integration, and troubleshooting
- configuration/storage.md  Storage classes by provider, PVC sizing, volume
  expansion, reclaim policies, benchmarking, and disk encryption (AKS/EKS/GKE)
- configuration/networking.md  Service types (ClusterIP/LoadBalancer),
  cloud-specific LB annotations, DNS configuration, and Network Policies
- configuration/resource-management.md  CPU/memory sizing, QoS classes,
  workload profiles (dev/prod/high-load), and monitoring recommendations
- configuration/cluster-configuration.md  Guided overview of all CRD fields
  with full spec YAML, field tables, and cross-references

Documentation improvements:
- Reorganize mkdocs.yml nav with Configuration section and sub-items
- Add Material for MkDocs features (tabs, code copy, admonitions, annotations)
- Cross-link all config guides to auto-generated API Reference
- Remove redundant Backup/ScheduledBackup from cluster-configuration.md
  (covered by backup-and-restore.md and api-reference.md)
- Slim down index.md Configuration section with links to new guides
- Add .gitignore entry for MkDocs site/ build output
- Update advanced-configuration/README.md with links to new config pages
- Add FAQ entry pointing to API Reference

Refs documentdb#248

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Wenting Wu <wentingwu@microsoft.com>
@WentingWu666666 WentingWu666666 force-pushed the users/wentingwu/config-docs branch from fcd0ca0 to 21ba781 Compare March 6, 2026 20:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reorganizes and expands the operator’s public documentation by introducing a dedicated “Configuration” section (TLS, storage, networking, resource management, and CRD walkthrough) and wiring it into the MkDocs Material site structure.

Changes:

  • Add new configuration guides under docs/operator-public-documentation/preview/configuration/ and link them from the docs landing pages.
  • Update mkdocs.yml navigation and enable additional Material/MkDocs markdown features (tabs, annotations, etc.).
  • Add cross-references from existing docs (index/FAQ/backup/advanced-configuration) to the new guides and API reference.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
mkdocs.yml Adds MkDocs Material features, markdown extensions, and a new “Configuration” nav subtree.
docs/operator-public-documentation/preview/index.md Replaces scattered config snippets with links to the new configuration guides.
docs/operator-public-documentation/preview/faq.md Adds FAQ entry pointing to API Reference + Cluster Configuration guide.
docs/operator-public-documentation/preview/configuration/tls.md New TLS guide (modes, rotation/monitoring, troubleshooting, Key Vault notes).
docs/operator-public-documentation/preview/configuration/storage.md New storage guide (classes, sizing, expansion, reclaim policy, encryption).
docs/operator-public-documentation/preview/configuration/networking.md New networking guide (service types, DNS, annotations, NetworkPolicy examples).
docs/operator-public-documentation/preview/configuration/resource-management.md New resource sizing/QoS/monitoring guide and workload profiles.
docs/operator-public-documentation/preview/configuration/cluster-configuration.md New guided CRD overview with YAML + field tables and links to API reference.
docs/operator-public-documentation/preview/backup-and-restore.md Adds a tip linking to the API Reference section for Backup CRDs.
docs/operator-public-documentation/preview/advanced-configuration/README.md Removes duplicated config sections and points readers to the new guides.
.gitignore Ignores MkDocs site/ build output.

Comment on lines +155 to +156
# View resource usage for DocumentDB pods
kubectl top pods -n default -l app.kubernetes.io/name=documentdb
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The label selector app.kubernetes.io/name=documentdb likely won’t match DocumentDB/CNPG pods created by the operator. The operator’s Service selector uses app: <documentdb-name> (plus cnpg.io/instanceRole=primary), so these examples should use -l app=<cluster-name> (or document the correct labels) to make the commands work.

Suggested change
# View resource usage for DocumentDB pods
kubectl top pods -n default -l app.kubernetes.io/name=documentdb
# View resource usage for primary DocumentDB pods (replace <cluster-name> with your DocumentDB cluster name)
kubectl top pods -n default -l app=<cluster-name>,cnpg.io/instanceRole=primary

Copilot uses AI. Check for mistakes.

- **SelfSigned mode**: [cert-manager](https://cert-manager.io/) must be installed in the cluster
- **CertManager mode**: [cert-manager](https://cert-manager.io/) installed, plus a configured Issuer or ClusterIssuer
- **Provided mode**: A Kubernetes TLS Secret containing `tls.crt`, `tls.key`, and `ca.crt`
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs state that Provided mode requires ca.crt, but the operator currently only validates presence of tls.crt and tls.key on the Secret. Either update the docs to clarify ca.crt is required for client verification (but not enforced by the operator), or update the operator behavior/validation to match the documented requirement.

Suggested change
- **Provided mode**: A Kubernetes TLS Secret containing `tls.crt`, `tls.key`, and `ca.crt`
- **Provided mode**: A Kubernetes TLS Secret containing `tls.crt` and `tls.key` (required by the operator). Include `ca.crt` if clients need a CA bundle for server certificate verification; the operator does not enforce its presence.

Copilot uses AI. Check for mistakes.
spec:
podSelector:
matchLabels:
app.kubernetes.io/name: documentdb
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These NetworkPolicy examples select pods via app.kubernetes.io/name: documentdb, but the operator-created Service targets pods labeled app: <documentdb-name> (and CNPG sets cnpg.io/instanceRole). Update the podSelector.matchLabels to use the labels actually applied to the database pods so the policies affect the intended workloads.

Suggested change
app.kubernetes.io/name: documentdb
app: <documentdb-name> # Matches the app label on DocumentDB pods
cnpg.io/instanceRole: primary # Target the primary instance role

Copilot uses AI. Check for mistakes.
Comment on lines +296 to +299
podSelector:
matchLabels:
app.kubernetes.io/name: documentdb
policyTypes:
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This NetworkPolicy uses app.kubernetes.io/name: documentdb in the pod selector; it likely won’t match the operator-managed database pods (which are labeled with app: <documentdb-name>). Adjust the selector (and placeholders) so the policy actually applies to the intended DocumentDB pods.

Copilot uses AI. Check for mistakes.
Comment on lines +322 to +325
podSelector:
matchLabels:
app.kubernetes.io/name: documentdb
policyTypes:
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inter-pod NetworkPolicy example uses app.kubernetes.io/name: documentdb for both selectors, but operator-managed CNPG pods are selected by app: <documentdb-name> (and roles via cnpg.io/instanceRole). Update the selectors so the policy actually permits traffic among the correct pods.

Copilot uses AI. Check for mistakes.

```bash
# Extract the CA certificate
kubectl get secret documentdb-gateway-cert-tls -n default \
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Secret name here is hard-coded as documentdb-gateway-cert-tls, but the operator’s cert-manager Secret name is derived from the DocumentDB name (for my-documentdb, it would be my-documentdb-gateway-cert-tls unless overridden). Update the example to use the correct name/pattern so the command works as-written.

Suggested change
kubectl get secret documentdb-gateway-cert-tls -n default \
kubectl get secret my-documentdb-gateway-cert-tls -n default \

Copilot uses AI. Check for mistakes.
## Additional Resources

- [Public Documentation](https://documentdb.io/documentdb-kubernetes-operator/preview/)
- [Configuration Guides](../configuration/tls.md) — TLS, Storage, Networking, and Resource Management
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link is labeled “Configuration Guides” and the description lists multiple guides, but it points specifically to the TLS page. Consider linking to a more general configuration landing page (e.g., Cluster Configuration) or renaming the link text to reflect that it goes to TLS.

Suggested change
- [Configuration Guides](../configuration/tls.md) — TLS, Storage, Networking, and Resource Management
- [TLS Configuration Guide](../configuration/tls.md) — TLS modes, certificate rotation, and troubleshooting

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +6
For core configuration topics, see the [Configuration](../configuration/tls.md) guides:

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link text “Configuration” points directly to the TLS page, which is only one of the configuration guides listed below. Consider linking to a more general entry point (e.g., Cluster Configuration) or adjusting the link text to “TLS configuration” to avoid implying it’s a configuration overview.

Copilot uses AI. Check for mistakes.
retentionDays: 30
```

1. Three instances provide Guaranteed QoS with one primary and two replicas for automatic failover.
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note implies that running 3 instances results in Guaranteed QoS, but QoS class is determined by CPU/memory requests and limits (requests==limits => Guaranteed), not by replica count. Consider rewording this note to describe HA/failover benefits, and keep QoS guidance tied to resource requests/limits.

Suggested change
1. Three instances provide Guaranteed QoS with one primary and two replicas for automatic failover.
1. Three instances provide high availability with one primary and two replicas for automatic failover.

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +23
| `Disabled` (default) | No TLS encryption | Development and testing only |
| `SelfSigned` | Automatic certificates via cert-manager with a self-signed CA | Development, testing, and environments without external PKI (Public Key Infrastructure) |
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tls.gateway.mode setting documents Disabled as the default ("Disabled" (default)), which means new DocumentDB clusters are created with TLS turned off unless users explicitly opt in. This default leaves database credentials and data in transit unencrypted by default, so anyone with network access to the gateway service (e.g., via a LoadBalancer or compromised pod) can passively capture or tamper with traffic. Consider making a secure mode (such as SelfSigned or another TLS-enabled option) the default and requiring an explicit opt-out for non-production use, so production deployments are not left without transport encryption by mistake.

Suggested change
| `Disabled` (default) | No TLS encryption | Development and testing only |
| `SelfSigned` | Automatic certificates via cert-manager with a self-signed CA | Development, testing, and environments without external PKI (Public Key Infrastructure) |
| `Disabled` | No TLS encryption (not recommended; must be explicitly opted into) | Development and testing only, never production |
| `SelfSigned` (default) | Automatic certificates via cert-manager with a self-signed CA | Development, testing, and secure-by-default environments without external PKI (Public Key Infrastructure) |

Copilot uses AI. Check for mistakes.
wentingwu000 and others added 6 commits March 6, 2026 16:13
- Reorganize TLS docs: move supported modes and prerequisites into each mode tab
- Add description, best-for, and prerequisite notes to each TLS mode
- Simplify CertManager tab with clear workflow and external links
- Replace inline field reference tables with API Reference links
- Replace Azure Key Vault section with link to setup guide
- Add cert-manager install links pointing to index.md
- Add documentation testing instructions to development-environment.md
- Add cross-reference from CONTRIBUTING.md to dev environment guide

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Wenting Wu <wentingwu@microsoft.com>
- Remove redundant content, internal implementation details, and generic Kubernetes info
- Networking: consolidate into nested tabs (Internal/External with cloud sub-tabs)
- Resource management: remove QoS deep-dive, internal operator resources, monitoring
- Storage: simplify cloud tabs, remove PV security and block storage sections
- TLS: fix ca.crt requirement (optional per source code), trim troubleshooting
- All pages: replace inline field tables with API Reference links

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Wenting Wu <wentingwu@microsoft.com>
Signed-off-by: Wenting Wu <wentingwu@microsoft.com>
- Standardize all config docs: merge intro into Overview, add spec
  quick-look with kind: DocumentDB, link to API reference
- Storage: reorder sections (PVC Sizing > Reclaim Policy > Storage
  Classes > Disk Encryption), add PV/PVC concept links, link to
  backup-and-restore for retained PV recovery
- Storage: clarify PVC resize not yet supported (see documentdb#298), remove
  unverified resize-without-downtime claim and generic sizing table
- Storage: explain StorageClass concept with link, show kubectl command
  to find default, clarify default behavior
- TLS: add spec quick-look in Overview
- Networking: add spec quick-look in Overview

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Wenting Wu <wentingwu@microsoft.com>
Signed-off-by: Wenting Wu <wentingwu@microsoft.com>
Signed-off-by: Wenting Wu <wentingwu@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants