Host: MPC/SBN replica (RHEL 8.6, 251 GB RAM, HDD) Status: Deferred — memory fragmentation prevented full allocation on 2026-02-08. Best attempted during a maintenance reboot when contiguous memory is guaranteed.
With shared_buffers = 64GB, PostgreSQL manages 16.7 million 4 KB pages.
Each needs a TLB (Translation Lookaside Buffer) entry. Huge pages use 2 MB
pages instead, reducing the count to ~32,768 — small enough to fit in the
CPU's TLB cache. Expected benefit: 5-10% reduction in CPU overhead for
TLB-heavy workloads (index scans, replication apply).
id -g postgres
# Note this value — used in step 3 as <postgres_gid># shared_buffers in GB / 2 MB per page + 5% margin
# For 64 GB: 64 * 1024 / 2 * 1.05 = 34,406 → round to 34408
# For 58 GB: 58 * 1024 / 2 * 1.05 = 31,213 → round to 31216cat > /etc/sysctl.d/99-postgresql-hugepages.conf << EOF
vm.nr_hugepages = 34408
vm.hugetlb_shm_group = <postgres_gid>
EOFsysctl --systemgrep HugePages_Total /proc/meminfoIf HugePages_Total matches the requested count, proceed. If it's lower,
memory is fragmented — see Troubleshooting below.
In postgresql.conf:
huge_pages = 'on'
Use 'on' (not 'try') so misconfiguration fails visibly at startup
rather than silently falling back.
If shared_buffers doesn't fit in the available huge pages, reduce it:
shared_buffers = '58GB' # if only ~30,000 pages available
systemctl restart postgresql-15grep HugePages /proc/meminfoHugePages_Free should drop by approximately shared_buffers / 2MB.
The kernel needs contiguous 2 MB blocks. After running for weeks/months, memory becomes fragmented and the kernel can't assemble enough contiguous blocks.
Option A — Best: enable at boot time
Add vm.nr_hugepages=34408 to sysctl.d (already done in step 3).
At next reboot, pages are reserved before anything fragments memory.
Option B — Drop caches and retry
systemctl stop postgresql-15
sync
echo 3 > /proc/sys/vm/drop_caches
sysctl -w vm.nr_hugepages=34408
grep HugePages_Total /proc/meminfo
# If successful, start PostgreSQL:
systemctl start postgresql-15Option C — Reduce to what's available
On 2026-02-08, the kernel could allocate 30,205 of 34,408 requested pages (59 GB). Set shared_buffers = '58GB' and nr_hugepages = 30205 to match.
# Check the log
tail -20 /var/lib/pgsql/15/data/log/$(ls -t /var/lib/pgsql/15/data/log/ | head -1)
# Temporarily fall back
# In postgresql.conf: huge_pages = 'try'
systemctl start postgresql-15Common causes:
nr_hugepagestoo low forshared_buffershugetlb_shm_groupdoesn't match postgres GID- Huge pages reserved but locked by another process
# If postmaster.pid exists but postgres isn't running:
rm /var/lib/pgsql/15/data/postmaster.pid
systemctl start postgresql-15- Set
nr_hugepages = 34408— kernel allocated only 30,205 (fragmentation) - Set
huge_pages = 'on',shared_buffers = '64GB'— PostgreSQL failed - Changed to
huge_pages = 'try'— PostgreSQL started but was killed by monitoring tool querying missingpg_buffercache/pg_stat_statementsextensions, then failed to restart due to huge pages still reserving 59 GB - Fixed by:
sysctl -w vm.nr_hugepages=0,systemctl reset-failed, thensystemctl start postgresql-15 - Deferred huge pages to next maintenance reboot