Skip to content

kpatch-build: incorrect or missing correlation of references to .rodata.*.str1_1 leads to kernel crashes #1225

@euspectre

Description

@euspectre

kpatch: 0.9.4
OS: CentOS 8 x86_64
Kernel: 4.18.0-305.19.1.el8_4.x86_64

While investigating an unrelated bug, I found that kpatch-build/create-diff-object probably does not correlate references to strings from .rodata.*.str1.1 properly and that causes difficult to debug kernel crashes.

Consider the following simple patch for sctp:

diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
index bfe3c26d1..9f9691063 100644
--- a/net/sctp/sysctl.c
+++ b/net/sctp/sysctl.c
@@ -370,6 +370,9 @@ static int proc_sctp_do_hmac_alg(struct ctl_table *ctl, int write,
 
 	memset(&tbl, 0, sizeof(struct ctl_table));
 
+	pr_info("[TEST] %s\n",
+		net->sctp.sctp_hmac_alg ? : none);
+
 	if (write) {
 		tbl.data = tmp;
 		tbl.maxlen = sizeof(tmp);

Note that, the locations of strings "md5" and "sha1" in .rodata.proc_sctp_do_hmac_alg.str1.1 changed in the patched code, because the printk format string was added:

[root@8176a445a8da centos8]# readelf -p .rodata.proc_sctp_do_hmac_alg.str1.1 kpatch/tmp/orig/net/sctp/sysctl.o       
String dump of section '.rodata.proc_sctp_do_hmac_alg.str1.1':
  [     0]  none
  [     5]  md5
  [     9]  sha1

[root@8176a445a8da centos8]# readelf -p .rodata.proc_sctp_do_hmac_alg.str1.1 kpatch/tmp/patched/net/sctp/sysctl.o
String dump of section '.rodata.proc_sctp_do_hmac_alg.str1.1':
  [     0]  none
  [     6]  6sctp: [TEST] %s\n
  [    18]  md5
  [    1c]  sha1

The strings are used as follows in proc_sctp_do_hmac_alg():

#ifdef CONFIG_CRYPTO_MD5
		if (!strncmp(tmp, "md5", 3)) {
			net->sctp.sctp_hmac_alg = "md5";
			changed = true;
		}
#endif
#ifdef CONFIG_CRYPTO_SHA1
		if (!strncmp(tmp, "sha1", 4)) {
			net->sctp.sctp_hmac_alg = "sha1";
			changed = true;
		}
#endif

Although this part of the source code remains unchanged, the binary code of the patched file is different because it refers to different locations in .rodata.proc_sctp_do_hmac_alg.str1.1.

The patch module has its own "md5" and "sha1" strings now, and the patched proc_sctp_do_hmac_alg() refers to them rather than to the original ones.

This is problematic because the addresses of the strings are stored in net->sctp.sctp_hmac_alg for later use. If we unload the patch module at the appropriate moment, the kernel will still keep the dangling addresses of these strings in the patch module and will crash when it tries to access them.

[root@centos8 temp]# kpatch load ./kpatch-sctp-str.ko 
loading patch module: ./kpatch-sctp-str.ko

[root@centos8 temp]# cat /sys/module/kpatch_sctp_str/sections/.rodata.proc_sctp_do_hmac_alg.str1.1
ffffffffc075f5f5

[root@centos8 temp]# modprobe sctp

[root@centos8 temp]# cat /proc/sys/net/sctp/cookie_hmac_alg
sha1

[root@centos8 temp]# echo md5 > /proc/sys/net/sctp/cookie_hmac_alg

[root@centos8 temp]# cat /proc/sys/net/sctp/cookie_hmac_alg
md5

[root@centos8 temp]# kpatch unload --all
disabling patch module: kpatch_sctp_str
waiting (up to 15 seconds) for patch transition to complete...
transition complete (2 seconds)
unloading patch module: kpatch_sctp_str

[root@centos8 temp]# cat /proc/sys/net/sctp/cookie_hmac_alg
[kernel crashed]

From vmcore-dmesg.txt:

[  143.827410] kpatch_sctp_str: loading out-of-tree module taints kernel.
[  143.827433] kpatch_sctp_str: tainting kernel with TAINT_LIVEPATCH
[  143.827474] kpatch_sctp_str: module verification failed: signature and/or required key missing - tainting kernel
[  143.827701] livepatch: enabling patch 'kpatch_sctp_str'
[  143.827752] livepatch: 'kpatch_sctp_str': starting patching transition
[  143.828698] livepatch: 'kpatch_sctp_str': patching complete
[  158.874261] livepatch: applying patch 'kpatch_sctp_str' to loading module 'sctp'
[  158.877213] sctp: Hash tables configured (bind 256/256)
[  189.311978] sctp: [TEST] sha1
[  189.312009] sctp: [TEST] sha1
[  212.399190] sctp: [TEST] sha1
[  232.559653] sctp: [TEST] md5
[  232.559677] sctp: [TEST] md5
[  250.031148] livepatch: 'kpatch_sctp_str': starting unpatching transition
[  251.565087] livepatch: 'kpatch_sctp_str': unpatching complete
[  260.230959] BUG: unable to handle kernel paging request at ffffffffc075f60d
[  260.230983] PGD 3d013067 P4D 3d013067 PUD 3d015067 PMD 16ea60067 PTE 0
[  260.230997] Oops: 0000 [#1] SMP PTI
[  260.231006] CPU: 1 PID: 5455 Comm: cat Kdump: loaded Tainted: G           OE K  --------- -  - 4.18.0-305.19.1.el8_4.x86_64 #1
[  260.231028] Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.2 04/01/2014
[  260.231052] RIP: 0010:strlen+0x0/0x20
[  260.231061] Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
[  260.231095] RSP: 0018:ffff9d9e0127fde8 EFLAGS: 00010282
[  260.231116] RAX: ffffffffc08cd311 RBX: ffff9d9e0127fe00 RCX: 0000000000000000
[  260.231130] RDX: 00007f67ac256000 RSI: 0000000000000000 RDI: ffffffffc075f60d
[  260.231144] RBP: ffffffffffffffea R08: ffff9d9e0127ff08 R09: 0000000000000000
[  260.231158] R10: ffff9d9e0127fe80 R11: 0000000000000000 R12: ffffffffaf9a53c0
[  260.231171] R13: ffff9d9e0127fe80 R14: 00007f67ac256000 R15: ffff8e2f2e5cc1c0
[  260.231185] FS:  00007f67b93f5540(0000) GS:ffff8e2f3bd00000(0000) knlGS:0000000000000000
[  260.231200] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  260.231214] CR2: ffffffffc075f60d CR3: 000000017976c004 CR4: 00000000001706e0
[  260.231230] Call Trace:
[  260.231254]  proc_sctp_do_hmac_alg+0x15b/0x190 [sctp]
[  260.231272]  proc_sys_call_handler+0x1a5/0x1c0
[  260.231287]  vfs_read+0x91/0x140
[  260.231295]  ksys_read+0x4f/0xb0
[  260.231304]  do_syscall_64+0x5b/0x1a0
[  260.231315]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[  260.231327] RIP: 0033:0x7f67b931d0a5
[  260.231336] Code: fe ff ff 50 48 8d 3d f2 04 0a 00 e8 f5 fd 01 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 f5 5f 0d 00 8b 00 85 c0 75 0f 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 53 c3 66 90 41 54 49 89 d4 55 48 89 f5 53 89
[  260.231370] RSP: 002b:00007ffda1b00468 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  260.231835] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f67b931d0a5
[  260.232289] RDX: 0000000000020000 RSI: 00007f67ac256000 RDI: 0000000000000003
[  260.232748] RBP: 00007f67ac256000 R08: 00000000ffffffff R09: 0000000000000000
[  260.233201] R10: 0000000000000022 R11: 0000000000000246 R12: 00007f67ac256000
[  260.233672] R13: 0000000000000003 R14: 0000000000000fff R15: 0000000000020000
[  260.234288] Modules linked in: sctp nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_tables_set nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter intel_rapl_msr intel_rapl_common kvm_intel kvm snd_hda_codec_generic ledtrig_audio irqbypass snd_hda_intel crct10dif_pclmul snd_intel_dspcfg soundwire_intel qxl soundwire_generic_allocation crc32_pclmul drm_ttm_helper ttm snd_soc_core snd_compress drm_kms_helper soundwire_cadence soundwire_bus syscopyarea sysfillrect snd_hda_codec sysimgblt fb_sys_fops snd_hda_core snd_hwdep snd_seq drm snd_seq_device iTCO_wdt iTCO_vendor_support snd_pcm ghash_clmulni_intel pcspkr snd_timer snd i2c_i801 soundcore virtio_balloon lpc_ich ip_tables xfs libcrc32c sr_mod cdrom sd_mod t10_pi sg ahci libahci
[  260.234311]  libata crc32c_intel virtio_net net_failover serio_raw failover virtio_console dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kpatch_sctp_str]
[  260.239265] CR2: ffffffffc075f60d

Note that ffffffffc075f60d mentioned in the "BUG" message is .rodata.proc_sctp_do_hmac_alg.str1.1+0x18, the address of "md5" string in the patch module.

As you can see, it is quite easy to miss such string references when preparing patches. They are relatively rare in the kernel but can still be found.

I guess, create-diff-object could detect that a string did not change in the patched object and replace the relocation with the one for the original data. Or, perhaps, you have better ideas how to fix it.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions