forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 0
c9f2 (5.10.x) for MCom-03 #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ghost
wants to merge
9
commits into
c9f2-std-def
Choose a base branch
from
c9f2-5.10.x-mcom03
base: c9f2-std-def
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The DTB files are taken from their U-Boot repo because kernel DTBs have unresolved problems.
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit 8253a34 upstream. When passing 'phys' in the devicetree to describe the USB PHY phandle (which is the recommended way according to Documentation/devicetree/bindings/usb/ci-hdrc-usb2.txt) the following NULL pointer dereference is observed on i.MX7 and i.MX8MM: [ 1.489344] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098 [ 1.498170] Mem abort info: [ 1.500966] ESR = 0x96000044 [ 1.504030] EC = 0x25: DABT (current EL), IL = 32 bits [ 1.509356] SET = 0, FnV = 0 [ 1.512416] EA = 0, S1PTW = 0 [ 1.515569] FSC = 0x04: level 0 translation fault [ 1.520458] Data abort info: [ 1.523349] ISV = 0, ISS = 0x00000044 [ 1.527196] CM = 0, WnR = 1 [ 1.530176] [0000000000000098] user address but active_mm is swapper [ 1.536544] Internal error: Oops: 96000044 [#1] PREEMPT SMP [ 1.542125] Modules linked in: [ 1.545190] CPU: 3 PID: 7 Comm: kworker/u8:0 Not tainted 5.14.0-dirty #3 [ 1.551901] Hardware name: Kontron i.MX8MM N801X S (DT) [ 1.557133] Workqueue: events_unbound deferred_probe_work_func [ 1.562984] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--) [ 1.568998] pc : imx7d_charger_detection+0x3f0/0x510 [ 1.573973] lr : imx7d_charger_detection+0x22c/0x510 This happens because the charger functions check for the phy presence inside the imx_usbmisc_data structure (data->usb_phy), but the chipidea core populates the usb_phy passed via 'phys' inside 'struct ci_hdrc' (ci->usb_phy) instead. This causes the NULL pointer dereference inside imx7d_charger_detection(). Fix it by also searching for 'phys' in case 'fsl,usbphy' is not found. Tested on a imx7s-warp board. Fixes: 746f316 ("usb: chipidea: introduce imx7d USB charger detection") Cc: stable@vger.kernel.org Reported-by: Heiko Thiery <heiko.thiery@gmail.com> Tested-by: Frieder Schrempf <frieder.schrempf@kontron.de> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Acked-by: Peter Chen <peter.chen@kernel.org> Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20210921113754.767631-1-festevam@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit bb8958d upstream. On SiFive Unmatched, I recently fell onto the following BUG when booting: [ 0.000000] ftrace: allocating 36610 entries in 144 pages [ 0.000000] Oops - illegal instruction [#1] [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.13.1+ #5 [ 0.000000] Hardware name: SiFive HiFive Unmatched A00 (DT) [ 0.000000] epc : riscv_cpuid_to_hartid_mask+0x6/0xae [ 0.000000] ra : __sbi_rfence_v02+0xc8/0x10a [ 0.000000] epc : ffffffff80007240 ra : ffffffff80009964 sp : ffffffff81803e10 [ 0.000000] gp : ffffffff81a1ea70 tp : ffffffff8180f500 t0 : ffffffe07fe30000 [ 0.000000] t1 : 0000000000000004 t2 : 0000000000000000 s0 : ffffffff81803e60 [ 0.000000] s1 : 0000000000000000 a0 : ffffffff81a22238 a1 : ffffffff81803e10 [ 0.000000] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.000000] a5 : 0000000000000000 a6 : ffffffff8000989c a7 : 0000000052464e43 [ 0.000000] s2 : ffffffff81a220c8 s3 : 0000000000000000 s4 : 0000000000000000 [ 0.000000] s5 : 0000000000000000 s6 : 0000000200000100 s7 : 0000000000000001 [ 0.000000] s8 : ffffffe07fe04040 s9 : ffffffff81a22c80 s10: 0000000000001000 [ 0.000000] s11: 0000000000000004 t3 : 0000000000000001 t4 : 0000000000000008 [ 0.000000] t5 : ffffffcf04000808 t6 : ffffffe3ffddf188 [ 0.000000] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000002 [ 0.000000] [<ffffffff80007240>] riscv_cpuid_to_hartid_mask+0x6/0xae [ 0.000000] [<ffffffff80009474>] sbi_remote_fence_i+0x1e/0x26 [ 0.000000] [<ffffffff8000b8f4>] flush_icache_all+0x12/0x1a [ 0.000000] [<ffffffff8000666c>] patch_text_nosync+0x26/0x32 [ 0.000000] [<ffffffff8000884e>] ftrace_init_nop+0x52/0x8c [ 0.000000] [<ffffffff800f051e>] ftrace_process_locs.isra.0+0x29c/0x360 [ 0.000000] [<ffffffff80a0e3c6>] ftrace_init+0x80/0x130 [ 0.000000] [<ffffffff80a00f8c>] start_kernel+0x5c4/0x8f6 [ 0.000000] ---[ end trace f67eb9af4d8d492b ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- While ftrace is looping over a list of addresses to patch, it always failed when patching the same function: riscv_cpuid_to_hartid_mask. Looking at the backtrace, the illegal instruction is encountered in this same function. However, patch_text_nosync, after patching the instructions, calls flush_icache_range. But looking at what happens in this function: flush_icache_range -> flush_icache_all -> sbi_remote_fence_i -> __sbi_rfence_v02 -> riscv_cpuid_to_hartid_mask The icache and dcache of the current cpu are never synchronized between the patching of riscv_cpuid_to_hartid_mask and calling this same function. So fix this by flushing the current cpu's icache before asking for the other cpus to do the same. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Fixes: fab957c ("RISC-V: Atomic and Locking Code") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit 30e29a9 ] In prealloc_elems_and_freelist(), the multiplication to calculate the size passed to bpf_map_area_alloc() could lead to an integer overflow. As a result, out-of-bounds write could occur in pcpu_freelist_populate() as reported by KASAN: [...] [ 16.968613] BUG: KASAN: slab-out-of-bounds in pcpu_freelist_populate+0xd9/0x100 [ 16.969408] Write of size 8 at addr ffff888104fc6ea0 by task crash/78 [ 16.970038] [ 16.970195] CPU: 0 PID: 78 Comm: crash Not tainted 5.15.0-rc2+ #1 [ 16.970878] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 16.972026] Call Trace: [ 16.972306] dump_stack_lvl+0x34/0x44 [ 16.972687] print_address_description.constprop.0+0x21/0x140 [ 16.973297] ? pcpu_freelist_populate+0xd9/0x100 [ 16.973777] ? pcpu_freelist_populate+0xd9/0x100 [ 16.974257] kasan_report.cold+0x7f/0x11b [ 16.974681] ? pcpu_freelist_populate+0xd9/0x100 [ 16.975190] pcpu_freelist_populate+0xd9/0x100 [ 16.975669] stack_map_alloc+0x209/0x2a0 [ 16.976106] __sys_bpf+0xd83/0x2ce0 [...] The possibility of this overflow was originally discussed in [0], but was overlooked. Fix the integer overflow by changing elem_size to u64 from u32. [0] https://lore.kernel.org/bpf/728b238e-a481-eb50-98e9-b0f430ab01e7@gmail.com/ Fixes: 557c0c6 ("bpf: convert stackmap to pre-allocation") Signed-off-by: Tatsuhiko Yasumatsu <th.yasumatsu@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210930135545.173698-1-th.yasumatsu@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit 560ee19 ] syzbot reported another NULL deref in fifo_set_limit() [1] I could repro the issue with : unshare -n tc qd add dev lo root handle 1:0 tbf limit 200000 burst 70000 rate 100Mbit tc qd replace dev lo parent 1:0 pfifo_fast tc qd change dev lo root handle 1:0 tbf limit 300000 burst 70000 rate 100Mbit pfifo_fast does not have a change() operation. Make fifo_set_limit() more robust about this. [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 1cf99067 P4D 1cf99067 PUD 7ca49067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 1 PID: 14443 Comm: syz-executor959 Not tainted 5.15.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. RSP: 0018:ffffc9000e2f7310 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffffffff8d6ecc00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff888024c27910 RDI: ffff888071e34000 RBP: ffff888071e34000 R08: 0000000000000001 R09: ffffffff8fcfb947 R10: 0000000000000001 R11: 0000000000000000 R12: ffff888024c27910 R13: ffff888071e34018 R14: 0000000000000000 R15: ffff88801ef74800 FS: 00007f321d897700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000722c3000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: fifo_set_limit net/sched/sch_fifo.c:242 [inline] fifo_set_limit+0x198/0x210 net/sched/sch_fifo.c:227 tbf_change+0x6ec/0x16d0 net/sched/sch_tbf.c:418 qdisc_change net/sched/sch_api.c:1332 [inline] tc_modify_qdisc+0xd9a/0x1a60 net/sched/sch_api.c:1634 rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5572 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340 netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:724 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409 ___sys_sendmsg+0xf3/0x170 net/socket.c:2463 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2492 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: fb0305c ("net-sched: consolidate default fifo qdisc setup") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20210930212239.3430364-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit 3e607dc ] Emergency stack path was jumping into a 3: label inside the __GEN_COMMON_BODY macro for the normal path after it had finished, rather than jumping over it. By a small miracle this is the correct place to build up a new interrupt frame with the existing stack pointer, so things basically worked okay with an added weird looking 700 trap frame on top (which had the wrong ->nip so it didn't decode bug messages either). Fix this by avoiding using numeric labels when jumping over non-trivial macros. Before: LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 88 Comm: sh Not tainted 5.15.0-rc2-00034-ge057cdade6e5 #2637 NIP: 7265677368657265 LR: c00000000006c0c8 CTR: c0000000000097f0 REGS: c0000000fffb3a50 TRAP: 0700 Not tainted MSR: 9000000000021031 <SF,HV,ME,IR,DR,LE> CR: 00000700 XER: 20040000 CFAR: c0000000000098b0 IRQMASK: 0 GPR00: c00000000006c964 c0000000fffb3cf0 c000000001513800 0000000000000000 GPR04: 0000000048ab0778 0000000042000000 0000000000000000 0000000000001299 GPR08: 000001e447c718ec 0000000022424282 0000000000002710 c00000000006bee8 GPR12: 9000000000009033 c0000000016b0000 00000000000000b0 0000000000000001 GPR16: 0000000000000000 0000000000000002 0000000000000000 0000000000000ff8 GPR20: 0000000000001fff 0000000000000007 0000000000000080 00007fff89d90158 GPR24: 0000000002000000 0000000002000000 0000000000000255 0000000000000300 GPR28: c000000001270000 0000000042000000 0000000048ab0778 c000000080647e80 NIP [7265677368657265] 0x7265677368657265 LR [c00000000006c0c8] ___do_page_fault+0x3f8/0xb10 Call Trace: [c0000000fffb3cf0] [c00000000000bdac] soft_nmi_common+0x13c/0x1d0 (unreliable) --- interrupt: 700 at decrementer_common_virt+0xb8/0x230 NIP: c0000000000098b8 LR: c00000000006c0c8 CTR: c0000000000097f0 REGS: c0000000fffb3d60 TRAP: 0700 Not tainted MSR: 9000000000021031 <SF,HV,ME,IR,DR,LE> CR: 22424282 XER: 20040000 CFAR: c0000000000098b0 IRQMASK: 0 GPR00: c00000000006c964 0000000000002400 c000000001513800 0000000000000000 GPR04: 0000000048ab0778 0000000042000000 0000000000000000 0000000000001299 GPR08: 000001e447c718ec 0000000022424282 0000000000002710 c00000000006bee8 GPR12: 9000000000009033 c0000000016b0000 00000000000000b0 0000000000000001 GPR16: 0000000000000000 0000000000000002 0000000000000000 0000000000000ff8 GPR20: 0000000000001fff 0000000000000007 0000000000000080 00007fff89d90158 GPR24: 0000000002000000 0000000002000000 0000000000000255 0000000000000300 GPR28: c000000001270000 0000000042000000 0000000048ab0778 c000000080647e80 NIP [c0000000000098b8] decrementer_common_virt+0xb8/0x230 LR [c00000000006c0c8] ___do_page_fault+0x3f8/0xb10 --- interrupt: 700 Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 6d28218e0cc3c949 ]--- After: ------------[ cut here ]------------ kernel BUG at arch/powerpc/kernel/exceptions-64s.S:491! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 88 Comm: login Not tainted 5.15.0-rc2-00034-ge057cdade6e5-dirty #2638 NIP: c0000000000098b8 LR: c00000000006bf04 CTR: c0000000000097f0 REGS: c0000000fffb3d60 TRAP: 0700 Not tainted MSR: 9000000000021031 <SF,HV,ME,IR,DR,LE> CR: 24482227 XER: 00040000 CFAR: c0000000000098b0 IRQMASK: 0 GPR00: c00000000006bf04 0000000000002400 c000000001513800 c000000001271868 GPR04: 00000000100f0d29 0000000042000000 0000000000000007 0000000000000009 GPR08: 00000000100f0d29 0000000024482227 0000000000002710 c000000000181b3c GPR12: 9000000000009033 c0000000016b0000 00000000100f0d29 c000000005b22f00 GPR16: 00000000ffff0000 0000000000000001 0000000000000009 00000000100eed90 GPR20: 00000000100eed90 0000000010000000 000000001000a49c 00000000100f1430 GPR24: c000000001271868 0000000002000000 0000000000000215 0000000000000300 GPR28: c000000001271800 0000000042000000 00000000100f0d29 c000000080647860 NIP [c0000000000098b8] decrementer_common_virt+0xb8/0x230 LR [c00000000006bf04] ___do_page_fault+0x234/0xb10 Call Trace: Instruction dump: 4182000c 39400001 48000008 894d0932 714a0001 39400008 408225fc 718a4000 7c2a0b78 3821fcf0 41c20008 e82d0910 <0981fcf0> f92101a0 f9610170 f9810178 ---[ end trace a5dbd1f5ea4ccc51 ]--- Fixes: 0a882e2 ("powerpc/64s/exception: remove bad stack branch") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211004145642.1331214-2-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit eb8257a ] On pseries LPAR when an empty slot is assigned to partition OR in single LPAR mode, kdump kernel crashes during issuing PHB reset. In the kdump scenario, we traverse all PHBs and issue reset using the pe_config_addr of the first child device present under each PHB. However the code assumes that none of the PHB slots can be empty and uses list_first_entry() to get the first child device under the PHB. Since list_first_entry() expects the list to be non-empty, it returns an invalid pci_dn entry and ends up accessing NULL phb pointer under pci_dn->phb causing kdump kernel crash. This patch fixes the below kdump kernel crash by skipping empty slots: audit: initializing netlink subsys (disabled) thermal_sys: Registered thermal governor 'fair_share' thermal_sys: Registered thermal governor 'step_wise' cpuidle: using governor menu pstore: Registered nvram as persistent store backend Issue PHB reset ... audit: type=2000 audit(1631267818.000:1): state=initialized audit_enabled=0 res=1 BUG: Kernel NULL pointer dereference on read at 0x00000268 Faulting instruction address: 0xc000000008101fb0 Oops: Kernel access of bad area, sig: 7 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 7 PID: 1 Comm: swapper/7 Not tainted 5.14.0 #1 NIP: c000000008101fb0 LR: c000000009284ccc CTR: c000000008029d70 REGS: c00000001161b840 TRAP: 0300 Not tainted (5.14.0) MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 28000224 XER: 20040002 CFAR: c000000008101f0c DAR: 0000000000000268 DSISR: 00080000 IRQMASK: 0 ... NIP pseries_eeh_get_pe_config_addr+0x100/0x1b0 LR __machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350 Call Trace: 0xc00000001161bb80 (unreliable) __machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350 do_one_initcall+0x60/0x2d0 kernel_init_freeable+0x350/0x3f8 kernel_init+0x3c/0x17c ret_from_kernel_thread+0x5c/0x64 Fixes: 5a090f7 ("powerpc/pseries: PCIE PHB reset") Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> [mpe: Tweak wording and trim oops] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/163215558252.413351.8600189949820258982.stgit@jupiter Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
…rylock() commit 38fa320 upstream. While reboot the system by sysrq, the following bug will be occur. BUG: sleeping function called from invalid context at kernel/locking/semaphore.c:90 in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 10052, name: rc.shutdown CPU: 3 PID: 10052 Comm: rc.shutdown Tainted: G W O 5.10.0 #1 Call trace: dump_backtrace+0x0/0x1c8 show_stack+0x18/0x28 dump_stack+0xd0/0x110 ___might_sleep+0x14c/0x160 __might_sleep+0x74/0x88 down_interruptible+0x40/0x118 virt_efi_reset_system+0x3c/0xd0 efi_reboot+0xd4/0x11c machine_restart+0x60/0x9c emergency_restart+0x1c/0x2c sysrq_handle_reboot+0x1c/0x2c __handle_sysrq+0xd0/0x194 write_sysrq_trigger+0xbc/0xe4 proc_reg_write+0xd4/0xf0 vfs_write+0xa8/0x148 ksys_write+0x6c/0xd8 __arm64_sys_write+0x18/0x28 el0_svc_common.constprop.3+0xe4/0x16c do_el0_svc+0x1c/0x2c el0_svc+0x20/0x30 el0_sync_handler+0x80/0x17c el0_sync+0x158/0x180 The reason for this problem is that irq has been disabled in machine_restart() and then it calls down_interruptible() in virt_efi_reset_system(), which would occur sleep in irq context, it is dangerous! Commit 99409b9("locking/semaphore: Add might_sleep() to down_*() family") add might_sleep() in down_interruptible(), so the bug info is here. down_trylock() can solve this problem, cause there is no might_sleep. -------- Cc: <stable@vger.kernel.org> Signed-off-by: Zhang Jianhua <chris.zjh@huawei.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
…tart commit 60d950f upstream. In commit 74fc4f8 ("net: Fix offloading indirect devices dependency on qdisc order creation"), it adds a process to trigger the callback to setup the bo callback when the driver regists a callback. In our current implement, we are not ready to run the callback when nfp call the function flow_indr_dev_register, then there will be error message as: kernel: Oops: 0000 [#1] SMP PTI kernel: CPU: 0 PID: 14119 Comm: kworker/0:0 Tainted: G kernel: Workqueue: events work_for_cpu_fn kernel: RIP: 0010:nfp_flower_indr_setup_tc_cb+0x258/0x410 kernel: RSP: 0018:ffffbc1e02c57bf8 EFLAGS: 00010286 kernel: RAX: 0000000000000000 RBX: ffff9c761fabc000 RCX: 0000000000000001 kernel: RDX: 0000000000000001 RSI: fffffffffffffff0 RDI: ffffffffc0be9ef1 kernel: RBP: ffffbc1e02c57c58 R08: ffffffffc08f33aa R09: ffff9c6db7478800 kernel: R10: 0000009c003f6e00 R11: ffffbc1e02800000 R12: ffffbc1e000d9000 kernel: R13: ffffbc1e000db428 R14: ffff9c6db7478800 R15: ffff9c761e884e80 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: fffffffffffffff0 CR3: 00000009e260a004 CR4: 00000000007706f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: PKRU: 55555554 kernel: Call Trace: kernel: ? flow_indr_dev_register+0xab/0x210 kernel: ? __cond_resched+0x15/0x30 kernel: ? kmem_cache_alloc_trace+0x44/0x4b0 kernel: ? nfp_flower_setup_tc+0x1d0/0x1d0 [nfp] kernel: flow_indr_dev_register+0x158/0x210 kernel: ? tcf_block_unbind+0xe0/0xe0 kernel: nfp_flower_init+0x40b/0x650 [nfp] kernel: nfp_net_pci_probe+0x25f/0x960 [nfp] kernel: ? nfp_rtsym_read_le+0x76/0x130 [nfp] kernel: nfp_pci_probe+0x6a9/0x820 [nfp] kernel: local_pci_probe+0x45/0x80 So we need to call flow_indr_dev_register in app start process instead of init stage. Fixes: 74fc4f8 ("net: Fix offloading indirect devices dependency on qdisc order creation") Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Link: https://lore.kernel.org/r/20211012124850.13025-1-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit b15fa92 upstream. Starting with kernel 5.11 built with CONFIG_FORTIFY_SOURCE mouting an ocfs2 filesystem with either o2cb or pcmk cluster stack fails with the trace below. Problem seems to be that strings for cluster stack and cluster name are not guaranteed to be null terminated in the disk representation, while strlcpy assumes that the source string is always null terminated. This causes a read outside of the source string triggering the buffer overflow detection. detected buffer overflow in strlen ------------[ cut here ]------------ kernel BUG at lib/string.c:1149! invalid opcode: 0000 [#1] SMP PTI CPU: 1 PID: 910 Comm: mount.ocfs2 Not tainted 5.14.0-1-amd64 #1 Debian 5.14.6-2 RIP: 0010:fortify_panic+0xf/0x11 ... Call Trace: ocfs2_initialize_super.isra.0.cold+0xc/0x18 [ocfs2] ocfs2_fill_super+0x359/0x19b0 [ocfs2] mount_bdev+0x185/0x1b0 legacy_get_tree+0x27/0x40 vfs_get_tree+0x25/0xb0 path_mount+0x454/0xa20 __x64_sys_mount+0x103/0x140 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Link: https://lkml.kernel.org/r/20210929180654.32460-1-vvidic@valentin-vidic.from.hr Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit 74c42e1 upstream. Currently collapse_file does not explicitly check PG_writeback, instead, page_has_private and try_to_release_page are used to filter writeback pages. This does not work for xfs with blocksize equal to or larger than pagesize, because in such case xfs has no page->private. This makes collapse_file bail out early for writeback page. Otherwise, xfs end_page_writeback will panic as follows. page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:ffff0003f88c86a8 index:0x0 pfn:0x84ef32 aops:xfs_address_space_operations [xfs] ino:30000b7 dentry name:"libtest.so" flags: 0x57fffe0000008027(locked|referenced|uptodate|active|writeback) raw: 57fffe0000008027 ffff80001b48bc28 ffff80001b48bc28 ffff0003f88c86a8 raw: 0000000000000000 0000000000000000 00000000ffffffff ffff0000c3e9a000 page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u)) page->mem_cgroup:ffff0000c3e9a000 ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1212! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: BUG: Bad page state in process khugepaged pfn:84ef32 xfs(E) page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:0 index:0x0 pfn:0x84ef32 libcrc32c(E) rfkill(E) aes_ce_blk(E) crypto_simd(E) ... CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Tainted: ... pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) Call trace: end_page_writeback+0x1c0/0x214 iomap_finish_page_writeback+0x13c/0x204 iomap_finish_ioend+0xe8/0x19c iomap_writepage_end_bio+0x38/0x50 bio_endio+0x168/0x1ec blk_update_request+0x278/0x3f0 blk_mq_end_request+0x34/0x15c virtblk_request_done+0x38/0x74 [virtio_blk] blk_done_softirq+0xc4/0x110 __do_softirq+0x128/0x38c __irq_exit_rcu+0x118/0x150 irq_exit+0x1c/0x30 __handle_domain_irq+0x8c/0xf0 gic_handle_irq+0x84/0x108 el1_irq+0xcc/0x180 arch_cpu_idle+0x18/0x40 default_idle_call+0x4c/0x1a0 cpuidle_idle_call+0x168/0x1e0 do_idle+0xb4/0x104 cpu_startup_entry+0x30/0x9c secondary_start_kernel+0x104/0x180 Code: d4210000 b0006161 910c8021 94013f4d (d4210000) ---[ end trace 4a88c6a074082f8c ]--- Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt Link: https://lkml.kernel.org/r/20211022023052.33114-1-rongwei.wang@linux.alibaba.com Fixes: 99cb0db ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com> Signed-off-by: Xu Yu <xuyu@linux.alibaba.com> Suggested-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Yang Shi <shy828301@gmail.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Song Liu <song@kernel.org> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit 6473395 upstream. When copying the device name, the length of the data memcpy copied exceeds the length of the source buffer, which cause the KASAN issue below. Use strscpy_pad() instead. BUG: KASAN: slab-out-of-bounds in ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core] Read of size 64 at addr ffff88811a10f5e0 by task rping/140263 CPU: 3 PID: 140263 Comm: rping Not tainted 5.15.0-rc1+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl+0x57/0x7d print_address_description.constprop.0+0x1d/0xa0 kasan_report+0xcb/0x110 kasan_check_range+0x13d/0x180 memcpy+0x20/0x60 ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core] ib_nl_make_request+0x1c6/0x380 [ib_core] send_mad+0x20a/0x220 [ib_core] ib_sa_path_rec_get+0x3e3/0x800 [ib_core] cma_query_ib_route+0x29b/0x390 [rdma_cm] rdma_resolve_route+0x308/0x3e0 [rdma_cm] ucma_resolve_route+0xe1/0x150 [rdma_ucm] ucma_write+0x17b/0x1f0 [rdma_ucm] vfs_write+0x142/0x4d0 ksys_write+0x133/0x160 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f26499aa90f Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 5c fd ff ff 48 RSP: 002b:00007f26495f2dc0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000000007d0 RCX: 00007f26499aa90f RDX: 0000000000000010 RSI: 00007f26495f2e00 RDI: 0000000000000003 RBP: 00005632a8315440 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000293 R12: 00007f26495f2e00 R13: 00005632a83154e0 R14: 00005632a8315440 R15: 00005632a830a810 Allocated by task 131419: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0x7c/0x90 proc_self_get_link+0x8b/0x100 pick_link+0x4f1/0x5c0 step_into+0x2eb/0x3d0 walk_component+0xc8/0x2c0 link_path_walk+0x3b8/0x580 path_openat+0x101/0x230 do_filp_open+0x12e/0x240 do_sys_openat2+0x115/0x280 __x64_sys_openat+0xce/0x140 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 2ca546b ("IB/sa: Route SA pathrecord query through netlink") Link: https://lore.kernel.org/r/72ede0f6dab61f7f23df9ac7a70666e07ef314b0.1635055496.git.leonro@nvidia.com Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
This reverts commit 88dbd08. Causes the following Syzkaller reported issue: BUG: kernel NULL pointer dereference, address: 0000000000000010 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP KASAN CPU: 1 PID: 546 Comm: syz-executor631 Tainted: G B 5.10.76-syzkaller-01178-g4944ec82ebb9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:arch_atomic_try_cmpxchg syzkaller/managers/android-5-10/kernel/./arch/x86/include/asm/atomic.h:202 [inline] RIP: 0010:atomic_try_cmpxchg_acquire syzkaller/managers/android-5-10/kernel/./include/asm-generic/atomic-instrumented.h:707 [inline] RIP: 0010:queued_spin_lock syzkaller/managers/android-5-10/kernel/./include/asm-generic/qspinlock.h:82 [inline] RIP: 0010:do_raw_spin_lock_flags syzkaller/managers/android-5-10/kernel/./include/linux/spinlock.h:195 [inline] RIP: 0010:__raw_spin_lock_irqsave syzkaller/managers/android-5-10/kernel/./include/linux/spinlock_api_smp.h:119 [inline] RIP: 0010:_raw_spin_lock_irqsave+0x10d/0x210 syzkaller/managers/android-5-10/kernel/kernel/locking/spinlock.c:159 Code: 00 00 00 e8 d5 29 09 fd 4c 89 e7 be 04 00 00 00 e8 c8 29 09 fd 42 8a 04 3b 84 c0 0f 85 be 00 00 00 8b 44 24 40 b9 01 00 00 00 <f0> 41 0f b1 4d 00 75 45 48 c7 44 24 20 0e 36 e0 45 4b c7 04 37 00 RSP: 0018:ffffc90000f174e0 EFLAGS: 00010097 RAX: 0000000000000000 RBX: 1ffff920001e2ea4 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffc90000f17520 RBP: ffffc90000f175b0 R08: dffffc0000000000 R09: 0000000000000003 R10: fffff520001e2ea5 R11: 0000000000000004 R12: ffffc90000f17520 R13: 0000000000000010 R14: 1ffff920001e2ea0 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8881f7100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000000640f000 CR4: 00000000003506a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: prepare_to_wait+0x9c/0x290 syzkaller/managers/android-5-10/kernel/kernel/sched/wait.c:248 io_uring_cancel_files syzkaller/managers/android-5-10/kernel/fs/io_uring.c:8690 [inline] io_uring_cancel_task_requests+0x16a9/0x1ed0 syzkaller/managers/android-5-10/kernel/fs/io_uring.c:8760 io_uring_flush+0x170/0x6d0 syzkaller/managers/android-5-10/kernel/fs/io_uring.c:8923 filp_close+0xb0/0x150 syzkaller/managers/android-5-10/kernel/fs/open.c:1319 close_files syzkaller/managers/android-5-10/kernel/fs/file.c:401 [inline] put_files_struct+0x1d4/0x350 syzkaller/managers/android-5-10/kernel/fs/file.c:429 exit_files+0x80/0xa0 syzkaller/managers/android-5-10/kernel/fs/file.c:458 do_exit+0x6d9/0x23a0 syzkaller/managers/android-5-10/kernel/kernel/exit.c:808 do_group_exit+0x16a/0x2d0 syzkaller/managers/android-5-10/kernel/kernel/exit.c:910 get_signal+0x133e/0x1f80 syzkaller/managers/android-5-10/kernel/kernel/signal.c:2790 arch_do_signal+0x8d/0x620 syzkaller/managers/android-5-10/kernel/arch/x86/kernel/signal.c:805 exit_to_user_mode_loop syzkaller/managers/android-5-10/kernel/kernel/entry/common.c:161 [inline] exit_to_user_mode_prepare+0xaa/0xe0 syzkaller/managers/android-5-10/kernel/kernel/entry/common.c:191 syscall_exit_to_user_mode+0x24/0x40 syzkaller/managers/android-5-10/kernel/kernel/entry/common.c:266 do_syscall_64+0x3d/0x70 syzkaller/managers/android-5-10/kernel/arch/x86/entry/common.c:56 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fc6d1589a89 Code: Unable to access opcode bytes at RIP 0x7fc6d1589a5f. RSP: 002b:00007ffd2b5da728 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffdfc RBX: 0000000000005193 RCX: 00007fc6d1589a89 RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007fc6d161142c RBP: 0000000000000032 R08: 00007ffd2b5eb0b8 R09: 0000000000000000 R10: 00007ffd2b5da750 R11: 0000000000000246 R12: 00007fc6d161142c R13: 00007ffd2b5da750 R14: 00007ffd2b5da770 R15: 0000000000000000 Modules linked in: CR2: 0000000000000010 ---[ end trace fe8044f7dc4d8d65 ]--- RIP: 0010:arch_atomic_try_cmpxchg syzkaller/managers/android-5-10/kernel/./arch/x86/include/asm/atomic.h:202 [inline] RIP: 0010:atomic_try_cmpxchg_acquire syzkaller/managers/android-5-10/kernel/./include/asm-generic/atomic-instrumented.h:707 [inline] RIP: 0010:queued_spin_lock syzkaller/managers/android-5-10/kernel/./include/asm-generic/qspinlock.h:82 [inline] RIP: 0010:do_raw_spin_lock_flags syzkaller/managers/android-5-10/kernel/./include/linux/spinlock.h:195 [inline] RIP: 0010:__raw_spin_lock_irqsave syzkaller/managers/android-5-10/kernel/./include/linux/spinlock_api_smp.h:119 [inline] RIP: 0010:_raw_spin_lock_irqsave+0x10d/0x210 syzkaller/managers/android-5-10/kernel/kernel/locking/spinlock.c:159 Code: 00 00 00 e8 d5 29 09 fd 4c 89 e7 be 04 00 00 00 e8 c8 29 09 fd 42 8a 04 3b 84 c0 0f 85 be 00 00 00 8b 44 24 40 b9 01 00 00 00 <f0> 41 0f b1 4d 00 75 45 48 c7 44 24 20 0e 36 e0 45 4b c7 04 37 00 RSP: 0018:ffffc90000f174e0 EFLAGS: 00010097 RAX: 0000000000000000 RBX: 1ffff920001e2ea4 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffc90000f17520 RBP: ffffc90000f175b0 R08: dffffc0000000000 R09: 0000000000000003 R10: fffff520001e2ea5 R11: 0000000000000004 R12: ffffc90000f17520 R13: 0000000000000010 R14: 1ffff920001e2ea0 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8881f7100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000000640f000 CR4: 00000000003506a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess), 1 bytes skipped: 0: 00 00 add %al,(%rax) 2: e8 d5 29 09 fd callq 0xfd0929dc 7: 4c 89 e7 mov %r12,%rdi a: be 04 00 00 00 mov $0x4,%esi f: e8 c8 29 09 fd callq 0xfd0929dc 14: 42 8a 04 3b mov (%rbx,%r15,1),%al 18: 84 c0 test %al,%al 1a: 0f 85 be 00 00 00 jne 0xde 20: 8b 44 24 40 mov 0x40(%rsp),%eax 24: b9 01 00 00 00 mov $0x1,%ecx * 29: f0 41 0f b1 4d 00 lock cmpxchg %ecx,0x0(%r13) <-- trapping instruction 2f: 75 45 jne 0x76 31: 48 c7 44 24 20 0e 36 movq $0x45e0360e,0x20(%rsp) 38: e0 45 3a: 4b rex.WXB 3b: c7 .byte 0xc7 3c: 04 37 add $0x37,%al Link: https://syzkaller.appspot.com/bug?extid=b0003676644cf0d6acc4 Reported-by: syzbot+b0003676644cf0d6acc4@syzkaller.appspotmail.com Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit 703535e upstream. No need to deduce command size in scsi_setup_scsi_cmnd() anymore as appropriate checks have been added to scsi_fill_sghdr_rq() function and the cmd_len should never be zero here. The code to do that wasn't correct anyway, as it used uninitialized cmd->cmnd, which caused a null-ptr-deref if the command size was zero as in the trace below. Fix this by removing the unneeded code. KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 1822 Comm: repro Not tainted 5.15.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014 Call Trace: blk_mq_dispatch_rq_list+0x7c7/0x12d0 __blk_mq_sched_dispatch_requests+0x244/0x380 blk_mq_sched_dispatch_requests+0xf0/0x160 __blk_mq_run_hw_queue+0xe8/0x160 __blk_mq_delay_run_hw_queue+0x252/0x5d0 blk_mq_run_hw_queue+0x1dd/0x3b0 blk_mq_sched_insert_request+0x1ff/0x3e0 blk_execute_rq_nowait+0x173/0x1e0 blk_execute_rq+0x15c/0x540 sg_io+0x97c/0x1370 scsi_ioctl+0xe16/0x28e0 sd_ioctl+0x134/0x170 blkdev_ioctl+0x362/0x6e0 block_ioctl+0xb0/0xf0 vfs_ioctl+0xa7/0xf0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae ---[ end trace 8b086e334adef6d2 ]--- Kernel panic - not syncing: Fatal exception Link: https://lore.kernel.org/r/20211103170659.22151-2-tadeusz.struk@linaro.org Fixes: 2ceda20 ("scsi: core: Move command size detection out of the fast path") Cc: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: James E.J. Bottomley <jejb@linux.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: <linux-scsi@vger.kernel.org> Cc: <linux-kernel@vger.kernel.org> Cc: <stable@vger.kernel.org> # 5.15, 5.14, 5.10 Reported-by: syzbot+5516b30f5401d4dcbcae@syzkaller.appspotmail.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit 3ef68d4 upstream. Kernel crashes when accessing port_speed sysfs file. The issue happens on a CNA when the local array was accessed beyond bounds. Fix this by changing the lookup. BUG: unable to handle kernel paging request at 0000000000004000 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 15 PID: 455213 Comm: sosreport Kdump: loaded Not tainted 4.18.0-305.7.1.el8_4.x86_64 #1 RIP: 0010:string_nocheck+0x12/0x70 Code: 00 00 4c 89 e2 be 20 00 00 00 48 89 ef e8 86 9a 00 00 4c 01 e3 eb 81 90 49 89 f2 48 89 ce 48 89 f8 48 c1 fe 30 66 85 f6 74 4f <44> 0f b6 0a 45 84 c9 74 46 83 ee 01 41 b8 01 00 00 00 48 8d 7c 37 RSP: 0018:ffffb5141c1afcf0 EFLAGS: 00010286 RAX: ffff8bf4009f8000 RBX: ffff8bf4009f9000 RCX: ffff0a00ffffff04 RDX: 0000000000004000 RSI: ffffffffffffffff RDI: ffff8bf4009f8000 RBP: 0000000000004000 R08: 0000000000000001 R09: ffffb5141c1afb84 R10: ffff8bf4009f9000 R11: ffffb5141c1afce6 R12: ffff0a00ffffff04 R13: ffffffffc08e21aa R14: 0000000000001000 R15: ffffffffc08e21aa FS: 00007fc4ebfff700(0000) GS:ffff8c717f7c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000004000 CR3: 000000edfdee6006 CR4: 00000000001706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: string+0x40/0x50 vsnprintf+0x33c/0x520 scnprintf+0x4d/0x90 qla2x00_port_speed_show+0xb5/0x100 [qla2xxx] dev_attr_show+0x1c/0x40 sysfs_kf_seq_show+0x9b/0x100 seq_read+0x153/0x410 vfs_read+0x91/0x140 ksys_read+0x4f/0xb0 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca Link: https://lore.kernel.org/r/20210908164622.19240-7-njavali@marvell.com Fixes: 4910b52 ("scsi: qla2xxx: Add support for setting port speed") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit e775eb9 upstream. When enable debug kernel configs,there will be calltrace as below: BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 caller is debug_smp_processor_id+0x20/0x30 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.10.63-yocto-standard #1 Hardware name: NXP Layerscape LX2160ARDB (DT) Call trace: dump_backtrace+0x0/0x1a0 show_stack+0x24/0x30 dump_stack+0xf0/0x13c check_preemption_disabled+0x100/0x110 debug_smp_processor_id+0x20/0x30 dpaa2_io_query_fq_count+0xdc/0x154 dpaa2_eth_stop+0x144/0x314 __dev_close_many+0xdc/0x160 __dev_change_flags+0xe8/0x220 dev_change_flags+0x30/0x70 ic_close_devs+0x50/0x78 ip_auto_config+0xed0/0xf10 do_one_initcall+0xac/0x460 kernel_init_freeable+0x30c/0x378 kernel_init+0x20/0x128 ret_from_fork+0x10/0x38 Based on comment in the context, it doesn't matter whether preemption is disable or not. So, replace smp_processor_id() with raw_smp_processor_id() to avoid above call trace. Fixes: c89105c ("staging: fsl-mc: Move DPIO from staging to drivers/soc/fsl") Cc: stable@vger.kernel.org Signed-off-by: Meng Li <Meng.Li@windriver.com> Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit dc7e594 upstream. In orininal code, use 2 function spin_lock() and local_irq_save() to protect the critical zone. But when enable the kernel debug config, there are below inconsistent lock state detected. ================================ WARNING: inconsistent lock state 5.10.63-yocto-standard #1 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. lock_torture_wr/226 [HC0[0]:SC1[5]:HE1:SE0] takes: ffff002005b2dd80 (&p->access_spinlock){+.?.}-{3:3}, at: qbman_swp_enqueue_multiple_mem_back+0x44/0x270 {SOFTIRQ-ON-W} state was registered at: lock_acquire.part.0+0xf8/0x250 lock_acquire+0x68/0x84 _raw_spin_lock+0x68/0x90 qbman_swp_enqueue_multiple_mem_back+0x44/0x270 ...... cryptomgr_test+0x38/0x60 kthread+0x158/0x164 ret_from_fork+0x10/0x38 irq event stamp: 4498 hardirqs last enabled at (4498): [<ffff800010fcf980>] _raw_spin_unlock_irqrestore+0x90/0xb0 hardirqs last disabled at (4497): [<ffff800010fcffc4>] _raw_spin_lock_irqsave+0xd4/0xe0 softirqs last enabled at (4458): [<ffff8000100108c4>] __do_softirq+0x674/0x724 softirqs last disabled at (4465): [<ffff80001005b2a4>] __irq_exit_rcu+0x190/0x19c other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&p->access_spinlock); <Interrupt> lock(&p->access_spinlock); *** DEADLOCK *** So, in order to avoid deadlock, use the combined functions spin_lock_irqsave/spin_unlock_irqrestore() to protect critical zone. Fixes: 3b2abda ("soc: fsl: dpio: Replace QMAN array mode with ring mode enqueue") Cc: stable@vger.kernel.org Signed-off-by: Meng Li <Meng.Li@windriver.com> Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit dbb4cfe ] The interrupt handling should be related to the firmware version. If the driver matches an old firmware, then the driver should not handle interrupt such as i2c or dma, otherwise it will cause some errors. This log reveals it: [ 27.708641] INFO: trying to register non-static key. [ 27.710851] The code is fine but needs lockdep annotation, or maybe [ 27.712010] you didn't initialize this object before use? [ 27.712396] turning off the locking correctness validator. [ 27.712787] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.12.4-g70e7f0549188-dirty torvalds#169 [ 27.713349] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 27.714149] Call Trace: [ 27.714329] <IRQ> [ 27.714480] dump_stack+0xba/0xf5 [ 27.714737] register_lock_class+0x873/0x8f0 [ 27.715052] ? __lock_acquire+0x323/0x1930 [ 27.715353] __lock_acquire+0x75/0x1930 [ 27.715636] lock_acquire+0x1dd/0x3e0 [ 27.715905] ? netup_i2c_interrupt+0x19/0x310 [ 27.716226] _raw_spin_lock_irqsave+0x4b/0x60 [ 27.716544] ? netup_i2c_interrupt+0x19/0x310 [ 27.716863] netup_i2c_interrupt+0x19/0x310 [ 27.717178] netup_unidvb_isr+0xd3/0x160 [ 27.717467] __handle_irq_event_percpu+0x53/0x3e0 [ 27.717808] handle_irq_event_percpu+0x35/0x90 [ 27.718129] handle_irq_event+0x39/0x60 [ 27.718409] handle_fasteoi_irq+0xc2/0x1d0 [ 27.718707] __common_interrupt+0x7f/0x150 [ 27.719008] common_interrupt+0xb4/0xd0 [ 27.719289] </IRQ> [ 27.719446] asm_common_interrupt+0x1e/0x40 [ 27.719747] RIP: 0010:native_safe_halt+0x17/0x20 [ 27.720084] Code: 07 0f 00 2d 8b ee 4c 00 f4 5d c3 0f 1f 84 00 00 00 00 00 8b 05 72 95 17 02 55 48 89 e5 85 c0 7e 07 0f 00 2d 6b ee 4c 00 fb f4 <5d> c3 cc cc cc cc cc cc cc 55 48 89 e5 e8 67 53 ff ff 8b 0d 29 f6 [ 27.721386] RSP: 0018:ffffc9000008fe90 EFLAGS: 00000246 [ 27.721758] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 27.722262] RDX: 0000000000000000 RSI: ffffffff85f7c054 RDI: ffffffff85ded4e6 [ 27.722770] RBP: ffffc9000008fe90 R08: 0000000000000001 R09: 0000000000000001 [ 27.723277] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff86a75408 [ 27.723781] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888100260000 [ 27.724289] default_idle+0x9/0x10 [ 27.724537] arch_cpu_idle+0xa/0x10 [ 27.724791] default_idle_call+0x6e/0x250 [ 27.725082] do_idle+0x1f0/0x2d0 [ 27.725326] cpu_startup_entry+0x18/0x20 [ 27.725613] start_secondary+0x11f/0x160 [ 27.725902] secondary_startup_64_no_verify+0xb0/0xbb [ 27.726272] BUG: kernel NULL pointer dereference, address: 0000000000000002 [ 27.726768] #PF: supervisor read access in kernel mode [ 27.727138] #PF: error_code(0x0000) - not-present page [ 27.727507] PGD 8000000118688067 P4D 8000000118688067 PUD 10feab067 PMD 0 [ 27.727999] Oops: 0000 [#1] PREEMPT SMP PTI [ 27.728302] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.12.4-g70e7f0549188-dirty torvalds#169 [ 27.728861] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 27.729660] RIP: 0010:netup_i2c_interrupt+0x23/0x310 [ 27.730019] Code: 0f 1f 80 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 89 fb e8 af 6e 95 fd 48 89 df e8 e7 9f 1c 01 49 89 c5 48 8b 83 48 08 00 00 <66> 44 8b 60 02 44 89 e0 48 8b 93 48 08 00 00 83 e0 f8 66 89 42 02 [ 27.731339] RSP: 0018:ffffc90000118e90 EFLAGS: 00010046 [ 27.731716] RAX: 0000000000000000 RBX: ffff88810803c4d8 RCX: 0000000000000000 [ 27.732223] RDX: 0000000000000001 RSI: ffffffff85d37b94 RDI: ffff88810803c4d8 [ 27.732727] RBP: ffffc90000118ea8 R08: 0000000000000000 R09: 0000000000000001 [ 27.733239] R10: ffff88810803c4f0 R11: 61646e6f63657320 R12: 0000000000000000 [ 27.733745] R13: 0000000000000046 R14: ffff888101041000 R15: ffff8881081b2400 [ 27.734251] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 27.734821] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 27.735228] CR2: 0000000000000002 CR3: 0000000108194000 CR4: 00000000000006e0 [ 27.735735] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 27.736241] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 27.736744] Call Trace: [ 27.736924] <IRQ> [ 27.737074] netup_unidvb_isr+0xd3/0x160 [ 27.737363] __handle_irq_event_percpu+0x53/0x3e0 [ 27.737706] handle_irq_event_percpu+0x35/0x90 [ 27.738028] handle_irq_event+0x39/0x60 [ 27.738306] handle_fasteoi_irq+0xc2/0x1d0 [ 27.738602] __common_interrupt+0x7f/0x150 [ 27.738899] common_interrupt+0xb4/0xd0 [ 27.739176] </IRQ> [ 27.739331] asm_common_interrupt+0x1e/0x40 [ 27.739633] RIP: 0010:native_safe_halt+0x17/0x20 [ 27.739967] Code: 07 0f 00 2d 8b ee 4c 00 f4 5d c3 0f 1f 84 00 00 00 00 00 8b 05 72 95 17 02 55 48 89 e5 85 c0 7e 07 0f 00 2d 6b ee 4c 00 fb f4 <5d> c3 cc cc cc cc cc cc cc 55 48 89 e5 e8 67 53 ff ff 8b 0d 29 f6 [ 27.741275] RSP: 0018:ffffc9000008fe90 EFLAGS: 00000246 [ 27.741647] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 27.742148] RDX: 0000000000000000 RSI: ffffffff85f7c054 RDI: ffffffff85ded4e6 [ 27.742652] RBP: ffffc9000008fe90 R08: 0000000000000001 R09: 0000000000000001 [ 27.743154] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff86a75408 [ 27.743652] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888100260000 [ 27.744157] default_idle+0x9/0x10 [ 27.744405] arch_cpu_idle+0xa/0x10 [ 27.744658] default_idle_call+0x6e/0x250 [ 27.744948] do_idle+0x1f0/0x2d0 [ 27.745190] cpu_startup_entry+0x18/0x20 [ 27.745475] start_secondary+0x11f/0x160 [ 27.745761] secondary_startup_64_no_verify+0xb0/0xbb [ 27.746123] Modules linked in: [ 27.746348] Dumping ftrace buffer: [ 27.746596] (ftrace buffer empty) [ 27.746852] CR2: 0000000000000002 [ 27.747094] ---[ end trace ebafd46f83ab946d ]--- [ 27.747424] RIP: 0010:netup_i2c_interrupt+0x23/0x310 [ 27.747778] Code: 0f 1f 80 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 89 fb e8 af 6e 95 fd 48 89 df e8 e7 9f 1c 01 49 89 c5 48 8b 83 48 08 00 00 <66> 44 8b 60 02 44 89 e0 48 8b 93 48 08 00 00 83 e0 f8 66 89 42 02 [ 27.749082] RSP: 0018:ffffc90000118e90 EFLAGS: 00010046 [ 27.749461] RAX: 0000000000000000 RBX: ffff88810803c4d8 RCX: 0000000000000000 [ 27.749966] RDX: 0000000000000001 RSI: ffffffff85d37b94 RDI: ffff88810803c4d8 [ 27.750471] RBP: ffffc90000118ea8 R08: 0000000000000000 R09: 0000000000000001 [ 27.750976] R10: ffff88810803c4f0 R11: 61646e6f63657320 R12: 0000000000000000 [ 27.751480] R13: 0000000000000046 R14: ffff888101041000 R15: ffff8881081b2400 [ 27.751986] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 27.752560] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 27.752970] CR2: 0000000000000002 CR3: 0000000108194000 CR4: 00000000000006e0 [ 27.753481] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 27.753984] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 27.754487] Kernel panic - not syncing: Fatal exception in interrupt [ 27.755033] Dumping ftrace buffer: [ 27.755279] (ftrace buffer empty) [ 27.755534] Kernel Offset: disabled [ 27.755785] Rebooting in 1 seconds.. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit b220c15 ] Coverity complains of a possible NULL dereference: CID 120718 (#1 of 1): Dereference null return value (NULL_RETURNS) 23. dereference: Dereferencing a pointer that might be NULL state->bos when calling msm_gpu_crashstate_get_bo. [show details] 301 msm_gpu_crashstate_get_bo(state, submit->bos[i].obj, 302 submit->bos[i].iova, submit->bos[i].flags); Fix this by employing the same state->bos NULL check as is used in the next for loop. Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: linux-arm-msm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20210929162554.14295-1-tim.gardner@canonical.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit 39fbef4 ] The following kernel crash can be triggered: [ 89.266592] ------------[ cut here ]------------ [ 89.267427] kernel BUG at fs/buffer.c:3020! [ 89.268264] invalid opcode: 0000 [#1] SMP KASAN PTI [ 89.269116] CPU: 7 PID: 1750 Comm: kmmpd-loop0 Not tainted 5.10.0-862.14.0.6.x86_64-08610-gc932cda3cef4-dirty torvalds#20 [ 89.273169] RIP: 0010:submit_bh_wbc.isra.0+0x538/0x6d0 [ 89.277157] RSP: 0018:ffff888105ddfd08 EFLAGS: 00010246 [ 89.278093] RAX: 0000000000000005 RBX: ffff888124231498 RCX: ffffffffb2772612 [ 89.279332] RDX: 1ffff11024846293 RSI: 0000000000000008 RDI: ffff888124231498 [ 89.280591] RBP: ffff8881248cc000 R08: 0000000000000001 R09: ffffed1024846294 [ 89.281851] R10: ffff88812423149f R11: ffffed1024846293 R12: 0000000000003800 [ 89.283095] R13: 0000000000000001 R14: 0000000000000000 R15: ffff8881161f7000 [ 89.284342] FS: 0000000000000000(0000) GS:ffff88839b5c0000(0000) knlGS:0000000000000000 [ 89.285711] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 89.286701] CR2: 00007f166ebc01a0 CR3: 0000000435c0e000 CR4: 00000000000006e0 [ 89.287919] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 89.289138] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 89.290368] Call Trace: [ 89.290842] write_mmp_block+0x2ca/0x510 [ 89.292218] kmmpd+0x433/0x9a0 [ 89.294902] kthread+0x2dd/0x3e0 [ 89.296268] ret_from_fork+0x22/0x30 [ 89.296906] Modules linked in: by running the following commands: 1. mkfs.ext4 -O mmp /dev/sda -b 1024 2. mount /dev/sda /home/test 3. echo "/dev/sda" > /sys/power/resume That happens because swsusp_check() calls set_blocksize() on the target partition which confuses the file system: Thread1 Thread2 mount /dev/sda /home/test get s_mmp_bh --> has mapped flag start kmmpd thread echo "/dev/sda" > /sys/power/resume resume_store software_resume swsusp_check set_blocksize truncate_inode_pages_range truncate_cleanup_page block_invalidatepage discard_buffer --> clean mapped flag write_mmp_block submit_bh submit_bh_wbc BUG_ON(!buffer_mapped(bh)) To address this issue, modify swsusp_check() to open the target block device with exclusive access. Signed-off-by: Ye Bin <yebin10@huawei.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit 8ef9dc0 ] We got the following lockdep splat while running fstests (specifically btrfs/003 and btrfs/020 in a row) with the new rc. This was uncovered by 87579e9 ("loop: use worker per cgroup instead of kworker") which converted loop to using workqueues, which comes with lockdep annotations that don't exist with kworkers. The lockdep splat is as follows: WARNING: possible circular locking dependency detected 5.14.0-rc2-custom+ torvalds#34 Not tainted ------------------------------------------------------ losetup/156417 is trying to acquire lock: ffff9c7645b02d38 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x84/0x600 but task is already holding lock: ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (&lo->lo_mutex){+.+.}-{3:3}: __mutex_lock+0xba/0x7c0 lo_open+0x28/0x60 [loop] blkdev_get_whole+0x28/0xf0 blkdev_get_by_dev.part.0+0x168/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x163/0x3a0 path_openat+0x74d/0xa40 do_filp_open+0x9c/0x140 do_sys_openat2+0xb1/0x170 __x64_sys_openat+0x54/0x90 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #4 (&disk->open_mutex){+.+.}-{3:3}: __mutex_lock+0xba/0x7c0 blkdev_get_by_dev.part.0+0xd1/0x3c0 blkdev_get_by_path+0xc0/0xd0 btrfs_scan_one_device+0x52/0x1f0 [btrfs] btrfs_control_ioctl+0xac/0x170 [btrfs] __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #3 (uuid_mutex){+.+.}-{3:3}: __mutex_lock+0xba/0x7c0 btrfs_rm_device+0x48/0x6a0 [btrfs] btrfs_ioctl+0x2d1c/0x3110 [btrfs] __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #2 (sb_writers#11){.+.+}-{0:0}: lo_write_bvec+0x112/0x290 [loop] loop_process_work+0x25f/0xcb0 [loop] process_one_work+0x28f/0x5d0 worker_thread+0x55/0x3c0 kthread+0x140/0x170 ret_from_fork+0x22/0x30 -> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: process_one_work+0x266/0x5d0 worker_thread+0x55/0x3c0 kthread+0x140/0x170 ret_from_fork+0x22/0x30 -> #0 ((wq_completion)loop0){+.+.}-{0:0}: __lock_acquire+0x1130/0x1dc0 lock_acquire+0xf5/0x320 flush_workqueue+0xae/0x600 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x650 [loop] lo_ioctl+0x29d/0x780 [loop] block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0); *** DEADLOCK *** 1 lock held by losetup/156417: #0: ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop] stack backtrace: CPU: 8 PID: 156417 Comm: losetup Not tainted 5.14.0-rc2-custom+ torvalds#34 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: dump_stack_lvl+0x57/0x72 check_noncircular+0x10a/0x120 __lock_acquire+0x1130/0x1dc0 lock_acquire+0xf5/0x320 ? flush_workqueue+0x84/0x600 flush_workqueue+0xae/0x600 ? flush_workqueue+0x84/0x600 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x650 [loop] lo_ioctl+0x29d/0x780 [loop] ? __lock_acquire+0x3a0/0x1dc0 ? update_dl_rq_load_avg+0x152/0x360 ? lock_is_held_type+0xa5/0x120 ? find_held_lock.constprop.0+0x2b/0x80 block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f645884de6b Usually the uuid_mutex exists to protect the fs_devices that map together all of the devices that match a specific uuid. In rm_device we're messing with the uuid of a device, so it makes sense to protect that here. However in doing that it pulls in a whole host of lockdep dependencies, as we call mnt_may_write() on the sb before we grab the uuid_mutex, thus we end up with the dependency chain under the uuid_mutex being added under the normal sb write dependency chain, which causes problems with loop devices. We don't need the uuid mutex here however. If we call btrfs_scan_one_device() before we scratch the super block we will find the fs_devices and not find the device itself and return EBUSY because the fs_devices is open. If we call it after the scratch happens it will not appear to be a valid btrfs file system. We do not need to worry about other fs_devices modifying operations here because we're protected by the exclusive operations locking. So drop the uuid_mutex here in order to fix the lockdep splat. A more detailed explanation from the discussion: We are worried about rm and scan racing with each other, before this change we'll zero the device out under the UUID mutex so when scan does run it'll make sure that it can go through the whole device scan thing without rm messing with us. We aren't worried if the scratch happens first, because the result is we don't think this is a btrfs device and we bail out. The only case we are concerned with is we scratch _after_ scan is able to read the superblock and gets a seemingly valid super block, so lets consider this case. Scan will call device_list_add() with the device we're removing. We'll call find_fsid_with_metadata_uuid() and get our fs_devices for this UUID. At this point we lock the fs_devices->device_list_mutex. This is what protects us in this case, but we have two cases here. 1. We aren't to the device removal part of the RM. We found our device, and device name matches our path, we go down and we set total_devices to our super number of devices, which doesn't affect anything because we haven't done the remove yet. 2. We are past the device removal part, which is protected by the device_list_mutex. Scan doesn't find the device, it goes down and does the if (fs_devices->opened) return -EBUSY; check and we bail out. Nothing about this situation is ideal, but the lockdep splat is real, and the fix is safe, tho admittedly a bit scary looking. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ copy more from the discussion ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
…eam state [ Upstream commit b7b1d02 ] The internal stream state sets the timeout to 120 seconds 2 seconds after the creation of the flow, attach this internal stream state to the IPS_ASSURED flag for consistent event reporting. Before this patch: [NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED] [DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED] Note IPS_ASSURED for the flow not yet in the internal stream state. after this update: [NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [UPDATE] udp 17 120 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED] [DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED] Before this patch, short-lived UDP flows never entered IPS_ASSURED, so they were already candidate flow to be deleted by early_drop under stress. Before this patch, IPS_ASSURED is set on regardless the internal stream state, attach this internal stream state to IPS_ASSURED. packet #1 (original direction) enters NEW state packet #2 (reply direction) enters ESTABLISHED state, sets on IPS_SEEN_REPLY paclet #3 (any direction) sets on IPS_ASSURED (if 2 seconds since the creation has passed by). Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit 4ef0c5c ] There is a small race between copy_process() and sched_fork() where child->sched_task_group point to an already freed pointer. parent doing fork() | someone moving the parent | to another cgroup -------------------------------+------------------------------- copy_process() + dup_task_struct()<1> parent move to another cgroup, and free the old cgroup. <2> + sched_fork() + __set_task_cpu()<3> + task_fork_fair() + sched_slice()<4> In the worst case, this bug can lead to "use-after-free" and cause panic as shown above: (1) parent copy its sched_task_group to child at <1>; (2) someone move the parent to another cgroup and free the old cgroup at <2>; (3) the sched_task_group and cfs_rq that belong to the old cgroup will be accessed at <3> and <4>, which cause a panic: [] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [] PGD 8000001fa0a86067 P4D 8000001fa0a86067 PUD 2029955067 PMD 0 [] Oops: 0000 [#1] SMP PTI [] CPU: 7 PID: 648398 Comm: ebizzy Kdump: loaded Tainted: G OE --------- - - 4.18.0.x86_64+ #1 [] RIP: 0010:sched_slice+0x84/0xc0 [] Call Trace: [] task_fork_fair+0x81/0x120 [] sched_fork+0x132/0x240 [] copy_process.part.5+0x675/0x20e0 [] ? __handle_mm_fault+0x63f/0x690 [] _do_fork+0xcd/0x3b0 [] do_syscall_64+0x5d/0x1d0 [] entry_SYSCALL_64_after_hwframe+0x65/0xca [] RIP: 0033:0x7f04418cd7e1 Between cgroup_can_fork() and cgroup_post_fork(), the cgroup membership and thus sched_task_group can't change. So update child's sched_task_group at sched_post_fork() and move task_fork() and __set_task_cpu() (where accees the sched_task_group) from sched_fork() to sched_post_fork(). Fixes: 8323f26 ("sched: Fix race in task_group") Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lkml.kernel.org/r/20210915064030.2231-1-zhangqiao22@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit 6d0d1b5 ] If the device used as a serial console gets detached/attached at runtime, register_console() will try to call imx_uart_setup_console(), but this is not possible since it is marked as __init. For instance # cat /sys/devices/virtual/tty/console/active tty1 ttymxc0 # echo -n N > /sys/devices/virtual/tty/console/subsystem/ttymxc0/console # echo -n Y > /sys/devices/virtual/tty/console/subsystem/ttymxc0/console [ 73.166649] 8<--- cut here --- [ 73.167005] Unable to handle kernel paging request at virtual address c154d928 [ 73.167601] pgd = 55433e84 [ 73.167875] [c154d928] *pgd=8141941e(bad) [ 73.168304] Internal error: Oops: 8000000d [#1] SMP ARM [ 73.168429] Modules linked in: [ 73.168522] CPU: 0 PID: 536 Comm: sh Not tainted 5.15.0-rc6-00056-g3968ddcf05fb #3 [ 73.168675] Hardware name: Freescale i.MX6 Ultralite (Device Tree) [ 73.168791] PC is at imx_uart_console_setup+0x0/0x238 [ 73.168927] LR is at try_enable_new_console+0x98/0x124 [ 73.169056] pc : [<c154d928>] lr : [<c0196f44>] psr: a0000013 [ 73.169178] sp : c2ef5e70 ip : 00000000 fp : 00000000 [ 73.169281] r10: 00000000 r9 : c02cf970 r8 : 00000000 [ 73.169389] r7 : 00000001 r6 : 00000001 r5 : c1760164 r4 : c1e0fb08 [ 73.169512] r3 : c154d928 r2 : 00000000 r1 : efffcbd r0 : c1760164 [ 73.169641] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 73.169782] Control: 10c5387d Table: 8345406a DAC: 00000051 [ 73.169895] Register r0 information: non-slab/vmalloc memory [ 73.170032] Register r1 information: non-slab/vmalloc memory [ 73.170158] Register r2 information: NULL pointer [ 73.170273] Register r3 information: non-slab/vmalloc memory [ 73.170397] Register r4 information: non-slab/vmalloc memory [ 73.170521] Register r5 information: non-slab/vmalloc memory [ 73.170647] Register r6 information: non-paged memory [ 73.170771] Register r7 information: non-paged memory [ 73.170892] Register r8 information: NULL pointer [ 73.171009] Register r9 information: non-slab/vmalloc memory [ 73.171142] Register r10 information: NULL pointer [ 73.171259] Register r11 information: NULL pointer [ 73.171375] Register r12 information: NULL pointer [ 73.171494] Process sh (pid: 536, stack limit = 0xcd1ba82f) [ 73.171621] Stack: (0xc2ef5e70 to 0xc2ef6000) [ 73.171731] 5e60: ???????? ???????? ???????? ???????? [ 73.171899] 5e80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172059] 5ea0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172217] 5ec0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172377] 5ee0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172537] 5f00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172698] 5f20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.172856] 5f40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173016] 5f60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173177] 5f80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173336] 5fa0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173496] 5fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173654] 5fe0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.173826] [<c0196f44>] (try_enable_new_console) from [<c01984a8>] (register_console+0x10c/0x2ec) [ 73.174053] [<c01984a8>] (register_console) from [<c06e2c90>] (console_store+0x14c/0x168) [ 73.174262] [<c06e2c90>] (console_store) from [<c0383718>] (kernfs_fop_write_iter+0x110/0x1cc) [ 73.174470] [<c0383718>] (kernfs_fop_write_iter) from [<c02cf5f4>] (vfs_write+0x31c/0x548) [ 73.174679] [<c02cf5f4>] (vfs_write) from [<c02cf970>] (ksys_write+0x60/0xec) [ 73.174863] [<c02cf970>] (ksys_write) from [<c0100080>] (ret_fast_syscall+0x0/0x1c) [ 73.175052] Exception stack(0xc2ef5fa8 to 0xc2ef5ff0) [ 73.175167] 5fa0: ???????? ???????? ???????? ???????? ???????? ???????? [ 73.175327] 5fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? [ 73.175486] 5fe0: ???????? ???????? ???????? ???????? [ 73.175608] Code: 00000000 00000000 00000000 00000000 (00000000) [ 73.175744] ---[ end trace 9b75121265109bf1 ]--- A similar issue could be triggered by unbinding/binding the serial console device [*]. Drop __init so that imx_uart_setup_console() can be safely called at runtime. [*] https://lore.kernel.org/all/20181114174940.7865-3-stefan@agner.ch/ Fixes: a3cb39d ("serial: core: Allow detach and attach serial device for console") Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://lore.kernel.org/r/20211020192643.476895-2-francesco.dolcini@toradex.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit f0caea8 ] Olga reports seeing the following Oops when doing O_DIRECT writes to a pNFS flexfiles server: Oops: 0000 [#1] SMP PTI CPU: 1 PID: 234186 Comm: kworker/u8:1 Not tainted 5.15.0-rc4+ #4 Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014 Workqueue: nfsiod rpc_async_release [sunrpc] RIP: 0010:nfs_mark_request_commit+0x12/0x30 [nfs] Code: ff ff be 03 00 00 00 e8 ac 34 83 eb e9 29 ff ff ff e8 22 bc d7 eb 66 90 0f 1f 44 00 00 48 85 f6 74 16 48 8b 42 10 48 8b 40 18 <48> 8b 40 18 48 85 c0 74 05 e9 70 fc 15 ec 48 89 d6 e9 68 ed ff ff RSP: 0018:ffffa82f0159fe00 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8f3393141880 RCX: 0000000000000000 RDX: ffffa82f0159fe08 RSI: ffff8f3381252500 RDI: ffff8f3393141880 RBP: ffff8f33ac317c00 R08: 0000000000000000 R09: ffff8f3487724cb0 R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000001 R13: ffff8f3485bccee0 R14: ffff8f33ac317c10 R15: ffff8f33ac317cd8 FS: 0000000000000000(0000) GS:ffff8f34fbc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 0000000122120006 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: nfs_direct_write_completion+0x13b/0x250 [nfs] rpc_free_task+0x39/0x60 [sunrpc] rpc_async_release+0x29/0x40 [sunrpc] process_one_work+0x1ce/0x370 worker_thread+0x30/0x380 ? process_one_work+0x370/0x370 kthread+0x11a/0x140 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 9c455a8 ("NFS/pNFS: Clean up pNFS commit operations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit c98c5da ] Current code does list element deletion and addition in and out of lock protection. This patch moves deletion behind lock. list_add double add: new=ffff9130b5eb89f8, prev=ffff9130b5eb89f8, next=ffff9130c6a715f0. ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:31! invalid opcode: 0000 [#1] SMP PTI CPU: 1 PID: 182395 Comm: kworker/1:37 Kdump: loaded Tainted: G W OE --------- - - 4.18.0-193.el8.x86_64 #1 Hardware name: HP ProLiant DL160 Gen8, BIOS J03 02/10/2014 Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx] RIP: 0010:__list_add_valid+0x41/0x50 Code: 85 94 00 00 00 48 39 c7 74 0b 48 39 d7 74 06 b8 01 00 00 00 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 60 83 ad 97 e8 4d bd ce ff <0f> 0b 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 48 8b 07 48 8b 57 08 RSP: 0018:ffffaba306f47d68 EFLAGS: 00010046 RAX: 0000000000000058 RBX: ffff9130b5eb8800 RCX: 0000000000000006 RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff9130b7456a00 RBP: ffff9130c6a70a58 R08: 000000000008d7be R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000001 R12: ffff9130c6a715f0 R13: ffff9130b5eb8824 R14: ffff9130b5eb89f8 R15: ffff9130b5eb89f8 FS: 0000000000000000(0000) GS:ffff9130b7440000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007efcaaef11a0 CR3: 000000005200a002 CR4: 00000000000606e0 Call Trace: qla24xx_async_gnl+0x113/0x3c0 [qla2xxx] ? qla2x00_iocb_work_fn+0x53/0x80 [qla2xxx] ? process_one_work+0x1a7/0x3b0 ? worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 ? kthread+0x112/0x130 Link: https://lore.kernel.org/r/20211026115412.27691-3-njavali@marvell.com Fixes: 726b854 ("qla2xxx: Add framework for async fabric discovery") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
[ Upstream commit e140c79 ] When fully configure VLANs for a VF, then unload the VF while triggering a reset to PF, will cause a kernel crash because the irq is already uninit. [ 293.177579] ------------[ cut here ]------------ [ 293.183502] kernel BUG at drivers/pci/msi.c:352! [ 293.189547] Internal error: Oops - BUG: 0 [#1] SMP ...... [ 293.390124] Workqueue: hclgevf hclgevf_service_task [hclgevf] [ 293.402627] pstate: 80c00009 (Nzcv daif +PAN +UAO) [ 293.414324] pc : free_msi_irqs+0x19c/0x1b8 [ 293.425429] lr : free_msi_irqs+0x18c/0x1b8 [ 293.436545] sp : ffff00002716fbb0 [ 293.446950] x29: ffff00002716fbb0 x28: 0000000000000000 [ 293.459519] x27: 0000000000000000 x26: ffff45b91ea16b00 [ 293.472183] x25: 0000000000000000 x24: ffffa587b08f4700 [ 293.484717] x23: ffffc591ac30e000 x22: ffffa587b08f8428 [ 293.497190] x21: ffffc591ac30e300 x20: 0000000000000000 [ 293.509594] x19: ffffa58a062a8300 x18: 0000000000000000 [ 293.521949] x17: 0000000000000000 x16: ffff45b91dcc3f48 [ 293.534013] x15: 0000000000000000 x14: 0000000000000000 [ 293.545883] x13: 0000000000000040 x12: 0000000000000228 [ 293.557508] x11: 0000000000000020 x10: 0000000000000040 [ 293.568889] x9 : ffff45b91ea1e190 x8 : ffffc591802d0000 [ 293.580123] x7 : ffffc591802d0148 x6 : 0000000000000120 [ 293.591190] x5 : ffffc591802d0000 x4 : 0000000000000000 [ 293.602015] x3 : 0000000000000000 x2 : 0000000000000000 [ 293.612624] x1 : 00000000000004a4 x0 : ffffa58a1e0c6b80 [ 293.623028] Call trace: [ 293.630340] free_msi_irqs+0x19c/0x1b8 [ 293.638849] pci_disable_msix+0x118/0x140 [ 293.647452] pci_free_irq_vectors+0x20/0x38 [ 293.656081] hclgevf_uninit_msi+0x44/0x58 [hclgevf] [ 293.665309] hclgevf_reset_rebuild+0x1ac/0x2e0 [hclgevf] [ 293.674866] hclgevf_reset+0x358/0x400 [hclgevf] [ 293.683545] hclgevf_reset_service_task+0xd0/0x1b0 [hclgevf] [ 293.693325] hclgevf_service_task+0x4c/0x2e8 [hclgevf] [ 293.702307] process_one_work+0x1b0/0x448 [ 293.710034] worker_thread+0x54/0x468 [ 293.717331] kthread+0x134/0x138 [ 293.724114] ret_from_fork+0x10/0x18 [ 293.731324] Code: f940b000 b4ffff00 a903e7b8 f90017b6 (d4210000) This patch fixes the problem by waiting for the VF reset done while unloading the VF. Fixes: e2cb1de ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit 92d602b upstream. We use inline_dentry which requires to allocate dentry page when adding a link. If we allow to reclaim memory from filesystem, we do down_read(&sbi->cp_rwsem) twice by f2fs_lock_op(). I think this should be okay, but how about stopping the lockdep complaint [1]? f2fs_create() - f2fs_lock_op() - f2fs_do_add_link() - __f2fs_find_entry - f2fs_get_read_data_page() -> kswapd - shrink_node - f2fs_evict_inode - f2fs_lock_op() [1] fs_reclaim ){+.+.}-{0:0} : kswapd0: lock_acquire+0x114/0x394 kswapd0: __fs_reclaim_acquire+0x40/0x50 kswapd0: prepare_alloc_pages+0x94/0x1ec kswapd0: __alloc_pages_nodemask+0x78/0x1b0 kswapd0: pagecache_get_page+0x2e0/0x57c kswapd0: f2fs_get_read_data_page+0xc0/0x394 kswapd0: f2fs_find_data_page+0xa4/0x23c kswapd0: find_in_level+0x1a8/0x36c kswapd0: __f2fs_find_entry+0x70/0x100 kswapd0: f2fs_do_add_link+0x84/0x1ec kswapd0: f2fs_mkdir+0xe4/0x1e4 kswapd0: vfs_mkdir+0x110/0x1c0 kswapd0: do_mkdirat+0xa4/0x160 kswapd0: __arm64_sys_mkdirat+0x24/0x34 kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8 kswapd0: do_el0_svc+0x28/0xa0 kswapd0: el0_svc+0x24/0x38 kswapd0: el0_sync_handler+0x88/0xec kswapd0: el0_sync+0x1c0/0x200 kswapd0: -> #1 ( &sbi->cp_rwsem ){++++}-{3:3} : kswapd0: lock_acquire+0x114/0x394 kswapd0: down_read+0x7c/0x98 kswapd0: f2fs_do_truncate_blocks+0x78/0x3dc kswapd0: f2fs_truncate+0xc8/0x128 kswapd0: f2fs_evict_inode+0x2b8/0x8b8 kswapd0: evict+0xd4/0x2f8 kswapd0: iput+0x1c0/0x258 kswapd0: do_unlinkat+0x170/0x2a0 kswapd0: __arm64_sys_unlinkat+0x4c/0x68 kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8 kswapd0: do_el0_svc+0x28/0xa0 kswapd0: el0_svc+0x24/0x38 kswapd0: el0_sync_handler+0x88/0xec kswapd0: el0_sync+0x1c0/0x200 Cc: stable@vger.kernel.org Fixes: bdbc90f ("f2fs: don't put dentry page in pagecache into highmem") Reviewed-by: Chao Yu <chao@kernel.org> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Light Hsieh <light.hsieh@mediatek.com> Tested-by: Light Hsieh <light.hsieh@mediatek.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
…unload commit 52862ab upstream. Commit 587164c, introduced new opal message type (OPAL_MSG_PRD2) and added opal notifier. But I missed to unregister the notifier during module unload path. This results in below call trace if you try to unload and load opal_prd module. Also add new notifier_block for OPAL_MSG_PRD2 message. Sample calltrace (modprobe -r opal_prd; modprobe opal_prd) BUG: Unable to handle kernel data access on read at 0xc0080000192200e0 Faulting instruction address: 0xc00000000018d1cc Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV CPU: 66 PID: 7446 Comm: modprobe Kdump: loaded Tainted: G E 5.14.0prd torvalds#759 NIP: c00000000018d1cc LR: c00000000018d2a8 CTR: c0000000000cde10 REGS: c0000003c4c0f0a0 TRAP: 0300 Tainted: G E (5.14.0prd) MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 24224824 XER: 20040000 CFAR: c00000000018d2a4 DAR: c0080000192200e0 DSISR: 40000000 IRQMASK: 1 ... NIP notifier_chain_register+0x2c/0xc0 LR atomic_notifier_chain_register+0x48/0x80 Call Trace: 0xc000000002090610 (unreliable) atomic_notifier_chain_register+0x58/0x80 opal_message_notifier_register+0x7c/0x1e0 opal_prd_probe+0x84/0x150 [opal_prd] platform_probe+0x78/0x130 really_probe+0x110/0x5d0 __driver_probe_device+0x17c/0x230 driver_probe_device+0x60/0x130 __driver_attach+0xfc/0x220 bus_for_each_dev+0xa8/0x130 driver_attach+0x34/0x50 bus_add_driver+0x1b0/0x300 driver_register+0x98/0x1a0 __platform_driver_register+0x38/0x50 opal_prd_driver_init+0x34/0x50 [opal_prd] do_one_initcall+0x60/0x2d0 do_init_module+0x7c/0x320 load_module+0x3394/0x3650 __do_sys_finit_module+0xd4/0x160 system_call_exception+0x140/0x290 system_call_common+0xf4/0x258 Fixes: 587164c ("powerpc/powernv: Add new opal message type") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211028165716.41300-1-hegdevasant@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ghost
pushed a commit
that referenced
this pull request
Nov 23, 2021
commit 5ec5582 upstream. This patch intends to add clocks management for stmmac driver: If CONFIG_PM enabled: 1. Keep clocks disabled after driver probed. 2. Enable clocks when up the net device, and disable clocks when down the net device. If CONFIG_PM disabled: Keep clocks always enabled after driver probed. Note: 1. It is fine for ethtool, since the way of implementing ethtool_ops::begin in stmmac is only can be accessed when interface is enabled, so the clocks are ticked. 2. The MDIO bus has a different life cycle to the MAC, need ensure clocks are enabled when _mdio_read/write() need clocks, because these functions can be called while the interface it not opened. Stable backport notes: When run below command to remove ethernet driver on stratix10 platform, there will be warning trace as below: $ cd /sys/class/net/eth0/device/driver/ $ echo ff800000.ethernet > unbind WARNING: CPU: 3 PID: 386 at drivers/clk/clk.c:810 clk_core_unprepare+0x114/0x274 Modules linked in: sch_fq_codel CPU: 3 PID: 386 Comm: sh Tainted: G W 5.10.74-yocto-standard #1 Hardware name: SoCFPGA Stratix 10 SoCDK (DT) pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--) pc : clk_core_unprepare+0x114/0x274 lr : clk_core_unprepare+0x114/0x274 sp : ffff800011bdbb10 clk_core_unprepare+0x114/0x274 clk_unprepare+0x38/0x50 stmmac_remove_config_dt+0x40/0x80 stmmac_pltfr_remove+0x64/0x80 platform_drv_remove+0x38/0x60 ... .. el0_sync_handler+0x1a4/0x1b0 el0_sync+0x180/0x1c0 This issue is introduced by introducing upstream commit 8f26910 ("net: stmmac: disable clocks in stmmac_remove_config_dt()") But in latest mainline kernel, there is no this issue. Because this patch improved clocks management for stmmac driver. Therefore, backport it and its fixing patches to stable kernel v5.10. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: stable@vger.kernel.org Signed-off-by: Meng Li <Meng.Li@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
None yet
0 participants
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.