Harden module auto-loading#19
Closed
madaidan wants to merge 104 commits into
Closed
Conversation
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
It can make sense to disable this to reduce attack surface / complexity.
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
anthraxx
pushed a commit
that referenced
this pull request
Nov 18, 2021
[ Upstream commit 515e718 ] When fail to init coex module, free 'common' and 'adapter' directly, but common->tx_thread which will access 'common' and 'adapter' is running at the same time. That will trigger the UAF bug. ================================================================== BUG: KASAN: use-after-free in rsi_tx_scheduler_thread+0x50f/0x520 [rsi_91x] Read of size 8 at addr ffff8880076dc000 by task Tx-Thread/124777 CPU: 0 PID: 124777 Comm: Tx-Thread Not tainted 5.15.0-rc5+ #19 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? rsi_tx_scheduler_thread+0x50f/0x520 kasan_report.cold+0x7f/0x11b ? rsi_tx_scheduler_thread+0x50f/0x520 rsi_tx_scheduler_thread+0x50f/0x520 ... Freed by task 111873: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 rsi_91x_init+0x741/0x8a0 [rsi_91x] rsi_probe+0x9f/0x1750 [rsi_usb] Stop thread before free 'common' and 'adapter' to fix it. Fixes: 2108df3 ("rsi: add coex support") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211015040335.1021546-1-william.xuanziyang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Nov 18, 2021
[ Upstream commit 515e718 ] When fail to init coex module, free 'common' and 'adapter' directly, but common->tx_thread which will access 'common' and 'adapter' is running at the same time. That will trigger the UAF bug. ================================================================== BUG: KASAN: use-after-free in rsi_tx_scheduler_thread+0x50f/0x520 [rsi_91x] Read of size 8 at addr ffff8880076dc000 by task Tx-Thread/124777 CPU: 0 PID: 124777 Comm: Tx-Thread Not tainted 5.15.0-rc5+ #19 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? rsi_tx_scheduler_thread+0x50f/0x520 kasan_report.cold+0x7f/0x11b ? rsi_tx_scheduler_thread+0x50f/0x520 rsi_tx_scheduler_thread+0x50f/0x520 ... Freed by task 111873: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 rsi_91x_init+0x741/0x8a0 [rsi_91x] rsi_probe+0x9f/0x1750 [rsi_usb] Stop thread before free 'common' and 'adapter' to fix it. Fixes: 2108df3 ("rsi: add coex support") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211015040335.1021546-1-william.xuanziyang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Nov 18, 2021
commit 15c9a35 upstream. When endpoint_alloc() return failed in xillyusb_setup_base_eps(), 'xdev->msg_ep' will be freed but not set to NULL. That lets program enter fail handling to cleanup_dev() in xillyusb_probe(). Check for 'xdev->msg_ep' is invalid in cleanup_dev() because 'xdev->msg_ep' did not set to NULL when was freed. So the UAF problem for 'xdev->msg_ep' is triggered. ================================================================== BUG: KASAN: use-after-free in fifo_mem_release+0x1f4/0x210 CPU: 0 PID: 166 Comm: kworker/0:2 Not tainted 5.15.0-rc5+ #19 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? fifo_mem_release+0x1f4/0x210 kasan_report.cold+0x7f/0x11b ? xillyusb_probe+0x530/0x700 ? fifo_mem_release+0x1f4/0x210 fifo_mem_release+0x1f4/0x210 ? __sanitizer_cov_trace_pc+0x1d/0x50 endpoint_dealloc+0x35/0x2b0 cleanup_dev+0x90/0x120 xillyusb_probe+0x59a/0x700 ... Freed by task 166: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 xillyusb_probe+0x606/0x700 Set 'xdev->msg_ep' to NULL after being freed in xillyusb_setup_base_eps() to fix the UAF problem. Fixes: a53d120 ("char: xillybus: Add driver for XillyUSB (Xillybus variant for USB)") Cc: stable <stable@vger.kernel.org> Acked-by: Eli Billauer <eli.billauer@gmail.com> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Link: https://lore.kernel.org/r/20211016052047.1611983-1-william.xuanziyang@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
anthraxx
pushed a commit
that referenced
this pull request
Nov 18, 2021
[ Upstream commit 515e718 ] When fail to init coex module, free 'common' and 'adapter' directly, but common->tx_thread which will access 'common' and 'adapter' is running at the same time. That will trigger the UAF bug. ================================================================== BUG: KASAN: use-after-free in rsi_tx_scheduler_thread+0x50f/0x520 [rsi_91x] Read of size 8 at addr ffff8880076dc000 by task Tx-Thread/124777 CPU: 0 PID: 124777 Comm: Tx-Thread Not tainted 5.15.0-rc5+ #19 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? rsi_tx_scheduler_thread+0x50f/0x520 kasan_report.cold+0x7f/0x11b ? rsi_tx_scheduler_thread+0x50f/0x520 rsi_tx_scheduler_thread+0x50f/0x520 ... Freed by task 111873: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 rsi_91x_init+0x741/0x8a0 [rsi_91x] rsi_probe+0x9f/0x1750 [rsi_usb] Stop thread before free 'common' and 'adapter' to fix it. Fixes: 2108df3 ("rsi: add coex support") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211015040335.1021546-1-william.xuanziyang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Nov 18, 2021
commit 15c9a35 upstream. When endpoint_alloc() return failed in xillyusb_setup_base_eps(), 'xdev->msg_ep' will be freed but not set to NULL. That lets program enter fail handling to cleanup_dev() in xillyusb_probe(). Check for 'xdev->msg_ep' is invalid in cleanup_dev() because 'xdev->msg_ep' did not set to NULL when was freed. So the UAF problem for 'xdev->msg_ep' is triggered. ================================================================== BUG: KASAN: use-after-free in fifo_mem_release+0x1f4/0x210 CPU: 0 PID: 166 Comm: kworker/0:2 Not tainted 5.15.0-rc5+ #19 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? fifo_mem_release+0x1f4/0x210 kasan_report.cold+0x7f/0x11b ? xillyusb_probe+0x530/0x700 ? fifo_mem_release+0x1f4/0x210 fifo_mem_release+0x1f4/0x210 ? __sanitizer_cov_trace_pc+0x1d/0x50 endpoint_dealloc+0x35/0x2b0 cleanup_dev+0x90/0x120 xillyusb_probe+0x59a/0x700 ... Freed by task 166: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 xillyusb_probe+0x606/0x700 Set 'xdev->msg_ep' to NULL after being freed in xillyusb_setup_base_eps() to fix the UAF problem. Fixes: a53d120 ("char: xillybus: Add driver for XillyUSB (Xillybus variant for USB)") Cc: stable <stable@vger.kernel.org> Acked-by: Eli Billauer <eli.billauer@gmail.com> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Link: https://lore.kernel.org/r/20211016052047.1611983-1-william.xuanziyang@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
anthraxx
pushed a commit
that referenced
this pull request
Nov 18, 2021
[ Upstream commit 515e718 ] When fail to init coex module, free 'common' and 'adapter' directly, but common->tx_thread which will access 'common' and 'adapter' is running at the same time. That will trigger the UAF bug. ================================================================== BUG: KASAN: use-after-free in rsi_tx_scheduler_thread+0x50f/0x520 [rsi_91x] Read of size 8 at addr ffff8880076dc000 by task Tx-Thread/124777 CPU: 0 PID: 124777 Comm: Tx-Thread Not tainted 5.15.0-rc5+ #19 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? rsi_tx_scheduler_thread+0x50f/0x520 kasan_report.cold+0x7f/0x11b ? rsi_tx_scheduler_thread+0x50f/0x520 rsi_tx_scheduler_thread+0x50f/0x520 ... Freed by task 111873: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 rsi_91x_init+0x741/0x8a0 [rsi_91x] rsi_probe+0x9f/0x1750 [rsi_usb] Stop thread before free 'common' and 'adapter' to fix it. Fixes: 2108df3 ("rsi: add coex support") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211015040335.1021546-1-william.xuanziyang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Nov 26, 2021
[ Upstream commit 515e718 ] When fail to init coex module, free 'common' and 'adapter' directly, but common->tx_thread which will access 'common' and 'adapter' is running at the same time. That will trigger the UAF bug. ================================================================== BUG: KASAN: use-after-free in rsi_tx_scheduler_thread+0x50f/0x520 [rsi_91x] Read of size 8 at addr ffff8880076dc000 by task Tx-Thread/124777 CPU: 0 PID: 124777 Comm: Tx-Thread Not tainted 5.15.0-rc5+ #19 Call Trace: dump_stack_lvl+0xe2/0x152 print_address_description.constprop.0+0x21/0x140 ? rsi_tx_scheduler_thread+0x50f/0x520 kasan_report.cold+0x7f/0x11b ? rsi_tx_scheduler_thread+0x50f/0x520 rsi_tx_scheduler_thread+0x50f/0x520 ... Freed by task 111873: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x109/0x140 kfree+0x117/0x4c0 rsi_91x_init+0x741/0x8a0 [rsi_91x] rsi_probe+0x9f/0x1750 [rsi_usb] Stop thread before free 'common' and 'adapter' to fix it. Fixes: 2108df3 ("rsi: add coex support") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211015040335.1021546-1-william.xuanziyang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Mar 21, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <suresh2514@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Mar 21, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <suresh2514@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Mar 21, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <suresh2514@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Mar 21, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <suresh2514@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Mar 21, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <suresh2514@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Mar 25, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <suresh2514@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Mar 25, 2022
commit f0cfe17 upstream. Nicolas reported that using: # trace-cmd record -e all -M 10 -p osnoise --poll Resulted in the following kernel warning: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1217 at kernel/tracepoint.c:404 tracepoint_probe_unregister+0x280/0x370 [...] CPU: 0 PID: 1217 Comm: trace-cmd Not tainted 5.17.0-rc6-next-20220307-nico+ #19 RIP: 0010:tracepoint_probe_unregister+0x280/0x370 [...] CR2: 00007ff919b29497 CR3: 0000000109da4005 CR4: 0000000000170ef0 Call Trace: <TASK> osnoise_workload_stop+0x36/0x90 tracing_set_tracer+0x108/0x260 tracing_set_trace_write+0x94/0xd0 ? __check_object_size.part.0+0x10a/0x150 ? selinux_file_permission+0x104/0x150 vfs_write+0xb5/0x290 ksys_write+0x5f/0xe0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7ff919a18127 [...] ---[ end trace 0000000000000000 ]--- The warning complains about an attempt to unregister an unregistered tracepoint. This happens on trace-cmd because it first stops tracing, and then switches the tracer to nop. Which is equivalent to: # cd /sys/kernel/tracing/ # echo osnoise > current_tracer # echo 0 > tracing_on # echo nop > current_tracer The osnoise tracer stops the workload when no trace instance is actually collecting data. This can be caused both by disabling tracing or disabling the tracer itself. To avoid unregistering events twice, use the existing trace_osnoise_callback_enabled variable to check if the events (and the workload) are actually active before trying to deactivate them. Link: https://lore.kernel.org/all/c898d1911f7f9303b7e14726e7cc9678fbfb4a0e.camel@redhat.com/ Link: https://lkml.kernel.org/r/938765e17d5a781c2df429a98f0b2e7cc317b022.1646823913.git.bristot@kernel.org Cc: stable@vger.kernel.org Cc: Marcelo Tosatti <mtosatti@redhat.com> Fixes: 2fac8d6 ("tracing/osnoise: Allow multiple instances of the same tracer") Reported-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 9, 2022
[ Upstream commit afadb04 ] Do what is done in other DMA-enabled MMC host drivers (cf. host/mmci.c) and limit the maximum segment size based on the DMA engine's capabilities. This is needed to avoid warnings like the following with CONFIG_DMA_API_DEBUG=y. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 21 at kernel/dma/debug.c:1162 debug_dma_map_sg+0x2f4/0x39c DMA-API: jz4780-dma 13420000.dma-controller: mapping sg segment longer than device claims to support [len=98304] [max=65536] CPU: 0 PID: 21 Comm: kworker/0:1H Not tainted 5.18.0-rc1 #19 Workqueue: kblockd blk_mq_run_work_fn Stack : 81575aec 00000004 80620000 80620000 80620000 805e7358 00000009 801537ac 814c832c 806276e3 806e34b4 80620000 81575aec 00000001 81575ab8 09291444 00000000 00000000 805e7358 81575958 ffffffea 8157596c 00000000 636f6c62 6220646b 80387a70 0000000f 6d5f6b6c 80620000 00000000 81575ba4 00000009 805e170c 80896640 00000001 00010000 00000000 00000000 00006098 806e0000 ... Call Trace: [<80107670>] show_stack+0x84/0x120 [<80528cd8>] __warn+0xb8/0xec [<80528d78>] warn_slowpath_fmt+0x6c/0xb8 [<8016f1d4>] debug_dma_map_sg+0x2f4/0x39c [<80169d4c>] __dma_map_sg_attrs+0xf0/0x118 [<8016a27c>] dma_map_sg_attrs+0x14/0x28 [<804f66b4>] jz4740_mmc_prepare_dma_data+0x74/0xa4 [<804f6714>] jz4740_mmc_pre_request+0x30/0x54 [<804f4ff4>] mmc_blk_mq_issue_rq+0x6e0/0x7bc [<804f5590>] mmc_mq_queue_rq+0x220/0x2d4 [<8038b2c0>] blk_mq_dispatch_rq_list+0x480/0x664 [<80391040>] blk_mq_do_dispatch_sched+0x2dc/0x370 [<80391468>] __blk_mq_sched_dispatch_requests+0xec/0x164 [<80391540>] blk_mq_sched_dispatch_requests+0x44/0x94 [<80387900>] __blk_mq_run_hw_queue+0xb0/0xcc [<80134c14>] process_one_work+0x1b8/0x264 [<80134ff8>] worker_thread+0x2ec/0x3b8 [<8013b13c>] kthread+0x104/0x10c [<80101dcc>] ret_from_kernel_thread+0x14/0x1c ---[ end trace 0000000000000000 ]--- Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220411153753.50443-1-aidanmacdonald.0x0@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 9, 2022
[ Upstream commit afadb04 ] Do what is done in other DMA-enabled MMC host drivers (cf. host/mmci.c) and limit the maximum segment size based on the DMA engine's capabilities. This is needed to avoid warnings like the following with CONFIG_DMA_API_DEBUG=y. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 21 at kernel/dma/debug.c:1162 debug_dma_map_sg+0x2f4/0x39c DMA-API: jz4780-dma 13420000.dma-controller: mapping sg segment longer than device claims to support [len=98304] [max=65536] CPU: 0 PID: 21 Comm: kworker/0:1H Not tainted 5.18.0-rc1 #19 Workqueue: kblockd blk_mq_run_work_fn Stack : 81575aec 00000004 80620000 80620000 80620000 805e7358 00000009 801537ac 814c832c 806276e3 806e34b4 80620000 81575aec 00000001 81575ab8 09291444 00000000 00000000 805e7358 81575958 ffffffea 8157596c 00000000 636f6c62 6220646b 80387a70 0000000f 6d5f6b6c 80620000 00000000 81575ba4 00000009 805e170c 80896640 00000001 00010000 00000000 00000000 00006098 806e0000 ... Call Trace: [<80107670>] show_stack+0x84/0x120 [<80528cd8>] __warn+0xb8/0xec [<80528d78>] warn_slowpath_fmt+0x6c/0xb8 [<8016f1d4>] debug_dma_map_sg+0x2f4/0x39c [<80169d4c>] __dma_map_sg_attrs+0xf0/0x118 [<8016a27c>] dma_map_sg_attrs+0x14/0x28 [<804f66b4>] jz4740_mmc_prepare_dma_data+0x74/0xa4 [<804f6714>] jz4740_mmc_pre_request+0x30/0x54 [<804f4ff4>] mmc_blk_mq_issue_rq+0x6e0/0x7bc [<804f5590>] mmc_mq_queue_rq+0x220/0x2d4 [<8038b2c0>] blk_mq_dispatch_rq_list+0x480/0x664 [<80391040>] blk_mq_do_dispatch_sched+0x2dc/0x370 [<80391468>] __blk_mq_sched_dispatch_requests+0xec/0x164 [<80391540>] blk_mq_sched_dispatch_requests+0x44/0x94 [<80387900>] __blk_mq_run_hw_queue+0xb0/0xcc [<80134c14>] process_one_work+0x1b8/0x264 [<80134ff8>] worker_thread+0x2ec/0x3b8 [<8013b13c>] kthread+0x104/0x10c [<80101dcc>] ret_from_kernel_thread+0x14/0x1c ---[ end trace 0000000000000000 ]--- Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220411153753.50443-1-aidanmacdonald.0x0@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 9, 2022
[ Upstream commit afadb04 ] Do what is done in other DMA-enabled MMC host drivers (cf. host/mmci.c) and limit the maximum segment size based on the DMA engine's capabilities. This is needed to avoid warnings like the following with CONFIG_DMA_API_DEBUG=y. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 21 at kernel/dma/debug.c:1162 debug_dma_map_sg+0x2f4/0x39c DMA-API: jz4780-dma 13420000.dma-controller: mapping sg segment longer than device claims to support [len=98304] [max=65536] CPU: 0 PID: 21 Comm: kworker/0:1H Not tainted 5.18.0-rc1 #19 Workqueue: kblockd blk_mq_run_work_fn Stack : 81575aec 00000004 80620000 80620000 80620000 805e7358 00000009 801537ac 814c832c 806276e3 806e34b4 80620000 81575aec 00000001 81575ab8 09291444 00000000 00000000 805e7358 81575958 ffffffea 8157596c 00000000 636f6c62 6220646b 80387a70 0000000f 6d5f6b6c 80620000 00000000 81575ba4 00000009 805e170c 80896640 00000001 00010000 00000000 00000000 00006098 806e0000 ... Call Trace: [<80107670>] show_stack+0x84/0x120 [<80528cd8>] __warn+0xb8/0xec [<80528d78>] warn_slowpath_fmt+0x6c/0xb8 [<8016f1d4>] debug_dma_map_sg+0x2f4/0x39c [<80169d4c>] __dma_map_sg_attrs+0xf0/0x118 [<8016a27c>] dma_map_sg_attrs+0x14/0x28 [<804f66b4>] jz4740_mmc_prepare_dma_data+0x74/0xa4 [<804f6714>] jz4740_mmc_pre_request+0x30/0x54 [<804f4ff4>] mmc_blk_mq_issue_rq+0x6e0/0x7bc [<804f5590>] mmc_mq_queue_rq+0x220/0x2d4 [<8038b2c0>] blk_mq_dispatch_rq_list+0x480/0x664 [<80391040>] blk_mq_do_dispatch_sched+0x2dc/0x370 [<80391468>] __blk_mq_sched_dispatch_requests+0xec/0x164 [<80391540>] blk_mq_sched_dispatch_requests+0x44/0x94 [<80387900>] __blk_mq_run_hw_queue+0xb0/0xcc [<80134c14>] process_one_work+0x1b8/0x264 [<80134ff8>] worker_thread+0x2ec/0x3b8 [<8013b13c>] kthread+0x104/0x10c [<80101dcc>] ret_from_kernel_thread+0x14/0x1c ---[ end trace 0000000000000000 ]--- Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220411153753.50443-1-aidanmacdonald.0x0@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 9, 2022
[ Upstream commit afadb04 ] Do what is done in other DMA-enabled MMC host drivers (cf. host/mmci.c) and limit the maximum segment size based on the DMA engine's capabilities. This is needed to avoid warnings like the following with CONFIG_DMA_API_DEBUG=y. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 21 at kernel/dma/debug.c:1162 debug_dma_map_sg+0x2f4/0x39c DMA-API: jz4780-dma 13420000.dma-controller: mapping sg segment longer than device claims to support [len=98304] [max=65536] CPU: 0 PID: 21 Comm: kworker/0:1H Not tainted 5.18.0-rc1 #19 Workqueue: kblockd blk_mq_run_work_fn Stack : 81575aec 00000004 80620000 80620000 80620000 805e7358 00000009 801537ac 814c832c 806276e3 806e34b4 80620000 81575aec 00000001 81575ab8 09291444 00000000 00000000 805e7358 81575958 ffffffea 8157596c 00000000 636f6c62 6220646b 80387a70 0000000f 6d5f6b6c 80620000 00000000 81575ba4 00000009 805e170c 80896640 00000001 00010000 00000000 00000000 00006098 806e0000 ... Call Trace: [<80107670>] show_stack+0x84/0x120 [<80528cd8>] __warn+0xb8/0xec [<80528d78>] warn_slowpath_fmt+0x6c/0xb8 [<8016f1d4>] debug_dma_map_sg+0x2f4/0x39c [<80169d4c>] __dma_map_sg_attrs+0xf0/0x118 [<8016a27c>] dma_map_sg_attrs+0x14/0x28 [<804f66b4>] jz4740_mmc_prepare_dma_data+0x74/0xa4 [<804f6714>] jz4740_mmc_pre_request+0x30/0x54 [<804f4ff4>] mmc_blk_mq_issue_rq+0x6e0/0x7bc [<804f5590>] mmc_mq_queue_rq+0x220/0x2d4 [<8038b2c0>] blk_mq_dispatch_rq_list+0x480/0x664 [<80391040>] blk_mq_do_dispatch_sched+0x2dc/0x370 [<80391468>] __blk_mq_sched_dispatch_requests+0xec/0x164 [<80391540>] blk_mq_sched_dispatch_requests+0x44/0x94 [<80387900>] __blk_mq_run_hw_queue+0xb0/0xcc [<80134c14>] process_one_work+0x1b8/0x264 [<80134ff8>] worker_thread+0x2ec/0x3b8 [<8013b13c>] kthread+0x104/0x10c [<80101dcc>] ret_from_kernel_thread+0x14/0x1c ---[ end trace 0000000000000000 ]--- Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220411153753.50443-1-aidanmacdonald.0x0@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 14, 2022
[ Upstream commit 12025ab ] When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel, these locks are replaced with sleepable rt-spinlock, so the stack calltrace will be triggered. Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start tp_printk=1" enabled. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: [<ffffffff8992303e>] try_to_wake_up+0x7e/0xba0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt17+ #19 34c5812404187a875f32bee7977f7367f9679ea7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x11d/0x155 rt_spin_lock+0x40/0x70 trace_event_buffer_commit+0x2fa/0x4c0 ? map_vsyscall+0x93/0x93 trace_event_raw_event_initcall_start+0xbe/0x110 ? perf_trace_initcall_finish+0x210/0x210 ? probe_sched_wakeup+0x34/0x40 ? ttwu_do_wakeup+0xda/0x310 ? trace_hardirqs_on+0x35/0x170 ? map_vsyscall+0x93/0x93 do_one_initcall+0x217/0x3c0 ? trace_event_raw_event_initcall_level+0x170/0x170 ? push_cpu_stop+0x400/0x400 ? cblist_init_generic+0x241/0x290 kernel_init_freeable+0x1ac/0x347 ? _raw_spin_unlock_irq+0x65/0x80 ? rest_init+0xf0/0xf0 kernel_init+0x1e/0x150 ret_from_fork+0x22/0x30 </TASK> Link: https://lkml.kernel.org/r/20220419013910.894370-1-jun.miao@intel.com Signed-off-by: Jun Miao <jun.miao@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 14, 2022
[ Upstream commit 12025ab ] When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel, these locks are replaced with sleepable rt-spinlock, so the stack calltrace will be triggered. Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start tp_printk=1" enabled. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: [<ffffffff8992303e>] try_to_wake_up+0x7e/0xba0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt17+ #19 34c5812404187a875f32bee7977f7367f9679ea7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x11d/0x155 rt_spin_lock+0x40/0x70 trace_event_buffer_commit+0x2fa/0x4c0 ? map_vsyscall+0x93/0x93 trace_event_raw_event_initcall_start+0xbe/0x110 ? perf_trace_initcall_finish+0x210/0x210 ? probe_sched_wakeup+0x34/0x40 ? ttwu_do_wakeup+0xda/0x310 ? trace_hardirqs_on+0x35/0x170 ? map_vsyscall+0x93/0x93 do_one_initcall+0x217/0x3c0 ? trace_event_raw_event_initcall_level+0x170/0x170 ? push_cpu_stop+0x400/0x400 ? cblist_init_generic+0x241/0x290 kernel_init_freeable+0x1ac/0x347 ? _raw_spin_unlock_irq+0x65/0x80 ? rest_init+0xf0/0xf0 kernel_init+0x1e/0x150 ret_from_fork+0x22/0x30 </TASK> Link: https://lkml.kernel.org/r/20220419013910.894370-1-jun.miao@intel.com Signed-off-by: Jun Miao <jun.miao@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 14, 2022
[ Upstream commit afadb04 ] Do what is done in other DMA-enabled MMC host drivers (cf. host/mmci.c) and limit the maximum segment size based on the DMA engine's capabilities. This is needed to avoid warnings like the following with CONFIG_DMA_API_DEBUG=y. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 21 at kernel/dma/debug.c:1162 debug_dma_map_sg+0x2f4/0x39c DMA-API: jz4780-dma 13420000.dma-controller: mapping sg segment longer than device claims to support [len=98304] [max=65536] CPU: 0 PID: 21 Comm: kworker/0:1H Not tainted 5.18.0-rc1 #19 Workqueue: kblockd blk_mq_run_work_fn Stack : 81575aec 00000004 80620000 80620000 80620000 805e7358 00000009 801537ac 814c832c 806276e3 806e34b4 80620000 81575aec 00000001 81575ab8 09291444 00000000 00000000 805e7358 81575958 ffffffea 8157596c 00000000 636f6c62 6220646b 80387a70 0000000f 6d5f6b6c 80620000 00000000 81575ba4 00000009 805e170c 80896640 00000001 00010000 00000000 00000000 00006098 806e0000 ... Call Trace: [<80107670>] show_stack+0x84/0x120 [<80528cd8>] __warn+0xb8/0xec [<80528d78>] warn_slowpath_fmt+0x6c/0xb8 [<8016f1d4>] debug_dma_map_sg+0x2f4/0x39c [<80169d4c>] __dma_map_sg_attrs+0xf0/0x118 [<8016a27c>] dma_map_sg_attrs+0x14/0x28 [<804f66b4>] jz4740_mmc_prepare_dma_data+0x74/0xa4 [<804f6714>] jz4740_mmc_pre_request+0x30/0x54 [<804f4ff4>] mmc_blk_mq_issue_rq+0x6e0/0x7bc [<804f5590>] mmc_mq_queue_rq+0x220/0x2d4 [<8038b2c0>] blk_mq_dispatch_rq_list+0x480/0x664 [<80391040>] blk_mq_do_dispatch_sched+0x2dc/0x370 [<80391468>] __blk_mq_sched_dispatch_requests+0xec/0x164 [<80391540>] blk_mq_sched_dispatch_requests+0x44/0x94 [<80387900>] __blk_mq_run_hw_queue+0xb0/0xcc [<80134c14>] process_one_work+0x1b8/0x264 [<80134ff8>] worker_thread+0x2ec/0x3b8 [<8013b13c>] kthread+0x104/0x10c [<80101dcc>] ret_from_kernel_thread+0x14/0x1c ---[ end trace 0000000000000000 ]--- Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20220411153753.50443-1-aidanmacdonald.0x0@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 14, 2022
[ Upstream commit 12025ab ] When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel, these locks are replaced with sleepable rt-spinlock, so the stack calltrace will be triggered. Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start tp_printk=1" enabled. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: [<ffffffff8992303e>] try_to_wake_up+0x7e/0xba0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt17+ #19 34c5812404187a875f32bee7977f7367f9679ea7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x11d/0x155 rt_spin_lock+0x40/0x70 trace_event_buffer_commit+0x2fa/0x4c0 ? map_vsyscall+0x93/0x93 trace_event_raw_event_initcall_start+0xbe/0x110 ? perf_trace_initcall_finish+0x210/0x210 ? probe_sched_wakeup+0x34/0x40 ? ttwu_do_wakeup+0xda/0x310 ? trace_hardirqs_on+0x35/0x170 ? map_vsyscall+0x93/0x93 do_one_initcall+0x217/0x3c0 ? trace_event_raw_event_initcall_level+0x170/0x170 ? push_cpu_stop+0x400/0x400 ? cblist_init_generic+0x241/0x290 kernel_init_freeable+0x1ac/0x347 ? _raw_spin_unlock_irq+0x65/0x80 ? rest_init+0xf0/0xf0 kernel_init+0x1e/0x150 ret_from_fork+0x22/0x30 </TASK> Link: https://lkml.kernel.org/r/20220419013910.894370-1-jun.miao@intel.com Signed-off-by: Jun Miao <jun.miao@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 14, 2022
[ Upstream commit 12025ab ] When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel, these locks are replaced with sleepable rt-spinlock, so the stack calltrace will be triggered. Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start tp_printk=1" enabled. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: [<ffffffff8992303e>] try_to_wake_up+0x7e/0xba0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt17+ #19 34c5812404187a875f32bee7977f7367f9679ea7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x11d/0x155 rt_spin_lock+0x40/0x70 trace_event_buffer_commit+0x2fa/0x4c0 ? map_vsyscall+0x93/0x93 trace_event_raw_event_initcall_start+0xbe/0x110 ? perf_trace_initcall_finish+0x210/0x210 ? probe_sched_wakeup+0x34/0x40 ? ttwu_do_wakeup+0xda/0x310 ? trace_hardirqs_on+0x35/0x170 ? map_vsyscall+0x93/0x93 do_one_initcall+0x217/0x3c0 ? trace_event_raw_event_initcall_level+0x170/0x170 ? push_cpu_stop+0x400/0x400 ? cblist_init_generic+0x241/0x290 kernel_init_freeable+0x1ac/0x347 ? _raw_spin_unlock_irq+0x65/0x80 ? rest_init+0xf0/0xf0 kernel_init+0x1e/0x150 ret_from_fork+0x22/0x30 </TASK> Link: https://lkml.kernel.org/r/20220419013910.894370-1-jun.miao@intel.com Signed-off-by: Jun Miao <jun.miao@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 14, 2022
[ Upstream commit 12025ab ] When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel, these locks are replaced with sleepable rt-spinlock, so the stack calltrace will be triggered. Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start tp_printk=1" enabled. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: [<ffffffff8992303e>] try_to_wake_up+0x7e/0xba0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt17+ #19 34c5812404187a875f32bee7977f7367f9679ea7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x11d/0x155 rt_spin_lock+0x40/0x70 trace_event_buffer_commit+0x2fa/0x4c0 ? map_vsyscall+0x93/0x93 trace_event_raw_event_initcall_start+0xbe/0x110 ? perf_trace_initcall_finish+0x210/0x210 ? probe_sched_wakeup+0x34/0x40 ? ttwu_do_wakeup+0xda/0x310 ? trace_hardirqs_on+0x35/0x170 ? map_vsyscall+0x93/0x93 do_one_initcall+0x217/0x3c0 ? trace_event_raw_event_initcall_level+0x170/0x170 ? push_cpu_stop+0x400/0x400 ? cblist_init_generic+0x241/0x290 kernel_init_freeable+0x1ac/0x347 ? _raw_spin_unlock_irq+0x65/0x80 ? rest_init+0xf0/0xf0 kernel_init+0x1e/0x150 ret_from_fork+0x22/0x30 </TASK> Link: https://lkml.kernel.org/r/20220419013910.894370-1-jun.miao@intel.com Signed-off-by: Jun Miao <jun.miao@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 14, 2022
[ Upstream commit 12025ab ] When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel, these locks are replaced with sleepable rt-spinlock, so the stack calltrace will be triggered. Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start tp_printk=1" enabled. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: [<ffffffff8992303e>] try_to_wake_up+0x7e/0xba0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt17+ #19 34c5812404187a875f32bee7977f7367f9679ea7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x11d/0x155 rt_spin_lock+0x40/0x70 trace_event_buffer_commit+0x2fa/0x4c0 ? map_vsyscall+0x93/0x93 trace_event_raw_event_initcall_start+0xbe/0x110 ? perf_trace_initcall_finish+0x210/0x210 ? probe_sched_wakeup+0x34/0x40 ? ttwu_do_wakeup+0xda/0x310 ? trace_hardirqs_on+0x35/0x170 ? map_vsyscall+0x93/0x93 do_one_initcall+0x217/0x3c0 ? trace_event_raw_event_initcall_level+0x170/0x170 ? push_cpu_stop+0x400/0x400 ? cblist_init_generic+0x241/0x290 kernel_init_freeable+0x1ac/0x347 ? _raw_spin_unlock_irq+0x65/0x80 ? rest_init+0xf0/0xf0 kernel_init+0x1e/0x150 ret_from_fork+0x22/0x30 </TASK> Link: https://lkml.kernel.org/r/20220419013910.894370-1-jun.miao@intel.com Signed-off-by: Jun Miao <jun.miao@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Jun 16, 2022
[ Upstream commit 12025ab ] When setting bootparams="trace_event=initcall:initcall_start tp_printk=1" in the cmdline, the output_printk() was called, and the spin_lock_irqsave() was called in the atomic and irq disable interrupt context suitation. On the PREEMPT_RT kernel, these locks are replaced with sleepable rt-spinlock, so the stack calltrace will be triggered. Fix it by raw_spin_lock_irqsave when PREEMPT_RT and "trace_event=initcall:initcall_start tp_printk=1" enabled. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: [<ffffffff8992303e>] try_to_wake_up+0x7e/0xba0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.1-rt17+ #19 34c5812404187a875f32bee7977f7367f9679ea7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x60/0x8c dump_stack+0x10/0x12 __might_resched.cold+0x11d/0x155 rt_spin_lock+0x40/0x70 trace_event_buffer_commit+0x2fa/0x4c0 ? map_vsyscall+0x93/0x93 trace_event_raw_event_initcall_start+0xbe/0x110 ? perf_trace_initcall_finish+0x210/0x210 ? probe_sched_wakeup+0x34/0x40 ? ttwu_do_wakeup+0xda/0x310 ? trace_hardirqs_on+0x35/0x170 ? map_vsyscall+0x93/0x93 do_one_initcall+0x217/0x3c0 ? trace_event_raw_event_initcall_level+0x170/0x170 ? push_cpu_stop+0x400/0x400 ? cblist_init_generic+0x241/0x290 kernel_init_freeable+0x1ac/0x347 ? _raw_spin_unlock_irq+0x65/0x80 ? rest_init+0xf0/0xf0 kernel_init+0x1e/0x150 ret_from_fork+0x22/0x30 </TASK> Link: https://lkml.kernel.org/r/20220419013910.894370-1-jun.miao@intel.com Signed-off-by: Jun Miao <jun.miao@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Nov 3, 2022
[ Upstream commit 3a66124 ] DRM commit_tails() will disable downstream crtc/encoder/bridge if both disable crtc is required and crtc->active is set before pushing a new frame downstream. There is a rare case that user space display manager issue an extra screen update immediately followed by close DRM device while down stream display interface is disabled. This extra screen update will timeout due to the downstream interface is disabled but will cause crtc->active be set. Hence the followed commit_tails() called by drm_release() will pass the disable downstream crtc/encoder/bridge conditions checking even downstream interface is disabled. This cause the crash to happen at dp_bridge_disable() due to it trying to access the main link register to push the idle pattern out while main link clocks is disabled. This patch adds atomic_check to prevent the extra frame will not be pushed down if display interface is down so that crtc->active will not be set neither. This will fail the conditions checking of disabling down stream crtc/encoder/bridge which prevent drm_release() from calling dp_bridge_disable() so that crash at dp_bridge_disable() prevented. There is no protection in the DRM framework to check if the display pipeline has been already disabled before trying again. The only check is the crtc_state->active but this is controlled by usermode using UAPI. Hence if the usermode sets this and then crashes, the driver needs to protect against double disable. SError Interrupt on CPU7, code 0x00000000be000411 -- SError CPU: 7 PID: 3878 Comm: Xorg Not tainted 5.19.0-stb-cbq #19 Hardware name: Google Lazor (rev3 - 8) (DT) pstate: a04000c9 (NzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __cmpxchg_case_acq_32+0x14/0x2c lr : do_raw_spin_lock+0xa4/0xdc sp : ffffffc01092b6a0 x29: ffffffc01092b6a0 x28: 0000000000000028 x27: 0000000000000038 x26: 0000000000000004 x25: ffffffd2973dce48 x24: 0000000000000000 x23: 00000000ffffffff x22: 00000000ffffffff x21: ffffffd2978d0008 x20: ffffffd2978d0008 x19: ffffff80ff759fc0 x18: 0000000000000000 x17: 004800a501260460 x16: 0441043b04600438 x15: 04380000089807d0 x14: 07b0089807800780 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000438 x10: 00000000000007d0 x9 : ffffffd2973e09e4 x8 : ffffff8092d53300 x7 : ffffff808902e8b8 x6 : 0000000000000001 x5 : ffffff808902e880 x4 : 0000000000000000 x3 : ffffff80ff759fc0 x2 : 0000000000000001 x1 : 0000000000000000 x0 : ffffff80ff759fc0 Kernel panic - not syncing: Asynchronous SError Interrupt CPU: 7 PID: 3878 Comm: Xorg Not tainted 5.19.0-stb-cbq #19 Hardware name: Google Lazor (rev3 - 8) (DT) Call trace: dump_backtrace.part.0+0xbc/0xe4 show_stack+0x24/0x70 dump_stack_lvl+0x68/0x84 dump_stack+0x18/0x34 panic+0x14c/0x32c nmi_panic+0x58/0x7c arm64_serror_panic+0x78/0x84 do_serror+0x40/0x64 el1h_64_error_handler+0x30/0x48 el1h_64_error+0x68/0x6c __cmpxchg_case_acq_32+0x14/0x2c _raw_spin_lock_irqsave+0x38/0x4c lock_timer_base+0x40/0x78 __mod_timer+0xf4/0x25c schedule_timeout+0xd4/0xfc __wait_for_common+0xac/0x140 wait_for_completion_timeout+0x2c/0x54 dp_ctrl_push_idle+0x40/0x88 dp_bridge_disable+0x24/0x30 drm_atomic_bridge_chain_disable+0x90/0xbc drm_atomic_helper_commit_modeset_disables+0x198/0x444 msm_atomic_commit_tail+0x1d0/0x374 commit_tail+0x80/0x108 drm_atomic_helper_commit+0x118/0x11c drm_atomic_commit+0xb4/0xe0 drm_client_modeset_commit_atomic+0x184/0x224 drm_client_modeset_commit_locked+0x58/0x160 drm_client_modeset_commit+0x3c/0x64 __drm_fb_helper_restore_fbdev_mode_unlocked+0x98/0xac drm_fb_helper_set_par+0x74/0x80 drm_fb_helper_hotplug_event+0xdc/0xe0 __drm_fb_helper_restore_fbdev_mode_unlocked+0x7c/0xac drm_fb_helper_restore_fbdev_mode_unlocked+0x20/0x2c drm_fb_helper_lastclose+0x20/0x2c drm_lastclose+0x44/0x6c drm_release+0x88/0xd4 __fput+0x104/0x220 ____fput+0x1c/0x28 task_work_run+0x8c/0x100 do_exit+0x450/0x8d0 do_group_exit+0x40/0xac __wake_up_parent+0x0/0x38 invoke_syscall+0x84/0x11c el0_svc_common.constprop.0+0xb8/0xe4 do_el0_svc+0x8c/0xb8 el0_svc+0x2c/0x54 el0t_64_sync_handler+0x120/0x1c0 el0t_64_sync+0x190/0x194 SMP: stopping secondary CPUs Kernel Offset: 0x128e800000 from 0xffffffc008000000 PHYS_OFFSET: 0x80000000 CPU features: 0x800,00c2a015,19801c82 Memory Limit: none Changes in v2: -- add more commit text Changes in v3: -- add comments into dp_bridge_atomic_check() Changes in v4: -- rewording the comment into dp_bridge_atomic_check() Changes in v5: -- removed quote x at end of commit text Changes in v6: -- removed quote x at end of comment in dp_bridge_atomic_check() Fixes: 8a3b4c1 ("drm/msm/dp: employ bridge mechanism for display enable and disable") Reported-by: Leonard Lausen <leonard@lausen.nl> Suggested-by: Rob Clark <robdclark@gmail.com> Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/17 Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/505331/ Link: https://lore.kernel.org/r/1664408211-25314-1-git-send-email-quic_khsieh@quicinc.com Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Mar 14, 2023
commit 95ab80a upstream. Otherwise the kernel can BUG with: [ 2245.426978] ============================================================================= [ 2245.435155] BUG bt_work (Tainted: G B W ): Objects remaining in bt_work on __kmem_cache_shutdown() [ 2245.445233] ----------------------------------------------------------------------------- [ 2245.445233] [ 2245.454879] Slab 0x00000000b0ce2b30 objects=64 used=2 fp=0x000000000a3c6a4e flags=0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) [ 2245.467300] CPU: 7 PID: 10805 Comm: lvm Kdump: loaded Tainted: G B W 6.0.0-rc2 #19 [ 2245.476078] Hardware name: Dell Inc. PowerEdge R7525/0590KW, BIOS 2.5.6 10/06/2021 [ 2245.483646] Call Trace: [ 2245.486100] <TASK> [ 2245.488206] dump_stack_lvl+0x34/0x48 [ 2245.491878] slab_err+0x95/0xcd [ 2245.495028] __kmem_cache_shutdown.cold+0x31/0x136 [ 2245.499821] kmem_cache_destroy+0x49/0x130 [ 2245.503928] btracker_destroy+0x12/0x20 [dm_cache] [ 2245.508728] smq_destroy+0x15/0x60 [dm_cache_smq] [ 2245.513435] dm_cache_policy_destroy+0x12/0x20 [dm_cache] [ 2245.518834] destroy+0xc0/0x110 [dm_cache] [ 2245.522933] dm_table_destroy+0x5c/0x120 [dm_mod] [ 2245.527649] __dm_destroy+0x10e/0x1c0 [dm_mod] [ 2245.532102] dev_remove+0x117/0x190 [dm_mod] [ 2245.536384] ctl_ioctl+0x1a2/0x290 [dm_mod] [ 2245.540579] dm_ctl_ioctl+0xa/0x20 [dm_mod] [ 2245.544773] __x64_sys_ioctl+0x8a/0xc0 [ 2245.548524] do_syscall_64+0x5c/0x90 [ 2245.552104] ? syscall_exit_to_user_mode+0x12/0x30 [ 2245.556897] ? do_syscall_64+0x69/0x90 [ 2245.560648] ? do_syscall_64+0x69/0x90 [ 2245.564394] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 2245.569447] RIP: 0033:0x7fe52583ec6b ... [ 2245.646771] ------------[ cut here ]------------ [ 2245.651395] kmem_cache_destroy bt_work: Slab cache still has objects when called from btracker_destroy+0x12/0x20 [dm_cache] [ 2245.651408] WARNING: CPU: 7 PID: 10805 at mm/slab_common.c:478 kmem_cache_destroy+0x128/0x130 Found using: lvm2-testsuite --only "cache-single-split.sh" Ben bisected and found that commit 0495e33 ("mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex/cpu_hotplug_lock") first exposed dm-cache's incomplete cleanup of its background tracker work objects. Reported-by: Benjamin Marzinski <bmarzins@redhat.com> Tested-by: Benjamin Marzinski <bmarzins@redhat.com> Cc: stable@vger.kernel.org # 6.0+ Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
anthraxx
pushed a commit
that referenced
this pull request
Apr 1, 2023
[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <mcornea@redhat.com> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
anthraxx
pushed a commit
that referenced
this pull request
Apr 1, 2023
[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <mcornea@redhat.com> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This creates a CONFIG_SECURITY_MODHARDEN option and when enabled, restricts module auto-loading to root. This is based on GRKERNSEC_MODHARDEN.