drm/amdgpu: skip mes self test after s0i3 resume for MES IP v11.0
authorTim Huang <tim.huang@amd.com>
Mon, 19 Dec 2022 10:32:32 +0000 (18:32 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Tue, 20 Dec 2022 18:23:05 +0000 (13:23 -0500)
MES is part of gfxoff and MES suspend and resume are skipped for S0i3.
But the mes_self_test call path is still in the amdgpu_device_ip_late_init.
it's should also be skipped for s0ix as no hardware re-initialization
happened.

Besides, mes_self_test will free the BO that triggers a lot of warning
messages while in the suspend state.

[   81.656085] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:425 amdgpu_bo_free_kernel+0xfc/0x110 [amdgpu]
[   81.679435] Call Trace:
[   81.679726]  <TASK>
[   81.679981]  amdgpu_mes_remove_hw_queue+0x17a/0x230 [amdgpu]
[   81.680857]  amdgpu_mes_self_test+0x390/0x430 [amdgpu]
[   81.681665]  mes_v11_0_late_init+0x37/0x50 [amdgpu]
[   81.682423]  amdgpu_device_ip_late_init+0x53/0x280 [amdgpu]
[   81.683257]  amdgpu_device_resume+0xae/0x2a0 [amdgpu]
[   81.684043]  amdgpu_pmops_resume+0x37/0x70 [amdgpu]
[   81.684818]  pci_pm_resume+0x5c/0xa0
[   81.685247]  ? pci_pm_thaw+0x90/0x90
[   81.685658]  dpm_run_callback+0x4e/0x160
[   81.686110]  device_resume+0xad/0x210
[   81.686529]  async_resume+0x1e/0x40
[   81.686931]  async_run_entry_fn+0x33/0x120
[   81.687405]  process_one_work+0x21d/0x3f0
[   81.687869]  worker_thread+0x4a/0x3c0
[   81.688293]  ? process_one_work+0x3f0/0x3f0
[   81.688777]  kthread+0xff/0x130
[   81.689157]  ? kthread_complete_and_exit+0x20/0x20
[   81.689707]  ret_from_fork+0x22/0x30
[   81.690118]  </TASK>
[   81.690380] ---[ end trace 0000000000000000 ]---

v2: make the comment clean and use adev->in_s0ix instead of
adev->suspend

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0, 6.1
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c

index 5459366f49ffe47e6798a82c70a2b56e53f58b0c..970b066b37bb917a0b99fdf54c8855bab86511e1 100644 (file)
@@ -1342,7 +1342,8 @@ static int mes_v11_0_late_init(void *handle)
 {
        struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-       if (!amdgpu_in_reset(adev) &&
+       /* it's only intended for use in mes_self_test case, not for s0ix and reset */
+       if (!amdgpu_in_reset(adev) && !adev->in_s0ix &&
            (adev->ip_versions[GC_HWIP][0] != IP_VERSION(11, 0, 3)))
                amdgpu_mes_self_test(adev);