sfrench/cifs-2.6.git
4 years agodrm/amdgpu: initialize new parameters and functions for amdgpu_umc structure
Tao Zhou [Mon, 29 Jul 2019 06:28:35 +0000 (14:28 +0800)]
drm/amdgpu: initialize new parameters and functions for amdgpu_umc structure

add initialization for new members of amdgpu_umc structure

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add more parameters and functions to amdgpu_umc structure
Tao Zhou [Mon, 29 Jul 2019 06:10:54 +0000 (14:10 +0800)]
drm/amdgpu: add more parameters and functions to amdgpu_umc structure

expose more parameters and functions of specific umc version to common
umc layer, so amdgpu_umc layer and other blocks could access them

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: remove the clear of MCA_ADDR
Tao Zhou [Mon, 29 Jul 2019 02:28:57 +0000 (10:28 +0800)]
drm/amdgpu: remove the clear of MCA_ADDR

clearing MCA_STATUS is enough to reset the whole MCA, writing zero to
MCA_ADDR is unnecessary

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: honor hw limit on fetching metrics data for navi10
Kevin Wang [Fri, 2 Aug 2019 04:01:00 +0000 (12:01 +0800)]
drm/amd/powerplay: honor hw limit on fetching metrics data for navi10

too frequently to update mertrics table will cause smu internal error.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Don't replace the dc_state for fast updates
Nicholas Kazlauskas [Wed, 31 Jul 2019 14:33:54 +0000 (10:33 -0400)]
drm/amd/display: Don't replace the dc_state for fast updates

[Why]
DRM private objects have no hw_done/flip_done fencing mechanism on their
own and cannot be used to sequence commits accordingly.

When issuing commits that don't touch the same set of hardware resources
like page-flips on different CRTCs we can run into the issue below
because of this:

1. Client requests non-blocking Commit #1, has a new dc_state #1,
state is swapped, commit tail is deferred to work queue

2. Client requests non-blocking Commit #2, has a new dc_state #2,
state is swapped, commit tail is deferred to work queue

3. Commit #2 work starts, commit tail finishes,
atomic state is cleared, dc_state #1 is freed

4. Commit #1 work starts,
commit tail encounters null pointer deref on dc_state #1

In order to change the DC state as in the private object we need to
ensure that we wait for all outstanding commits to finish and that
any other pending commits must wait for the current one to finish as
well.

We do this for MEDIUM and FULL updates. But not for FAST updates, nor
would we want to since it would cause stuttering from the delays.

FAST updates that go through dm_determine_update_type_for_commit always
create a new dc_state and lock the DRM private object if there are
any changed planes.

We need the old state to validate, but we don't actually need the new
state here.

[How]
If the commit isn't a full update then the use after free can be
resolved by simply discarding the new state entirely and retaining
the existing one instead.

With this change the sequence above can be reexamined. Commit #2 will
still free Commit #1's reference, but before this happens we actually
added an additional reference as part of Commit #2.

If an update comes in during this that needs to change the dc_state
it will need to wait on Commit #1 and Commit #2 to finish. Then it'll
swap the state, finish the work in commit tail and drop the last
reference on Commit #2's dc_state.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204181
Fixes: 004b3938e637 ("drm/amd/display: Check scaling info when determing update type")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: David Francis <david.francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Skip determining update type for async updates
Nicholas Kazlauskas [Wed, 31 Jul 2019 13:45:16 +0000 (09:45 -0400)]
drm/amd/display: Skip determining update type for async updates

[Why]
By passing through the dm_determine_update_type_for_commit for atomic
commits that can be done asynchronously we are incurring a
performance penalty by locking access to the global private object
and holding that access until the end of the programming sequence.

This is also allocating a new large dc_state on every access in addition
to retaining all the references on each stream and plane until the end
of the programming sequence.

[How]
Shift the determination for async update before validation. Return early
if it's going to be an async update.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: David Francis <david.francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Allow cursor async updates for framebuffer swaps
Nicholas Kazlauskas [Mon, 10 Jun 2019 12:47:57 +0000 (08:47 -0400)]
drm/amd/display: Allow cursor async updates for framebuffer swaps

[Why]
We previously allowed framebuffer swaps as async updates for cursor
planes but had to disable them due to a bug in DRM with async update
handling and incorrect ref counting. The check to block framebuffer
swaps has been added to DRM for a while now, so this check is redundant.

The real fix that allows this to properly in DRM has also finally been
merged and is getting backported into stable branches, so dropping
this now seems to be the right time to do so.

[How]
Drop the redundant check for old_fb != new_fb.

With the proper fix in DRM, this should also fix some cursor stuttering
issues with xf86-video-amdgpu since it double buffers the cursor.

IGT tests that swap framebuffers (-varying-size for example) should
also pass again.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: David Francis <david.francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix unsigned variable instance compared to less than zero
Colin Ian King [Thu, 1 Aug 2019 11:01:45 +0000 (12:01 +0100)]
drm/amdgpu: fix unsigned variable instance compared to less than zero

Currenly the error check on variable instance is always false because
it is a uint32_t type and this is never less than zero. Fix this by
making it an int type.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 7d0e6329dfdc ("drm/amdgpu: update more sdma instances irq support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: Allow changing of fan_control in smu_v11_0
Matt Coffin [Wed, 31 Jul 2019 20:14:35 +0000 (14:14 -0600)]
drm/amd/powerplay: Allow changing of fan_control in smu_v11_0

[Why]
Before this change, the fan control state on smu_v11 was not able to be
changed because the capability check for checking if the fan control
capability existed was inverted.

[How]
The capability check for fan control in smu_v11_0_auto_fan_control was
inverted, to correctly check for the absence, instead of presence of fan
control capabilities.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Matt Coffin <mcoffin13@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: fix a few spelling mistakes
Colin Ian King [Thu, 1 Aug 2019 08:39:41 +0000 (09:39 +0100)]
drm/amd/powerplay: fix a few spelling mistakes

There are a few spelling mistakes "unknow" -> "unknown" and
"enabeld" -> "enabled". Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agogpu: drm: radeon: Fix a possible null-pointer dereference in radeon_connector_set_pro...
Jia-Ju Bai [Mon, 29 Jul 2019 08:36:44 +0000 (16:36 +0800)]
gpu: drm: radeon: Fix a possible null-pointer dereference in radeon_connector_set_property()

In radeon_connector_set_property(), there is an if statement on line 743
to check whether connector->encoder is NULL:
    if (connector->encoder)

When connector->encoder is NULL, it is used on line 755:
    if (connector->encoder->crtc)

Thus, a possible null-pointer dereference may occur.

To fix this bug, connector->encoder is checked before being used.

This bug is found by a static analysis tool STCheck written by us.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: fix off-by-one upper bounds limit checks
Colin Ian King [Thu, 1 Aug 2019 11:15:41 +0000 (12:15 +0100)]
drm/amd/powerplay: fix off-by-one upper bounds limit checks

There are two occurrances of off-by-one upper bound checking of indexes
causing potential out-of-bounds array reads. Fix these.

Addresses-Coverity: ("Out-of-bounds read")
Fixes: cb33363d0e85 ("drm/amd/powerplay: add smu feature name support")
Fixes: 6b294793e384 ("drm/amd/powerplay: add smu message name support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: Fix EEH during kexec
KyleMahlkuch [Wed, 31 Jul 2019 22:10:14 +0000 (17:10 -0500)]
drm/radeon: Fix EEH during kexec

During kexec some adapters hit an EEH since they are not properly
shut down in the radeon_pci_shutdown() function. Adding
radeon_suspend_kms() fixes this issue.

Signed-off-by: KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Extend CU mask to 8 SEs (v3)
Jay Cornwall [Thu, 18 Jul 2019 21:57:22 +0000 (16:57 -0500)]
drm/amdkfd: Extend CU mask to 8 SEs (v3)

Following bitmap layout logic introduced by:
"drm/amdgpu: support get_cu_info for Arcturus".

v2: squash in fixup for gfx_v9_0.c (Alex)
v3: squash in debug print output fix

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: support get_cu_info for Arcturus
Le Ma [Mon, 8 Jul 2019 12:17:48 +0000 (20:17 +0800)]
drm/amdgpu: support get_cu_info for Arcturus

This change is because SE/SH layout on Arcturus is 8*1, different from
4*2(or 4*1) on Vega ASICs.

Currently the cu bitmap array is 4x4 size, and besides the bitmap is used widely
across SW stack. To mostly reduce the scale of impact, we make the cu bitmap
array compatible with SE/SH layout on Arcturus. Then the store of cu bits of
each shader array for Arcturus will be like below:
    SE0,SH0 --> bitmap[0][0]
    SE1,SH0 --> bitmap[1][0]
    SE2,SH0 --> bitmap[2][0]
    SE3,SH0 --> bitmap[3][0]
    SE4,SH0 --> bitmap[0][1]
    SE5,SH0 --> bitmap[1][1]
    SE6,SH0 --> bitmap[2][1]
    SE7,SH0 --> bitmap[3][1]

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Fix pcie_bw on Vega20
Kent Russell [Wed, 31 Jul 2019 13:24:32 +0000 (09:24 -0400)]
drm/amdgpu: Fix pcie_bw on Vega20

The registers used for VG20 are different in that certain performance
counters were split off to TXCLK3/4. Vega10/12 doesn't have this, so add
a new vg20_get_pcie_usage to reflect this change.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Update NBIO headers to add TXCLK3/4
Kent Russell [Wed, 31 Jul 2019 13:23:45 +0000 (09:23 -0400)]
drm/amdgpu: Update NBIO headers to add TXCLK3/4

These are added for VG20, and are needed for PCIe bandwidth.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Add amdgpu_asic_funcs.reset_method for Vega20
Andrey Grodzovsky [Thu, 1 Aug 2019 15:44:17 +0000 (11:44 -0400)]
drm/amdgpu: Add amdgpu_asic_funcs.reset_method for Vega20

Fixes GPU reset crash.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Mark KFD VRAM allocations for wipe on release
Felix Kuehling [Tue, 9 Jul 2019 00:01:22 +0000 (20:01 -0400)]
drm/amdgpu: Mark KFD VRAM allocations for wipe on release

Memory used by KFD applications can contain sensitive information that
should not be leaked to other processes. The current approach to prevent
leaks is to clear VRAM at allocation time. This is not effective because
memory can be reused in other ways without being cleared. Synchronously
clearing memory on the allocation path also carries a significant
performance penalty.

Stop clearing memory at allocation time. Instead mark the memory for
wipe on release.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Implement VRAM wipe on release
Felix Kuehling [Tue, 9 Jul 2019 23:12:44 +0000 (19:12 -0400)]
drm/amdgpu: Implement VRAM wipe on release

Wipe VRAM memory containing sensitive data when moving or releasing
BOs. Clearing the memory is pipelined to minimize any impact on
subsequent memory allocation latency. Use of a poison value should
help debug future use-after-free bugs.

When moving BOs, the existing ttm_bo_pipelined_move ensures that the
memory won't be reused before being wiped.

When releasing BOs, the BO is fenced with the memory fill operation,
which results in queuing the BO for a delayed delete.

v2: Move amdgpu_amdkfd_unreserve_memory_limit into
amdgpu_bo_release_notify so that KFD can use memory that's still
being cleared in the background

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Add flag to wipe VRAM on release
Felix Kuehling [Tue, 9 Jul 2019 00:09:21 +0000 (20:09 -0400)]
drm/amdgpu: Add flag to wipe VRAM on release

This memory allocation flag will be used to indicate BOs containing
sensitive data that should not be leaked to other processes.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/ttm: Add release_notify callback to ttm_bo_driver
Felix Kuehling [Tue, 9 Jul 2019 23:09:42 +0000 (19:09 -0400)]
drm/ttm: Add release_notify callback to ttm_bo_driver

This notifies the driver that a BO is about to be released.

Releasing a BO also invokes the move_notify callback from
ttm_bo_cleanup_memtype_use, but that happens too late for anything
that would add fences to the BO and require a delayed delete.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Use switch table for dc_to_smu_clock_type
Leo Li [Thu, 25 Jul 2019 17:12:24 +0000 (13:12 -0400)]
drm/amd/display: Use switch table for dc_to_smu_clock_type

Using a static int array will cause errors if the given dm_pp_clk_type
is out-of-bounds. For robustness, use a switch table, with a default
case to handle all invalid values.

v2: 0 is a valid clock type for smu_clk_type. Return SMU_CLK_COUNT
    instead on invalid mapping.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Use proper enum conversion functions
Nathan Chancellor [Thu, 4 Jul 2019 05:52:16 +0000 (22:52 -0700)]
drm/amd/display: Use proper enum conversion functions

clang warns:

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_pp_smu.c:336:8:
warning: implicit conversion from enumeration type 'enum smu_clk_type'
to different enumeration type 'enum amd_pp_clock_type'
[-Wenum-conversion]
                                        dc_to_smu_clock_type(clk_type),
                                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_pp_smu.c:421:14:
warning: implicit conversion from enumeration type 'enum
amd_pp_clock_type' to different enumeration type 'enum smu_clk_type'
[-Wenum-conversion]
                                        dc_to_pp_clock_type(clk_type),
                                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There are functions to properly convert between all of these types, use
them so there are no longer any warnings.

Fixes: a43913ea50a5 ("drm/amd/powerplay: add function get_clock_by_type_with_latency for navi10")
Fixes: e5e4e22391c2 ("drm/amd/powerplay: add interface to get clock by type with latency for display (v2)")
Link: https://github.com/ClangBuiltLinux/linux/issues/586
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix double ucode load by PSP(v3)
Monk Liu [Wed, 31 Jul 2019 08:47:56 +0000 (16:47 +0800)]
drm/amdgpu: fix double ucode load by PSP(v3)

previously the ucode loading of PSP was repreated, one executed in
phase_1 init/re-init/resume and the other in fw_loading routine

Avoid this double loading by clearing ip_blocks.status.hw in suspend or reset
prior to the FW loading and any block's hw_init/resume

v2:
still do the smu fw loading since it is needed by bare-metal

v3:
drop the change in reinit_early_sriov, just clear all block's status.hw
in the head place and set the status.hw after hw_init done is enough

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix incorrect judge on sos fw version
Monk Liu [Tue, 30 Jul 2019 09:32:27 +0000 (17:32 +0800)]
drm/amdgpu: fix incorrect judge on sos fw version

for SRIOV the SOS fw of PSP is loaded in hypervisor thus
guest won't tell the version of it, and judging feature by
reading the sos fw version in guest side is completely wrong

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: cleanup vega10 SRIOV code path
Monk Liu [Tue, 30 Jul 2019 09:21:19 +0000 (17:21 +0800)]
drm/amdgpu: cleanup vega10 SRIOV code path

we can simplify all those unnecessary function under
SRIOV for vega10 since:
1) PSP L1 policy is by force enabled in SRIOV
2) original logic always set all flags which make itself
   a dummy step

besides,
1) the ih_doorbell_range set should also be skipped
for VEGA10 SRIOV.
2) the gfx_common registers should also be skipped
for VEGA10 SRIOV.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: sort feature status index by asic feature id for smu
Kevin Wang [Wed, 31 Jul 2019 07:37:07 +0000 (15:37 +0800)]
drm/amd/powerplay: sort feature status index by asic feature id for smu

before this change, the pp_feature sysfs show feature enable state by
logic feature id, it is not easy to read.
this change will sort pp_features show index by asic feature id.

before:
features high: 0x00000623 low: 0xb3cdaffb
00. DPM_PREFETCHER       ( 0) : enabeld
01. DPM_GFXCLK           ( 1) : enabeld
02. DPM_UCLK             ( 3) : enabeld
03. DPM_SOCCLK           ( 4) : enabeld
04. DPM_MP0CLK           ( 5) : enabeld
05. DPM_LINK             ( 6) : enabeld
06. DPM_DCEFCLK          ( 7) : enabeld
07. DS_GFXCLK            (10) : enabeld
08. DS_SOCCLK            (11) : enabeld
09. DS_LCLK              (12) : disabled
10. PPT                  (23) : enabeld
11. TDC                  (24) : enabeld
12. THERMAL              (33) : enabeld
13. RM                   (35) : disabled
......

after:
features high: 0x00000623 low: 0xb3cdaffb
00. DPM_PREFETCHER       ( 0) : enabeld
01. DPM_GFXCLK           ( 1) : enabeld
02. DPM_GFX_PACE         ( 2) : disabled
03. DPM_UCLK             ( 3) : enabeld
04. DPM_SOCCLK           ( 4) : enabeld
05. DPM_MP0CLK           ( 5) : enabeld
06. DPM_LINK             ( 6) : enabeld
07. DPM_DCEFCLK          ( 7) : enabeld
08. MEM_VDDCI_SCALING    ( 8) : enabeld
09. MEM_MVDD_SCALING     ( 9) : enabeld
10. DS_GFXCLK            (10) : enabeld
11. DS_SOCCLK            (11) : enabeld
12. DS_LCLK              (12) : disabled
13. DS_DCEFCLK           (13) : enabeld
......

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: enable KFD support for navi14
Alex Deucher [Fri, 26 Jul 2019 19:15:12 +0000 (14:15 -0500)]
drm/amdkfd: enable KFD support for navi14

Same as navi10.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: disable inject for failed subblocks of gfx
Dennis Li [Tue, 23 Jul 2019 10:23:44 +0000 (18:23 +0800)]
drm/amdgpu: disable inject for failed subblocks of gfx

some subblocks of gfx fail in inject test, disable them

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: support gfx ras error injection and err_cnt query
Dennis Li [Wed, 31 Jul 2019 12:45:50 +0000 (20:45 +0800)]
drm/amdgpu: support gfx ras error injection and err_cnt query

check gfx error count in both ras querry function and
ras interrupt handler.

gfx ras is still disabled by default due to known stability
issue found in gpu reset.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add RAS callback for gfx
Dennis Li [Wed, 31 Jul 2019 12:42:15 +0000 (20:42 +0800)]
drm/amdgpu: add RAS callback for gfx

Add functions for RAS error inject and query error counter

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add define for gfx ras subblock
Dennis Li [Fri, 19 Jul 2019 07:22:29 +0000 (15:22 +0800)]
drm/amdgpu: add define for gfx ras subblock

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/include: add define of TCP_EDC_CNT_NEW
Dennis Li [Fri, 19 Jul 2019 06:50:25 +0000 (14:50 +0800)]
drm/amd/include: add define of TCP_EDC_CNT_NEW

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/include: add bitfield define for EDC registers
Dennis Li [Fri, 19 Jul 2019 06:42:49 +0000 (14:42 +0800)]
drm/amd/include: add bitfield define for EDC registers

Add EDC registers to support VEGA20 RAS

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: remove ras_reserve_vram in ras injection
Tao Zhou [Wed, 24 Jul 2019 03:19:56 +0000 (11:19 +0800)]
drm/amdgpu: remove ras_reserve_vram in ras injection

error injection address is not in gpu address space

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add check for ras error type
Tao Zhou [Tue, 23 Jul 2019 05:07:24 +0000 (13:07 +0800)]
drm/amdgpu: add check for ras error type

only ue and ce errors are supported

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: update interrupt callback for all ras clients
Tao Zhou [Mon, 22 Jul 2019 12:33:39 +0000 (20:33 +0800)]
drm/amdgpu: update interrupt callback for all ras clients

add err_data parameter in interrupt cb for ras clients

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: allow ras interrupt callback to return error data
Tao Zhou [Mon, 22 Jul 2019 12:27:25 +0000 (20:27 +0800)]
drm/amdgpu: allow ras interrupt callback to return error data

add error data as parameter for ras interrupt cb and process it

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: query umc ras error address
Tao Zhou [Wed, 24 Jul 2019 13:43:45 +0000 (21:43 +0800)]
drm/amdgpu: query umc ras error address

query umc ras error address, translate it to gpu 4k page view
and save it.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add structures for umc error address translation
Tao Zhou [Mon, 22 Jul 2019 10:30:59 +0000 (18:30 +0800)]
drm/amdgpu: add structures for umc error address translation

add related registers, callback function and channel index table

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add support for recording ras error address
Tao Zhou [Mon, 22 Jul 2019 11:20:29 +0000 (19:20 +0800)]
drm/amdgpu: add support for recording ras error address

more than one error address may be recorded in one query

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: update algorithm of umc uncorrectable error counting
Tao Zhou [Tue, 23 Jul 2019 04:25:16 +0000 (12:25 +0800)]
drm/amdgpu: update algorithm of umc uncorrectable error counting

remove the check of ErrorCodeExt

v2: refine the if condition for ue counting

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: switch to amdgpu_umc structure
Tao Zhou [Tue, 23 Jul 2019 04:18:39 +0000 (12:18 +0800)]
drm/amdgpu: switch to amdgpu_umc structure

create new amdgpu_umc structure to for more umc
settings in future and switch to the new structure

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: use 64bit operation macros for umc
Tao Zhou [Tue, 23 Jul 2019 03:57:15 +0000 (11:57 +0800)]
drm/amdgpu: use 64bit operation macros for umc

replace some 32bit macros with 64bit operations to simplify code

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add RREG64/WREG64(_PCIE) operations
Tao Zhou [Wed, 24 Jul 2019 07:13:27 +0000 (15:13 +0800)]
drm/amdgpu: add RREG64/WREG64(_PCIE) operations

add 64 bits register access functions

v2: implement 64 bit functions in low level

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add ras error count after each query (v2)
Tao Zhou [Wed, 31 Jul 2019 12:28:13 +0000 (20:28 +0800)]
drm/amdgpu: add ras error count after each query (v2)

v1: increase ras ce/ue error count
v2: log the number of correctable and uncorrectable errors

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: querry umc error count
Hawking Zhang [Wed, 17 Jul 2019 13:49:53 +0000 (21:49 +0800)]
drm/amdgpu: querry umc error count

check umc error count in both ras querry function and
ras interrupt handler

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: init umc v6_1 functions for vega20
Hawking Zhang [Wed, 17 Jul 2019 13:47:44 +0000 (21:47 +0800)]
drm/amdgpu: init umc v6_1 functions for vega20

init umc callback function for vega20 in sw early init phase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add umc v6_1 query error count support
Hawking Zhang [Wed, 31 Jul 2019 12:23:01 +0000 (20:23 +0800)]
drm/amdgpu: add umc v6_1 query error count support

Implement umc query_ras_error_count function to support querry
both correctable and uncorrectable error

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add umc v6_1_1 IP headers
Hawking Zhang [Wed, 24 Jul 2019 06:36:49 +0000 (14:36 +0800)]
drm/amdgpu: add umc v6_1_1 IP headers

the change introduces IP headers for unified memory controller (umc)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add rsmu v_0_0_2 ip headers
Hawking Zhang [Wed, 24 Jul 2019 06:13:53 +0000 (14:13 +0800)]
drm/amdgpu: add rsmu v_0_0_2 ip headers

remote smu (rsmu) is a sub-block used as ip register interface,
error handling, reset generation.etc

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add amdgpu_umc_functions structure
Hawking Zhang [Tue, 23 Jul 2019 11:42:03 +0000 (19:42 +0800)]
drm/amdgpu: add amdgpu_umc_functions structure

This is common structure as UMC callback function

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: init RSMU and UMC ip base address for vega20
Hawking Zhang [Wed, 17 Jul 2019 09:52:28 +0000 (17:52 +0800)]
drm/amdgpu: init RSMU and UMC ip base address for vega20

the driver needs to program RSMU and UMC registers to
support vega20 RAS feature

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: move some ras data structure to amdgpu_ras.h
Hawking Zhang [Wed, 17 Jul 2019 09:34:46 +0000 (17:34 +0800)]
drm/amdgpu: move some ras data structure to amdgpu_ras.h

These are common structures that can be included by IP specific
source files

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop drmP.h from vcn_v2_5.c
Alex Deucher [Wed, 31 Jul 2019 15:47:26 +0000 (10:47 -0500)]
drm/amdgpu: drop drmP.h from vcn_v2_5.c

Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop drmP.h from vcn_v2_0.c
Alex Deucher [Wed, 31 Jul 2019 15:45:52 +0000 (10:45 -0500)]
drm/amdgpu: drop drmP.h from vcn_v2_0.c

And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop drmP.h from sdma_v5_0.c
Alex Deucher [Wed, 31 Jul 2019 15:43:40 +0000 (10:43 -0500)]
drm/amdgpu: drop drmP.h from sdma_v5_0.c

And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop drmP.h from nv.c
Alex Deucher [Wed, 31 Jul 2019 15:39:40 +0000 (10:39 -0500)]
drm/amdgpu: drop drmP.h from nv.c

And fix up the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop drmP.h from navi10_ih.c
Alex Deucher [Wed, 31 Jul 2019 15:34:39 +0000 (10:34 -0500)]
drm/amdgpu: drop drmP.h from navi10_ih.c

And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop drmP.h in gfx_v10_0.c
Alex Deucher [Wed, 31 Jul 2019 15:31:44 +0000 (10:31 -0500)]
drm/amdgpu: drop drmP.h in gfx_v10_0.c

And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop drmP.h from amdgpu_amdkfd_gfx_v10.c
Alex Deucher [Wed, 31 Jul 2019 15:27:57 +0000 (10:27 -0500)]
drm/amdgpu: drop drmP.h from amdgpu_amdkfd_gfx_v10.c

Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: drop drmP.h in amdgpu_amdkfd_arcturus.c
Alex Deucher [Wed, 31 Jul 2019 15:26:39 +0000 (10:26 -0500)]
drm/amdgpu: drop drmP.h in amdgpu_amdkfd_arcturus.c

Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: determine the features to enable by pptable only
Evan Quan [Thu, 25 Jul 2019 08:40:51 +0000 (16:40 +0800)]
drm/amd/powerplay: determine the features to enable by pptable only

Per current logics, the features to enable are determined together
by driver and pptable. This is not efficient in co-debug with
firmware team.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: correct irq type used for sdma ecc
Hawking Zhang [Thu, 25 Jul 2019 09:22:01 +0000 (17:22 +0800)]
drm/amdgpu: correct irq type used for sdma ecc

we should pass irq type, instead of irq client id,
to irq_get/put interface

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: make power limit retrieval as asic specific
Evan Quan [Wed, 31 Jul 2019 03:52:37 +0000 (22:52 -0500)]
drm/amd/powerplay: make power limit retrieval as asic specific

The power limit retrieval should be done per asic. Since we may
need to lookup in the pptable and that's really asic specific.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: correct arcturus current clock level calculation
Evan Quan [Tue, 23 Jul 2019 12:28:14 +0000 (20:28 +0800)]
drm/amd/powerplay: correct arcturus current clock level calculation

There may be 1Mhz delta between target and actual frequency. That
should be taken into consideration for current level check.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: support UMD PSTATE settings on arcturus
Evan Quan [Tue, 23 Jul 2019 09:30:35 +0000 (17:30 +0800)]
drm/amd/powerplay: support UMD PSTATE settings on arcturus

Enable arcturus UMD PSTATE support.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: fix arcturus real-time clock frequency retrieval
Evan Quan [Tue, 23 Jul 2019 03:42:24 +0000 (11:42 +0800)]
drm/amd/powerplay: fix arcturus real-time clock frequency retrieval

Make sure we can still get the accurate gfxclk/uclk/socclk frequency
even on dpm disabled.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: remove redundancy debug log in smu
Kevin Wang [Fri, 19 Jul 2019 08:06:29 +0000 (16:06 +0800)]
drm/amd/powerplay: remove redundancy debug log in smu

remove redundacy debug log in smu.
eg:
[ 6897.969447] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6897.969448] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6897.969448] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6899.024114] amdgpu: [powerplay] Unsupported SMU message: 38
[ 6899.024151] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6899.024151] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6899.024152] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6900.078296] amdgpu: [powerplay] Unsupported SMU message: 38
[ 6900.078332] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6900.078332] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6900.078333] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6901.133230] amdgpu: [powerplay] Unsupported SMU message: 38

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: correct the bitmask used in arcturus
Evan Quan [Mon, 22 Jul 2019 09:03:02 +0000 (17:03 +0800)]
drm/amd/powerplay: correct the bitmask used in arcturus

Those bitmask prefixed by "SMU_" should be used.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: add missing arcturus feature maps
Evan Quan [Mon, 22 Jul 2019 08:26:04 +0000 (16:26 +0800)]
drm/amd/powerplay: add missing arcturus feature maps

Add missing feature maps for arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: support fan speed retrieval on arcturus
Evan Quan [Mon, 22 Jul 2019 04:09:38 +0000 (12:09 +0800)]
drm/amd/powerplay: support fan speed retrieval on arcturus

Support arcturus fan speed retrieval.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: support real-time clock retrieval on arcturus
Evan Quan [Fri, 19 Jul 2019 09:18:34 +0000 (17:18 +0800)]
drm/amd/powerplay: support real-time clock retrieval on arcturus

Enable arcturus real-time clock retrieval.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: support sensor reading on arcturus
Evan Quan [Mon, 29 Jul 2019 18:18:37 +0000 (13:18 -0500)]
drm/amd/powerplay: support sensor reading on arcturus

Support sensor reading for gpu loading, power and
temperatures.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: init arcturus SMU metrics table on bootup
Evan Quan [Mon, 22 Jul 2019 07:55:52 +0000 (15:55 +0800)]
drm/amd/powerplay: init arcturus SMU metrics table on bootup

Initialize arcturus SMU metrics table.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: correct UVD/VCE/VCN power status retrieval
Evan Quan [Mon, 22 Jul 2019 02:42:29 +0000 (10:42 +0800)]
drm/amd/powerplay: correct UVD/VCE/VCN power status retrieval

VCN should be used for Vega20 later ASICs while UVD and VCE
are for previous ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: correct Navi10 VCN powergate control (v2)
Evan Quan [Mon, 22 Jul 2019 02:27:21 +0000 (10:27 +0800)]
drm/amd/powerplay: correct Navi10 VCN powergate control (v2)

No VCN DPM bit check as that's different from VCN PG. Also
no extra check for possible double enablement/disablement
as that's already done by VCN.

v2: check return value of smu_feature_set_enabled

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: support VCN powergate status retrieval for SW SMU
Evan Quan [Mon, 22 Jul 2019 01:57:27 +0000 (09:57 +0800)]
drm/amd/powerplay: support VCN powergate status retrieval for SW SMU

Commonly used for VCN powergate status retrieval for SW SMU.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: support VCN powergate status retrieval on Raven
Evan Quan [Mon, 22 Jul 2019 01:55:36 +0000 (09:55 +0800)]
drm/amd/powerplay: support VCN powergate status retrieval on Raven

Enable VCN powergate status report on Raven.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: add new sensor type for VCN powergate status
Evan Quan [Mon, 22 Jul 2019 01:51:59 +0000 (09:51 +0800)]
drm/amd/powerplay: add new sensor type for VCN powergate status

VCN is widely used in new ASICs and different from tranditional
UVD and VCE.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: update more sdma instances irq support
Le Ma [Tue, 16 Jul 2019 07:21:54 +0000 (15:21 +0800)]
drm/amdgpu: update more sdma instances irq support

Update for sdma ras ecc_irq and other minors.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/include: adjust base offset of SMUIO and THM for Arcturus
Le Ma [Mon, 15 Jul 2019 10:00:50 +0000 (18:00 +0800)]
drm/amd/include: adjust base offset of SMUIO and THM for Arcturus

Arcturus has different _BASE_IDX value in some HWIP_offset.h. To make source
files like smu_v11_0.c and soc15.c that include HWIP_offset.h of Vega20
reusable for Arcturus, align this base offset with Vega20.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: hold on the arcturus gfx dpm support in driver
Evan Quan [Wed, 17 Jul 2019 01:34:13 +0000 (09:34 +0800)]
drm/amd/powerplay: hold on the arcturus gfx dpm support in driver

As for now, only "Prefetcher" is guarded to be working from
SMU firmware.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: correct VCN powergate routine for acturus
Evan Quan [Tue, 16 Jul 2019 03:03:10 +0000 (11:03 +0800)]
drm/amdgpu: correct VCN powergate routine for acturus

Arcturus VCN should powergate in the way as Navi.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: enable arcturus powerplay
Evan Quan [Fri, 12 Jul 2019 08:53:28 +0000 (16:53 +0800)]
drm/amd/powerplay: enable arcturus powerplay

Arcturus powerplay is ready to use.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: initialize arcturus MP1 and THM base address
Evan Quan [Fri, 12 Jul 2019 08:50:52 +0000 (16:50 +0800)]
drm/amd/powerplay: initialize arcturus MP1 and THM base address

Initialize base address for those IPs which are used in powerplay.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: enable SW SMU routine support for arcturus
Evan Quan [Wed, 31 Jul 2019 04:30:07 +0000 (23:30 -0500)]
drm/amd/powerplay: enable SW SMU routine support for arcturus

Enable arcturus SW SMU routines.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: update arcturus_ppt.c/h V3
Evan Quan [Mon, 29 Jul 2019 17:43:28 +0000 (12:43 -0500)]
drm/amd/powerplay: update arcturus_ppt.c/h V3

Arcturus ASIC specific powerplay interfaces.

V2: correct SMU msg naming
    drop unnecessary debugs

V3: rebase (Alex)

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: update arcturus_ppsmc.h
Evan Quan [Fri, 12 Jul 2019 08:28:02 +0000 (16:28 +0800)]
drm/amd/powerplay: update arcturus_ppsmc.h

Correct header and fix typo.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: update smu11_driver_if_arcturus.h
Evan Quan [Fri, 12 Jul 2019 08:24:34 +0000 (16:24 +0800)]
drm/amd/powerplay: update smu11_driver_if_arcturus.h

It guides how driver should interface with SMU in arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: add SW SMU interface for dumping pptable out (v2)
Evan Quan [Wed, 31 Jul 2019 03:50:14 +0000 (22:50 -0500)]
drm/amd/powerplay: add SW SMU interface for dumping pptable out (v2)

This is especially useful in early bring up phase.

v2: disabled by default (Alex)

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: add smcdpminfo table v4_6 support
Evan Quan [Wed, 10 Jul 2019 01:29:57 +0000 (09:29 +0800)]
drm/amd/powerplay: add smcdpminfo table v4_6 support

New smcdpminfo table used in arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Save/restore vcc on gfx10
Jay Cornwall [Sun, 28 Jul 2019 21:00:59 +0000 (16:00 -0500)]
drm/amdkfd: Save/restore vcc on gfx10

VCC moved out of user SGPR allocation in gfx10. It's now stored
in SGPRs 106-107.

Also fixes incorrect SGPR read offsets.

Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Save/restore flat_scratch_lo/hi on gfx10
Jay Cornwall [Sun, 28 Jul 2019 20:25:05 +0000 (15:25 -0500)]
drm/amdkfd: Save/restore flat_scratch_lo/hi on gfx10

These moved from SGPRs in gfx9 to HWREG in gfx10.

Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Fix gfx10 wave64 VGPR context restore
Jay Cornwall [Sun, 28 Jul 2019 20:24:40 +0000 (15:24 -0500)]
drm/amdkfd: Fix gfx10 wave64 VGPR context restore

Copy/paste error, first 4 VGPRs are separated by 64 dwords (256 bytes).

Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Support uclk switching for DCN2
Nicholas Kazlauskas [Tue, 30 Jul 2019 13:45:33 +0000 (09:45 -0400)]
drm/amd/display: Support uclk switching for DCN2

[Why]
We were previously forcing the uclk for every state to max and reducing
the switch time to prevent uclk switching from occuring. This workaround
was previously needed in order to avoid hangs + underflow under certain
display configurations.

Now that DC has the proper fix complete we can drop the hacks and
improve power for most display configurations.

[How]
We still need the function pointers hooked up to grab the real uclk
states from pplib. The rest of the prior hack can be reverted.

The key requirements here are really just DC support, updated firmware,
and support for disabling p-state support when needed in pplib/smu.

When these requirements are met uclk switching works without underflow
or hangs.

Fixes: 02316e963a5a ("drm/amd/display: Force uclk to max for every state")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Embed DCN2 SOC bounding box
Nicholas Kazlauskas [Tue, 30 Jul 2019 13:08:34 +0000 (09:08 -0400)]
drm/amd/display: Embed DCN2 SOC bounding box

[Why]
In order to support uclk switching on NV10 the SOC bounding box
needs to be updated.

[How]
We currently read the constants from the gpu info FW, but supporting
workarounds in DC for different versions of the FW adds additional
complexity to the codebase.

NV10 has been released so it's cleanest to keep the bounding box and
source code in sync by embedding the bounding box like we do for
other ASICs.

Fixes: 02316e963a5a ("drm/amd/display: Force uclk to max for every state")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix a potential information leaking bug
Wang Xiayang [Sat, 27 Jul 2019 09:30:30 +0000 (17:30 +0800)]
drm/amdgpu: fix a potential information leaking bug

Coccinelle reports a path that the array "data" is never initialized.
The path skips the checks in the conditional branches when either
of callback functions, read_wave_vgprs and read_wave_sgprs, is not
registered. Later, the uninitialized "data" array is read
in the while-loop below and passed to put_user().

Fix the path by allocating the array with kcalloc().

The patch is simplier than adding a fall-back branch that explicitly
calls memset(data, 0, ...). Also it does not need the multiplication
1024*sizeof(*data) as the size parameter for memset() though there is
no risk of integer overflow.

Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix error handling in amdgpu_cs_process_fence_dep
Christian König [Tue, 30 Jul 2019 09:17:03 +0000 (11:17 +0200)]
drm/amdgpu: fix error handling in amdgpu_cs_process_fence_dep

We always need to drop the ctx reference and should check
for errors first and then dereference the fence pointer.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>