drm/i915/execlists: Reset CSB pointers on canceling requests (wedging)
authorChris Wilson <chris@chris-wilson.co.uk>
Fri, 14 Sep 2018 08:00:17 +0000 (09:00 +0100)
committerChris Wilson <chris@chris-wilson.co.uk>
Fri, 14 Sep 2018 14:21:58 +0000 (15:21 +0100)
The prior assumption was that we did not need to reset the CSB on
wedging when cancelling the outstanding requests as it would be cleaned
up in the subsequent reset prior to restarting the GPU. However, what
was not accounted for was that in preparing for the reset, we would try
to process the outstanding CSB entries. If the GPU happened to complete
a CS event just as we were performing the cancellation of requests, that
event would be kept in the CSB until the reset -- but our bookkeeping
was cleared, causing confusion when trying to complete the CS event.

v2: Use a sanitize on unwedge to avoid interfering with eio suspend
(where we intentionally disable GPU reset).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107925
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180914080017.30308-3-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/i915_gem.c

index e3c2492438b8762f792a002af5b5c9839f7a7a71..a94d5a308c4d6083e6d844761a862489d9ecda68 100644 (file)
@@ -3438,6 +3438,9 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
        i915_retire_requests(i915);
        GEM_BUG_ON(i915->gt.active_requests);
 
+       if (!intel_gpu_reset(i915, ALL_ENGINES))
+               intel_engines_sanitize(i915);
+
        /*
         * Undo nop_submit_request. We prevent all new i915 requests from
         * being queued (by disallowing execbuf whilst wedged) so having