IPoIB: Fix crash due to skb double destruct
authorShlomo Pongratz <shlomop@mellanox.com>
Mon, 4 Feb 2013 15:29:10 +0000 (15:29 +0000)
committerRoland Dreier <roland@purestorage.com>
Tue, 5 Feb 2013 17:35:06 +0000 (09:35 -0800)
After commit b13912bbb4a2 ("IPoIB: Call skb_dst_drop() once skb is
enqueued for sending"), using connected mode and running multithreaded
iperf for long time, ie

    iperf -c <IP> -P 16 -t 3600

results in a crash.

After the above-mentioned patch, the driver is calling skb_orphan() and
skb_dst_drop() after calling post_send() in ipoib_cm.c::ipoib_cm_send()
(also in ipoib_ib.c::ipoib_send())

The problem with this is, as is written in a comment in both routines,
"it's entirely possible that the completion handler will run before we
execute anything after the post_send()."  This leads to running the
skb cleanup routines simultaneously in two different contexts.

The solution is to always perform the skb_orphan() and skb_dst_drop()
before queueing the send work request.  If an error occurs, then it
will be no different than the regular case where dev_free_skb_any() in
the completion path, which is assumed to be after these two routines.

Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
drivers/infiniband/ulp/ipoib/ipoib_cm.c
drivers/infiniband/ulp/ipoib/ipoib_ib.c

index 03103d2bd641e715ce8ab18c147173bdc5624f21..67b0c1d23678d26565981cf9815993027c0a1b5e 100644 (file)
@@ -741,6 +741,9 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_
 
        tx_req->mapping = addr;
 
+       skb_orphan(skb);
+       skb_dst_drop(skb);
+
        rc = post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1),
                       addr, skb->len);
        if (unlikely(rc)) {
@@ -752,9 +755,6 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_
                dev->trans_start = jiffies;
                ++tx->tx_head;
 
-               skb_orphan(skb);
-               skb_dst_drop(skb);
-
                if (++priv->tx_outstanding == ipoib_sendq_size) {
                        ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net queue\n",
                                  tx->qp->qp_num);
index a1bca70e20aa5b0ea33d4d5410e35b79531f77b6..2cfa76f5d99eac87bf788eb41bb2a7c3815ec700 100644 (file)
@@ -600,6 +600,9 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb,
                netif_stop_queue(dev);
        }
 
+       skb_orphan(skb);
+       skb_dst_drop(skb);
+
        rc = post_send(priv, priv->tx_head & (ipoib_sendq_size - 1),
                       address->ah, qpn, tx_req, phead, hlen);
        if (unlikely(rc)) {
@@ -615,9 +618,6 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb,
 
                address->last_send = priv->tx_head;
                ++priv->tx_head;
-
-               skb_orphan(skb);
-               skb_dst_drop(skb);
        }
 
        if (unlikely(priv->tx_outstanding > MAX_SEND_CQE))