authorPranith Kumar K <>2011-09-20 18:30:42 +0530
committerVijay Bellur <>2011-09-21 04:25:15 -0700
commit03591027b06c556baa95c6fa4569be0bff4adcd8 (patch)
treeacb3ff42e7df960a3d294916487e27e6757e0258 /
parent82d1a445b92526629d699f947a2d2bd029c8db75 (diff)
cluster/afr: Make local->child_up immutable
Afr transaction performs lock, pre-op, op, post-op and unlock steps in that order. The child_up[] is overloaded with the information of where all the first two steps succeeded. This works perfectly fine for Transaction, but the locking/unlocking part of the code is re-used by data self-heal. In that each loop_frame does lock, rchecksum, read-from-source and write-to-sinks, unlock steps. Rchecksum fop assumes that the fop needs to happen on one source + all sinks and sets the call_count to that number. But if the lock step fails on any of the sinks it will mark the child_up of that child to 0, which will result in call_count mismatch and the frame will hang thinking that some more cbks need to come. When this happens loop_frame will never go to unlock step leading to hangs on that file. Change-Id: I3dd0449cc6193a980bacf637d935881f4b22210a BUG: 3597 Reviewed-on: Tested-by: Gluster Build System <> Reviewed-by: Amar Tumballi <> Reviewed-by: Vijay Bellur <>
