path: root/xlators/debug
diff options
authorAnand Avati <>2012-08-30 13:15:39 -0700
committerAnand Avati <>2012-09-12 14:29:51 -0700
commit4f87fd0ae2ce629576ca5f647a99888d31a46815 (patch)
treeb548adb73477f5e23905adc18fefd90fa9c9e3e8 /xlators/debug
parentc3d7286a67ce0ac4db9cb8fa079a48f423245000 (diff)
dht: improve dht_fix_layout_of_directory for better re-assignment
Jeff Darcy wrote: > AFAICT, the fix-layout code doesn't do the same rotation that the > new-directory code does. Therefore, the new bricks always claim > completely predictable hash ranges for every directory, leading to > either a 0-1-2-3 pattern or a 1-0-2-3 pattern. In other words, a > file whose hash falls into the second quarter of the range will always > be assigned to brick 2, and a file whose hash falls into the fourth > quarter will always be assigned to brick 3. The rest will be split > according to the original pattern. Put still another way, instead of > same-named files in different directories being spread across N bricks, > they might be spread across only two bricks (bad) or totally > concentrated on one brick (worse) regardless of N. The current dht_fix_layout_of_directory() code, in an attempt to maximize overlap of new layout with existing layout (to minimize movement of data) fails to do a good job of randomizing new assignment even when it could do a better job. In an example where we expand from 2 nodes to 4 nodes, the current possibilities are limited in the following way - (theoretical hash range: 00 - 99) OLD 1 ----- server1: 00 - 49 server2: 50 - 99 NEW 1 ----- server1: 00 - 24 server2: 50 - 74 server3: 25 - 49 server4: 75 - 99 OLD 2 ----- server1: 50 - 99 server2: 00 - 49 NEW 2 ------ server1: 50 - 74 server2: 00 - 24 server3: 25 - 49 server4: 75 - 99 The above shows that when add-brick from 2 bricks to 4 bricks, server3 and server4 always get the _same_ hash range no matter what the original hash range assignment was. The fix in this patch is first do the standard new directory assignment to a directory (with rotation etc.) and then do the reassignment to maximize overlap. This way newly added servers still get random ranges and existing servers have a probability of getting either of the quarters which were part of its half previously. The same principles hold for all add-brick from M to M+N. Change-Id: I0cbbf3bfa334645728072d66aaaa80120d0b295f BUG: 853258 Signed-off-by: Anand Avati <> Reviewed-on: Tested-by: Gluster Build System <>
Diffstat (limited to 'xlators/debug')
0 files changed, 0 insertions, 0 deletions