k-Balanced sorting and skew join in MPI and MapReduce

  • Silu Huang ,
  • Ada Wai-Chee Fu

2014 IEEE International Conference on Big Data (Big Data) |

We consider algorithms for sorting and skew equi-join operations for computer clusters. The proposed algorithms achieve the best known theoretical workload balancing guarantee, and exhibit close to optimal balancing in our experiments. Our empirical studies also show that the proposed sorting algorithm is up to 30% faster than the state-of-the-art algorithm.