DataFrame.rebalance(factor=None, axis=0, num_partitions=None, reassign_worker=True)#

Make Data more balanced across entire cluster.

  • factor (float) – Specified so that number of chunks after balance is total CPU count of cluster * factor.

  • axis (int) – The axis to rebalance.

  • num_partitions (int) – Specified so the number of chunks are at most num_partitions.

  • reassign_worker (bool) – If True, workers will be reassigned.


Result of DataFrame or Series after rebalanced.

Return type

Series or DataFrame