[QUESTION] spark reduce需要sort时,是否还需要准备很大的本地盘 #2146
Replies: 5 comments 1 reply
-
实际上,如果在Celeborn端直接支持排序,可能未必是一个好的选择. Celeborn已经汇总了所有job的shuffle,此时再进行shuffle,压力是很大的. 并且由于celeborn对接的是多个spark job,如果由celeborn进行排序,那么celeborn所需的资源或许将大于spark job本身所需的资源.这样的实现真的合理么? 相比于这种诉求,我甚至更倾向远端spill. |
Beta Was this translation helpful? Give feedback.
-
但是如果不解决celeborn排序,就会要求spark的executor额外还需要大的内存和磁盘的问题,这样存算分离就不彻底,造成对资源的分离无法完全做到,在线上实际生产中,如果已经部署了celeborn,不希望再要求spark executor还要占用大内存和大存储盘。 |
Beta Was this translation helpful? Give feedback.
-
那你或许可以考虑remote spill |
Beta Was this translation helpful? Give feedback.
-
remote spill 是celeborn支持的吗? |
Beta Was this translation helpful? Give feedback.
-
感觉sort最好内置,默认就关闭,让client端自己决定要不要sort。 |
Beta Was this translation helpful? Give feedback.
-
当前看代码实现,celeborn不会在worker侧按照key排好序,那么在reader的时候,如果reduce需要排序就需要使用内存+磁盘来进行本地排序,这个地方就会要求spark的executor有比较大的内存和磁盘,这个后续会考虑直接在celeborn服务端排好序吗?
/cc @who-need-to-know
/assign @who-can-help-you
Beta Was this translation helpful? Give feedback.
All reactions