pyspark LEAK: ByteBuf.release() was not called before it's garbage-collected. Enable advanced leak

pyspark執行卡在某一個階段,而且報錯:分佈式

LEAK: ByteBuf.release() was not called before it's garbage-collected. Enable advanced leak reporting

緣由:spa

分佈式數據量太大,收集到一臺機器就會報錯code

解決方法:it

在分佈式計算中儘可能少使用收集到本地處理,好比collect、countByKey等等算子,直接輸出到hdfs文件spark

相關文章
相關標籤/搜索