Spark Sql 小文件問題

時間 2021-01-17

原文原文鏈接

參考： https://github.com/Intel-bigdata/spark-adaptive http://spark.apache.org/docs/latest/configuration.html 使用Spark Sql APIs 處理數據容易產生生成大量小文件，小文件問題也是在分佈式計算中常見的問題。一般有三種方法來處理這類問題：設置spark.sql.shuffle.part

>>阅读原文<<