利用maven打包spark項目,運行生成的jar包(例如:java -jar DataAnalygis.jar hdfs://server1:8020/tasks/files),運行時報如下異常。java
Exception in thread "main" java.lang.RuntimeException: java.io.IOException: No FileSystem for scheme: file
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:657)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:391)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:391)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:111)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:111)
at scala.Option.map(Option.scala:145)apache
解決辦法:app
1)檢查生成的jar中META-INF->services->org.apache.hadoop.fs.FileSystem文件,該文件中要包含FileSystem的實現。特別是maven
org.apache.hadoop.fs.LocalFileSystem #處理local file scheme的類oop
2)還有一個可能性是檢查classpath是否包含hadoop-hdfs.jar,不過這種可能性比較低。通常狀況下,在項目中利用maven打包,應該都配置正確的hadoop-client的依賴(dependency),所以這種錯誤就不是這個狀況致使。spa
另外,對於hadoop的jar運行時報這個錯,這種解決方法也適用。scala