spark-2.3.1-bin-hadoop2.7
包,解壓便可HADOOP_HOME:D:\softwares\Java\hadoop-2.7.7 SPARK_HOME:D:\softwares\Java\spark-2.3.1-bin-hadoop2.7 PATH:%SPARK_HOME%\bin;%HADOOP_HOME%\bin;
D:\softwares\Java\spark-2.3.1-bin-hadoop2.7\python\lib
下的 py4j-0.10.7-src.zip
解壓py4j
放到 python 目錄 D:\softwares\Java\Python36\Lib\site-packages
下from pyspark import SparkConf, SparkContext conf = SparkConf().setMaster('local').setAppName('JackManWu') sc = SparkContext(conf=conf) lines = sc.textFile("D:\softwares\Java\spark-2.3.1-bin-hadoop2.7\README.md") print(lines.count())
2018-08-20 17:30:13 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 2018-08-20 17:30:15 WARN Utils:66 - Service 'SparkUI' could not bind on port 4040. Attempting port 4041. 103
pyspark
, SparkConf
和 SparkContext
會有紅色波浪線,並且也沒有spark的代碼智能提示、補全等,極不方便,以下方法能夠解決:
Project Structure
,在面板右側,點擊 Add Content Root
,將spark目錄 D:\softwares\Java\spark-2.3.1-bin-hadoop2.7\python\lib
下的 pyspark.zip
的文件添加進項目中,便可解決紅色波浪線及智能提示補全問題。