搭建hadoop開發環境--基於xp+cygwin

1.安裝cygwinhtml

 參考博文:http://hi.baidu.com/%BD%AB%D6%AE%B7%E7_%BE%B2%D6%AE%D4%A8/blog/item/8832551c7598551f314e15c2.html  java

       Q1.實際安裝中在第9步 打開cygwin進行配置,首先輸入:ssh-host-config.回車。會讓你輸入yes/no輸入no。回車。見到Have fun!就說明成功了」有些不一樣node

Administrator@03ad6b3ba2f34fe ~
$ ssh-host-config

*** Info: Generating /etc/ssh_host_key
*** Info: Generating /etc/ssh_host_rsa_key
*** Info: Generating /etc/ssh_host_dsa_key
*** Info: Generating /etc/ssh_host_ecdsa_key
*** Info: Creating default /etc/ssh_config file
*** Info: Creating default /etc/sshd_config file
*** Info: Privilege separation is set to yes by default since OpenSSH 3.3.
*** Info: However, this requires a non-privileged account called 'sshd'.
*** Info: For more info on privilege separation read /usr/share/doc/openssh/README.privsep.
*** Query: Should privilege separation be used? (yes/no) no
*** Info: Updating /etc/sshd_config file
*** Info: Added ssh to C:\WINDOWS\system32\driversc\services

*** Query: Do you want to install sshd as a service?
*** Query: (Say "no" if it is already installed as a service) (yes/no) yes
*** Query: Enter the value of CYGWIN for the daemon: []              --直接敲回車

*** Info: The sshd service has been installed under the LocalSystem
*** Info: account (also known as SYSTEM). To start the service now, call
*** Info: `net start sshd' or `cygrunsrv -S sshd'.  Otherwise, it
*** Info: will start automatically after the next reboot.

*** Info: Host configuration finished. Have fun!

      Q2. 第一次安裝中電腦死機,當時執行到建立圖標的步驟,已經能夠運行了,可是仍是想重裝一遍。因而找卸載辦法,有人說用setup那個文件,把選中的都uninstall一下,我信了而後就悲劇了,卸不乾淨。而後找完美卸載的辦法,嘗試了一個"刪除全部cygwin的文件夾,而後清理註冊表中有cygwin的項" 此次OK了。千萬別用setup去卸載!!apache

2.安裝jdk和eclipse,這部分沒有遇到問題,畢業java程序也寫了1年多了windows

3.hadoop配置eclipse

      參考博文:http://hi.baidu.com/%BD%AB%D6%AE%B7%E7_%BE%B2%D6%AE%D4%A8/blog/item/a0ebb1db953a772033fa1c9a.htmlssh

       Q1.順着博主的第四步./hadoop jar ./../hadoop-0.20.2-examples.jar wordcount testin testout的時候開始報錯oop

INFO input.FileInputFormat: Total input paths to process : 2
INFO mapred.JobClient: Running job: job_201202131412_0007
INFO mapred.JobClient:  map 0% reduce 0%
INFO mapred.JobClient: Task Id : attempt_201202131412_0007_m_0             00003_0, Status : FAILED
java.io.FileNotFoundException: File D:/hadoop/temp/taskTracker/jobcache/job_2012             02131412_0007/attempt_201202131412_0007_m_000003_0/work/tmp does not exist.
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys             tem.java:361)
        at

沒錯,博文下留言的人就是俺。這個錯誤怎麼看都是找不到文件,上網找到了一個解決辦法,就是mapred-site.xml文件中修改
ui

<property>
  <name>mapred.child.tmp</name>
  <value>/hadoop/tmp</value>


後來的操做就一直OK了。this

4.經常使用的命令
ssh localhost  登陸
cd /cygdriver/d/hadoop-0.20.2 進入目錄
ls  查看當前目錄下的全部文件
在/cygdrive/d/hadoop-0.20.2/bin目錄下
./start-all.sh    啓動
./hadoop namenode -format 格式化一個新的HDFS
./start-all.sh  同時啓動HDFS和MAP/Reduce
./hadoop dfs -mkdir testin 建立目錄testin
./hadoop dfs -put /test/*.jav0a testin 把test目錄下的java文件所有複製到testin中
./hadoop dfs -ls testin 查看testin中的全部文件
./hadoop dfs -rmr testout  刪除testout文件夾
./hadoop jar ./../hadoop-0.20.2-examples.jar wordcount testin testout
./hadoop dfs -cat testout/part-r-00000 查看testout文件夾下的part-r-00000文件

================================

遺留的問題

1. 好多人的博客中都寫到hadoop0.20.2版本會遇到不少問題,「在windows用cygwin配置hadoop環境的時候必定要選擇0.19.2的版本」。這個我暫時沒遇到,另外提供0.19.2的下載連接,須要的本身下載:http://archive.apache.org/dist/hadoop/core/hadoop-0.19.2/  我也上傳到了csdn  或者能夠留個郵箱我發給你

2. 在cygwin上跑起來沒問題的wordCount,在eclipse下跑着總有問題,和最初遇到那個問題同樣,找不到文件。這個還須要進一步解決

注.參考的文檔:http://wildrain.iteye.com/blog/1164608

 

---低頭拉車,擡頭看路

相關文章
相關標籤/搜索