Hadoop給咱們提供了使用c語言訪問hdfs的API,下面進行簡要介紹:java
環境:ubuntu14.04 hadoop1.0.1 jdk1.7.0_51linux
訪問hdfs的函數主要定義在hdfs.h文件中,該文件位於hadoop-1.0.1/src/c++/libhdfs/文件夾下,而相應的庫文件是位於hadoop-1.0.1/c++/Linux-amd64-64/lib/目錄下的libhdfs.so,另外要訪問hdfs還須要依賴jdk的相關API,頭文件目錄包括jdk1.7.0_51/include/和jdk1.7.0_51/include/linux/,庫文件爲jdk1.7.0_51/jre/lib/amd64/server/目錄下的libjvm.so,這些庫和包含目錄都要在編譯鏈接時給出。下面是一段簡單的源程序main.c:c++
1 #include <stdio.h> 2 3 #include <stdlib.h> 4 5 #include <string.h> 6 7 #include "hdfs.h" 8 9 10 11 int main(int argc, char **argv) 12 13 { 14 15 /* 16 17 * Connection to hdfs. 18 19 */ 20 21 hdfsFS fs = hdfsConnect("127.0.0.1", 9000); 22 23 if(!fs) 24 25 { 26 27 fprintf(stderr, "Failed to connect to hdfs.\n"); 28 29 exit(-1); 30 31 } 32 33 /* 34 35 * Create and open a file in hdfs. 36 37 */ 38 39 const char* writePath = "/user/root/output/testfile.txt"; 40 41 hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY|O_CREAT, 0, 0, 0); 42 43 if(!writeFile) 44 45 { 46 47 fprintf(stderr, "Failed to open %s for writing!\n", writePath); 48 49 exit(-1); 50 51 } 52 53 /* 54 55 * Write data to the file. 56 57 */ 58 59 const char* buffer = "Hello, World!"; 60 61 tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, strlen(buffer)+1); 62 63 64 65 /* 66 67 * Flush buffer. 68 69 */ 70 71 if (hdfsFlush(fs, writeFile)) 72 73 { 74 75 fprintf(stderr, "Failed to 'flush' %s\n", writePath); 76 77 exit(-1); 78 79 } 80 81 82 83 /* 84 85 * Close the file. 86 87 */ 88 89 hdfsCloseFile(fs, writeFile); 90 91 92 93 unsigned bufferSize=1024; 94 95 const char* readPath = "/user/root/output/testfile.txt"; 96 97 hdfsFile readFile = hdfsOpenFile(fs, readPath, O_RDONLY, bufferSize, 0, 0); 98 99 if (!readFile) { 100 101 fprintf(stderr,"couldn't open file %s for reading\n",readPath); 102 103 exit(-2); 104 105 } 106 107 // data to be written to the file 108 109 char* rbuffer = (char*)malloc(sizeof(char) * (bufferSize+1)); 110 111 if(rbuffer == NULL) { 112 113 return -2; 114 115 } 116 117 118 119 // read from the file 120 121 tSize curSize = bufferSize; 122 123 for (; curSize == bufferSize;) { 124 125 curSize = hdfsRead(fs, readFile, (void*)rbuffer, curSize); 126 127 rbuffer[curSize]='\0'; 128 129 fprintf(stdout, "read '%s' from file!\n", rbuffer); 130 131 } 132 133 134 135 free(rbuffer); 136 137 hdfsCloseFile(fs, readFile); 138 139 /* 140 141 * Disconnect to hdfs. 142 143 */ 144 145 hdfsDisconnect(fs); 146 147 148 149 return 0; 150 151 }
程序比較簡單,重要的地方都有註釋,這裏就不一一解釋了。程序所實現的主要功能爲在hdfs的/user/root/output/目錄下新建一名稱爲testfile.txt的文件,並寫入Hello, World!,而後將Hello, World!從該文件中讀出並打印出來。若是你的hdfs中沒有/user/root/output/目錄,則須要你新建一個或將路徑改成一個存在的路徑。ubuntu
下面給出我係統中的編譯鏈接指令:jvm
g++ main.cpp -I /root/hadoop-1.0.1/src/c++/libhdfs/ -I /usr/java/jdk1.7.0_51/include/ -I /usr/java/jdk1.7.0_51/include/linux/ -L /root/hadoop-1.0.1/c++/Linux-amd64-64/lib/ -lhdfs -L /usr/java/jdk1.7.0_51/jre/lib/amd64/server/ -ljvm -o hdfs-test函數
其中,g++爲編譯指令,-I後面的是頭文件包含路徑,-L後面的是要鏈接的庫文件路徑-lhdfs和-ljvm是要鏈接的具體庫名稱。具體路徑須要換成你係統中的相應路徑。至此,編譯應該就能夠完成了。但運行時回報找不到libhdfs.so.0和libjvm.so。解決辦法是將相應庫文件所在目錄追加到到/etc/ld.so.conf文件中,而後執行ldconfig命令,這至關於在系統中註冊了一下相應的庫,運行時就不會找不到了。oop