Hadoop經過c語言API訪問hdfs

Hadoop給咱們提供了使用c語言訪問hdfsAPI,下面進行簡要介紹:java

環境:ubuntu14.04  hadoop1.0.1  jdk1.7.0_51linux

訪問hdfs的函數主要定義在hdfs.h文件中,該文件位於hadoop-1.0.1/src/c++/libhdfs/文件夾下,而相應的庫文件是位於hadoop-1.0.1/c++/Linux-amd64-64/lib/目錄下的libhdfs.so,另外要訪問hdfs還須要依賴jdk的相關API,頭文件目錄包括jdk1.7.0_51/include/和jdk1.7.0_51/include/linux/,庫文件爲jdk1.7.0_51/jre/lib/amd64/server/目錄下的libjvm.so,這些庫和包含目錄都要在編譯鏈接時給出。下面是一段簡單的源程序main.cc++

  1 #include <stdio.h>
  2 
  3 #include <stdlib.h>
  4 
  5 #include <string.h>
  6 
  7 #include "hdfs.h"
  8 
  9  
 10 
 11 int main(int argc, char **argv)
 12 
 13 {
 14 
 15     /*
 16 
 17      * Connection to hdfs.
 18 
 19      */
 20 
 21     hdfsFS fs = hdfsConnect("127.0.0.1", 9000);
 22 
 23     if(!fs)
 24 
 25     {
 26 
 27         fprintf(stderr, "Failed to connect to hdfs.\n");
 28 
 29         exit(-1);
 30 
 31     }
 32 
 33     /*
 34 
 35      * Create and open a file in hdfs.
 36 
 37      */
 38 
 39     const char* writePath = "/user/root/output/testfile.txt";
 40 
 41     hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY|O_CREAT, 0, 0, 0);
 42 
 43     if(!writeFile)
 44 
 45     {
 46 
 47         fprintf(stderr, "Failed to open %s for writing!\n", writePath);
 48 
 49         exit(-1);
 50 
 51     }
 52 
 53     /*
 54 
 55      * Write data to the file.
 56 
 57      */
 58 
 59     const char* buffer = "Hello, World!";
 60 
 61     tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, strlen(buffer)+1);
 62 
 63  
 64 
 65     /*
 66 
 67      * Flush buffer.
 68 
 69      */
 70 
 71     if (hdfsFlush(fs, writeFile))
 72 
 73     {
 74 
 75         fprintf(stderr, "Failed to 'flush' %s\n", writePath);
 76 
 77         exit(-1);
 78 
 79     }
 80 
 81  
 82 
 83     /*
 84 
 85      * Close the file.
 86 
 87      */
 88 
 89     hdfsCloseFile(fs, writeFile);
 90 
 91  
 92 
 93     unsigned bufferSize=1024;
 94 
 95     const char* readPath = "/user/root/output/testfile.txt";
 96 
 97     hdfsFile readFile = hdfsOpenFile(fs, readPath, O_RDONLY, bufferSize, 0, 0);
 98 
 99     if (!readFile) {
100 
101         fprintf(stderr,"couldn't open file %s for reading\n",readPath);
102 
103         exit(-2);
104 
105     }
106 
107     // data to be written to the file
108 
109     char* rbuffer = (char*)malloc(sizeof(char) * (bufferSize+1));
110 
111     if(rbuffer == NULL) {
112 
113         return -2;
114 
115     }
116 
117  
118 
119     // read from the file
120 
121     tSize curSize = bufferSize;
122 
123     for (; curSize == bufferSize;) {
124 
125         curSize = hdfsRead(fs, readFile, (void*)rbuffer, curSize);
126 
127         rbuffer[curSize]='\0';
128 
129         fprintf(stdout, "read '%s' from file!\n", rbuffer);
130 
131     }
132 
133  
134 
135     free(rbuffer);
136 
137     hdfsCloseFile(fs, readFile);
138 
139     /*
140 
141      * Disconnect to hdfs.
142 
143      */
144 
145     hdfsDisconnect(fs);
146 
147  
148 
149     return 0;
150 
151 }

 

程序比較簡單,重要的地方都有註釋,這裏就不一一解釋了。程序所實現的主要功能爲在hdfs/user/root/output/目錄下新建一名稱爲testfile.txt的文件,並寫入Hello, World!,而後將Hello, World!從該文件中讀出並打印出來。若是你的hdfs中沒有/user/root/output/目錄,則須要你新建一個或將路徑改成一個存在的路徑。ubuntu

下面給出我係統中的編譯鏈接指令:jvm

g++ main.cpp -I /root/hadoop-1.0.1/src/c++/libhdfs/ -I /usr/java/jdk1.7.0_51/include/ -I /usr/java/jdk1.7.0_51/include/linux/ -L /root/hadoop-1.0.1/c++/Linux-amd64-64/lib/ -lhdfs -L /usr/java/jdk1.7.0_51/jre/lib/amd64/server/ -ljvm -o hdfs-test函數

其中,g++爲編譯指令,-I後面的是頭文件包含路徑,-L後面的是要鏈接的庫文件路徑-lhdfs-ljvm是要鏈接的具體庫名稱。具體路徑須要換成你係統中的相應路徑。至此,編譯應該就能夠完成了。但運行時回報找不到libhdfs.so.0libjvm.so。解決辦法是將相應庫文件所在目錄追加到到/etc/ld.so.conf文件中,而後執行ldconfig命令,這至關於在系統中註冊了一下相應的庫,運行時就不會找不到了。oop

相關文章
相關標籤/搜索