AI運行環境的搭建

安裝tensorflow

安裝環境爲CENTOS6.8操做系統,pip安裝tensorflow後提示GLIBC版本太低。考慮到升級GLIBC有必定的風險,因此決定使用編譯安裝的方式安裝tensorflow。基本流程是按照這篇教程: http://www.jianshu.com/p/fdb7b54b616e/ 進行的,可是由於選擇使用的版本有些不一樣,本身又遇到了一些坑。因此從新整理一下操做步驟。爲了使安裝步驟對操做系統影響最小,安裝時不使用root帳戶以及sudo權限,而是使用了一個普通帳戶makeuser進行操做(少數步驟須要使用root操做)html

安裝使用到的軟件版本

  • gcc 4.9.4
  • python 3.5.2
  • bzael 0.4.5
  • tensorflow 1.2.0

步驟

編譯安裝gcc4.9.4版本

參考教程: http://blog.csdn.net/xiexievv/article/details/50620170java

GCC官方網站: https://gcc.gnu.org/ 能夠從官網下載gcc的4.9.4版本,我這裏就直接從鏡像網站wget了python

wget http://mirrors.concertpass.com/gcc/releases/gcc-4.9.4/gcc-4.9.4.tar.gz
tar xf gcc-4.9.4.tar.gz
cd gcc-4.9.4
./contrib/download_prerequisites #這步是下載一些須要的組件,我直接下載成功了,若是不成功能夠安裝上面參考教程中的方法手動下載

組件都下載完成後就能夠configure了。由於這裏編譯的gcc高版本只用於編譯tensorflow,而且不但願對系統原來的gcc產生影響。因此單首創建一個文件夾用於安裝編譯使用的環境軟件。使用 --prefix 能夠自定義安裝路徑。linux

cd ..
mkdir gcc-4.9.4-build-temp #建立編譯gcc時的路徑
mkdir software  #建立安裝gcc的路徑
cd gcc-4.9.4-build-temp/
../gcc-4.9.4/configure --prefix=/home/makeuser/software --enable-checking=release --enable-languages=c,c++ --disable-multilib
make -j4
make install

編譯完成以後須要將編譯好的gcc加入用戶makeuser的環境變量中。編輯 ~/.bashrc 加入下列環境變量代碼c++

export PATH=/home/makeuser/software/bin:$PATH
export CC=/home/makeuser/software/bin/gcc
export CXX=/home/makeuser/software/bin/g++
export C_INCLUDE_PATH=/home/makeuser/software/include
export CXX_INCLUDE_PATH=$C_INCLUDE_PATH
export LD_LIBRARY_PATH=/home/makeuser/software/lib:/home/makeuser/software/lib64
export LDFLAGS="-L/home/makeuser/software/lib -L/home/makeuser/software/lib64"
export CXXFLAGS="-L/home/makeuser/software/lib -L/home/makeuser/software/lib64"
export LD_RUN_PATH=/home/makeuser/software/lib/:/home/makeuser/software/lib64/

配置好環境變量後可使用gcc -v命令查看到gcc版本爲4.9.4則已經安裝正確。git

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/makeuser/software/libexec/gcc/x86_64-unknown-linux-gnu/4.9.4/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.9.4/configure --prefix=/home/makeuser/software --enable-checking=release --enable-languages=c,c++ --disable-multilib
Thread model: posix
gcc version 4.9.4 (GCC)

參考教程後面還繼續安裝了gdb,我這裏暫時還用不到因此先不安裝github

編譯安裝python3.5.2

#在編譯安裝前有一點須要注意的是。若是須要編譯的 python 支持 sqlite3 模塊,須要在安裝前在系統上安裝 sqlite-devel 
yum install sqlite-devel -y

參考教程:http://www.cnblogs.com/yuechaotian/archive/2013/06/03/3115482.htmlsql

python官方網站:https://www.python.org/shell

仍是直接wget下載、安裝(python須要安裝在 /usr/local 下,供全部用戶使用,因此 python 安裝時使用root用戶)bootstrap

wget https://www.python.org/ftp/python/3.5.2/Python-3.5.2.tgz
tar xf Python-3.5.2.tgz
cd Python-3.5.2
./configure --prefix=/usr/local/python35 --enable-shared
make -j4 && make install

#使用新安裝的 python3.5 替換原來的 python2.6
ln -s /usr/local/python35/bin/python3 /usr/bin/python3.5
ln -s /usr/local/python35/lib/libpython3.5m.so.1.0 /usr/lib64/
cd /usr/bin/
ln -s python3.5 python3
mv python python.old
ln -s python3 python

#由於系統的yum命令依賴於 python2.6 因此須要將 /usr/bin/yum 中的解釋器指向 /usr/bin/python.old

安裝pip並使用pip安裝numpy(這步操做我不肯定是否是編譯tensorflow必須的,我安裝的時候照着作了)

wget https://bootstrap.pypa.io/get-pip.py --no-check-certificate
python get-pip.py
ln -s /usr/local/python35/bin/pip3 /usr/bin/
ln -s /usr/bin/pip3 /usr/bin/pip
pip install numpy

安裝bazel0.4.5

安裝bazel須要java1.8的環境,個人服務器上以前用rpm方式安裝了jdk-8u40能夠直接使用。若是服務器上沒有java1.8也能夠下載一個tat.gz方式的java包,解壓並正確配置環境變量

這裏安裝的bazel0.4.5與0.4.0的安裝方法有些不一樣,參考這裏

以前嘗試了使用0.4.0版本bazel編譯,編譯時出現了相似下面的問題後來嘗試使用0.4.5未出現此問題

ERROR: /home/krishna/tensorflow/WORKSPACE:3:1: //external:io_bazel_rules_closure: no such attribute 'urls' in 'http_archive' rule.
ERROR: /home/krishna/tensorflow/WORKSPACE:3:1: //external:io_bazel_rules_closure: missing value for mandatory attribute 'url' in 'http_archive' rule.
ERROR: com.google.devtools.build.lib.packages.BuildFileContainsErrorsException: error loading package '': Encountered error while reading extension file 'closure/defs.bzl': no such package '@io_bazel_rules_closure//closure': error loading package 'external': Could not load //external package.

首先去github上bazel的releases頁面下載bazel-0.4.5-dist.zip 這個包並上傳到服務器上,而後在服務器上安裝

mkdir bazel
mv bazel-0.4.5-dist.zip bazel
cd bazel
unzip bazel-0.4.5-dist.zip
./compile.sh

等編譯完成後把output/bazel 複製到 /home/makeuser/software/bin/ 這個目錄已經在PATH中

cp output/bazel /home/makeuser/software/bin/

安裝tensorflow1.2.0

不少指引中中在這步中提示不能使用NFS文件系統,由於個人CentOS並無掛載過NFS因此並無驗證過。

從github上下載tensorflow的1.2.0版本並上傳到服務器上

cd
unzip tensorflow-1.2.0.zip
cd tensorflow-1.2.0

在configure前須要修改源碼中的這個文件 tensorflow/tensorflow.bzl 不然編譯完成後使用時會出現問題

redhat6/centos6太老,爲了順利運行tensorflow代碼,增長librt.so連接項(不然編譯正常,但安裝後運行時會出現 _pywrap_tensorflow_internal.so: undefined symbol: clock_gettime 等相似連接符號錯誤)

將tensorflow.bzl中的

def tf_extension_linkopts():
  return []  # No extension link opts

修改爲

def tf_extension_linkopts():
  return ["-lrt"]  # No extension link opts

執行下面的編譯過程時我還遇到了相似這樣的問題

bazel-out/host/bin/external/protobuf/protoc: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by bazel-out/host/bin/external/protobuf/protoc)
bazel-out/host/bin/external/protobuf/protoc: /usr/lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by bazel-out/host/bin/external/protobuf/protoc)
bazel-out/host/bin/external/protobuf/protoc: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.18' not found (required by bazel-out/host/bin/external/protobuf/protoc)

後來使用了這個解決辦法 就是將以前添加到~/.bashrc中的$LD_LIBRARY_PATH位置路徑添加到/etc/ld.so.conf後面,像這樣

cat /etc/ld.so.conf

include ld.so.conf.d/*.conf
/home/makeuser/software/lib
/home/makeuser/software/lib64

而後執行ldconfig。執行成功後能夠在/etc/ld.so.cache查看到新版gcc的庫文件

strings /etc/ld.so.cache |grep software

/home/makeuser/software/lib64/libvtv.so.0
/home/makeuser/software/lib64/libvtv.so
/home/makeuser/software/lib64/libubsan.so.0
…………

上面說的這步修改是普通用戶權限沒法完成的,須要使用root權限執行

而後就能夠configure,執行的時候注意2個地方。1是Please specify the location of python.檢查後面的路徑是不是你準備使用的python位置,我這裏由於寫了環境變量並且使用的是python2版本因此默認值就是正確的。2是Do you wish to use jemalloc as the malloc implementation?選擇N,不然編譯時會出現報錯

ERROR: /home/makeuser/.cache/bazel/_bazel_makeuser/602695da20d6c4d186ee5dce763d82ad/external/jemalloc/BUILD:10:1: C++ compilation of rule '@jemalloc//:jemalloc' failed: gcc failed: error executing command /home/makeuser/software/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -B/home/makeuser/software/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 ... (remaining 35 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
external/jemalloc/src/pages.c: In function 'je_pages_huge':
external/jemalloc/src/pages.c:203:30: error: 'MADV_HUGEPAGE' undeclared (first use in this function)
  return (madvise(addr, size, MADV_HUGEPAGE) != 0);
                          ^
external/jemalloc/src/pages.c:203:30: note: each undeclared identifier is reported only once for each function it appears in
external/jemalloc/src/pages.c: In function 'je_pages_nohuge':
external/jemalloc/src/pages.c:217:30: error: 'MADV_NOHUGEPAGE' undeclared (first use in this function)
  return (madvise(addr, size, MADV_NOHUGEPAGE) != 0);
                          ^
external/jemalloc/src/pages.c: In function 'je_pages_huge':
external/jemalloc/src/pages.c:207:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^
external/jemalloc/src/pages.c: In function 'je_pages_nohuge':
external/jemalloc/src/pages.c:221:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^
Target //tensorflow/tools/pip_package:build_pip_package failed to build

把上面的坑都填完以後執行編譯應該就不會出現問題了,如今開始編譯(若是運行編譯的服務器上內存比較緊張,能夠添加參數: --local_resources 2048,.5,1.0 來限制編譯線程,防止內存不足報錯 )

bazel build -c opt //tensorflow/tools/pip_package:build_pip_package

編譯完成後開始安裝

bazel-bin/tensorflow/tools/pip_package/build_pip_package /home/makeuser/tensorflow_pkg  #生成whl包
pip install /home/makeuser/tensorflow_pkg/tensorflow-1.2.0-cp27-cp27m-linux_x86_64.whl  #安裝

安裝完成後能夠測試一下

$ python
Python 3.5.2 (default, Dec  5 2017, 11:26:25) 
[GCC 4.9.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello,Tensorflow~')
>>> sess = tf.Session()
2017-12-05 15:25:55.673343: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-05 15:25:55.673435: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-05 15:25:55.673454: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-12-05 15:25:55.673470: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-05 15:25:55.673485: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
>>> print(sess.run(hello))
b'Hello,Tensorflow~'
>>> a = tf.constant(10)
>>> b = tf.constant(32)
>>> print(sess.run(a + b))
42
>>>

安裝其餘須要的環境

以上步驟已經成功的在 python 中安裝了 tensorflow 。但後來又有需求安裝一個 c++ 使用的動態連接庫 libtensorflow_cc.so 。安裝方法以下:

cd ~/tensorflow-1.2.0
bazel build //tensorflow:libtensorflow_cc.so
#下面是爲C++所需編譯準備環境
#我在安裝的時候把這個 .so 文件複製到/usr/local/lib下就可使用了
cp bazel-bin/tensorflow/libtensorflow_cc.so /usr/local/lib/
#將須要的文件放入 /usr/local/include/tf 下,運行時就能夠找到這些文件
mkdir /usr/local/include/tf
cp -r bazel-genfiles/ /usr/local/include/tf/
cp -r tensorflow/ /usr/local/include/tf/
cp -r third_party/ /usr/local/include/tf/

而後把 /usr/local/lib 加入/etc/ld.so.conf ,再運行ldconfig

eigen 3.3.4 安裝

#從官網下載 eigen 3.3.4 並上傳至服務器
tar xf eigen-eigen-5a0156e40feb.tar.bz2
#eigen3的經過yum安裝的方式並不能正常使用。須要經過下載eigen3.3.4而後解壓到/usr/local/include/下並重命名爲eigen3才能正常使用
mv eigen-eigen-5a0156e40feb /usr/local/include/eigen3

protobuf 3.2.0 編譯安裝

# 環境準備
yum install -y autoconf automake libtool
# 參考 https://github.com/google/protobuf/pull/2599/commits/141a1dac6ca572056c6a8b989e41f6ee213f8445
# http://blog.csdn.net/u012839187/article/details/48025225
# http://blog.csdn.net/cristianojason/article/details/68489595
# http://blog.csdn.net/xiexievv/article/details/47396725

tar xf protobuf-cpp-3.2.0.tar.gz
cd protobuf-3.2.0/

./autogen.sh
./configure --prefix=/usr
vim src/google/protobuf/metadata.h
make
make check
make install

安裝完成後可使用protoc --version 查看 protobuf 是否安裝正確,若是出現動態連接庫找不到的狀況能夠嘗試運行 ldconfig 命令從新加載動態鏈接庫

除此以外服務器上還須要安裝線性迴歸的的庫 pulp ,直接使用pip安裝就能夠

pip install pulp

安裝語音識別須要的庫

pip install jieba
相關文章
相關標籤/搜索