sphinx+coreseek建立中文分詞索引

 

一:安裝sphinx步驟以下:php

下載sphinx最新的2.2.11版本
cd /opt/sphinx
wget 下載地址
tar xzvf sphinx-2.2.11-release.tar.gz 
cd /opt/sphinx/sphinx-2.2.11-release
./configure --prefix=/opt/soft_install/sphinx/ --with-mysql=/usr/local/mysql
make && make installmysql

安裝成功後,會在目錄:ls -lht /opt/soft_install/sphinx/下有四個文件夾
[root@XL_Php_Mysql sphinx-2.2.11-release]# ls -lht /opt/soft_install/sphinx/
總用量 16K
drwxr-xr-x 4 root root 4.0K 7月  22 12:05 var
drwxr-xr-x 2 root root 4.0K 7月  22 12:05 etc
drwxr-xr-x 3 root root 4.0K 7月  22 12:05 share
drwxr-xr-x 2 root root 4.0K 7月  22 12:05 binlinux

進入etc目錄,拷貝配置文件, cp  sphinx-min.conf.dist  sphinx.conf  (這裏不拷貝sphinx.conf.dist ,由於這文件只是比前者多了一堆註釋)ios

其實只須要安裝coreseek便可,不須要安裝sphinx。


安裝coreseek
首先下載軟件,打開 官網 coreseek.cn(不過該網站已經打不開了,只能去其餘地方下載,我是從csdn上下的,花了5積分,好心疼)
下載coreseek-4.1-beta.tar.gz(csdn下載)
cd /opt/coreseek
wget 下載地址
tar xzvf coreseek-4.1-beta.tar.gz
裏有3個文件夾 一個是mmseg中文分詞包 一個是csft(其實就是sphinx)包,一個是測試包testpack,只需安裝前兩個便可。sql

建立安裝目錄:
mkdir /opt/soft_install/coreseek數據庫

返回coreseek mmseg目錄,開始首先安裝mmseg
cd /opt/coreseek/coreseek-4.1-beta/mmseg-3.2.14/
首先安裝mmseg中文分詞
./configure --prefix=/opt/soft_install/coreseek/mmsegvim

編譯時可能會報錯config.status: error: cannot find input file: src/Makefile.in
經過automake來解決
首先檢查是否安裝了libtool若是沒有 
yum -y install libtool
automake
若是automake報錯 緣由多是下列
Libtool library used but `LIBTOOL' is undefined
The usual way to define `LIBTOOL' is to add `AC_PROG_LIBTOOL'
to `configure.ac' and run `aclocal' and `autoconf' again.
If `AC_PROG_LIBTOOL' is in `configure.ac', make sure
its definition is in aclocal's search path.api

若是以上步驟都沒成功,那麼試下如下辦法(把下面的命令都執行一遍,就行了)
aclocal
libtoolize -f (等於libtoolize --force)
automake -a (等於automake -add-missing)
autoconf
autoheader
make cleanide

執行完上述語句後,繼續執行安裝命令:
./configure --prefix=/opt/soft_install/coreseek/mmseg
此時會出現安裝成功的命令顯示:
------------------------------------------------------------------------
Configuration:php-fpm

  Source code location:       .
  Compiler:                   gcc
  Compiler flags:             -g -O2
  Host System Type:           x86_64-redhat-linux-gnu
  Install path:               /opt/soft_install/coreseek/mmseg

  See config.h for further configuration information.
------------------------------------------------------------------------

運行安裝命令:
make && make install

運行結束後,在安裝目錄:/opt/soft_install/coreseek/mmseg/會有四個文件產生:
[root@XL_Php_Mysql mmseg-3.2.14]# ls -lht /opt/soft_install/coreseek/mmseg/
總用量 16K
drwxr-xr-x 2 root root 4.0K 7月  22 12:31 etc
drwxr-xr-x 3 root root 4.0K 7月  22 12:31 include
drwxr-xr-x 2 root root 4.0K 7月  22 12:31 bin
drwxr-xr-x 2 root root 4.0K 7月  22 12:31 lib
測試是否安裝成功:
/opt/soft_install/coreseek/mmseg/bin/mmseg -d /opt/soft_install/coreseek/mmseg/etc/ /opt/coreseek/coreseek-4.1-beta/mmseg-3.2.14/src/t1.txt

接着繼續安裝csft:
進入原碼目錄:
cd /opt/coreseek/coreseek-4.1-beta/csft-4.1
sh buildconf.sh
./configure --prefix=/opt/soft_install/coreseek/csft --with-mysql=/usr/local/mysql --with-mmseg=/opt/soft_install/coreseek/mmseg --with-mmseg-includes=/opt/soft_install/coreseek/mmseg/include/mmseg/ --with-mmseg-libs=/opt/soft_install/coreseek/mmseg/lib/

make && make install

安裝成功後,在安裝目錄下有四個目錄:
[root@XL_Php_Mysql coreseek]# ls -lht /opt/soft_install/coreseek/csft/
總用量 16K
drwxr-xr-x 4 root root 4.0K 7月  22 12:43 var
drwxr-xr-x 2 root root 4.0K 7月  22 12:43 etc
drwxr-xr-x 3 root root 4.0K 7月  22 12:43 share
drwxr-xr-x 2 root root 4.0K 7月  22 12:43 bin

安裝完畢後 注意 coreseek 中的配置文件也是csft.conf 而不是 sphinx.conf
cd /opt/soft_install/coreseek/csft/etc/
cp sphinx.conf.dist csft.conf
vim csft.conf

測試是否安裝成功
/opt/soft_install/coreseek/csft/bin/indexer -c /opt/soft_install/coreseek/csft/etc/csft.conf 
若是出現報錯:/opt/soft_install/coreseek/csft/bin/indexer: error while loading shared libraries: libmysqlclient.so.18: cannot open shared object file: No such file or directory
則須要將文件:
32位系統:ln -s /usr/local/mysql/lib/libmysqlclient.so.18.1.0 /usr/lib/libmysqlclient.so.18
64位系統:ln -s /usr/local/mysql/lib/libmysqlclient.so.18.1.0 /usr/lib64/libmysqlclient.so.18


配置、測試mysql數據源搜索
cp /opt/coreseek/coreseek-4.1-beta/testpack/etc/csft_mysql.conf /opt/soft_install/coreseek/csft/etc/
[root@XL_Php_Mysql testpack]# ls -lht /opt/soft_install/coreseek/csft/etc/
總用量 68K
-rwxr-xr-x 1 root root 2.8K 7月  22 15:42 csft_mysql.conf
-rw-r--r-- 1 root root  26K 7月  22 12:43 csft.conf
-rw-r--r-- 1 root root  903 7月  22 12:43 example.sql
-rw-r--r-- 1 root root  26K 7月  22 12:43 sphinx.conf.dist
-rw-r--r-- 1 root root 1.2K 7月  22 12:43 sphinx-min.conf.dist

mysql -uroot -p test < /opt/coreseek/coreseek-4.1-beta/testpack/var/test/documents.sql 
建立索引數據庫帳戶
mysql> grant all privileges on test.* to sphinxdbuser@"%" identified by "sphinx07test";
mysql> grant all privileges on recruitment_system.* to sphinxdbuser@"%" identified by "sphinx07test";
mysql> flush privileges;

5.2 啓動搜索服務
啓動搜索服務,這裏選擇後臺運行的方式
# /opt/soft_install/coreseek/csft/bin/indexer -c /opt/soft_install/coreseek/csft/etc/csft_mysql.conf --all
# /opt/soft_install/coreseek/csft/bin/searchd -c /opt/soft_install/coreseek/csft/etc/csft_mysql.conf
以在log中記錄每一個查詢的io和cpu狀態的方式啓動搜索服務
# /opt/soft_install/coreseek/csft/bin/searchd  --iostats --cpustats -c /opt/soft_install/coreseek/csft/etc/csft_mysql.conf

中止搜索服務
# /opt/soft_install/coreseek/csft/bin/searchd -c /opt/soft_install/coreseek/csft/etc/csft_mysql.conf --stop

查看搜索服務狀態
# /opt/soft_install/coreseek/csft/bin/searchd -c /opt/soft_install/coreseek/csft/etc/csft_mysql.conf --status

5.3 本機搜索測試
# /opt/soft_install/coreseek/csft/bin/search -c /opt/soft_install/coreseek/csft/etc/csft_mysql.conf -a 百度成立

中止mysql,觀察以上本機搜索測試是否正常
# /etc/init.d/mysqld stop

啓動mysql,觀察以上本機搜索測試是否正常
# /etc/init.d/mysqld start

5.4 異機php客戶端api調用測試
API路徑:/opt/coreseek/coreseek-4.1-beta/testpack/api/

拷貝test.php 和sphinxapi.php到支持php環境的機器,而後修改數據庫地址,帳號,密碼等配置,測試以下:

# php test.php  中國  
Query '中國 ' retrieved 1 of 1 matches in 0.016 sec.
Query stats:
    '中國' found 17 times in 1 documents

運行時可能會出現錯誤: Warning: assert() has been disabled for security reasons in /opt/test_coreseek/sphinxapi.php on line 182 則修改: vi /etc/php.ini  disable_functions = 該列去除assert, 重啓php-fpm /etc/init.d/php-fpm restart 再次運行便可。

相關文章
相關標籤/搜索