Coreseek + Sphinx + Mysql + PHP構建中文檢索引擎

大體圖解

1、安裝

1.下載和解壓安裝包php

cd /var/install
wget http://git.oschina.net/tanjiajun/sphinx/raw/master/coreseek-3.2.14.tar.gz
sudo tar -zxvf coreseek-3.2.14.tar.gz ```

**2.首先安裝mmseg3(用於中文字分詞)**

cd mmseg-3.2.14/mmseg-3.2.14 sudo ./bootstrap sudo ./congigure --prefix=/usr/local/mmseg3 make & make install77html

**3.安裝coreseek**

cd csft-3.2.14/ sudo sh buildconf.sh sudo ./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql make & make instalmysql

### 2、測試安裝是否成功


**1.測試mmseg3**

cd /var/install/coreseek-3.2.14/testpack/var/test cat test.xmlgit

此時
![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31141405_Trx0.png "在這裏輸入圖片標題")

mmseg分詞

sudo /usr/local/mmseg3/bin/mmseg -d /usr/local/mmseg3/etc test.xml程序員

此時
![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31141405_Trx0.png "在這裏輸入圖片標題")

mmseg3安裝成功!

**2.測試coreseek indexer生成索引**

cd /var/install/coreseek-3.2.14/testpack/ sudo /usr/local/coreseek/bin/indexer -c etc/csft.conf --allsql

報錯
![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31142036_VwtB.png "在這裏輸入圖片標題")

安裝libexpat 或者libexpat-dev

apt-get install libexpat或者 apt-get install libexpat-dev數據庫

從新安裝coreseek

sudo make clean make & make installbootstrap

再次報錯,下圖
![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31142310_YSQk.png "在這裏輸入圖片標題")

編輯:

sudo vi /src/MakeFile文件 sudo vi MakeFile文件api

LIBS = -lm -lexpat -L/usr/local/lib 改爲 LIBS = -lm -lexpat -liconv -L/usr/local/lib 服務器

從新安裝coreseek

sudo make clean make & make install

coreseek再次安裝成功
繼續測試索引生成

cd /var/install/coreseek-3.2.14/testpack/ sudo /usr/local/coreseek/bin/indexer -c etc/csft.conf --all

![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31142556_I6B1.png "在這裏輸入圖片標題")

索引生成成功
搜索關鍵字‘網絡’

sudo /usr/local/coreseek/bin/search -c etc/csft.conf 網絡

![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31142714_cBQ4.png "在這裏輸入圖片標題")

**coreseek安裝完成!!!!!!!**

### 3、mysql和coreseek


此次我打算創建兩個mysql的數據源配置,開啓兩個搜索線程服務。一個是以coreseek自帶的數據庫腳本和配置爲例子,另一個是本身根據其例子更改的配置例子,其實都是大同小異

**1.建測試庫:**

create database coreseek_test;

建表:
1.一份本身的

create table sphinx_conter(count_id integer primary key not null,max_doc_id integer not null,name varchar(255) null,desc varchar(255) null,address varchar(255) null);

2.一份coreseek自帶的,它的腳本會在剛纔咱們解壓的目錄下

/var/install/coreseek-3.2.14/testpack/var/test/documents.sql

創建配置數據源文件
Coreseek也有自帶的例子,是和示例數據庫表documents對應的,目錄在

/var/install/coreseek-3.2.14/testpack/etc/csft_mysql.conf

下,這裏複製兩份份到目錄

/usr/local/coreseek/etc

下,一份命名爲sphinx_conter_min.conf,一份爲csft_mysql.conf(默認示例)
這樣個人目錄下如圖:

![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31142714_cBQ4.png "在這裏輸入圖片標題")

**2.數據源配置**

(1)csft_mysql.conf內容爲:

#MySQL數據源配置 #源定義 source mysql { type = mysql

sql_host                = 127.0.0.1
sql_user                = root
sql_pass                = root
sql_db                    = coreseek_test
sql_port                = 3306
sql_query_pre            = SET NAMES utf8

sql_query                = SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content FROM documents
                                                          #sql_query第一列id需爲整數
                                                          #title、content做爲字符串/文本字段,被全文索引
sql_attr_uint            = group_id           #從SQL讀取到的值必須爲整數
sql_attr_timestamp        = date_added #從SQL讀取到的值必須爲整數,做爲時間屬性

sql_query_info_pre      = SET NAMES utf8                                        #命令行查詢時,設置正確的字符集
sql_query_info            = SELECT * FROM documents WHERE id=$id #命令行查詢時,從數據庫讀取原始數據信息

} #index定義 index mysql { source = mysql #對應的source名稱 path = /usr/local/coreseek/var/data/mysql #請修改成實際使用的絕對路徑,例如:/usr/local/coreseek/var/... docinfo = extern mlock = 0 morphology = none min_word_len = 1 html_strip = 0

#中文分詞配置,詳情請查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
charset_dictpath = /usr/local/mmseg3/etc/ #BSD、Linux環境下設置,/符號結尾
#charset_dictpath = etc/                             #Windows環境下設置,/符號結尾,最好給出絕對路徑,例如:C:/usr/local/coreseek/etc/...
charset_type        = zh_cn.utf-8

}

#全局index定義 indexer { mem_limit = 128M }

#searchd服務定義 searchd { listen = 9313 read_timeout = 5 max_children = 30 max_matches = 1000 seamless_rotate = 0 preopen_indexes = 0 unlink_old = 1 pid_file = /var/sphinx_log/searchd_mysql.p log = /var/sphinx_log/searchd_mysql.log
query_log = /var/sphinx_log/query_mysql.log }

(2).sphinx_conter_min.conf內容爲:

#源定義 source sphinx_test { type = mysql

sql_host                = 127.0.0.1
sql_user                = root
sql_pass                = root
sql_db                    = coreseek_test
sql_port                = 3306

sql_query_pre			= SET NAMES utf8
sql_query_pre			= INSERT INTO sphinx_conter (`max_doc_id`,`name`,`desc`,`address`) values(unix_timestamp(now()),'我是程序員','側死','廣州大道中國')
sql_query				= \
SELECT count_id, max_doc_id, name, address \
FROM sphinx_conter
sql_attr_uint			= count_id
sql_attr_uint			= max_doc_id
    #sql_attr_timestamp		= date_added
#sql_field_string                = name
#sql_field_string                = desc
#sql_field_string                = address
sql_query_info		= SELECT * FROM sphinx_conter WHERE count_id=$id
#sql_query_info		= SELECT * FROM sphinx_conter

}

#index定義 index sphinx_test { source = sphinx_test #對應的source名稱 path = /usr/local/coreseek/var/data/sphinx_test #請修改成實際使用的絕對路徑,例如:/usr/local/coreseek/var/... docinfo = extern mlock = 0 morphology = none min_word_len = 1 html_strip = 0

#中文分詞配置,詳情請查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
charset_dictpath = /usr/local/mmseg3/etc/ #BSD、Linux環境下設置,/符號結尾
#charset_dictpath = etc/                             #Windows環境下設置,/符號結尾,最好給出絕對路徑,例如:C:/usr/local/coreseek/etc/...
charset_type        = zh_cn.utf-8

}

#全局index定義 indexer { mem_limit = 128M }

#searchd服務定義 searchd { listen = 9312 read_timeout = 5 max_children = 30 max_matches = 1000 seamless_rotate = 0 preopen_indexes = 0 unlink_old = 1 pid_file = /var/sphinx_log/searchd_sphinx_test.pid log = /var/sphinx_log/searchd_sphinx_test.log
query_log = /var/sphinx_log/query_sphinx_test.log }

**3.對配置的數據源生成索引**

(1)對csft_mysql.conf執行生成索引

sudo /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --all --rotate

![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31150140_Hv8R.png "在這裏輸入圖片標題")

而後開啓搜索服務

sudo /usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/csft_mysql.conf

![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31150312_hjeC.png "在這裏輸入圖片標題")

而後進行搜索測試:

sudo /usr/local/coreseek/bin/search -c /usr/local/coreseek/etc/csft_mysql.conf Opera

![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31150312_hjeC.png "在這裏輸入圖片標題")


(2)對配置文件sphinx_conter_min.conf進行同樣的操做:

sudo /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/sphinx_conter_min.conf --all --rotate sudo /usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/sphinx_conter_min.conf

這樣咱們已是開啓了兩個搜索進程,一個是端口9312的,一個是9313的。可用命令

ps -ef | grep coreseek

進行查看
![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31150542_s3rR.png "在這裏輸入圖片標題")

**4.其它可用到的命令:**

執行增量索引 sudo /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf mysql --rotate

![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31150542_s3rR.png "在這裏輸入圖片標題")

合併索引 /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --merge main delta --rotate --merge-dst-range deleted 0 0

假如配置文件配置了兩個索引的話
這樣能夠加入定時腳本,每一分鐘執行一次增量索引,每5分鐘合併一次索引,而後固定時間執行所有從新生成一次索引

![輸入圖片說明](https://static.oschina.net/uploads/img/201610/31150841_5f7j.png "在這裏輸入圖片標題")

至此,coreseek搜索服務器已經所有創建完畢!

### 4、使用php端連接coreseek


安裝sphinxclient(在咱們以前下載的解壓縮包已有)

cd /var/install/coreseek-3.2.14/csft-3.2.14/api/libsphinxclient sudo ./configure --prefix=/usr/local/sphinxclient sudo make & make install

安裝sphinx的PHP擴展

cd /var/install/ sudo wget http://pecl.php.net/get/sphinx-1.3.0.tgz sudo tar -zxvf sphinx-1.3.0.tgz cd sphinx-1.3.0
sudo phpize sudo ./configure --with-php-config=/usr/local/php/bin/php-config --with-sphinx=/usr/local/sphinxclient sudo make & make install

修改php.ini增長擴展extension=sphinx.so ,重啓php
php -m 查看是否已經有sphinx擴展


php使用示例代碼:
https://git.oschina.net/tanjiajun/sphinx.git
相關文章
相關標籤/搜索