隨着本身電影網站資源逐漸增多,增長電影資源搜索服務成爲必然。直接操做數據庫的搜索,IO口請求增多減低了搜索性能。以前項目中有sphinx的使用基礎,加之支持中文檢索服務,最後決定採用基於sphinx的Coreseek搜索服務。php
下載安裝步驟【本人採用 linux環境下 4.1版本,系統支持mysql和xml數據源】:html
coreseek下載地址,下載 coreseek-4.1-beta.tar.gz 包
mysql
解壓gz包,tar zxvf coreseek-4.1-beta.tar.gzlinux
編譯安裝 mmseg【中文分詞包】sql
./bootstrap數據庫
./configure --prefix=/usr/local/mmseg3 bootstrap
make && make installapi
編譯安裝 coreseek less
sh buildconf.sh #輸出的warning信息能夠忽略,若是出現error則須要解決 post
./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql #with-mmseg-libs就是mmseg中文分詞路徑
make && make install
電影網站更新週期比較長,sphinx採用主索引+增量索引方式進行索引,最後合併兩個索引文件。下面開始部署本身的搜索配置文件:
進入coreseek安裝目錄下的etc文件,新建或修改 .conf配置文件
配置source源
source movie
{
type = mysql
sql_host = localhost #mysql數據庫host
sql_user = root #mysql用戶
sql_pass = #mysql用戶密碼
sql_db = movie #movie
sql_port = 3306 # optional, default is 3306
sql_query_pre = SET NAMES utf8
#創建增量索引
sql_query_pre = REPLACE INTO movie_sph_counter SELECT 1, MAX(id) FROM movie
sql_query = SELECT id, UNIX_TIMESTAMP(cdate) AS date ,id AS movie_id ,name, year, type,status,sync_status FROM movie WHERE id<=( SELECT max_movie_id FROM movie_sph_counter WHERE counter_id=1 )
#搜索返回字段
sql_attr_uint = movie_id
sql_attr_uint = year
sql_attr_uint = type
sql_attr_uint = date
sql_attr_uint = status
sql_attr_uint = sync_status
sql_field_string = name
sql_query_info_pre = SET NAMES utf8 #命令行查詢時,設置正確的字符集
sql_query_info = SELECT * FROM movie WHERE id=$id #命令行查詢時,從數據庫讀取原始數據信息
}
#增量索引源
source delta : movie
{
sql_query_pre = SET NAMES utf8
sql_query = SELECT id, UNIX_TIMESTAMP(cdate) AS date ,id AS movie_id ,name , year, type ,status,sync_status FROM movie WHERE id>( SELECT max_movie_id FROM movie_sph_counter WHERE counter_id=1 )
sql_query_post_index = REPLACE INTO movie_sph_counter SELECT 1, MAX(id) FROM movie
}
配置索引
#index定義
index movie
{
source = movie #對應的source名稱
path = /usr/local/coreseek/var/data/movie #請修改成實際使用的絕對路徑,例如:/usr/local/coreseek/var/...
docinfo = extern
mlock = 0
morphology = none
min_word_len = 1
html_strip = 0
#中文分詞配置,詳情請查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
charset_dictpath = /usr/local/mmseg/etc/ #BSD、Linux環境下設置,/符號結尾 mmseg路徑
charset_type = zh_cn.utf-8 #中文編碼
}
index delta : movie
{
source = delta
path = /usr/local/coreseek/var/data/movie_delta #注意!!不要和主索引路徑名稱同樣
docinfo = extern
mlock = 0
morphology = none
min_word_len = 1
html_strip = 0
charset_dictpath = /usr/local/mmseg/etc/
charset_type = zh_cn.utf-8
}
配置搜索服務
#searchd服務定義
searchd
{
listen = 9312 #端口號,能夠本身定義
read_timeout = 5
max_children = 30
max_matches = 1000
seamless_rotate = 0
preopen_indexes = 0
unlink_old = 1
compat_sphinxql_magics=0
pid_file = /usr/local/coreseek/var/log/searchd_mysql.pid #請修改成實際使用的絕對路徑,例如:/usr/local/coreseek/var/...
log = /usr/local/coreseek/var/log/searchd_mysql.log #請修改成實際使用的絕對路徑,例如:/usr/local/coreseek/var/...
query_log = /usr/local/coreseek/var/log/query_mysql.log #請修改成實際使用的絕對路徑,例如:/usr/local/coreseek/var/...
binlog_path = #關閉binlog日誌
}
執行命令創建索引: /usr/local/coreseek/bin/indexer -c movie.conf --all
後臺開啓搜索服務運行:/usr/local/coreseek/bin/searchd -c movie.conf
創建定時任務,執行增量索引:/usr/local/coreseek/bin/indexer -c csft_movie.conf delta --rotate
創建定時任務,合併索引:/usr/local/coreseek/bin/indexer -c csft_movie.conf --merge movie delta --merge-dst-range deleted 0 0 --rotate
至此基於sphinx+mysql的搜索服務已經搭建完畢,接下來就是根據sphinxapi.php開發搜索接口代碼……
第一次本身搭建sphinx搜索服務,最後測試網站搜索,速度槓槓的。
特此分享,但願對你們有所幫助