要支持中文分詞,還須要下載Coreseek,能夠去官方搜索下載,這裏我用的4.1php
百度雲下載地址: https://pan.baidu.com/s/1slNIyHf前端
tar -zxvf coreseek-4.1-beta.tar.gz cd coreseek-4.1-beta cd mmseg-3.2.14/ ./bootstrap //測試安裝環境
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `config'. libtoolize: copying file `config/ltmain.sh' libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.in and libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree. libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am. + autoheader + automake --add-missing --copy + autoconf
./configure --prefix=/usr/local/mmseg3
------------------------------------------------------------------------ Configuration: Source code location: . Compiler: gcc Compiler flags: -g -O2 Host System Type: x86_64-redhat-linux-gnu Install path: /usr/local/mmseg3 See config.h for further configuration information. ------------------------------------------------------------------------
make && make install
在原安裝目錄下建立一個文本文檔測試一下java
cd /usr/local/mmseg3 cd /usr/local/src/coreseek-4.1-beta/mmseg-3.2.14/src vim test.txt 山東省德州市 北京朝陽市 中國北京 中國德州 中國山東德州
cd /usr/local/mmsge3/bin ./mmseg -d /usr/local/mmseg3/etc/ /usr/local/src/coreseek-4.1-beta/mmseg-3.2.14/src/test.txt
山東省/x 德州市/x /x /x 北京/x 朝陽市/x 中國/x 北京/x 中國/x 德州/x 中國/x 山東/x 德州/x Word Splite took: 0 ms.
cd /usr/local/src/coreseek-4.1-beta/csft-4.1 //能夠把csft當作sphinx了 sh buildconf.sh //執行腳本測試,若是不出問題,證實可使用 ./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/ /include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql
You can now run 'make install' to build and install Sphinx binaries. On a multi-core machine, try 'make -j4 install' to speed up the build. Updates, articles, help forum, and commercial support, consulting, training, and development services are available at http://sphinxsearch.com/ Thank you for choosing Sphinx!
make && make install
make[3]: Entering directory `/usr/local/src/coreseek-4.1-beta/csft-4.1' mkdir -p /usr/local/coreseek/var/data && mkdir -p /usr/local/coreseek/var/log make[3]: Leaving directory `/usr/local/src/coreseek-4.1-beta/csft-4.1' make[2]: Leaving directory `/usr/local/src/coreseek-4.1-beta/csft-4.1' make[1]: Leaving directory `/usr/local/src/coreseek-4.1-beta/csft-4.1'
而後進入mysql客戶端建立一個表測試一下mysql
create table kecheng(id int primary key auto_increment,name varchar(50),info varchar(50))charset utf8; insert into kecheng(name,info) values('java','java是一門很牛的語言,性能總體來講比PHP要強,可是不如php開發速度快'); insert into kecheng(name,info) values('redis','redis是一種內存緩存數據庫,比memcache支持的數據格式多'); insert into kecheng(name,info) values('memcache','memcache支持簡單的key value形式,不像redis支持持久化'); insert into kecheng(name,info) values('jquery','jquery是一種前端腳本,結合php和java能夠作web開發');
cd /usr/local/coreseek/ //也就是sphinx目錄了 cd bin ls //相似於原版sphinx目錄結構 cd /usr/local/coreseek/etc cp sphinx.conf.dist csft.conf
CREATE TABLE index_table( //此表爲了存放更新完的索引id,不用每次更新全表 Counter_id int unsigned not null primary key auto, Max_id int unsigned not null comment'已經建立完索引的最大id' )
編輯配置文件csft.confjquery
13 source src1 14 { 15 # data source type. mandatory, no default value 16 # known types are mysql, pgsql, mssql, xmlpipe, xmlpipe2, odbc 17 type = mysql --庫類型 18 19 ##################################################################### 20 ## SQL settings (for 'mysql' and 'pgsql' types) 21 ##################################################################### 22 23 # some straightforward parameters for SQL source types 24 sql_host = localhost --不作解釋 25 sql_user = root 26 sql_pass = 27 sql_db = test 28 sql_port = 3306 # optional, default is 3306 ..... 79 sql_query_pre = SET NAMES utf8 --設置字符集 80 sql_query_pre = SET SESSION query_cache_type=OFF --關閉mysql查詢緩存 84 # mandatory, integer document ID field MUST be the first selected column 85 #sql_query = \ 86 # SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content \ 87 # FROM documents--關掉默認的查詢表 #設置要查詢的信息,若是表主鍵不叫id,那麼還須要別名爲id,如 select tid id from tableName; 88 sql_query = SELECT id,name,info FROM kecheng #主查詢執行完以後執行的SQL index_table是存放最後更新的主鍵id,不用每次更新全表,只更新最新數據 sql_query_post = REPLACE INTO index_table SELECT 1,MAX(id) FROM kecheng; ..... #當使用search檢索文件的時候,返回的記錄字段,這裏是全部(測試而已) 241 sql_query_info = SELECT * FROM kecheng WHERE id=$id ..... index test1 318 { ..... 331 path = /usr/local/coreseek/var/data/test1 --索引文件建立的位置 332 333 # document attribute values (docinfo) storage mode 391 charset_type = zh_cn.utf-8 --改成中文 392 charset_dictpath = /usr/local/mmseg3/etc/ --詞典目錄 #---------------- source zengliangsuoyin : src1{ #取出尚未建立索引的數據 sql_query = SELECT id,name,info FROM kecheng WHERE id > (SELECT max_id FROM index_table ) #再把最後一個id更新到index_table 。。不用寫了,由於是繼承上一個 } index zengliangsuoyin : src1{ source = zengliangsuoyin path = /usr/local/coreseek/var/data/test1 }
保存退出linux
cd /usr/local/coreseek/bin/ ./indexer --all
using config file '/usr/local/coreseek/etc/csft.conf'... --指定的配置文檔,以前複製的文件命名一致 indexing index 'test1'... WARNING: attribute 'group_id' not found - IGNORING WARNING: attribute 'date_added' not found - IGNORING WARNING: Attribute count is 0: switching to none docinfo collected 5 docs, 0.0 MB sorted 0.0 Mhits, 100.0% done total 5 docs, 351 bytes total 0.178 sec, 1971 bytes/sec, 28.07 docs/sec indexing index 'test1stemmed'... WARNING: attribute 'group_id' not found - IGNORING WARNING: attribute 'date_added' not found - IGNORING WARNING: Attribute count is 0: switching to none docinfo collected 5 docs, 0.0 MB --發現五個文檔也就是mysql五條記錄,鏈接庫沒問題了 sorted 0.0 Mhits, 100.0% done total 5 docs, 351 bytes total 0.007 sec, 47677 bytes/sec, 679.16 docs/sec skipping non-plain index 'dist1'... skipping non-plain index 'rt'... total 4 reads, 0.000 sec, 0.3 kb/call avg, 0.0 msec/call avg total 12 writes, 0.000 sec, 0.2 kb/call avg, 0.0 msec/call avg
./search php
Coreseek Fulltext 4.1 [ Sphinx 2.0.2-dev (r2922)] Copyright (c) 2007-2011, Beijing Choice Software Technologies Inc (http://www.coreseek.com) using config file '/usr/local/coreseek/etc/csft.conf'... index 'test1': query 'php ': returned 3 matches of 3 total in 0.000 sec displaying matches: 1. document=1, weight=2500 id=1 group_id=1 group_id2=5 date_added=2017-02-08 06:22:36 title=test one content=this is my test document number one. also checking search within phrases. 2. document=2, weight=1500 id=2 group_id=1 group_id2=6 date_added=2017-02-08 06:22:36 title=test two content=this is my test document number two 3. document=5, weight=1500 (document not found in db) words: 1. 'php': 3 documents, 5 hits ---出現的次數 index 'test1stemmed': query 'php ': returned 3 matches of 3 total in 0.000 sec displaying matches: 1. document=1, weight=2500 id=1 group_id=1 group_id2=5 date_added=2017-02-08 06:22:36 title=test one content=this is my test document number one. also checking search within phrases. 2. document=2, weight=1500 id=2 group_id=1 group_id2=6 date_added=2017-02-08 06:22:36 title=test two content=this is my test document number two 3. document=5, weight=1500 (document not found in db) words: 1. 'php': 3 documents, 5 hits
測試完成,下面就開始php擴展的安裝了web