bert-as-service 用 BERT 做爲句子編碼器, 並經過 ZeroMQ 服務託管, 只需兩行代碼就能夠將句子映射成固定長度的向量表示;html
windows10 + python3.5 + tensorflow1.2.1前端
bert-as-service, 依賴於 python≥3.5 AND tensorflow≥1.10;python
pip install bert-serving-server pip instlal bert-serving-client
下載中文 bert 預訓練的模型git
BERT-Base, Uncased | 12-layer, 768-hidden, 12-heads, 110M parameters |
---|---|
BERT-Large, Uncased | 24-layer, 1024-hidden, 16-heads, 340M parameters |
BERT-Base, Cased | 12-layer, 768-hidden, 12-heads , 110M parameters |
BERT-Large, Cased | 24-layer, 1024-hidden, 16-heads, 340M parameters |
BERT-Base, Multilingual Cased (New) | 104 languages, 12-layer, 768-hidden, 12-heads, 110M parameters |
BERT-Base, Multilingual Cased (Old) | 102 languages, 12-layer, 768-hidden, 12-heads, 110M parameters |
BERT-Base, Chinese | Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M parameters |
啓動 bert-as-serving 服務github
bert-serving-start -model_dir /tmp/english_L-12_H-768_A-12/ -num_worker=2 //模型路徑自改 usage: xxxx\Anaconda3\envs\py35\Scripts\bert-serving-start -model_dir D:\env\bert\chinese_L-12_H-768_A-12 -num_worker=2 ARG VALUE __________________________________________________ ckpt_name = bert_model.ckpt config_name = bert_config.json cors = * cpu = False device_map = [] do_lower_case = True fixed_embed_length = False fp16 = False gpu_memory_fraction = 0.5 graph_tmp_dir = None http_max_connect = 10 http_port = None mask_cls_sep = False max_batch_size = 256 max_seq_len = 25 model_dir = D:\env\bert\chinese_L-12_H-768_A-12 no_position_embeddings = False no_special_token = False num_worker = 2 pooling_layer = [-2] pooling_strategy = REDUCE_MEAN port = 5555 port_out = 5556 prefetch_size = 10 priority_batch_size = 16 show_tokens_to_client = False tuned_model_dir = None verbose = False xla = False I:[35mVENTILATOR[0m:freeze, optimize and export graph, could take a while... I:[36mGRAPHOPT[0m:model config: D:\env\bert\chinese_L-12_H-768_A-12\bert_config.json I:[36mGRAPHOPT[0m:checkpoint: D:\env\bert\chinese_L-12_H-768_A-12\bert_model.ckpt I:[36mGRAPHOPT[0m:build graph... I:[36mGRAPHOPT[0m:load parameters from checkpoint... I:[36mGRAPHOPT[0m:optimize... I:[36mGRAPHOPT[0m:freeze... I:[36mGRAPHOPT[0m:write graph to a tmp file: C:\Users\Memento\AppData\Local\Temp\tmpo07002um I:[35mVENTILATOR[0m:bind all sockets I:[35mVENTILATOR[0m:open 8 ventilator-worker sockets I:[35mVENTILATOR[0m:start the sink I:[32mSINK[0m:ready I:[35mVENTILATOR[0m:get devices W:[35mVENTILATOR[0m:no GPU available, fall back to CPU I:[35mVENTILATOR[0m:device map: worker 0 -> cpu worker 1 -> cpu I:[33mWORKER-0[0m:use device cpu, load graph from C:\Users\Memento\AppData\Local\Temp\tmpo07002um I:[33mWORKER-1[0m:use device cpu, load graph from C:\Users\Memento\AppData\Local\Temp\tmpo07002um I:[33mWORKER-0[0m:ready and listening! I:[33mWORKER-1[0m:ready and listening! I:[35mVENTILATOR[0m:all set, ready to serve request!
bc = BertClient(ip="localhost", check_version=False, check_length=False) vec = bc.encode(['你好', '你好呀', '我很好']) print(vec)
輸出結果:shell
[[ 0.2894022 -0.13572647 0.07591158 ... -0.14091237 0.54630077 -0.30118054] [ 0.4535432 -0.03180456 0.3459639 ... -0.3121457 0.42606848 -0.50814617] [ 0.6313594 -0.22302179 0.16799903 ... -0.1614125 0.23098437 -0.5840646 ]]
啓動服務時加入參數 -http_port 8081
便可經過 8081 端口對外提供查詢服務;json
請求 http://localhost:8081/status/server
能夠查看到服務的狀態:windows
{ "ckpt_name": "bert_model.ckpt", "client": "7a033047-f177-45fd-9ef5-45781b10d322", "config_name": "bert_config.json", "cors": "*", "cpu": false, "device_map": [], "do_lower_case": true, "fixed_embed_length": false, "fp16": false, "gpu_memory_fraction": 0.5, "graph_tmp_dir": null, "http_max_connect": 10, "http_port": 8081, "mask_cls_sep": false, "max_batch_size": 256, "max_seq_len": 25, "model_dir": "D:\\env\\bert\\chinese_L-12_H-768_A-12", "no_position_embeddings": false, "no_special_token": false, "num_concurrent_socket": 8, "num_process": 3, "num_worker": 1, "pooling_layer": [ -2 ], "pooling_strategy": 2, "port": 5555, "port_out": 5556, "prefetch_size": 10, "priority_batch_size": 16, "python_version": "3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:05:27) [MSC v.1900 64 bit (AMD64)]", "pyzmq_version": "20.0.0", "server_current_time": "2021-03-03 15:53:03.859211", "server_start_time": "2021-03-03 10:00:21.128310", "server_version": "1.10.0", "show_tokens_to_client": false, "statistic": { "avg_last_two_interval": 1665.306127225, "avg_request_per_client": 8.333333333333334, "avg_request_per_second": 0.09246377980293276, "avg_size_per_request": 102.58333333333333, "max_last_two_interval": 17484.7365829, "max_request_per_client": 53, "max_request_per_second": 0.9194538223647459, "max_size_per_request": 601, "min_last_two_interval": 1.087602199997491, "min_request_per_client": 2, "min_request_per_second": 0.00005719274038008647, "min_size_per_request": 1, "num_active_client": 0, "num_data_request": 12, "num_max_last_two_interval": 1, "num_max_request_per_client": 1, "num_max_request_per_second": 1, "num_max_size_per_request": 1, "num_min_last_two_interval": 1, "num_min_request_per_client": 6, "num_min_request_per_second": 1, "num_min_size_per_request": 1, "num_sys_request": 63, "num_total_client": 9, "num_total_request": 75, "num_total_seq": 1231 }, "status": 200, "tensorflow_version": [ "1", "10", "0" ], "tuned_model_dir": null, "ventilator -> worker": [ "tcp://127.0.0.1:52440", "tcp://127.0.0.1:52441", "tcp://127.0.0.1:52442", "tcp://127.0.0.1:52443", "tcp://127.0.0.1:52444", "tcp://127.0.0.1:52445", "tcp://127.0.0.1:52446", "tcp://127.0.0.1:52447" ], "ventilator <-> sink": "tcp://127.0.0.1:52439", "verbose": false, "worker -> sink": "tcp://127.0.0.1:52467", "xla": false, "zmq_version": "4.3.3" }
而後作個可視化的前端呈現數據便可, 也能夠直接使用 bert-as-service 項目裏的 plugin/dashboard;api
參考:cors
Q: 啓動 bert-as-service 服務提示缺乏 cudart64_100.dll
dll 文件
A: 從網上下載個 dll 文件, 而後放置在 C:\Windows\System32
目錄下, 從新啓動命令行窗口執行命令便可;
Q: fail to optimize the graph!, TypeError: cannot unpack non-iterable NoneType object
A: 降級安裝 TF 1.10.0 版本; 確認 model 路徑是絕對路徑;
pip uninstall tensorflow pip uninstall tensorflow-estimator conda install --channel https://conda.anaconda.org/aaronzs tensorflow
參考: