因爲業務的須要,咱們ES使用的是Amazon Elasticsearch Service 7.4
,爲了配合開發同窗的使用和節省部門沒必要要的開支,咱們將按期去備份索引快照至S3
中,同時刪除ES對應的索引數據。html
咱們須要按期備份必定週期(好比:一週以前)的索引快照至S3
,刪除Elasticsearch Service
中對應索引數據。同時,若有須要還要能夠恢復備份的索引數據。整個流程執行成功後,要有微信或其餘途徑信息提醒。python
ES需求處理流程圖:docker
The current version of Curator is 5.8.3,詳見傳送門json
curator容許對索引和快照執行許多不一樣的操做,包括:vim
curator
安裝方式有多種,好比:yum/apt-get、pip、docker等
,這裏咱們選擇經常使用的pip
。bash
# 安裝必要的基礎包 yum install -y vim python-pip
# 安裝虛擬環境 pip install virtualenvwrapper # 配置虛擬環境,在/etc/profile添加: ### virtualenv start ### #設置virtualenv的統一管理目錄 export WORKON_HOME=~/Envs #添加virtualenvwrapper的參數,生成乾淨隔絕的環境 #export VIRTUALENVWRAPPER_VIRTUALENV_ARGS='--no-site-packages' #指定python解釋器 #export VIRTUALENVWRAPPER_PYTHON=/opt/python36/bin/python3.6 #執行virtualenvwrapper安裝腳本 export VIRTUALENVWRAPPER_SCRIPT=/usr/bin/virtualenvwrapper.sh source /usr/bin/virtualenvwrapper_lazy.sh ### virtualenv end ### # 刷新配置文件 source !$ # 建立管理es的虛擬環境 mkvirtualenv es-snapshot # 查看剛建立的虛擬環境 lsvirtualenv # 進入虛擬環境 workon es-snapshot
# 在es-snapshot虛擬環境中安裝 pip install elasticsearch-curator
# 查看當前es的全部索引的詳細信息,默認host:127.0.0.1,默認port:9200 curator_cli --host 127.0.0.1 --port 9200 show_indices --verbose
# Rmember, leave a key empty if there is no value. None will be a string, # not a Python "NoneType" client: #es集羣地址 hosts: http://your-domain.com #es端口 port: your-port url_prefix: use_ssl: False # aws區域,如ap-south-1 aws_region: xxxxx aws_sign_request: False certificate: client_cert: client_key: ssl_no_validate: False http_auth: timeout: 30 master_only: False logging: #日誌級別 loglevel: INFO #日誌存放路徑 logfile: /var/log/cur-run.log logformat: default blacklist: ['elasticsearch', 'urllib3']
actions: 1: # 備份7天前的索引快照 action: snapshot description: >- Snapshot sdk_|game_ prefixed indices older than 7 day (based on index creation_date) with the default snapshot name pattern of 'es-%Y%m%d%H%M%S'. Wait for the snapshot to complete. Do not skip the repository filesystem access check. Use the other options to create the snapshot. options: # s3倉庫名稱,可經過腳本生成 repository: "es_backup_\ " # Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S' name: es-%Y%m%d%H%M%S ignore_unavailable: False include_global_state: True partial: True wait_for_completion: True skip_repo_fs_check: True ignore_empty_list: True continue_if_exception: False disable_action: False filters: - filtertype: pattern kind: regex # 匹配"logstash-"的索引 value: 'logstash-' - filtertype: age source: creation_date direction: older unit: days # 7天以前的索引 unit_count: 7 2: # 關閉7天前以logstash-爲前綴的索引: action: close description: >- Close indices older than 7 days (based on index name), for dtlog- prefixed indices. options: delete_aliases: False timeout_override: continue_if_exception: False filters: - filtertype: pattern kind: regex value: '^logstash-' exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 7 3: # 刪除7天前的索引 action: delete_indices description: >- Delete metric indices older than 7 days (based on index name), for logstash-2021.04.10 prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. options: ignore_empty_list: True filters: - filtertype: pattern kind: prefix # 匹配"logstash-"的索引 value: logstash- - filtertype: age # 這裏根據索引name來匹配,還能夠根據字段等,詳見官方文檔 source: name direction: older # 用於匹配和提取索引或快照名稱中的時間戳 timestring: '%Y.%m.%d' unit: days # 7天以前的索引 unit_count: 7
配置action順序:7天前索引作快照 --> 關閉7天前索引 --> 刪除7天前索引 --> 保留7天內的索引,若有須要可把7天前的快照恢復當前es中。微信
action.yml
配置中:session
# s3倉庫名稱,可經過腳本生成 repository: "es_backup_\ "
之因此這樣寫,是由於執行python register-repo.py
會獲得兩個值:帶有時間戳倉庫的後綴
好比es_backup_20210424150533
,另外一個值是時間戳
並把它寫入time_save.txt
。sed '/es_backup_/r time_save.txt' action_temp.yml -i
將得到的時間戳傳進action_temp.yml
中。app
注意:actions: 後面的,依次類推:dom
2:執行操做 3:執行操做 4:執行操做 N:執行操做
在執行curator
以前,咱們須要建立s3倉庫
,須要配置IAM role
訪問Elasticsearch Service
權限,詳見AWS Elasticsearch Service 創建snapshot
詳見以下腳本:
# cat register-repo.py import boto3 import requests from requests_aws4auth import AWS4Auth import time def create_s3_register(timeup): host = 'https://your-aws-es-domain.com/' # include https:// and trailing / region = 'ap-south-1' # e.g. us-west-1 service = 'es' credentials = boto3.Session().get_credentials() awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token) # Register repository path = '_snapshot/'+'es_backup_'+timeup # the Elasticsearch API endpoint url = host + path payload = { "type": "s3", "settings": { "bucket": "your-s3-bucket", "region": "ap-south-1", "role_arn": "arn:aws:iam::1234567890:role/your-role-name" } } headers = {"Content-Type": "application/json"} r = requests.put(url, auth=awsauth, json=payload, headers=headers) print(r.status_code) print(r.text) def var_save(timeup,filename,mode='w'): file = open(filename,mode) file.write(' '+timeup+'\\'+'\n') file.close if __name__=="__main__": time = time.strftime('%Y%m%d%H%M%S',time.localtime(time.time())) create_s3_register(time) var_save(time,'time_save.txt')
該腳本將完成,索引快照備份、索引關閉和索引刪除,最後微信信息通知,詳見:
#!/bin/bash #author: tengfei.wu #email: tengfei.wu@domain.com #date:2021/04/25 #version: 2 # Create the S3 repository python register-repo.py # Get the name of the warehouse cp action.yml action_temp.yml sed '/es_backup_/r time_save.txt' action_temp.yml -i # Perform ES index shutdown, backup, and deletion #curator --config config.yml action_temp.yml --dry-run curator --config config.yml action_temp.yml rm -rf action_temp.yml # WeChat alarm content=' 【AI測試環境】-- ES操做通知 詳情信息: "ES快照備份、索引關閉和索引刪除" 操做細節: 索引快照: "7天前索引" 索引關閉: "7天前索引" 索引刪除: "7天前索引" 狀態: SUCCESS 報警建立方式: "自動腳本對接" 當前索引: "保留最近一週的索引"' curl http://x.x.x.x:4567/send -d "tos=your-IM&content=${content}"
# logstash-日誌備份,每週日am 9:30 30 9 * * 0 cd /root/Envs/es-snapshot/bin && source ./activate && cd /root/Envs/es-snapshot && (/bin/bash ccc) && deactivate
workon es-snapshot && cd /root/Envs/es-snapshot/ sh start_es_backup.sh > /dev/null 2>&1 &
# cat action_restore.yml actions: 1: action: restore description: >- Restore all indices in the most recent snapshot with state SUCCESS. Wait for the restore to complete before continuing. Do not skip the repository filesystem access check. Use the other options to define the index/shard settings for the restore. options: repository: es_backup_20210425054626 name: indices: wait_for_completion: True #max_wait: 3600 #wait_interval: 10 filters: - filtertype: state state: SUCCESS exclude:
查看當前索引狀態:curator_cli --host your-domain.es.amazonaws.com --port your-port show_indices --verbose
# cat action_open.yml actions: 1: action: open description: "open selected indices" options: continue_if_exception: False timeout_override: 300 filters: - filtertype: pattern kind: regex value: '^logstash-' - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 7
# action_delete_snapshot.yml # 刪除快照配置示例 actions: 1: action: delete_snapshots description: "Delete selected snapshots from 'repository'" options: repository: es_backup_20210424150533 retry_interval: 120 retry_count: 3 timeout_override: 3600 filters: - filtertype: state state: SUCCESS exclude:
注意:
action_delete_snapshot.yml
配置只是清空了es_backup_20210424150533
倉庫中的快照內容,倉庫並無刪除,刪除空倉庫:DELETE /_snapshot/es_backup_20210424150533