在當前的軟件系統中,在不關閉服務的狀況下部署一個新的版本或者是修改一些配置信息已經成爲了必備的要求。這裏介紹不一樣的方法去平滑重啓應用,同時用些實例去挖掘其中的細節信息。這裏經過介紹Teleport來展開,Teleport是爲Kubernetes權限控制而設計的,對於不熟悉的能夠查看這個連接https://gravitational.com/teleport/。linux
SO_REUSERPORT vs Duplicating Sockets:git
爲了Teleport更加高可用,咱們最近花費了一些時間如何去平滑重啓Teleport的TLS和SSH的監聽者,咱們的目標是在不生成一個新的實例的狀況下去升級Teleport的包。github
兩種通用的實現方法在這篇文章中有介紹,https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing,其方法大概就是這樣:golang
》你能夠在使用socket時設置 SO_REUSERPORT ,這樣就容許多個進程綁定同一個端口,採用這種方法時每一個進程都有一個對應的接收處理隊列。json
》你也能夠複用socket,經過將其傳遞給子進程的方式來使用,這種方式就是多個進程共用一個接收隊列。安全
對於SO_REUSERPORT有一些負面的影響,一個是咱們的工程師之前用過這種方式,這種多個接收隊列的方式有時會致使tcp鏈接的中斷。另外 Go不容易去設置SO_REUSERPORT這個參數。bash
第二種方法因爲大部分開發者都比較熟悉其簡單的unix fork/exec模型 反而是比較吸引的。這種方式能夠把全部的文件描述符都傳遞給子進程,不過在go中 os/exec包目前是不容許這樣的,多是由於安全問題,只能傳遞 stdin stdou和stderr給子進程。可是os包有比較底層的包能夠傳遞全部的文件描述符給子進程,而這正是咱們要作的。app
信號控制進程切換:ssh
在講正式的源碼前,先說下這種方式工做的細節信息。curl
開始一個新的Teleport進程時會建立一個socket listener,其會接收發送給目的端口的全部traffic。咱們增長一個信號處理函數來處理 SIGUSR2,這個信號能夠使Teleport複製一份lisenter socket而後傳遞的文件描述符和其環境變量的元數據信息生成一個新的進程。一旦一個新的進程開始,就使用前面傳遞過來的文件描述符合元素開始改造socket並開始處traffic。
這裏應該注意下 socket被複用後,兩個socket是循環均衡的處理traffic,具體能夠查看下面的圖。這意味這Teleport進程每一段時間將接受新的鏈接。
Figure 1: Teleport能夠複用自身,與其他複用的進程共享數據傳輸
父進程(PID2))的關閉方式是同樣的,只是順序反過來。一旦一個Teleport進程接收了SIGOUT信號將會開始關閉進程,其流程:先中止接收新鏈接,而後等待全部鏈接退出。而後父進程將關閉它本身的listener socket並退出。如今內核只發送traffic給新的進程了。
實例:
咱們使用這種方法寫了一個小應用。源代碼在底部。首先咱們來編譯而後開始應用:
$ go build restart.go
$ ./restart &
[1] 95147
$ Created listener file descriptor for :8080.
$ curl http://localhost:8080/hello
Hello from 95147!
複製代碼
發送USR2信號給原始進程,如今你點擊發送http請求時,將會返回兩個進程的pid號:
$ kill -SIGUSR2 95147
user defined signal 2 signal received.
Forked child 95170.
$ Imported listener file descriptor for :8080.
$ curl http://localhost:8080/hello
Hello from 95170!
$ curl http://localhost:8080/hello
Hello from 95147!複製代碼
kil掉原始的進程,你將會發現其返回一個新的pid號:
$ kill -SIGTERM 95147
signal: killed
[1]+ Exit 1 go run restart.go
$ curl http://localhost:8080/hello
Hello from 95170!
$ curl http://localhost:8080/hello
Hello from 95170!複製代碼
最後kill調新的進行,整個進程就別幹掉了。
$ kill -SIGTERM 95170
$ curl http://localhost:8080/hello
curl: (7) Failed to connect to localhost port 8080: Connection refused複製代碼
正如你看到的,一旦你瞭解其是如何工做的,用go寫一個平滑重啓的服務是很easy的,同時能夠極大的提高你的服務的效率。
Golang Graceful Restart Source Example
package main
import (
"context"
"encoding/json"
"flag"
"fmt"
"net"
"net/http"
"os"
"os/signal"
"path/filepath"
"syscall"
"time"
)
type listener struct {
Addr string `json:"addr"`
FD int `json:"fd"`
Filename string `json:"filename"`
}
func importListener(addr string) (net.Listener, error) {
// Extract the encoded listener metadata from the environment.
listenerEnv := os.Getenv("LISTENER")
if listenerEnv == "" {
return nil, fmt.Errorf("unable to find LISTENER environment variable")
}
// Unmarshal the listener metadata.
var l listener
err := json.Unmarshal([]byte(listenerEnv), &l)
if err != nil {
return nil, err
}
if l.Addr != addr {
return nil, fmt.Errorf("unable to find listener for %v", addr)
}
// The file has already been passed to this process, extract the file
// descriptor and name from the metadata to rebuild/find the *os.File for
// the listener.
listenerFile := os.NewFile(uintptr(l.FD), l.Filename)
if listenerFile == nil {
return nil, fmt.Errorf("unable to create listener file: %v", err)
}
defer listenerFile.Close()
// Create a net.Listener from the *os.File.
ln, err := net.FileListener(listenerFile)
if err != nil {
return nil, err
}
return ln, nil
}
func createListener(addr string) (net.Listener, error) {
ln, err := net.Listen("tcp", addr)
if err != nil {
return nil, err
}
return ln, nil
}
func createOrImportListener(addr string) (net.Listener, error) {
// Try and import a listener for addr. If it's found, use it. ln, err := importListener(addr) if err == nil { fmt.Printf("Imported listener file descriptor for %v.\n", addr) return ln, nil } // No listener was imported, that means this process has to create one. ln, err = createListener(addr) if err != nil { return nil, err } fmt.Printf("Created listener file descriptor for %v.\n", addr) return ln, nil } func getListenerFile(ln net.Listener) (*os.File, error) { switch t := ln.(type) { case *net.TCPListener: return t.File() case *net.UnixListener: return t.File() } return nil, fmt.Errorf("unsupported listener: %T", ln) } func forkChild(addr string, ln net.Listener) (*os.Process, error) { // Get the file descriptor for the listener and marshal the metadata to pass // to the child in the environment. lnFile, err := getListenerFile(ln) if err != nil { return nil, err } defer lnFile.Close() l := listener{ Addr: addr, FD: 3, Filename: lnFile.Name(), } listenerEnv, err := json.Marshal(l) if err != nil { return nil, err } // Pass stdin, stdout, and stderr along with the listener to the child. files := []*os.File{ os.Stdin, os.Stdout, os.Stderr, lnFile, } // Get current environment and add in the listener to it. environment := append(os.Environ(), "LISTENER="+string(listenerEnv)) // Get current process name and directory. execName, err := os.Executable() if err != nil { return nil, err } execDir := filepath.Dir(execName) // Spawn child process. p, err := os.StartProcess(execName, []string{execName}, &os.ProcAttr{ Dir: execDir, Env: environment, Files: files, Sys: &syscall.SysProcAttr{}, }) if err != nil { return nil, err } return p, nil } func waitForSignals(addr string, ln net.Listener, server *http.Server) error { signalCh := make(chan os.Signal, 1024) signal.Notify(signalCh, syscall.SIGHUP, syscall.SIGUSR2, syscall.SIGINT, syscall.SIGQUIT) for { select { case s := <-signalCh: fmt.Printf("%v signal received.\n", s) switch s { case syscall.SIGHUP: // Fork a child process. p, err := forkChild(addr, ln) if err != nil { fmt.Printf("Unable to fork child: %v.\n", err) continue } fmt.Printf("Forked child %v.\n", p.Pid) // Create a context that will expire in 5 seconds and use this as a // timeout to Shutdown. ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) defer cancel() // Return any errors during shutdown. return server.Shutdown(ctx) case syscall.SIGUSR2: // Fork a child process. p, err := forkChild(addr, ln) if err != nil { fmt.Printf("Unable to fork child: %v.\n", err) continue } // Print the PID of the forked process and keep waiting for more signals. fmt.Printf("Forked child %v.\n", p.Pid) case syscall.SIGINT, syscall.SIGQUIT: // Create a context that will expire in 5 seconds and use this as a // timeout to Shutdown. ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) defer cancel() // Return any errors during shutdown. return server.Shutdown(ctx) } } } } func handler(w http.ResponseWriter, r *http.Request) { fmt.Fprintf(w, "Hello from %v!\n", os.Getpid()) } func startServer(addr string, ln net.Listener) *http.Server { http.HandleFunc("/hello", handler) httpServer := &http.Server{ Addr: addr, } go httpServer.Serve(ln) return httpServer } func main() { // Parse command line flags for the address to listen on. var addr string flag.StringVar(&addr, "addr", ":8080", "Address to listen on.") // Create (or import) a net.Listener and start a goroutine that runs // a HTTP server on that net.Listener. ln, err := createOrImportListener(addr) if err != nil { fmt.Printf("Unable to create or import a listener: %v.\n", err) os.Exit(1) } server := startServer(addr, ln) // Wait for signals to either fork or quit. err = waitForSignals(addr, ln, server) if err != nil { fmt.Printf("Exiting: %v\n", err) return } fmt.Printf("Exiting.\n") }複製代碼
注意:golang1.8及以上,由於server.shutdown優雅的關閉是1.8才加上的特性。
英文原文:https://gravitational.com/blog/golang-ssh-bastion-graceful-restarts/