因爲咱們的服務器會借給咱們部分公司使用,可是部分公司有沒有運維,都是開發直接操做,致使出現一些問題後直接來找我處理。爲此,也學習到不少。python
近來,一哥們因爲不知道操做什麼致使服務器直接掛掉,因而我聯繫機房啓動機器後查看日誌,沒有發現什麼明顯的日誌,只是有一條watchdog的日誌信息。因而又多方搜尋關於watchdog的相關知識。發現的也大體是皮毛而已,也有部分提交一些內核信息是關於watchdog的bug。此處也沒法作什麼排查的(主要仍是個人技術不精)。只能推測是開發使用應用程序觸發了watchdog的監控。在指定時間內watchdog檢測不經過致使reboot或者shutdown。好在這臺機器他們只是測試使用。沒有什麼重要的業務。mysql
不過不久後他們又呼叫我mysql多實例的3308端口沒法啓動。因而我又上機器排查。sql
登錄機器後首先查看下端口:服務器
# ss -tunl Netid Recv-Q Send-Q Local Address:Port Peer Address:Port tcp 0 64 :::873 :::* tcp 0 50 *:3306 *:* tcp 0 128 :::11211 :::* tcp 0 128 *:11211 *:* tcp 0 50 *:3307 *:*
查看上述確實是沒有3308端口,因而又查看下配置文件。看看他們的具體配置是怎麼樣的。運維
# cat /etc/my.cnf [client] #port = 3306 default-character-set = utf8 #socket = /usr/local/mysql/mysql.sock [mysqld_multi] mysqld = /data0/mysql/bin/mysqld_safe mysqladmin = /data0/mysql/bin/mysqladmin [mysqld1] socket = /usr/local/mysql/mysql1.sock port = 3306 pid-file = /usr/local/mysql/mysql1.pid datadir = /usr/local/mysql/data1 user = mysql log = /usr/local/msyql/e1.log server-id = 1 skip-name-resolve character-set-server = utf8 log-bin-trust-function-creators=1 back_log = 50 max_connections = 500 max_connect_errors = 32 max_allowed_packet = 16M table_cache = 2048 binlog_cache_size = 1M max_heap_table_size = 64M tmp_table_size = 64M #binlog_format = "MIXED" key_buffer_size = 32M read_buffer_size = 2M read_rnd_buffer_size = 16M bulk_insert_buffer_size = 64M sort_buffer_size = 8M join_buffer_size = 8M thread_cache_size = 8 thread_concurrency = 8 thread_stack = 192K slow_query_log long_query_time = 2 log-short-format myisam_sort_buffer_size = 128M myisam_max_sort_file_size = 10G myisam_repair_threads = 1 myisam_recover [mysqld2] socket = /usr/local/mysql/mysql2.sock port = 3308 pid-file = /usr/local/mysql/mysql2.pid datadir = /usr/local/mysql/data2 user = mysql log = /usr/local/msyql/e2.log server-id = 1 skip-name-resolve character-set-server = utf8 log-bin-trust-function-creators=1 back_log = 50 max_connections = 500 max_connect_errors = 32 max_allowed_packet = 16M table_cache = 2048 binlog_cache_size = 1M max_heap_table_size = 64M tmp_table_size = 64M #binlog_format = "MIXED" key_buffer_size = 32M read_buffer_size = 2M read_rnd_buffer_size = 16M bulk_insert_buffer_size = 64M sort_buffer_size = 8M join_buffer_size = 8M thread_cache_size = 8 thread_concurrency = 8 thread_stack = 192K slow_query_log long_query_time = 2 log-short-format myisam_sort_buffer_size = 128M myisam_max_sort_file_size = 10G myisam_repair_threads = 1 myisam_recover [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash [isamchk] key_buffer = 512M sort_buffer_size = 512M read_buffer = 8M write_buffer = 8M [myisamchk] key_buffer = 512M sort_buffer_size = 512M read_buffer = 8M write_buffer = 8M [mysqlhotcopy] interactive-timeout [mysqld_safe] open-files-limit = 8192
配置文件如上,主要是看下數據目錄和sock文件目錄以及pid存放位置和log位置。dom
而後就是到指定目錄下,查看相關信息。socket
# ls 3307 data1 docs e2.log lib mysql2-slow.log mysql-test sql-bench bin data2 e1.log include libexec mysql1-slow.log mysql2.pid mysql2.sock share var
看到上述狀況我真是很鬱悶了,這徹底和配置文件對不上啊。明明mysql2.sock和pid文件存在,爲毛卻監聽的是3306端口呢?此時就很鬱悶了,因而又ps -ef | grep mysql看了下路徑。不看不知道,看了下就更不對勁了。tcp
# ps -ef | grep mysql root 18634 1 0 Jan12 ? 00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/usr/local/mysql/var --pid-file=/usr/local/mysql/var/localhost.localdomain.pid mysql 18763 18634 0 Jan12 ? 00:00:39 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/var --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=/usr/local/mysql/var/localhost.localdomain.err --pid-file=/usr/local/mysql/var/localhost.localdomain.pid --socket=/tmp/mysql.sock --port=3306 root 29805 29780 0 14:40 pts/0 00:00:00 grep mysql
想必此時不少大神一眼就知道問題在哪裏了。惋惜我不是大神。一時也沒想到那麼多。因而一步一步的來往下查啊。我這裏想到的是TM的竟然啓動了默認的mysql。因而我就停掉了這個3306這個端口。ide
刪除了mysql2.sock和pid文件。再啓動一次:學習
mysqld_multi start 1,2
但是事與願違,3306端口仍是起來了。可是3308仍是沒起來。因而再次ps -ef | grep mysql查看到mysqld_safe的進程起了兩個,反正相同的都是兩個。你懂得。這麼奇葩。因而我再次kill掉。
再次啓動時我先監控着多實例的日誌:
tail -f mysqld_multi.log
mysqld_multi log file version 2.16; run: Fri Jan 16 13:53:44 2015 Starting MySQL servers 150116 13:53:44 [ERROR] Fatal error: Please read "Security" section of the manual to find out how to run mysqld as root! 150116 13:53:44 [ERROR] Aborting 150116 13:53:44 [Note] /usr/local/mysql/libexec/mysqld: Shutdown complete 150116 13:53:44 [ERROR] Fatal error: Please read "Security" section of the manual to find out how to run mysqld as root! 150116 13:53:44 [ERROR] Aborting 150116 13:53:44 [Note] /usr/local/mysql/libexec/mysqld: Shutdown complete mysqld_multi log file version 2.16; run: Fri Jan 16 13:56:32 2015
就相似上面這種。報錯信息大體反映的就是不能以root啓動這個意思。可是mysql的配置文件裏指定的就是mysql用戶。因而我又查看了數據目錄的文件夾權限等亂七八糟的一堆。感受都是很正常的。到這裏時我就特別的鬱悶了。那究竟是哪裏的問題呢。
因而我就去找錯誤日誌,可是發現var下面的.err的日誌竟然沒有記錄。我思前想後難道是配置文件沒有生效?此時的我只能先發會呆。
發完呆後我又仔細仔細看了看日誌和配置文件。終於發現問題了。mysqld_multi的配置和mysqld下面的路徑不一致啊。
因而再次的使用指定目錄下面的啓動腳本,並加以選項。此次雖然仍是沒能出現所要的結果。可是日誌已經很明顯的指出問題所在了。
# tail -f mysqld_multi.log mysqld_multi log file version 2.16; run: Fri Jan 16 14:10:30 2015 Starting MySQL servers 150116 14:10:30 mysqld_safe Logging to '/usr/local/mysql/var/localhost.localdomain.err'. 150116 14:10:30 mysqld_safe Logging to '/usr/local/mysql/var/localhost.localdomain.err'. cat: /usr/local/mysql/var/localhost.localdomain.pid: No such file or directory 150116 14:10:30 mysqld_safe Starting mysqld daemon with databases from /usr/local/mysql/var 150116 14:10:30 mysqld_safe Starting mysqld daemon with databases from /usr/local/mysql/var mysqld_multi log file version 2.16; run: Fri Jan 16 14:15:39 2015
因而配置一個mysql的環境變量,效果立馬見效。
Starting MySQL servers 150116 14:15:39 mysqld_safe Logging to '/usr/local/mysql/data1/localhost.localdomain.err'. 150116 14:15:39 mysqld_safe Logging to '/usr/local/mysql/data2/localhost.localdomain.err'. 150116 14:15:39 mysqld_safe Starting mysqld daemon with databases from /usr/local/mysql/data1 150116 14:15:39 mysqld_safe Starting mysqld daemon with databases from /usr/local/mysql/data2
到此終於解決問題。此事說明解決問題前必定要先弄清楚以前作了什麼操做,具體的環境是什麼樣子的。等等...
ps :有大神知道下面這個問題還請指教,感謝。
localhost kernel: iTCO_wdt: Unexpected close, not stopping watchdog!