[問題] 伺服器偶爾會重開機

作者: LIAR (玻璃做的大叔)   2016-03-13 14:13:01
我是centos 6.7
我確定沒有下排程去重開機,crontab裡面只有三項工作,兩項是把指定資料夾
內超過一個月的檔案清掉,還有一項是yum -y update。全部都是凌晨作業。
我最近一個月發生過兩次重開機,我原本以為有斷電或是power不穩,不過
查過message後發現應該是正常重開機,
Mar 9 17:00:45 ORZ kernel: r8169 0000:02:00.0: eth0: link down
Mar 9 17:03:38 ORZ kernel: r8169 0000:02:00.0: eth0: link up
Mar 9 17:03:51 ORZ init: tty (/dev/tty1) main process (2231) killed by TERM
signal
Mar 9 17:03:51 ORZ init: tty (/dev/tty2) main process (2233) killed by TERM
signal
Mar 9 17:03:51 ORZ init: tty (/dev/tty3) main process (2235) killed by TERM
signal
Mar 9 17:03:51 ORZ init: tty (/dev/tty4) main process (2237) killed by TERM
signal
Mar 9 17:03:51 ORZ init: tty (/dev/tty5) main process (2239) killed by TERM
signal
Mar 9 17:03:51 ORZ init: tty (/dev/tty6) main process (2243) killed by TERM
signal
Mar 9 17:03:52 ORZ abrtd: Got signal 15, exiting
Mar 9 17:03:53 ORZ xinetd[2038]: Exiting...
Mar 9 17:03:57 ORZ acpid: exiting
Mar 9 17:04:08 ORZ openvpn[1697]: /sbin/ip route del 192.168.100.0/24
Mar 9 17:04:08 ORZ openvpn[1697]: ERROR: Linux route delete command failed:
external program exited with error status: 2
Mar 9 17:04:08 ORZ openvpn[1697]: Closing TUN/TAP interface
Mar 9 17:04:08 ORZ openvpn[1697]: /sbin/ip addr del dev tun0 local
192.168.100.1 peer 192.168.100.2
Mar 9 17:04:08 ORZ openvpn[1697]: Linux ip addr del failed: external program
exited with error status: 2
Mar 9 17:04:08 ORZ init: Disconnected from system bus
Mar 9 17:04:08 ORZ openvpn[1697]: SIGTERM[hard,] received, process exiting
Mar 9 17:04:08 ORZ console-kit-daemon[31704]: WARNING: no sender#012
Mar 9 17:04:08 ORZ rpcbind: rpcbind terminating on signal. Restart with
"rpcbind -w"
Mar 9 17:04:08 ORZ auditd[1490]: The audit daemon is exiting.
Mar 9 17:04:08 ORZ kernel: type=1305 audit(1457514248.550:7020): audit_pid=0
old=1490 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0
res=1
Mar 9 17:04:08 ORZ kernel: type=1305 audit(1457514248.664:7021):
audit_enabled=0 old=1 auid=4294967295 ses=4294967295
subj=system_u:system_r:auditctl_t:s0 res=1
Mar 9 17:04:08 ORZ kernel: Kernel logging (proc) stopped.
Mar 9 17:04:08 ORZ rsyslogd: [origin software="rsyslogd" swVersion="5.8.10"
x-pid="1526" x-info="http://www.rsyslog.com"] exiting on signal 15.
Mar 9 17:07:14 ORZ kernel: imklog 5.8.10, log source = /proc/kmsg started.
Mar 9 17:07:14 ORZ rsyslogd: [origin software="rsyslogd" swVersion="5.8.10"
x-pid="1520" x-info="http://www.rsyslog.com"] start
Mar 9 17:07:14 ORZ kernel: Initializing cgroup subsys cpuset
Mar 9 17:07:14 ORZ kernel: Initializing cgroup subsys cpu
這樣應該是有被下指定對吧?另外last也說
reboot system boot 2.6.32-573.18.1. Wed Mar 9 17:06 - 14:06 (3+20:59)
reboot system boot 2.6.32-573.18.1. Thu Mar 3 03:06 - 17:03 (6+13:57)
請問為什麼會重開?
作者: mstar (Wayne Su)   2016-03-13 16:09:00
檢查記憶體或主機板看看?
作者: johnjohnlin (嗯?)   2016-03-13 20:54:00
以前買過 hp z620 也有類似情況,叫廠商換 power之後就好了
作者: LIAR (玻璃做的大叔)   2016-03-13 22:06:00
嗚挖....這東西要交叉測試也難耶!頻率那麼低
作者: sixchen (建六)   2016-03-14 16:04:00
主機過熱?感覺上像是被hardware trigger 了reboot,若是HP server可以檢查一下iLo log, IBM server可以檢查IMM log
作者: LIAR (玻璃做的大叔)   2016-03-20 09:47:00
我是centos,我再看看狀況吧!先頻繁備份好了

Links booklink

Contact Us: admin [ a t ] ucptt.com