IT專題-Nagios警訊透過hangups發送－IT001

現場(環境):CentOS 7.3 套件:Nagios、hangups

這個是一個由third-party開發的使用Google hangouts發送訊息的套件。

是透過Python3 進行開發的程式。

在CentOS 7 的環境中。預設是使用Python 2的環境。

所以我們有兩個方案可以開始進行安裝。

1) 環境直接改成Python3 ，但是可能造成部份透過python2運作的套件異常。

2) 透過python virtualenv 安裝建置。

這邊選擇第2方案。

關於virtualenv 安裝請參考:

IT事件簿-安裝Python virtualenv

Step 1. 將檔案從github抓回來

# git clone https://github.com/tdryer/hangups.git

可以看到多了一個hangups的資料夾。

# ls -l

建立virtualenv 環境 (可選用Python 3.5 / Python 3.6)

這邊建議降成Python 3.5版

# virtualenv -p /usr/local/bin/python3.6 Nagioshangups

運行virtualenv環境

#cd Nagioshangups

#source ./bin/activate

(Nagioshangups) # cd /root/hangups/

(Nagioshangups) # python3 setup.py install

執行程式。抓取目前的群組

(Nagioshangups) # cd /root/hangups/examples/

(Nagioshangups) # python3 -d build_conversation_list.py

輸入要串接的Google Hangouts 帳號。

接著他會把內部存在的帳號都撈出來。

因為目前帳號內就只有自己的帳號，所以也只有撈出一組序號

這邊有一個重點，因為他說沒有辦法把驗證的token儲存到refresh_token.txt檔案。

這邊就要先把檔案創出來。

(這邊重新建立了一個環境Python3.5 的來修正，作法都一樣)

首先要先創立路徑，這個動作要在hangups存在的環境裏面製作。

一般狀況下，只要這邊運作完成，就可以不需要再執行登入動作。

(Nagioshangups)# mkdir -p /root/.cache/hangups

(Nagioshangups)# touch /root/.cache/hangups/refresh_token.txt

之後再次執行

(Nagioshangups) # python3 -d build_conversation_list.py

就會把帳號密碼所要使用的token 紀錄進去

但是如果沒有照正常程序動作，造成之後的操作問題的話。

就要用下面的方法。

先到參考網頁將pythont程式複製下來（這個程式是用來找出token的)

這個程式要再有hangups的環境下使用。

名稱就直接以他的方法命名。

=! hangups_manual_login.py =====

import os, hangups, requests, appdirs

print('Open this URL:')

print(hangups.auth.OAUTH2_LOGIN_URL)

authorization_code = input('Enter oauth_code cookie value: ')

with requests.Session() as session:

session.headers = {'user-agent': hangups.auth.USER_AGENT}

access_token, refresh_token = hangups.auth._auth_with_code(

session, authorization_code

)

dirs = appdirs.AppDirs('hangups', 'hangups')

token_path = os.path.join(dirs.user_cache_dir, 'refresh_token.txt')

hangups.auth.RefreshTokenCache(token_path).set(refresh_token)

=====!=====================

接著直接執行

(Nagioshangups)# python3 hangups_manual_login.py

運作後會出現一串網址，將網址(整串)複製到有圖形化介面的環境，

並使用瀏覽器開啟。

就會出現google的登入畫面。

輸入密碼登入後就會出現「請稍候」的畫面。

這個畫面是不會轉跳到哪裡去的，這時候要用工具來操作。

在瀏覽器按下「F12」，點選「Applicaton」

找到cookies內唯一的網址

找到oauth 裏面的值(Value)。把他複製到剛剛運作的畫面中。

複製貼上

但是出現下面的錯誤。

這邊一樣顯示要將token寫入refresh_token.txt時有問題

所以問題很明顯都是無法創建token造成的

先把路徑建立出來

(Nagioshangups)# mkdir -p /root/.cache/hangups

建立檔案

(Nagioshangups)# vim /root/.cache/hangups/refresh_token.txt

把剛剛取得的oauth的值複製貼上。然後存檔。

之後再重新運作

(Nagioshangups)# python3 hangups_manual_login.py

這時候一樣要取抓取oauth的值(value)。數值可能不一樣。

之後運作完後就會自行更新token數值。

有了token我們再這個環境中執行什麼hangups的指令，都不用再輸入帳號密碼。

接著到hangouts群組去新增使用者&群組看看

看會出現什麼畫面。

先新增使用者

輸入其他使用者的gmail帳號就可以找到。

點選使用者後，就會自動將他將入群組邀請行列。

輸入群組名稱後。就可以勾選確認。

完成後就可以看到使用者加入聯絡人之中

這個階段去運行 build_conversation_list.py 的時候還找不到新的使用者資訊。

要等這個群組有第一個對話之後。才能夠撈到群組&使用者帳號資訊。

可以在群組內隨便打個字。

再一次運行 build_conversation_list.py 撈取資料

(Nagioshangups) # python3 -d build_conversation_list.py

這邊可以看到撈到令一個使用者帳號資料 & 剛剛創建的群組。

通常群組應該也是要有ID的，

如果一直不出現，就嘗試下方指令看看能不能撈出來。

(Nagioshangups) #python3 build_conversation_list.py -d

這邊會顯示較為清楚的內容

再訊息最後面可以看到之前顯示的結果，也會出現新建的群組ID名稱。

或者可以到訊息中找conversation_id ，這樣也可以找到群組ID

試著透過python程式發訊息測試。

(Nagioshangups) #python3.5 send_message.py --conversation-id [群組ID] --message-text "[訊息內容]"

看一下訊息內容是否成功發送。

進行到這邊hangups的準備作業算是完成了。

先離開python 的 virtualenv環境。

(Nagioshangups) #deactivate

編寫一下send_message.sh

這邊會將訊息拋進 /usr/local/nagios/send_message.log做紀錄。

寫完這邊後就可以讓Nagios直接執行send_message.sh 拋送訊息。

#vim send_message.sh

=!===

#!/bin/sh

pythonEnv=[virtualenv路徑]/bin/activate

hangupsPath=[下載路徑]/hangups/bin/examples

hangupsSendshell=send_message.py

hangupsLog=/usr/local/nagios/send_message.log

conversationId=$1

messageText=$2

echo ${conversationId} ${messageText} >> ${hangupsLog}

cd ${hangupsPath}

source ${pythonEnv}

python ${hangupsSendshell} --conversation-id ${conversationId} --message-text "${messageText}"

deactivate

===!=

最後就是要來設定Nagios的template來進行警訊發送。

撰寫contract (這邊主要是設定聯絡人的地方)

設定Nagios的時候這裡一定要最先定義。

這邊是定義聯絡人。

#vim /usr/local/nagios/etc/objects/contacts.cfg

=!===

#####!####

contact

#####!####

define contact{

contact_name nagiosadmin

use generic-contact

alias Nagios Admin

email nagios@localhost

}

#添加一組新的contact ; 有多少個群組就加多少

define contact{

contact_name it001_Nagios

use generic-contact

alias Hangouts Nagios

email Ugwuo…. ; Hangouts Group ID

}

#####!####

同一隻設定檔內，添加group (自行定義)

這邊的members 要對應上方的define contact 。

將上方定義好的聯絡人資料(it001_Nagios)，加入下方的group當中。

設定完成後，後續的監控主機如果屬於這個contactgroup時。

在監控出現問題的當下，就會通知這個meber成員(it001_Nagios)。

也就會是以hangups的方式告知群組。

contactgroup

#####!####

define contactgroup{

contactgroup_name admins

alias Nagios Administrators

members nagiosadmin

}

define contactgroup{

contactgroup_name hangups

alias Nagios hangups

members it001_Nagios

}

===!=

撰寫command

在command當中將入關於send_message.sh的使用方法

這邊定義的是告知方法以及要告知那些資訊

拿email的設定檔來修改。所以都一樣是針對host / service來做。

# vim /usr/local/nagios/etc/objects/commands.cfg

=!===

# notify-host-by-hangouts' command definition

define command{

command_name notify-host-by-hangouts

command_line [hangups路徑]/examples/send_message.sh $CONTACTEMAIL$ "Host: $HOSTALIAS$ $HOSTNAME$ State: $HOSTSTATE$ Address: $HOSTADDRESS$ Notes: $HOSTNOTES$ Info: $HOSTOUTPUT$ Time: $LONGDATETIME$"

}

# 'notify-service-by-hangouts' command definition

define command{

command_name notify-service-by-hangouts

command_line [hangups路徑]/examples/send_message.sh $CONTACTEMAIL$ "Service: $SERVICEDESC$ Host: $HOSTALIAS$ Address: $HOSTADDRESS$ State: $SERVICESTATE$ Info: $SERVICEOUTPUT$ DateTime: $LONGDATETIME$"

===!=

如果command 內要加監控指令也是可以

這邊就依據個人狀況設計

這邊我們到網站上下載 check_traffic.sh & 放入NRPE的套件用以做偵測。

其中NRPE的套件，就是呼叫遠端伺服器執行服務的關鍵。

檔案路徑都需要放於 /usr/local/nagios/libexec 之下。

=!==

# 監控snmp 部份。

define command{

command_name check_traffic

command_line $USER1$/check_traffic.sh -V 2c -C public -H $HOSTADDRESS$ -N $ARG1$ -w $ARG2$ -c $ARG3$ $ARG4$

}

#透過NRPE確認遠端環境運作狀況是否正常

define command{

command_name check_remote_running

command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_local_service -a "$ARG1$"

}

==!=

再來來設計templates

這邊很重要，很多地方都是這裡設定錯誤，導致後續使用上的問題。

因為這邊的定義資訊會直接影響後面的*.cfg(監控主機)設定值。

# vim /usr/local/nagios/etc/objects/templates.cfg

基本上也都會是透過範本進行修改

最上方contact的部份要修改。原本是透過email發信，

我們要修改成使用hangups。

=!===

efine contact{

name generic-contact

service_notification_period 24x7

host_notification_period 24x7

service_notification_options w,u,c,r,f,s

host_notification_options d,u,r,f,s

service_notification_commands notify-service-by-hangouts

#service_notification_commands notify-service-by-email

host_notification_commands notify-host-by-hangouts

#host_notification_commands notify-host-by-email

}

# 這邊對應到command.cfg的部份

===!=

之後會有分為host & service部份

監控主機是否存活的部份

=!host====

define host{

name linux-server

use generic-host

check_period 24x7

check_interval 5

retry_interval 1

max_check_attempts 10

check_command check-host-alive

notification_period workhours

notification_interval 120

notification_options d,u,r

contact_groups hangups

}

define host{

name company-switch

use generic-host

check_period 24x7

check_interval 60

retry_interval 30

max_check_attempts 5

check_command check-host-alive

notification_period 24x7

notification_interval 3600

notification_options d,r

contact_groups hangups

}

define host{

name company-server

use generic-host

check_period 24x7

check_interval 300

retry_interval 60

max_check_attempts 10

check_command check_snmp!-C public -o sysUpTime.0

notification_period 24x7

notification_interval 3600

notification_options d,u,r

contact_groups hangups

}

#checkcommand 是剛剛在 command那邊設定的。

#contacr_groups 是在contacts.cfg 那邊設定的。

====!=

監控服務是否正確運行的部份

=!service====

=!===

define service{

name generic-switch-service; The 'name' of this service template

active_checks_enabled 1 ; Active service checks are enabled

passive_checks_enabled 1 ; Passive service checks are enabled/accepted

parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)

obsess_over_service 1 ; We should obsess over this service (if necessary)

check_freshness 0 ; Default is to NOT check service 'freshness'

notifications_enabled 1 ; Service notifications are enabled

event_handler_enabled 1 ; Service event handler is enabled

flap_detection_enabled 1 ; Flap detection is enabled

process_perf_data 1 ; Process performance data

retain_status_information 1 ; Retain status information across program restarts

retain_nonstatus_information 1 ; Retain non-status information across program restarts

is_volatile 0 ; The service is not volatile

check_period 24x7 ; The service can be checked at any time of the day

max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state

normal_check_interval 600 ; Check the service every 10 minutes under normal conditions

retry_check_interval 120 ; Re-check the service every two minutes until a hard state can be determined

contact_groups hangups ; Notifications get sent out to everyone in the 'admins' group

notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events

notification_interval 3600 ; Re-notify about service problems every hour

notification_period 24x7 ; Notifications can be sent out at any time

}

define service{

name company-switch

use generic-switch-service

max_check_attempts 6

normal_check_interval 60

retry_check_interval 30

contact_groups hangups

notification_interval 3600

}

Name company-switch 定義這個Service的名稱

Use generic-switch-service 使用上方定義的模組

(該模組有通知hangups群組的設定)

contact_groups hangups 這邊一樣定義偵測如出現異常送hangups群組

(是在contacts.cfg(聯絡人) 那邊設定的(對應到define contactgroup))。

====!=

監控主機部份：

監控主機的cfg檔案，也都放在/usr/local/nagios/etc/objects裏面。

(這邊最後會統一在nagios.cfg檔案內載入)

# vim /usr/local/nagios/etc/objects/company-switch.cfg

=!===

define host{

use company-switch ; Inherit default values from a template

host_name vigor-001 ; The name we're giving to this switch

alias Vigor Company01 ; A longer name associated with the switch

address X.X.X.X ; IP address of the switch

hostgroups hangups ; Host groups this switch is associated with

notes 線路編號(XXX)

}

#use 表示使用template內的哪個define host來監控

#hostname 會顯示在Nagios Web介面上。

#alias 用來識別

#hostgroup 這邊的hostgroup比較特別，是指群組。

這個後面來設定，與先前設定皆無關。

#note 是用來作為訊息事項

define service{

use company-switch

host_name vigor-001

service_description Uptime

check_command check_snmp!-C public -o sysUpTime.0

}

define service{

use company-switch

host_name vigor-001

service_description wan-wan2

check_command check_traffic!wan-wan1!40,5!45,10!-M -b

}

這邊的use就是對應到template當中的define service的部分，

表示依其設定間監測，也是是偵測到異常要發送hagups的警訊。

#check_command !40,5 表示為警戒值 ! 45,10 表示為Alter值

#最後面的 -M -b 表示為單位 Mb

#有其他的switch host可以陸續添加在同一個檔案內，只要格式分清楚就好。

===!=

也可以設定另一個檔案來管理設備

# vim /usr/local/nagios/etc/objects/company-server.cfg

=!===

define host{

use company-server

host_name Server001

alias XX伺服器01

address X.X.X.X

hostgroups hangups

}

define service{

use company-server-service

host_name Server001

service_description Filesystem-/

check_command check_snmp_storage!/!85%!92%

notifications_enabled 0

}

#設計理念都差不多，大概差異會在command的不同

===!=

如果有特殊服務也可以再建立額外的檔案分開。

例如透過NRPE執行動作的*.cfg:q!

=!===

#這邊的linux-extraServer / linux-extraServer-service都需要另外創建

define host{

use linux-extraService

host_name company_extraService

alias extraServiceXXX

address X.X.X.X

hostgroups hangups

}

define service{

use linux-extraService-service

host_name company_extraService

service_description extraServiceXXX

check_command check_remote!-s XXX -u XXX -d XXX -p XXX -w 80 -c 90 -t advan

notifications_enabled 1

}

===!=

接著還有一個要設定的重點。

前面有設定到hostgroup的地方。

我們還要定義一個專門用來分群組的設定檔。

#vim /usr/local/nagios/etc/objects/hostgroups.cfg

=!===

define hostgroup{

hostgroup_name switches ; The name of the hostgroup

alias Network Switches ; Long name of the group

}

===!=

最後將我們多添加的switch.cfg 、server.cfg 以及hostgroups.cfg，

都一同寫入nagios.cfg檔案當中。

這個檔案用於載入所有的設定資訊。

添加的方法很簡單，只要加上【cfg_file=檔案路徑】。

=!===

cfg_file=/usr/local/nagios/etc/objects/hostgroups.cfg

cfg_file=/usr/local/nagios/etc/objects/company-server.cfg

cfg_file=/usr/local/nagios/etc/objects/company-switch.cfg

===!=

然後用Nagios指令進行檢查，看看設定的地方有沒有錯誤。

因為nagios.cfg內載入了我們所有的設定檔，所以可以從這邊直接檢查。

(其他contacts、commands、templates…等檔案，也會因我們的設定方式而連動被檢測)

#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

沒有問題應當會出現下面的圖示。

接著重啟Nagios服務。並確認一下是不是有被加進host裡面。

#systemctl restart nagios

其他的主機or Switch設備的添加方式都一樣。

至於不會發送通知的問題，這個就要注意設定的地方有沒有哪邊有問題。

因為Nagios的設定是一環扣一環，也比較複雜些，這部分就讓大家自行嘗試及處理。

<本篇完>

參考網址：

https://hangups.readthedocs.io/en/latest/installation.html

https://github.com/tdryer/hangups/issues/350#issuecomment-323553771

IT001

IT001 發表在痞客邦留言(0) 人氣()

E-mail轉寄

IT001

# IT 回頭沒有岸

IT專題-Nagios警訊透過hangups發送

IT事件簿-安裝Python virtualenv

歷史上的今天

留言列表

文章搜尋

文章分類

IT記事本 (4)

IT好用軟體 (2)

IT廠商服務 (5)

IT 開箱文 (2)

IT執行緒 (1)

IT專題 (25)

IT讀書室 (8)

IT事件簿 (36)

熱門文章

最新文章

活動快報

【寵物...

我的好友

站方公告

最新留言

動態訂閱

文章精選

新聞交換(RSS)

誰來我家

參觀人氣

QR Code

POWERED BY