EcAlarmSinkSNMP class documentation

Authors

Richard Frith-Macdonald (rfm@gnu.org)

Copyright: (C) 2012 Free Software Foundation, Inc.

Software documentation for the EcAlarmSinkSNMP class

EcAlarmSinkSNMP : EcAlarmDestination

Declared in:
EcAlarmSinkSNMP.h

The EcAlarmSinkSNMP class implements an alarm destination which terminates a chain of destinations by delivering alarms to an SNMP monitoring system rather than forwarding them to another EcAlarmDestination object.

The SNMP functionality of the EcAlarmSinkSNMP is essentially to maintain a table of active alarms and to send out traps indicating changes of state of that table as follows:

Alarms in the table are each identified by a unique numeric identifier, the 'notificationID' and traps concerning the alarms carry that ID.
For each alarm trap sent out, there is a corresponding trap sent when the alarm is cleared and removed from the table. The clear trap carries the same notification identifier as that of the alarm being cleared, and the pair (alarm/clear) is known as a correlation.
A clear trap is never sent without a corresponding alarm trap having first been sent.
Each trap carries a sequence number so that the SNMP manager is able to detect lost traps.

For purposes of establishing a correlation between any alarms coming in to the system, the managed object, event type, specific problem, and probable cause values must be the same. When this occurs, any matching alarms are given the same notification identifier as the first alarm.
When an alarm is added to the table, only one trap will to notify about it, and no further traps will be sent for any correlated alarms arriving unless those alarms have new perceived severity values.
If an alarm changes its severity, a trap will be sent to clear the alarm (removing it from the table) and the alarm with new severity will be added to the table and its trap sent.

To allow an SNMP manager to resynchronize with the agent, the manager may examine the table of active alarms. This may be needed where there is an agent error, manager error, or loss of connectivity/traps (jumps in the trap sequence number).
The entries in the table contain all the same information as the alarm traps, apart from the trap sequence number.

After a crash/shutdown and subsequent start-up of the agent, the active alarm table is updated with the current situation of the system before the manager is able to respond to the coldstart (see RFC 1157). When the manager receives the agent's coldstart it may being a resynchronization process gathering alarms related to the current situation of the system by examining the alarm table.
There is a resync flag variable which indicates whether a resynchronization process is being carried out or not (set to 1 if resync is in progress, 0 otherwise). Before sending a trap, the variable is checked to know whether the resynchronization process is active or not. The traps will only be sent to the manager if there is no resync in progress, otherwise they will be deferred until the resync has completed.
The manager may therefore set the resync flag to 1, read the contents of the active alarms table, and then set the resync flag to zero to resume active operation.
As a safety measure to protect against manager failure, the resync flag is automatically reset to zero if it is left set to 1 for more than five minutes.

In addition to the traps and the alarms table, a table of managed objects is also maintained. The system guarantees that any managed object referred to in a trap is present in the table before the trap is sent.

In order for the manager to know that the agent is running, a heartbeat trap is sent periodically. The heartbeat contains a notification identifier of zero (never used other than for a heartbeat).
A poll heartbeat interval variable is provided to allow the manager to control the time between heartbeats (in minutes) and may be set to a positive integer value, or queried.

The class works by connecting to an SNMP agent using the Agent-X protocol, and registering as the 'owner' of various OIDs in an SNMP MIB so that SNMP requests made to the agent for those OIDs are forwarded to the EcAlarmSinkSNMP, and so that the EcAlarmSinkSNMP can send SNMP 'traps' via the agent.
To do this, the agent must be configured to accept incoming Agent-X connections from the host on which the EcAlarmSinkSNMP is running, and the alarm sink object must be initialized using the -initWithHost:name: method to specify the host and port on which the agent is listening.
If the EcAlarmSinkSNMP object is initialized without a specific host and name then it assumes that the agent is running on localhost and the standard Agent-X tcp/ip port (705).

To configure a net-snmp agent to work with this software you need to:

Edit /etc/snmp/snmpd.conf to get it to send traps to snmptrapd...

 rwcommunity     public
 trap2sink       localhost public
 
and to accept agentx connections via tcp...
 agentxsocket    tcp:localhost:705
 master          agentx
 
Then restart with '/etc/rc.d/init.d/snmpd restart'

All alarming is done a based on three OIDs which may be defined within the user defaults system. The EcAlarmSinkSNMP instance is responsible for managing those OIDs (and anything below them in the OID hierarchy).
By default the OIDs used are those specified in the GNUstep MIB (GNUSTEP-MIB.txt), but you may override them as specified below:

AlarmsOID
All SNMP alarm data is in a fixed structure relative to this OID in the MIB.
The table of current alarms is at the OID 1 relative to AlarmsOID, so you can interrogate the table using;
snmpwalk -v 1 -c public localhost {AlarmsOID}.1
   
The resync flag is at OID 2 relative to AlarmsOID and can be queried or set using:
snmpget -v 1 -c public localhost {AlarmsOID}.2.0
   
snmpset -v 1 -c public localhost {AlarmsOID}.2.0 i 1
   
The current trap sequence number is at OID 3 relative to AlarmsOID and can be queried using:
snmpget -v 1 -c public localhost {AlarmsOID}.3.0
   
The heartbeat poll interval is at OID 4 relative to AlarmsOID and can be queried or set using:
snmpget -v 1 -c public localhost {AlarmsOID}.4.0
   
snmpset -v 1 -c public localhost {AlarmsOID}.4.0 i 5
   
ObjectsOID
All SNMP managed object values are in a table relative to this OID in the MIB.
SNMP tools are able to interrogate the table at this OID to see which managed objects are currently registered:
snmpwalk -v 1 -c public localhost {ObjectsOID}
   
TrapOID
The definition of the trap which carries alarms to any monitoring system is at this OID.

Each of these OID strings must be a representation of an OID in the integer dotted format (eg. 1.3.6.1.4.1.37374.3.0.1).

Method summary

alarmSinkSNMP 

+ (EcAlarmSinkSNMP*) alarmSinkSNMP;

Returns the singleton alarm sink instance.
This instance is the link between the Objective-C world and the net-snmp world which runs in a separate thread. You should stop the SNMP system by calling -shutdown before process termination.

If the agent is on another host and/or is using a non-standard port, you need to call -initWithHost:name: before and/or instead of calling this method.


initWithHost: name: 

- (id) initWithHost: (NSString*)host name: (NSString*)name;

Overrides the default behavior to specify a host and name for the SNMP agent this instance is to connect to.
The host must be the machine the agent is running on and the name must be the tcp/ip port on which that agent is listening for Agent-X connections.

If an instance has already been created/initialized, this method returns the existing instance and its arguments are ignored.