HP

HP Management Agents for Servers (Server component)

English
  Server Subsystem  |  Recovery   

Recovery

»Table of Contents
»Index
»Server Subsystem
»System
»Storage
»Utilization
»Power Subsystem
Recovery
»Management Processor
»Tasks
»Logs
»Printable version
»Glossary
»Using Help
» Automatic Server Recovery
» Cooling and Temperature
» Remote Communications
» Firmware Inventory

Automatic Server Recovery

This section provides Automatic Server Recovery (ASR) configuration information, tells you when the server was last reset, and allows you to modify pager settings. You can modify the Status, ASR Reset Boot Option, Pager Status, Pager Dial String, and Pager Message settings.

The following items display on this window.

  • General Information

    • Status displays the status of ASR. The possible values are:

      • Enabled - ASR is enabled for this server.

      • Disabled - ASR is disabled for this server. To change this status, run the HP System Configuration Utility or perform a set on this item.

      • Not Available - ASR is not available for this server or your driver is not loaded. ASR is available only on operating systems using the ASR software support provided by HP.

      • Unknown - You may need to upgrade your support software and/or Server Agent. The Server Agent cannot determine the status.

      OID: 1.3.6.1.4.1.232.6.2.5.1 - cpqHeAsrStatus

    • Last Reset displays how the last server reset was performed. The following values are possible:

      • ASR - The last reset was performed by ASR. Check the Critical Error Log to determine what may have caused ASR.

      • ASR-Cleared - The last reset was performed by ASR. The degraded condition caused by the ASR reset has been cleared. Degraded ASR conditions can be cleared by selecting the [Clear ASR] button on the Auto Server Recovery window.

      • Manual - The last reset was performed manually.

      • Unknown - You may need to upgrade your driver software and/or Server Agent. The Server Agent cannot determine the status of the device.

      OID: 1.3.6.1.4.1.232.6.2.5.7 - cpqHeAsrReset

      If the last reset was an ASR reset, the ASR condition will be degraded.

    • Timeout displays how many minutes ASR will wait before initiating a recovery process. ASR depends on the software support to routinely notify the ASR hardware that the server is operating properly.

      OID: 1.3.6.1.4.1.232.6.2.5.4 - cpqHeAsrTimeout

      To change the timeout setting, use the HP System Configuration Utility. The time you specify for this field should be a prudent period of time before resetting the system and activating the recovery process after a fault occurs. If the timeout period is set too low on a heavily utilized server, the timeout could occur before the software support has time to service the timer.

    • ASR H/W Version displays the version of the hardware supporting ASR. Use this information for identification purposes.

      OID: 1.3.6.1.4.1.232.6.2.5.2 - cpqHeAsrMajorVersion

      OID: 1.3.6.1.4.1.232.6.2.5.3 - cpqHeAsrMinorVersion

  • Reboot

    • Reset Boot Option displays what the server will boot after an ASR reset occurs. When the recovery process is initiated, ASR will reset the server, test all memory, de-allocate any bad memory blocks, and page you (if modem is present in the server and paging is enabled).

      • other

      • bootOs

      • bootUtilities

      OID: 1.3.6.1.4.1.232.6.2.5.8 - cpqHeAsrReboot

    • ASR Reset Limit displays the number of consecutive times that ASR will attempt recovery. The Automatic Server Recovery (ASR) feature can restart a server after a critical hardware or software error occurs. ASR will attempt the recovery process a limited number of consecutive times. You cannot change this number. If the server continues to experience hardware or software errors and the number of recovery cycles exceeds this limit, the server will log an error to the Critical Error Log and continue to boot the Utilities from the hard drive.

      Use the ASR Reset Limit feature in conjunction with the ASR Reset Count feature in the same window. The ASR Reset Count feature displays the number of times that ASR has rebooted the server. If the ASR Reset Count is approaching the reset limit, immediately investigate the server for problems by checking the Critical Error Log and running Diagnostics.

      OID: 1.3.6.1.4.1.232.6.2.5.9 - cpqHeAsrRebootLimit

    • ASR Reset Count displays how many times the ASR feature has rebooted the server. ASR will reboot (or reset) the server a limited number of times. If the ASR Reset Count is incremented, complete the following:

      1. Check the Critical Error Log to determine if a serious problem exists.

      2. If you suspect a software problem, consult your operating system documentation.

      3. If you suspect a hardware problem, run Diagnostics to determine if a problem exists.

      This count is reset to 0 when the system is reset manually .

      OID: 1.3.6.1.4.1.232.6.2.5.10 - cpqHeAsrRebootCount

Cooling and Temperature

This section displays details on the device environment. The following information is available.

  • System

    • Overall Thermal and Fan Status This section displays the overall condition of the system's thermal environment. The options are:

      • OK - The thermal temperature status and thermal CPU fan status and thermal System fan status are all OK.

      • Degraded - The status is not failed and thermal temperature status or thermal CPU fan status or thermal System fan status is degraded.

        Do not operate the system with the cover removed for longer than the recommended service time. Proper airflow is possible only when the cover is in place and properly secured.

      • Failed - The thermal temperature status or thermal CPU fan status or thermal System fan status is failed.

        A Failed condition will not occur in a system since the power supply for the client will be cut off in the event the thermal condition reaches a permanently damaging level.

      • Unknown - The thermal features are not supported.

      OID: 1.3.6.1.4.1.232.6.2.6.1 - cpqHeThermalCondition

  • Temperature Sensors displays an entry for each temperature sensor in the system.

    • Sensor displays the status and index of the sensor. The status of each sensor can be:

      • Unknown - Temperature could not be determined.

      • OK - The temperature sensor is within normal operating range.

      • Degraded - The temperature sensor is outside of normal operating range.

      • Failed - The temperature sensor detects a condition that could permanently damage the system.

      OID (status): 1.3.6.1.4.1.232.6.2.6.8.1.6 - cpqHeTemperatureCondition

      OID (index): 1.3.6.1.4.1.232.6.2.6.8.1.2 - cpqHeTemperatureIndex

    • Location displays the location of the temperature sensor in the system.

      OID: 1.3.6.1.4.1.232.6.2.6.8.1.3 - cpqHeTemperatureLocale

    • Temperature displays the current temperature sensor reading in degrees celsius. If this value cannot be determined by software, then a value "N/A" will be displayed.

      OID: 1.3.6.1.4.1.232.6.2.6.8.1.4 - cpqHeTemperatureCelsius

    • Threshold displays the shutdown threshold temperature sensor setting in degrees celsius. This is the temperature in which the sensor will be considered to be in a failed state thus causing the system to be shutdown. If this value cannot be determined by software, then a value "N/A" will be displayed.

      OID: 1.3.6.1.4.1.232.6.2.6.8.1.5 - cpqHeTemperatureThreshold

    • Type displays the type of this instance of temperature sensor. This value will be one of the following:

      • Unknown - Temperature threshold type could not be determined.

      • Blowout - If a blowout temperature sensor reaches its threshold, the fan or fans in the area of the temperature sensor will increase in speed in an attempt to reduce the temperature before a caution or critical threshold is reached.

      • Caution - If a caution temperature sensor reaches its threshold, its status will be set to degraded and the system will either continue or shutdown depending on the setting of "Thermal Degraded Action" field.

      • Critical - If a critical temperature sensor reaches its threshold its status will be set to failed and the system will shutdown.

      OID: 1.3.6.1.4.1.232.6.2.6.8.1.7 - cpqHeTemperatureThresholdType

  • Enclosure Fans displays information about enclosure fans on server blades is available only through Onboard Administrator

  • Fault Tolerant Fan Groups displays an entry for each of the device or system processor fans.

    • Group Redundancy is displayed if the fan group is currently operating in redundant mode. The values are:

      • Redundant

      • Not Redundant

      • Unknown

      OID: 1.3.6.1.4.1.232.6.2.6.7.1.7 - cpqHePwrConvRedundantGroupId

    • Fan displays the status and index of the fan. The status of each fan can be:

      • OK - The fan is operational.

      • Failed - The fan has failed. The device will shut down automatically to prevent damage to hardware or data loss. Replace the fan.

      • Unknown - You may need to upgrade your driver software or Server Agent and the Server Agent cannot determine the status of this setting.

      OID (status): 1.3.6.1.4.1.232.6.2.6.7.1.9 - cpqHeFltTolFanCondition

      OID (index): 1.3.6.1.4.1.232.6.2.6.7.1.2 - cpqHeFltTolFanIndex

    • Location displays a text description of the hardware location for the fan.

      OID: 1.3.6.1.4.1.232.6.2.6.7.1.11 - cpqHeFltTolFanHwLocation

    • Type displays the type of the fan:

      • Unknown - The type of fan could not be determined.

      • Tach Output - The fan can increase speed for greater cooling. Implies spin detect.

      • Spin Detect - The fan can detect when the fan stops spinning.

      OID: 1.3.6.1.4.1.232.6.2.6.7.1.5 - cpqHeFltTolFanType

    • Present specifies if the fan described is present in the system. This column will display "Unknown" if this information cannot be determined.

      OID: 1.3.6.1.4.1.232.6.2.6.7.1.4 - cpqHeFltTolFanPresent

    • Hot Pluggable indicates if the fan is capable of being removed and/or inserted while the system is in an operational state. This value will be one of the following:

      • Unknown - The state could not be determined.

      • Not Hot Plug - The fan is not hot plug capable.

      • Hot Plug - The fan is hot plug capable and can be removed if the system is operating in a redundant state. A fan may be added to an empty fan bay.

      OID: 1.3.6.1.4.1.232.6.2.6.7.1.10 - cpqHeFltTolFanHotPlug

    • Current Speed specifies the current speed of a fan in rpm - revolutions per minute.This field will contain "N/A" if the current speed cannot be determined.

      OID: 1.3.6.1.4.1.232.6.2.6.7.1.12 - cpqHeFltTolFanCurrentSpeed

    • Speed specifies the speed of the fan. This field will contain "N/A" if the speed cannot be determined.

      OID: 1.3.6.1.4.1.232.6.2.6.7.1.6 - cpqHeFltTolFanSpeed

Remote Communications

This section displays details about the status of the Integrated Remote Console and the Rapid Recovery communications configuration.The following fields display:

  • Integrated Remote Console

    • Status indicates whether the Integrated Remote Console is supported and enabled. Possible values include Not Supported, Unknown, Enabled, and Disabled.

      • Not Supported Integrated Remote Console is not present on this device.

      • Unknown Integrated Remote Console cannot determine the status of this setting.

      • Enabled Integrated Remote Console is present and enabled.

      • Disabled Integrated Remote Console is present but disabled. Three things can cause Integrated Remote Console to be disabled even though you enabled it:

        1. The COM port for which IRC is configured does not exist.

        2. The COM port for which IRC is configured is a PCI device

        3. The IRQ for which IRC is configured does not match the COM port for which IRC is configured.

      OID: 1.3.6.1.4.1.232.6.2.10.1 - cpqHeIRCStatus

  • Remote PC Communications to System Configuration Utilities

    • Network Access displays the status of the ASR Network Remote Console feature. The following values may display in this field.

      • Enabled - Remote Console network access is enabled. If the server ASR reboots toHP System Configuration Utility(see Reset Boot Optionin Automatic Server Recovery Window) or if you reboot to HP System Configuration Utility from HP Systems Insight Manager by pressing the [ Reboot] Button in the Device View Window, then network remote access is enabled. You may access HP System Configuration Utility through Remote Console.

      • Disabled - Remote Console network access is not enabled.

      • Unknown - You may need to upgrade your driver software or Server Agent or the Server Agent cannot determine the status of this setting.

      OID: 1.3.6.1.4.1.232.6.2.5.21 - cpqHeAsrNetworkAccessStatus

    • Dial In Status displays whether the ASR feature will put the modem into auto-answer mode after an ASR reboot.

      The following values may be displayed in this field.

      • Enabled - Remote Console dial-in access is enabled by putting the modem into auto-answer mode. If the server ASR reboots to HP System Configuration Utility (see Reset Boot Option in Automatic Server Recovery Window) or if you reboot to HP System Configuration Utility from HP Systems Insight Manager by pressing the Reboot button in the Device View Window, then modem remote access is enabled. You may access HP System Configuration Utility via Remote Console using a modem connection.

      If you have enabled Dial-Out Status, a dial-out connection will be attempted first. If that connection fails, then dial-in access is enabled. If the dial-out connection is successful, then dial-in is enabled after that connection is terminated.

      • Disabled - This feature is not enabled. ASR will not put the modem in auto-answer mode.

      • Unknown - You may need to upgrade your driver software or Server Agent or the Server Agent cannot determine the status of this setting.

      OID: 1.3.6.1.4.1.232.6.2.5.18 - cpqHeAsrDialInStatus

    • Dial-Out Status

      After the ASR feature has attempted to deliver an alarm by the means of the pager, if the Dial-Out Status is enabled and a proper Dial-Out String has been provided, ASR will dial a remote PC. When a session is established, the server administrator can use a third party terminal emulation program to run the HP System Configuration Utility to diagnose the problem.

      Possible values are:

      • Enabled - ASR will dial the Dial-Out String and attempt to set up a connection to a remote PC. ASR will attempt the connection five times. If a connection is not established and the Dial-In Status is enabled, ASR will put the modem into auto-answer mode so that the server administrator can dial-in.

      • Disabled - This feature is not enabled. ASR will not attempt a remote connection. However, if the Dial-In Status in enabled, ASR will put the modem into auto-answer mode so that the server administrator can dial in.

      • Unknown - You may need to upgrade your driver software or Server Agent or the Server Agent cannot determine the status of this setting.

      OID: 1.3.6.1.4.1.232.6.2.5.19 - cpqHeAsrDialOutStatus

    • Dial-Out String

      After the ASR feature has attempted to deliver an alarm by means of the pager, if the Dial-Out Status is enabled and a proper Dial-Out String is provided in this field, ASR will attempt to dial a remote PC. When a session is established, the system administrator can use a third-party terminal emulation program to run the HP System Configuration Utility to diagnose the problem.

      OID: 1.3.6.1.4.1.232.6.2.5.20 - cpqHeAsrDialOutNumber

    • Serial Interface displays the communication port that is enabled for use with the ASR feature. For example, this port might be Serial Port 1. ASR will use this port to page the system administrator, and the administrator will use this port when dialing in to the device.

      OID: 1.3.6.1.4.1.232.6.2.5.13 - cpqHeAsrCommPort

Firmware Inventory

This section displays the information about the firmware inventory in the system. The following information is displayed:

Firmware Versions

  • Index displays a unique row number for each firmware version entry.

    OID: 1.3.6.1.4.1.232.2.2.5.4.1.1 - cpqSiFirmwareRevIndex

  • Description displays the description of the firmware.

    OID: 1.3.6.1.4.1.232.2.2.5.4.1.2 - cpqSiFirmwareRevDesc

  • Revision displays the version of the firmware in the system.

    OID: 1.3.6.1.4.1.232.2.2.5.4.1.3 - cpqSiFirmwareRevString

  • Location displays the location of the firmware component.

    OID: 1.3.6.1.4.1.232.2.2.5.4.1.5 - cpqSiFirmwareLocation

  • Status displays the status of the firmware. The status values are:

    • Active

    • Inactive

    • Unknown

    OID: 1.3.6.1.4.1.232.2.2.5.4.1.6 - cpqSiFirmwareStatus