HP

HP Server Manageability Extensions Provider for Storage

English
  Smart Array Controllers  |  Physical Drive Information   

Physical Drive Information

»Table of Contents
»Index
»Smart Array Controllers
»Array Controller Information
»Array Accelerator Information
Physical Drive Information
»Logical Drive Information
»Storage System Information
»Fibre Channel HBA
»Glossary
»Using Help

This section provides an overview of all disk drives attached to the controller. Each physical drive is listed as a separate entry in the Mass Storage submenu. The information displayed next to the physical drive includes the condition of the drive, the location of the drive and drive size. Select any of the physical drives to display more information about the drive. The following information is displayed.

  • Status indicates the status of the physical drive. The possible values are:

    • OK- The drive is functioning properly

    • Failed- The drive is no longer operating and should be replaced.

    • Unknown- The physical drive cannot be monitored at this time. This may be due to:

      The device driver for this drive may have been unloaded.

      The logical drive may have failed and been deactivated by the operating system. In this case, the last known status was OK.

      The Storage Agents do not recognize the drive. You may need to upgrade your software.

    • Stressed indicates that the element is functioning, but needs attention. Examples of "Stressed" states are overload, overheated, and so on.

    • Predictive Failure indicates that an element is functioning nominally but predicting a failure in the near future.

    • In Service describes an element being configured, maintained, cleaned, or otherwise administered.

    • No Contact indicates that the monitoring system has knowledge of this element, but has never been able to establish communications with it.

    • Lost Communication indicates that the Managed System Element is known to exist and has been contacted successfully in the past, but is currently unreachable.

    • Stopped and Aborted are similar, although the former implies a clean and orderly stop, while the latter implies an abrupt stop where the state and configuration of the element might need to be updated.

    • Dormant indicates that the element is inactive or quiescent.

    • Supporting Entity in Error indicates that this element might be "OK" but that another element, on which it is dependent, is in error. An example is a network service or endpoint that cannot function due to lower-layer networking problems.

    • Completed indicates that the element has completed its operation. This value should be combined with either OK, Error, or Degraded so that a client can tell if the complete operation Completed with OK (passed), Completed with Error (failed), or Completed with Degraded (the operation finished, but it did not complete with OK or did not report an error).

    • Power Mode indicates that the element has additional power model information contained in the Associated PowerManagementService association.

  • Model displays a description of the physical drive. The text depends on the manufacturer of the drive and the drive type.

    If a drive fails, note the model to identify the type of drive necessary for replacement.

  • Firmware Version- A string representing the complete software version information - for example, '12.1(3)T'. This string and the numeric major/minor/revision/build properties are complementary. Since vastly different representations and semantics exist for versions, it is not assumed that one representation is sufficient to permit a client to perform computations (i.e., the values are numeric) and a user to recognize the software's version (i.e., the values are understandable and readable). Hence, both numeric and string representations of version are provided.

  • Serial Number- A manufacturer-allocated number used to identify the software.

  • Service Hours displays the current number of hours of service (the number of hours that a physical drive has been spinning) since the drive was stamped. The drive was stamped when it left the factory.

    For example, if the Current Service Hours value is 604, the drive has been operating for 604 hours. If an error occurred at 499 Service Hours, it occurred after 499 hours of service.

  • Capacity displays the size of the physical drive in megabytes. For example, 120 indicates that the physical drive is 120 megabytes.

  • Rotational Speed indicates the rotational speed of the drive in revolutions per minute.

  • Drive Type indicates the type of physical drive. The following values are valid:

    • SCSI- The physical drive is a parallel SCSI drive.

    • SATA- The physical drive is a SATA drive.

    • SAS- The physical drive is a SAS drive.

    • Unknown- The Storage provider can not determine the drive type.

  • Drive Configuration- Enumeration of types that describe the drive configuration. The enumeration's values are

    • Unknown

    • Unconfigured

    • Data

    • Spare

    • Non RAID

Paths

This section displays the status and role of each data path to the physical drive for multipath capable hardware. This section is displayed only if the hardware is multipath capable.

  • Path Each path is identified by a descriptor, for example "Port 2E Box 1 Bay 4" indicates a path from the host adapter external port number 2 ("Port 2E") to the 4th bay in the first box.

  • Status Indicates the status of the data path. Possible values are:

    • OK - The path is operational.

    • PATH ERROR - The path is not operational.

    • UNKNOWN - The path status cannot be determined.

  • Role Indicates the status of the data path. Possible values are:

    • active - This path is the preferred data path to the physical drive.

    • passive - This path is the alternate data path to the physical drive.

    • Path Error - This path is not operational.

    • unknown - The role of this path cannot be determined.

Identify Drive

Select the length of time to identify the physical drive from the drop-down list box and then select the Start button. The page will automatically refresh and display an image of a identified drive and a Stop button. Select the Stop button to end identification before the time expires.

After the drive identification completes, the page will have to be manually refreshed to display the Start button. There may be a delay, depending on the length of the HP Insight Management Agents data collection interval, after the drive identification completes and before the Start button can be displayed.

Only drives in hot plug trays are supported since the LEDs are part of the tray. Only one drive on a selected controller may be identified at a time. If a different drive is selected while another drive is currently identified then the other drive will stop identification and the selected drive will be identified

Logical Drive Information

Select one of the listed logical drives to see more information about the drive.

Problem Indicators

Use the Problem Indicators to determine when a drive failure has occurred that may be correctable without replacing the drive. The Problem Indicators are:

  • Fail Recovery Reads shows the number of read errors that occurred while Automatic Data Recovery was being performed from this physical drive to another drive. If a read error occurs, Automatic Data Recovery stops.

  • Other Timeouts shows the number of Other Time Outs.

  • SCSI Bus Faults displays the number of SCSI Bus Faults.

Failure Indicators

Use the Failure Indicators to determine the cause of a drive failure. Typically, the number of failures is zero when the drive is operating normally. If a counter is not zero and the drive has not failed, there could be an intermittent problem that may require the drive to be replaced. The Failure Indicators are:

  • Spinup Errors- When the physical drive fails due to the failure of a spin-up command, a Spinup Error occurs. If the failure count is not zero and the drive has failed, replace the drive.

    If the counter is not zero and the drive is OK (has not failed), there may be an intermittent problem that requires drive replacement. If you observe that the count is increasing over time, replace the drive.

  • Aborted Commands- The Aborted Commands counter records the number of times that a physical SCSI drive returned an Aborted Command status when a SCSI command was attempted. This error count indicates unsuccessful termination of the SCSI command. When the physical drive is failed due to aborted commands that could not be retried successfully, Aborted Commands errors occur. If the number of errors is not zero and the drive has failed, replace the drive

    If the counter is not zero and the drive is OK (has not failed), there may be an intermittent problem that requires drive replacement. If you observe that the count is increasing over time, replace the drive.

  • Media Failures- When this physical drive fails due to unrecoverable media errors, a Media Failure occurs.

    If the number of media failure errors is not zero and the drive has failed, replace the drive. If the counter is not zero and the drive is OK (has not failed), there may be an intermittent problem that requires drive replacement. If you observe that the count is increasing over time, replace the drive.

  • Format Errors- When a format operation fails because the controller was unable to remap a bad sector, a Format Error occurs.

    If the number of format errors is not zero and the drive has failed, replace the drive. If the counter is not zero and the drive is OK (has not failed), there may be an intermittent problem that requires drive replacement. If you observe that the count is increasing over time, replace the drive.

  • Hardware Errors- The Hardware Errors counter records the number of times that a physical SCSI drive returned a Hardware Error status when a SCSI command was attempted. This error status indicates unsuccessful termination of the SCSI command. The controller typically retries this command several times before failing the drive.

    If the number of hardware errors is not zero and the drive has failed, replace the drive. If the counter is not zero and the drive is OK (has not failed), there may be an intermittent problem that requires drive replacement. If you observe that the count is increasing over time, replace the drive.

  • Not Ready Errors- When a physical drive returns a not ready status when it should be ready, a Drive Not Ready Error occurs. This error could occur if a drive spins down unexpectedly or if the drive never becomes ready after the spin up command is issued.

    If the number of not ready errors is not zero and the drive has failed, replace the drive. If the counter is not zero and the drive is OK (has not failed), there may be an intermittent problem that requires drive replacement. If you observe that the count is increasing over time, replace the drive.

  • Bad Target Errors- When a physical drive performs an action that does not conform to the SCSI-2 port protocol, the SCSI port is reset.

    If the number of bad target errors is not zero and the drive has failed, replace the drive. If the counter is not zero and the drive is OK (has not failed), there may be an intermittent problem that requires drive replacement. If you observe that the count is increasing over time, replace the drive.

  • Fail Recovery Writes indicates whether write errors occurred while Automatic Data Recovery was being performed to this physical drive. If a write error occurs, Automatic Data Recovery stops. These errors indicate that the physical drive has failed.

    If the number of fail recovery writes is not zero and the drive has failed, replace the drive. If the counter is not zero and the drive is OK (has not failed), there may be an intermittent problem that requires drive replacement. If you observe that the count is increasing over time, replace the drive.

Statistics

This section displays statistics about a physical drive of the specific drive array controller . You can use the run-time statistics to monitor the health of a specific drive. The following information displays:

  • Sectors Read shows the total number of sectors read from the physical drive since the drive was stamped. The drive was stamped when it left the factory.

  • Hard Read Errors displays the number of read errors that could not be recovered by a physical drive's Error Correction Code (ECC) algorithm or through retries. Over time, a drive may produce these errors. If you receive these errors, a problem may exist with your drive.

    The severity of these errors depends on whether the managed system is running in a fault tolerant mode. With fault tolerance, the controller can remap data to eliminate the problems caused by these errors.

  • Recovered Read Errors displays the number of read errors corrected through physical drive retries. Over time, all drives produce these errors. If you notice a rapid increase in the value for Recovered Read Errors or Hard Read Errors, a problem may exist with the drive. Expect more errors for this monitored item than for Hard Read Errors.

  • Total Seeks displays the total number of seek operations during seek tests performed by the physical drive since the drive was stamped. The drive was stamped when it left the factory.

    During normal reads and writes to the drive, the drive does implied seeks to the location where data resides. These are not included in this count.

  • Seek Errors displays the number of seek errors that a physical drive detects. A seek error is a seek that failed. Over time, a drive usually produces these errors. If you notice a rapid increase in the value shown for Seek Errors, this physical drive may be failing. Only an unusually rapid increase in these errors indicates a problem.

  • Sectors Written displays the total number of sectors written to the physical drive since the drive was stamped. The drive was stamped when it left the factory.

  • Hard Write Errors displays the number of write errors that could not be recovered by a physical drive. Over time, a drive may produce these errors. If you notice an increase in the value shown for Hard Write Errors or Recovered Write Errors, a problem may exist with the drive. The counter value increases every time the physical drive detects another error. On average, these errors should occur less frequently than read errors.

  • Recovered Write Errors displays the number of write errors corrected through physical drive retries or recovered by a physical drive on a monitored system. Over time, a drive may produce these errors. If you notice an increase in the value shown for Recovered Write Errors or Hard Write Errors, a problem may exist with the drive.

  • Hotplug Count displays the number of hot plugs

  • DRQ Timeouts displays the number of times the drive did not respond for a data request within a controller-defined period of time after a command had been issued.