What is ASR – Automatic System Recovery ?

ASR (Automatic System Recovery ) is an advances System Recovery Mechanism available in HP servers. Supported Hardware platform includes : ProLiant and Blade Servers.

This ASR feature is implemented using a “heartbeat ” timer that continually counts down. The Health Monitor frequently reloads the counter to prevent it from counting down to zero. If the ASR counts down to zero, it is assumed that the operating system has locked up and the system will automatically attempt to reboot.

We have the option to toggle the ASR ON/OFF based on the customer requirements. But it is always advised to turn this feature ON to save your system from  “Blue Screen of Death”. If there is such a condition , ASR will trigger and reboot the system to avoid the down time.

The ASR Timeout option can be set a timeout limit for resetting a server that is not responding. When the server has not responded in the selected amount of time, the server automatically resets. The available time increments are:

10 minutes, 15 minutes, 20 minutes, 30 minutes, 5 minutes

hp Asr automatic system recovery

 

Events which may contribute to the operating system locking up include:A peripheral device − such as a

  • Peripheral Component Interconnect Specification (PCI) adapter − that generates numerous spurious interrupts when it fails.
  • A high priority software application consumes all the available central processing unit (CPU) cycles and does not allow the operating system scheduler to run the ASR timer reset process.
  • A software or kernel application consumes all available memory, including the virtual memory space (for example, swap). This may cause the operating system scheduler to cease functioning.
  • A critical operating system component, such as a file system, fails and causes the operating system scheduler to cease functioning.

Any other event besides an ASR timeout that causes a Non-Maskable Interrupt (NMI) to be generated. The ASR feature is a hardware-based timer. If a true hardware failure occurs, the Health Monitor might not be called, but the server will be reset as if the power switch were pressed. The ProLiant ROM code may log an event to the IML (Integrated Management Log) when the server reboots.

  • The Health Monitor is notified of ASR timeout through a NMI. If possible, the driver will attempt to perform the following actions:Displays a message on the console stating the problem.
  • Makes an entry in the IML.
  • Attempts to gracefully shut down the operating system to close the file systems.

There is no guarantee that the operating system will gracefully shutdown. This shutdown depends on the type of error condition (software or hardware) and its severity. The Health Monitor logs a series of messages when an ASR event occurs. The presence or absence of these messages can provide some insight into the reason for the ASR event. The order of the messages is important, since the ASR event is always a symptom of another error condition.

Reference : http://hp.com