Tag Archives: incident

About Single Bit Errors

If you notice a similar message in syslog: Jan 20 12:49:05 EMS [1827]: —— EMS Event Notification —— Value: “MAJORWARNING (3)” for Resource: “/system/events/memory/192″ (Threshold: >= ” 3″) Execute the following command to obtain event details: /opt/resmon/bin/resdata -R 119734274 -r /system/events/memory/192 -n 119734277 -a

Bug in userdel found – affects all 11i v1,v2,v3

This is my first bug I’ve found in HP-UX. In the following story I modified some data to protect our customers privacy. According to the white book of our customer, a user integrated into a HA package need to be created with a symlinked home directory like this: # ll -d /home/user1 lrwxr-xr-x 1 root [...]

Interrupting dump after a crash

We have a machine that crashes after a certain period of time, and continues with a crash dump before a reboot. HP has detected that the FW of the Management Console should be patched, as we don’t have the latest patch installed. The customer didn’t want this, he was afraid of the side-effects to the [...]

Business Copy unleashed

Today we had some trouble with a cluster using Business Copy for backing up the data while the application is running. There is a framework for the automatic conduction of the backup process: it tells oracle to be prepared for the backup (no more write queries for the next x minutes), after that pairs the [...]

The process to replace a failed PCI card with OLA/R

Identify the failed PCI card Perform critical resource analysis on the affected PCI card Turn on the attention light for the affected PCI card slot Check that the affected PCI slot is in its own power domain Check that the affected PCI card slot is not a multi-function card Run any associated driver scripts before [...]