Document revision date: 15 July 2002
[Compaq] [Go to the documentation home page] [How to order documentation] [Help on this site] [How to contact us]
[OpenVMS documentation]

OpenVMS Cluster Systems


Previous Contents Index


Chapter 10
Maintaining an OpenVMS Cluster System

Once your cluster is up and running, you can implement routine, site-specific maintenance operations---for example, backing up disks or adding user accounts, performing software upgrades and installations, running AUTOGEN with the feedback option on a regular basis, and monitoring the system for performance.

You should also maintain records of current configuration data, especially any changes to hardware or software components. If you are managing a cluster that includes satellite nodes, it is important to monitor LAN activity.

From time to time, conditions may occur that require the following special maintenance operations:

10.1 Backing Up Data and Files

As a part of the regular system management procedure, you should copy operating system files, application software files, and associated files to an alternate device using the OpenVMS Backup utility.

Some backup operations are the same in an OpenVMS Cluster as they are on a single OpenVMS system. For example, an incremental back up of a disk while it is in use, or the backup of a nonshared disk.

Backup tools for use in a cluster include those listed in Table 10-1.

Table 10-1 Backup Methods
Tool Usage
Online backup Use from a running system to back up:
  • The system's local disks
  • Cluster-shareable disks other than system disks
  • The system disk or disks

Caution: Files open for writing at the time of the backup procedure may not be backed up correctly.

Menu-driven or +standalone BACKUP Use one of the following methods:
  • If you have access to the OpenVMS Alpha or VAX distribution CD-ROM, back up your system using the menu system provided on that disc. This menu system, which is displayed automatically when you boot the CD-ROM, allows you to:
    • Enter a DCL environment, from which you can perform backup and restore operations on the system disk (instead of using standalone BACKUP).
    • Install or upgrade the operating system and layered products, using the POLYCENTER Software Installation utility.

    Reference: For more detailed information about using the menu-driven procedure, see the OpenVMS Upgrade and Installation Manual and the OpenVMS System Manager's Manual.

  • If you do not have access to the OpenVMS VAX distribution CD-ROM, you should use standalone BACKUP to back up and restore your system disk. Standalone BACKUP:
    • Should be used with caution because it does not:
      1. Participate in the cluster
      2. Synchronize volume ownership or file I/O with other systems in the cluster
    • Can boot from the system disk instead of the console media. Standalone BACKUP is built in the reserved root on any system disk.

    Reference: For more information about standalone BACKUP, see the OpenVMS System Manager's Manual.


+VAX specific
++Alpha specific

Plan to perform the backup process regularly, according to a schedule that is consistent with application and user needs. This may require creative scheduling so that you can coordinate backups with times when user and application system requirements are low.

Reference: See the OpenVMS System Management Utilities Reference Manual: A--L for complete information about the OpenVMS Backup utility.

10.2 Updating the OpenVMS Operating System

When updating the OpenVMS operating system, follow the steps in Table 10-2.

Table 10-2 Upgrading the OpenVMS Operating System
Step Action
1 Back up the system disk.
2 Perform the update procedure once for each system disk.
3 Install any mandatory updates.
4 Run AUTOGEN on each node that boots from that system disk.
5 Run the user environment test package (UETP) to test the installation.
6 Use the OpenVMS Backup utility to make a copy of the new system volume.

Reference: See the appropriate OpenVMS upgrade and installation manual for complete instructions.

10.2.1 Rolling Upgrades

The OpenVMS operating system allows an OpenVMS Cluster system running on multiple system disks to continue to provide service while the system software is being upgraded. This process is called a rolling upgrade because each node is upgraded and rebooted in turn, until all the nodes have been upgraded.

If you must first migrate your system from running on one system disk to running on two or more system disks, follow these steps:
Step Action
1 Follow the procedures in Section 8.5 to create a duplicate disk.
2 Follow the instructions in Section 5.10 for information about coordinating system files.

These sections help you add a system disk and prepare a common user environment on multiple system disks to make the shared system files such as the queue database, rightslists, proxies, mail, and other files available across the OpenVMS Cluster system.

10.3 LAN Network Failure Analysis

The OpenVMS operating system provides a sample program to help you analyze OpenVMS Cluster network failures on the LAN. You can edit and use the SYS$EXAMPLES:LAVC$FAILURE_ANALYSIS.MAR program to detect and isolate failed network components. Using the network failure analysis program can help reduce the time required to detect and isolate a failed network component, thereby providing a significant increase in cluster availability.

Reference: For a description of the network failure analysis program, refer to Appendix D.

10.4 Recording Configuration Data

To maintain an OpenVMS Cluster system effectively, you must keep accurate records about the current status of all hardware and software components and about any changes made to those components. Changes to cluster components can have a significant effect on the operation of the entire cluster. If a failure occurs, you may need to consult your records to aid problem diagnosis.

Maintaining current records for your configuration is necessary both for routine operations and for eventual troubleshooting activities.

10.4.1 Record Information

At a minimum, your configuration records should include the following information:

10.4.2 Satellite Network Data

The first time you execute CLUSTER_CONFIG.COM to add a satellite, the procedure creates the file NETNODE_UPDATE.COM in the boot server's SYS$SPECIFIC:[SYSMGR] directory. (For a common-environment cluster, you must rename this file to the SYS$COMMON:[SYSMGR] directory, as described in Section 5.10.2.) This file, which is updated each time you add or remove a satellite or change its Ethernet or FDDI hardware address, contains all essential network configuration data for the satellite.

If an unexpected condition at your site causes configuration data to be lost, you can use NETNODE_UPDATE.COM to restore it. You can also read the file when you need to obtain data about individual satellites. Note that you may want to edit the file occasionally to remove obsolete entries.

Example 10-1 shows the contents of the file after satellites EUROPA and GANYMD have been added to the cluster.

Example 10-1 Sample NETNODE_UPDATE.COM File

$ RUN SYS$SYSTEM:NCP 
    define node EUROPA address 2.21 
    define node EUROPA hardware address 08-00-2B-03-51-75 
    define node EUROPA load assist agent sys$share:niscs_laa.exe 
    define node EUROPA load assist parameter $1$DJA11:<SYS10.> 
    define node EUROPA tertiary loader sys$system:tertiary_vmb.exe 
    define node GANYMD address 2.22 
    define node GANYMD hardware address 08-00-2B-03-58-14 
    define node GANYMD load assist agent sys$share:niscs_laa.exe 
    define node GANYMD load assist parameter $1$DJA11:<SYS11.> 
    define node GANYMD tertiary loader sys$system:tertiary_vmb.exe 

Reference: See the DECnet--Plus documentation for equivalent NCL command information.

10.5 Cross-Architecture Satellite Booting

Cross-architecture satellite booting permits VAX boot nodes to provide boot service to Alpha satellites and Alpha boot nodes to provide boot service to VAX satellites. For some OpenVMS Cluster configurations, cross-architecture boot support can simplify day-to-day system operation and reduce the complexity of managing OpenVMS Cluster that include both VAX and Alpha systems.

Note: Compaq will continue to provide cross-architecture boot support while it is technically feasible. This support may be removed in future releases of the OpenVMS operating system.

10.5.1 Sample Configurations

The sample configurations that follow show how you might configure an OpenVMS Cluster to include both Alpha and VAX boot nodes and satellite nodes. Note that each architecture must include a system disk that is used for installations and upgrades.

Caution: The OpenVMS operating system and layered product installations and upgrades cannot be performed across architectures. For example, OpenVMS Alpha software installations and upgrades must be performed using an Alpha system. When configuring OpenVMS Cluster systems that use the cross-architecture booting feature, configure at least one system of each architecture with a disk that can be used for installations and upgrades. In the configurations shown in Figure 10-1 and Figure 10-2, one of the workstations has been configured with a local disk for this purpose.

In Figure 10-1, several Alpha workstations have been added to an existing VAXcluster configuration that contains two VAX boot nodes based on the DSSI interconnect and several VAX workstations. For high availability, the Alpha system disk is located on the DSSI for access by multiple boot servers.

Figure 10-1 VAX Nodes Boot Alpha Satellites


In Figure 10-2, the configuration originally consisted of a VAX boot node and several VAX workstations. The VAX boot node has been replaced with a new, high-performance Alpha boot node. Some Alpha workstations have also been added. The original VAX workstations remain in the configuration and still require boot service. The new Alpha boot node can perform this service.

Figure 10-2 Alpha and VAX Nodes Boot Alpha and VAX Satellites


10.5.2 Usage Notes

Consider the following guidelines when using the cross-architecture booting feature:

10.5.3 Configuring DECnet

The following examples show how to configure DECnet databases to perform cross-architecture booting. Note that this feature is available for systems running DECnet for OpenVMS (Phase IV) only.

Customize the command procedures in Examples 10-2 and 10-3 according to the following instructions.
Replace... With...
alpha_system_disk or vax_system_disk The appropriate disk name on the server
label The appropriate label name for the disk on the server
ccc-n The server circuit name
alpha or vax The DECnet node name of the satellite
xx.yyyy The DECnet area.address of the satellite
aa-bb-cc-dd-ee-ff The hardware address of the LAN adapter on the satellite over which the satellite is to be loaded
satellite_root The root on the system disk (for example, SYS10) of the satellite

Example 10-2 shows how to set up a VAX system to serve a locally mounted Alpha system disk.

Example 10-2 Defining an Alpha Satellite in a VAX Boot Node

 
$! VAX system to load Alpha satellite 
$! 
$!  On the VAX system: 
$!  ----------------- 
$! 
$!  Mount the system disk for MOP server access. 
$! 
$ MOUNT /SYSTEM alpha_system_disk: label ALPHA$SYSD 
$! 
$!  Enable MOP service for this server. 
$! 
$ MCR NCP 
NCP> DEFINE CIRCUIT ccc-n SERVICE ENABLED STATE ON 
NCP> SET CIRCUIT ccc-n STATE OFF 
NCP> SET CIRCUIT ccc-n ALL 
NCP> EXIT 
$! 
$!  Configure MOP service for the ALPHA satellite. 
$! 
$ MCR NCP 
NCP> DEFINE NODE alpha ADDRESS xx.yyyy
NCP> DEFINE NODE alpha HARDWARE ADDRESS aa-bb-cc-dd-ee-ff
NCP> DEFINE NODE alpha LOAD ASSIST AGENT SYS$SHARE:NISCS_LAA.EXE 
NCP> DEFINE NODE alpha LOAD ASSIST PARAMETER ALPHA$SYSD:[satellite_root.] 
NCP> DEFINE NODE alpha LOAD FILE APB.EXE 
NCP> SET NODE alpha ALL 
NCP> EXIT 

Example 10-3 shows how to set up an Alpha system to serve a locally mounted VAX system disk.

Example 10-3 Defining a VAX Satellite in an Alpha Boot Node

 $! Alpha system to load VAX satellite 
 $! 
 $!  On the Alpha system: 
 $!  -------------------- 
 $! 
 $!  Mount the system disk for MOP server access. 
 $! 
 $ MOUNT /SYSTEM vax_system_disk: label VAX$SYSD 
 $! 
 $!  Enable MOP service for this server. 
 $! 
 $ MCR NCP 
 NCP> DEFINE CIRCUIT ccc-n SERVICE ENABLED STATE ON 
 NCP> SET CIRCUIT ccc-n STATE OFF 
 NCP> SET CIRCUIT ccc-n ALL 
 NCP> EXIT 
 $! 
 $!  Configure MOP service for the VAX satellite. 
 $! 
 $ MCR NCP 
 NCP> DEFINE NODE vax ADDRESS xx.yyyy
 NCP> DEFINE NODE vax HARDWARE ADDRESS aa-bb-cc-dd-ee-ff
 NCP> DEFINE NODE vax TERTIARY LOADER SYS$SYSTEM:TERTIARY_VMB.EXE 
 NCP> DEFINE NODE vax LOAD ASSIST AGENT SYS$SHARE:NISCS_LAA.EXE 
 NCP> DEFINE NODE vax LOAD ASSIST PARAMETER VAX$SYSD:[satellite_root.] 
 NCP> SET NODE vax ALL 
 NCP> EXIT 

Then, to boot the satellite, perform these steps:

  1. Execute the appropriate command procedure from a privileged account on the server
  2. Boot the satellite over the adapter represented by the hardware address you entered into the command procedure earlier.

10.6 Controlling OPCOM Messages

When a satellite joins the cluster, the Operator Communications Manager (OPCOM) has the following default states:

10.6.1 Overriding OPCOM Defaults

Table 10-3 shows how to define the following system logical names in the command procedure SYS$MANAGER:SYLOGICALS.COM to override the OPCOM default states.

Table 10-3 OPCOM System Logical Names
System Logical Name Function
OPC$OPA0_ENABLE If defined to be true, OPA0: is enabled as an operator console. If defined to be false, OPA0: is not enabled as an operator console. DCL considers any string beginning with T or Y or any odd integer to be true, all other values are false.
OPC$OPA0_CLASSES Defines the operator classes to be enabled on OPA0:. The logical name can be a search list of the allowed classes, a list of classes, or a combination of the two. For example:
$ DEFINE/SYSTEM OP$OPA0_CLASSES CENTRAL,DISKS,TAPE

$ DEFINE/SYSTEM OP$OPA0_CLASSES "CENTRAL,DISKS,TAPE"
$ DEFINE/SYSTEM OP$OPA0_CLASSES "CENTRAL,DISKS",TAPE

You can define OPC$OPA0_CLASSES even if OPC$OPA0_ENABLE is not defined. In this case, the classes are used for any operator consoles that are enabled, but the default is used to determine whether to enable the operator console.

OPC$LOGFILE_ENABLE If defined to be true, an operator log file is opened. If defined to be false, no log file is opened.
OPC$LOGFILE_CLASSES Defines the operator classes to be enabled for the log file. The logical name can be a search list of the allowed classes, a comma-separated list, or a combination of the two. You can define this system logical even when the OPC$LOGFILE_ENABLE system logical is not defined. In this case, the classes are used for any log files that are open, but the default is used to determine whether to open the log file.
OPC$LOGFILE_NAME Supplies information that is used in conjunction with the default name SYS$MANAGER:OPERATOR.LOG to define the name of the log file. If the log file is directed to a disk other than the system disk, you should include commands to mount that disk in the SYLOGICALS.COM command procedure.

10.6.2 Example

The following example shows how to use the OPC$OPA0_CLASSES system logical to define the operator classes to be enabled. The following command prevents SECURITY class messages from being displayed on OPA0.


$ DEFINE/SYSTEM OPC$OPA0_CLASSES CENTRAL,PRINTER,TAPES,DISKS,DEVICES, -
_$ CARDS,NETWORK,CLUSTER,LICENSE,OPER1,OPER2,OPER3,OPER4,OPER5, -
_$ OPER6,OPER7,OPER8,OPER9,OPER10,OPER11,OPER12

In large clusters, state transitions (computers joining or leaving the cluster) generate many multiline OPCOM messages on a boot server's console device. You can avoid such messages by including the DCL command REPLY/DISABLE=CLUSTER in the appropriate site-specific startup command file or by entering the command interactively from the system manager's account.

10.7 Shutting Down a Cluster

The SHUTDOWN command of the SYSMAN utility provides five options for shutting down OpenVMS Cluster computers:

These options are described in the following sections.


Previous Next Contents Index

  [Go to the documentation home page] [How to order documentation] [Help on this site] [How to contact us]  
  privacy and legal statement  
4477PRO_020.HTML