Article 138855 of comp.os.vms:
Newsgroups: comp.os.vms
Path: nntpd.lkg.dec.com!lead.zk3.dec.com!crl.dec.com!crl.dec.com!bloom-beacon.mit.edu!newsxfer2.itd.umich.edu!gatech!news.mathworks.com!uunet!in1.uu.net!esseye!news
From: tillman_brian@si.com (Brian Tillman)
Subject: Queue Manager Secrets (was: uilization of file...)
Content-Type: Text/Plain; charset=US-ASCII
Message-ID: <DLst3p.IHF@esseye.si.com>
Sender: news@esseye.si.com
Nntp-Posting-Host: helpdesk_1.si.com
Organization: Smiths Industries
X-Newsreader: WinVN 0.99.7
References: <764019@MVB.SAIC.COM> <DLqwFG.Cxr@esseye.si.com>
Mime-Version: 1.0
Date: Fri, 26 Jan 1996 17:11:49 GMT
Lines: 309

In article <DLqwFG.Cxr@esseye.si.com>, tillman_brian@si.com says...
>
>There are a number of "secret" Queue Manager commands that are very 
>interesting.  DECUServe contains an article describing them.  I could post 
>it, if anyone's interested.

I've received several requests, so here is the information.  I obtained this 
from DECUServe, where it was posted by Dale Coy (coy@eisner.decus.org).  Hope 
you find it interesting.

                 <<< EISNER::$2$DIA7:[NOTES$HIVOL]VMS.NOTE;1 >>>
                         -< VMS and bundled utilities >-
================================================================================
Note 2068.0    Care and Feeding of the New Queue Manager (V5.5+++)    29 replies
EISNER::COY "Dale E. Coy (DECUServe MoS)"            22 lines  22-APR-1993 23:07
--------------------------------------------------------------------------------
    I've gathered some information about the "New" Queue Manager
    (Introduced with VMS V5.5).
    
    Refer to Topics 1318, 1679, 1725, 1779, and undoubtedly other places,
    for previous discussion.
    
    I am indebted to Kim and Pete (of the CSC) for a lot of the following
    insight.  However - please assume that *opinions* are mine.  *facts*
    are theirs.
    
    The new queue manager looks like an excellent attempt to rationalize
    the queue operations on VMS.  Although it has suffered from lots of
    growing pains, with VMS V5.5-2 and the latest patch it is relatively
    stable.
    
    It appears that one design criterion was "ease of backup" - but that
    the designers _assumed_ that it was only really important to easily
    preserve the queues and forms - and that few people really needed to
    preserve the ENTRIES for jobs on the queues.  That was IMO a wrong
    assumption (more later), but it was certainly an understandable
    assumption.
    
================================================================================
Note 2068.1    Care and Feeding of the New Queue Manager (V5.5+++)       1 of 29
EISNER::COY "Dale E. Coy (DECUServe MoS)"            37 lines  22-APR-1993 23:21
             -< Overview of the 3 files, and preserving 2 of them >-
--------------------------------------------------------------------------------
    Three files are needed to totally describe the Queue structure.  All
    are (by default - they can be moved) located in SYS$SYSTEM
    
    QMAN$MASTER.DAT
    	This file contains _all_ of the information about your FORMS, and a
    bit of information about all of the QUEUES.  
    
    SYS$QUEUE_MANAGER.QMAN$QUEUES
    	This file contains the rest of the information about the QUEUES
    themselves (characteristics, etc.).  
    
    SYS$QUEUE_MANAGER.QMAN$JOURNAL;1
    	This file contains all of the information about the ENTRIES on all
    of the queues.  It's rather dynamic, may not be totally up to date
    (data not flushed to the file), etc.  More on this file later.  Note
    that I gave a version number above.
    
    ==================
    
    The easy part of preserving the queues (in case of disaster or
    corruption) is to keep recent copies of the first two files above.  If
    you have good QMAN$MASTER.DAT and SYS$QUEUE_MANAGER.QMAN$QUEUES file
    copies, that's all you need to restore your queues and forms.  Just
    delete the JOURNAL file, replace the other two with good copies, and
    start the queue manager.  Presto - all of the queues and forms are
    recovered.  No executing jobs (entries), though.
    
    The even-better news is that these two files never seem to be "locked". 
    There is an article that recommends doing CONVERT/SHARE to make copies
    of the files - or you could use BACKUP/IGNORE=INTERLOCK.  My personal
    preference is for CONVERT/SHARE, and I have never seen it fail to work.
    [Don't use a method where _you_ might lock the file - I would hate to
    confuse the queue manager]
    
    The structure of these two files strongly implies that, even if a file
    is being "updated" as you convert/share it, the copy you get will be
    "rational" and "usable" and not corrupt.
================================================================================
Note 2068.2    Care and Feeding of the New Queue Manager (V5.5+++)       2 of 29
EISNER::COY "Dale E. Coy (DECUServe MoS)"            26 lines  22-APR-1993 23:33
                        -< Details of the JOURNAL file >-
--------------------------------------------------------------------------------
    That brings us to SYS$QUEUE_MANAGER.QMAN$JOURNAL;1
    
    As previously stated, this is where all of the ENTRIES (jobs) live. 
    It's the source of several problems and changes.
    
    This file is a "coded" file, with pointers to lots of things and with
    lots of links.  It can have "old" data, with an "execution pointer"
    that points after the old data.  The physical order of things in this
    file is probably the same order in which you NOW see the queue entries
    in show/queue displays.  
    
    The _correct_ copy of this file is maintained IN MEMORY on the node
    that is executing the queue manager.  In the first versions of the new
    queue manager, this data was seldom flushed to the file - probably only
    when the node shut down, and/or the queue manager was shut down.
    
    It is my belief that this file is the source of almost ALL instances
    where people saw "queue corruption".  
    
    The latest version of CSCPAT_1012 (V1.6 at least) changes the behavior
    so that the in-memory structure is flushed to the journal file fairly
    frequently.  The interval appears to be at MOST every hour, and for
    very active queue situations every few minutes.
    
    The size of this file tends to be "around 1000 blocks" (with a large
    variation, of course).
================================================================================
Note 2068.3    Care and Feeding of the New Queue Manager (V5.5+++)       3 of 29
EISNER::COY "Dale E. Coy (DECUServe MoS)"            40 lines  22-APR-1993 23:46
                        -< Backing up the Journal file >-
--------------------------------------------------------------------------------
    So - get the CSCPAT_1012 at some version (I would get at least 1.6).  
    
    Note that the "patch" means that any entry older than the flush
    interval (e.g., any entry older than an hour) will appear in the file. 
    Newer entries _may_ not be flushed yet.
    
    OK, so you really want to back up the journal file, to preserve a
    snapshot of the queue entries (like you did with FIXQUE - see topic
    1318 - in earlier versions).
    
    I'll describe several methods - depending on your aversion to risk.
    
    ================
    METHOD 1:
    
    	BACKUP/IGNORE=INTERLOCK  SYS$QUEUE_MANAGER.QMAN$JOURNAL  whatever
    
    Convert/share won't work (the file is open).  Backup will complain, but
    make a "fair" copy.  The only apparent risk is if you (unluckily) do
    the backup at the exact same time that a "flush" operation is being
    done.  You could get an inconsistent file.  But, IMO, if you kept two
    versions, the probability of a bad result would be vanishingly small.
    
    ====================
    METHOD 2
    
    STOP/QUEUE/MANAGER/CLUSTER and then do a backup (or convert/share) of
    the file when the flush is done and the file is released.  [Of course,
    if you aren't in a cluster, you can omit that qualifier]
    
    I didn't extensively test this - but it is the SAFE METHOD RECOMMENDED
    BY THE CSC.  In my testing, I had trouble getting the queue manager to
    release the file - but perhaps I didn't wait long enough or something.
    
    In addition to the disadvantage of having your queues not running for
    the duration of the backup, note that stop/queue/manager will "crash"
    any job that is in EXECUTING state.  This may not be a problem for you
    - but it was for me.
    
    But I agree with the CSC.  It's unconditionally safe.
================================================================================
Note 2068.4    Care and Feeding of the New Queue Manager (V5.5+++)       4 of 29
EISNER::COY "Dale E. Coy (DECUServe MoS)"            70 lines  23-APR-1993 00:09
            -< The real way to save SYS$QUEUE_MANAGER.QMAN$JOURNAL >-
--------------------------------------------------------------------------------
    METHOD 3 - pure unsupported magic.
    
    There is an undocumented and "unsupported" method (or two).
    
    In the VMS 5.5 kit, in Saveset B, there is a program JBC$UPGRADE.EXE. 
    Install left it on my system, but if you don't have it, you can just
    get it from that saveset.  Put it in sys$system.
    
    This is apparently the program that was/is used to convert from old to
    new styles.  But it has other uses!!!
    
    With privs CMKRNL (and maybe others), RUN SYS$SYSTEM:JBC$UPGRADE
    (may need to be on the "execution" node).
    
    You get a prompt:
    
    JBC$UPGRADE>		(More info in next reply, but...)
    
    If you type
    JBC$UPGRADE>  DIAG  0  2         (numbers)
    
    %JBCUPGRAD-E-SHOWOUTPUT, the output from this command is:
    
    Log for playback = 0
    Save old Journal files = 1
    Log all requests = 0
    Dump on error = 0
    
    ...........
    
    That's it.  Wait a while.  You will see a new file in sys$system, named
    	SYS$QUEUE_MANAGER.QMAN$JOURNAL_OBSOLETE
    
    What happens is that, _when_ the "flush" is done, NORMALLY the old file
    is renamed, then a new JOURNAL file is created, and the new file is
    renamed to .QMAN$JOURNAL;1, and the old file is deleted.  With "diag 0
    2", the old file is KEPT.
    
    *NOTE 1* - the first time you do this, SET A VERSION LIMIT on the
    _OBSOLETE file.  Otherwise, you'll get one more every hour or so.  I
    use a version limit of 2.
    
    *NOTE 2* - If you ever have to use it, name it with VERSION NUMBER 1 -
    otherwise the queue manager won't touch it, I'm told.  This is a
    CRITICAL POINT.  In the heat of battle (recovery), it would be very
    easy to forget.
    
    *NOTE 3* - I have been assured that the need/desire for this to be a
    _supported_ capability _will_ be communicated to engineering.
    
    Now - some other technical details:
    
    The "switching on" is done for the executing copy.  
    	1. I _know_ that it doesn't survive a cluster reboot.  My startup
    file now does 
    	$  S_Q_J := $sys$system:jbc$upgrade
    	$  S_Q_J diag 0 2
    for _all_ nodes.
    
    	2. Unknown: what happens when the queue manager fails over to
    another node.  I _suspect_ that the "_obsolete" behavior is maintained.
    
    	3. Strong suspicion that stop/queue/manager/cluster and then
    start/queue/manager would cancel the effect of diag.  But I don't know
    that for sure either.
    
    [Anybody want to run some good tests for 2 and 3?]
    
    	4. If you want to explicitly turn off the behavior, using "diag 0"
    does that, without any shutting down or whatever.
================================================================================
Note 2068.5    Care and Feeding of the New Queue Manager (V5.5+++)       5 of 29
EISNER::COY "Dale E. Coy (DECUServe MoS)"            28 lines  23-APR-1993 00:16
                               -< Diag options >-
--------------------------------------------------------------------------------
    And now for the "really neat/fuzzy/dangerous" stuff.
    
    The format of the command line to JBC$UPGRADE> is 
    	Keyword  Options
    
    For keyword DIAG, we used options 0 and 2 in that order (this is
    apparently just a list of things to do).  Option 0 apparently says
    "cancel everything" and then option 2 says "Save old Journal files".
    
    You can string together as many commands as you want (1-6)
    
    The output after the command was:
>    Log for playback = 0
>    Save old Journal files = 1
>    Log all requests = 0
>    Dump on error = 0
    
    I was told that the other options are:
    	1 - "input playback" (playback journal commands)
    	3 - "log ALL requests" (maybe for future playback?)
    	4 - PROCESS dump on error.
    	5 - Diagnostics - I was told that this is like "loopback" for the
    		queue manager.  It wouldn't do anything, but would sit
    		there and process any commands.
    	6 - SYSTEM CRASH on queue manager error.
    
    I just _knew_ you would enjoy option 6.
    	
================================================================================
Note 2068.6    Care and Feeding of the New Queue Manager (V5.5+++)       6 of 29
EISNER::COY "Dale E. Coy (DECUServe MoS)"            19 lines  23-APR-1993 00:23
                     -< A few more (unexplored) keywords >-
--------------------------------------------------------------------------------
    Ah - but we aren't through:
    
>    The format of the command line to JBC$UPGRADE> is 
>    	Keyword  Options
    
    If you just "press return", you get:
    
    %JBCUPGRAD-E-INVFUNC, invalid function
    valid choices are: SAVE, RESTORE, TEST, COMPARE, NEWJBC, DIAGNOSTIC
    
    We already talked about DIAG.  
    
    SAVE writes "everything" to a file.  (except entries for jobs that are
    executing at the time).  It seems reasonable that SAVE and RESTORE are
    a pair, and that maybe SAVE would substitute for preserving the 3
    files.  SAVE writes _one_ file that looks like it contains everything.
    [Anybody want to test it?]
    
    And of course there are those other keywords...
================================================================================
Note 2068.7    Care and Feeding of the New Queue Manager (V5.5+++)       7 of 29
EISNER::COY "Dale E. Coy (DECUServe MoS)"             6 lines  23-APR-1993 00:24
                        -< OK - what can you tell me? >-
--------------------------------------------------------------------------------
    In summary:
    
    	1. I think the "preservation" question is resolved, without much
    need for re-creating FIXQUE.
    
    	2. I think I like the new queue manager.


-- 
-----------------------------+--------------------------------
 Brian Tillman               | Internet: tillman@swdev.si.com 
 Smiths Industries, Inc.     |           tillman_brian@si.com 
 4141 Eastern Ave., MS239    | Hey, I said this stuff myself. 
 Grand Rapids, MI 49518-8727 | My company has no part in it.
-----------------------------+--------------------------------



