                               [(PYVMS LOGO)]
HTML2RNO.PY is a Python program, that is used to convert a '.HTML' file to
a '.RNO' file. The '.RNO' file is then to be run through the RUNOFF text
formatter that is supplied with the OpenVMS operating system.

HTML2RNO understands several HTML tags and tries to convert them to their
RUNOFF 'equivalents'. There is, however a fundamental difference:

HTML
     describes the contents and the browser is supposed to decide how the
     output is rendered (e.g. lists).

RUNOFF
     is a text formatter where the user has a high number of commands
     available that can be used to control the output.

  ------------------------------------------------------------------------
The following HTML tags are understood by HTML2RNO which can be mapped
quite easily to RUNOFF commands:

<H1> + </H1> through <H6> + </H6>
     are converted to '.header level' commands.

<PRE> + </PRE>
     is converted to '.literal' and '.end literal'

<TITLE> + </TITLE>
     is converted to '.subtitle "title-string"'.
     This intentionally done for the PYVMS documentation. '.title' defines
     the name of the document.

<UL> + </UL>
     is converted to '.list' and '.end list'. HTML2RNO tells RUNOFF the
     'bullet character' to use. You can easily change that - look at the
     implementation of start_ul().

<OL> + </OL>
     is converted to '.list' and '.end list'. HTML2RNO tells RUNOFF to
     prefix the list elements with a decimal number.

<LI>
     is converted to '.list element' which works with both <UL> and <OL>.

<DL>
     is just translated into a new paragraph

<DT>
     The left margin is re-established.

<DD>
     moves the left margin 8 characters to the right.

</DL>
     is translated into a new paragraph. The left margin is re-established.

<BR>
     is converted to '.break'

<HR>
     does a line break. Then a line containing dashes (-) is inserted as
     literal text into the RUNOFF file.

<P>
     starts a new line and puts a '.blank' on it.

<CENTER> + </CENTER>
     is converted to '.center;text-to-center'.

<STRONG> + </STRONG>
     enables bold printing in RUNOFF and inserts the control characters for
     begin or end into the RUNOFF file. After that bold printing is turned
     off again.

<EM> + </EM>
     is usually displayed in italics within a browser. HTML2RNO tells
     RUNOFF to underline the embedded text.

  ------------------------------------------------------------------------
HTML2RNO uses the <RNO> tag to pass additional data to RUNOFF. As far as I
know <RNO> is not a valid HTML tag and browsers should ignore it. The
following 'attributes' (@@ is that the correct name?) are implemented:

NOCONVERT
     is used to tell the procedure CVT__HTML.COM that the '.HTML' file need
     not be converted. This tag must be in the first line of the document.

INLINE="data"
     inserts 'data' directly into the '.RNO' file.

     Example:
     aaa<RNO INLINE="|">bbb
     Such a construct can be used to suggest a possible line-break to
     RUNOFF. I'm not aware that a browser does right-justify a paragraph or
     that HTML contains directives to indicate possible line-break in
     words.

INSERT_LITERAL="data"
     inserts 'data' as a separate line into the '.RNO' file which is
     included between '.end literal' and '.literal' lines. This prevents
     unnecessary empty lines from appearing if one uses the following
     technice instead:
     text</PRE><RNO LINE=".test page 4"><PRE>

     Example:
     <PRE>
     text-1
     text-2<RNO INSERT_LITERAL=".test page 4">more-text-1
     more-text-2
     </PRE>

     The resulting output in the '.RNO' file looks like:

     Example:

     : .literal
     :
     : text-1
     : text-2
     : .end literal
     : .test page 4
     : .literal
     : more-text-1
     : more-text-2
     :
     : .end literal

     (There are currently superflowous empty lines if one uses <PRE> and
     </PRE>.)

LINE="data"
     inserts 'data' as a separate line into the '.RNO' file.

     Example:
     <RNO LINE=".test page 10">
     This tells RUNOFF that there must be at least 10 free lines on the
     current page. If not, RUNOFF starts a new page. A browser has only one
     huge 'page' and the user scrolls through that page.

NEWLINE
     just starts a new line in the '.RNO' file.

  ------------------------------------------------------------------------
A test procedure is located in [.VMS.TEST]HTML2RNO-TEST.COM. This file
converts [.VMS.TEST]HTML2RNO-TEST.HTML to [.VMS.TMP]HTML2RNO-TEST.RNO with
[.VMS.TOOLS]HTML2RNO.PY. Then the '.RNO' file is converted to a '.MEM' file
by running the RUNOFF text formatter.

[.VMS.TEST]HTML2RNO-TEST.HTML should contain a test case for each HTML tag
that HTML2RNO.PY understands.
@@ to be enhanced
  ------------------------------------------------------------------------
(go to: table of contents, Tools in TOC, index)

28-JUL-1998 ZE.
