OpenVMS RTL String Manipulation (STR$) Manual

Document revision date: 30 March 2001

OpenVMS RTL String Manipulation (STR$) Manual

Contents

Index

2.3 Selecting String Manipulation Routines

To perform a given string manipulation operation, you can often choose one of several routines from the Run-Time Library. The LIB$, OTS$, and STR$ facilities all contain string copying and dynamic string allocation routines. Furthermore, a MACRO or BLISS program can call several of these routines using either a JSB or CALL entry point.

You should consider the factors discussed in the following sections when choosing a routine to perform the desired operation.

2.3.1 Efficiency

One of the major considerations in choosing among several routines is the efficiency of the various options.

In general, LIB$ and STR$ routines execute more efficiently than the corresponding OTS$ routines. OTS$ routines usually invoke the LIB$ entry point to perform an operation.

JSB entry points usually execute more efficiently than CALL entry points. However, a high-level language cannot explicitly access a JSB entry point. Further, a JSB entry point does not establish a stack frame and executes entirely in the environment of the calling program. This means, for instance, that the called routine cannot establish its own condition handler, so it cannot regain control if an exception occurs during execution. Also, some of the efficiency gained by using the JSB entry point may be lost because the calling routine must explicitly save all of the registers that the called routine uses.

Some routines perform a specific operation that is a subset of a more general capability. These more specialized routines are usually more efficient. For example, if you want to join two strings together, STR$APPEND and STR$PREFIX are more specific, and more efficient, than STR$CONCAT. Similarly, STR$LEFT and STR$RIGHT are subsets of the capabilities of STR$POS_EXTR.

2.3.2 Argument Passing

The mechanism by which a routine passes or receives arguments may also help you to decide among several routines that perform basically the same function.

Routines in the LIB$ and STR$ facilities pass scalar input arguments by reference to CALL entry points and by immediate value to JSB entry points. OTS$ routines pass scalar input arguments by immediate value to all entry points. For most high-level languages, the default passing mechanism is by reference. Thus, if you call a LIB$ or STR$ routine from one of these languages, you do not need to specify the passing mechanism for input scalar arguments.

Some routines require you to set up and pass more arguments than others. For example, some use a single string descriptor, while others require separate arguments for the length and the address of the string. Which routine you choose then depends on the form of the information already available in your program.

2.3.3 Error Handling

Routines from the LIB$, OTS$, and STR$ facilities handle errors in string copying differently:

LIB$
The LIB$ string-copying routines return a completion status. When an output string must be truncated and its length depends on input arguments, LIB$ routines consider this to be a partial success; they therefore return LIB$_STRTRU instead of a severe error. This process corresponds to the convention of many higher-level languages, which do not consider truncation to be an error.
OTS$
The OTS$ string-copying routines also signal errors that are considered fatal (such as invalid descriptor class). In addition, the routine returns in R0 the number of bytes in the source string that were not moved to the destination string. For VAX systems, this is the same as a MOVC5 instruction. The JSB entry points for OTS$ string-copying routines also leave registers R1 through R5 as they would be after a VAX MOVC5 instruction. See the VAX Architecture Reference Manual for a complete description of the MOVC5 instruction.
STR$
The STR$ string-copying routines generally signal errors instead of returning a completion status. In the case of truncation errors, STR$ routines return an error status with a severity of WARNING (STR$_TRU). STR$ routines consider range errors to be qualified success.

Table 2-4 indicates the errors and the corresponding message that each facility considers severe.

Table 2-4 Severe Errors, by Facility
Error LIB$_ OTS$_ STR$_

Fatal internal error FATERRLIB FATINTERR FATINTERR

Illegal string class INVSTRDES INVSTRDES ILLSTRCLA

Insufficient virtual memory INSVIRMEM INSVIRMEM INSVIRMEM

**Table 2-4 Severe Errors, by Facility**
Error	LIB$_	OTS$_	STR$_
Fatal internal error	FATERRLIB	FATINTERR	FATINTERR
Illegal string class	INVSTRDES	INVSTRDES	ILLSTRCLA
Insufficient virtual memory	INSVIRMEM	INSVIRMEM	INSVIRMEM

Some Run-Time Library routines require you to specify the length of a string or the position of a character within a string. When you refer to character positions in a string, the first position is 1. Given a string with length L, containing a substring specified by character positions M to N, the following evaluation rules apply:

If M is less than 1, M is considered to equal 1.
If M is greater than L, the substring specified is the null string.
If N is greater than L, N is considered to equal the length of the source string.
If M is greater than N, the substring specified is the null string.

When specifying a substring of length L, the following applies:

If L is less than 0, the substring specified is the null string. (A null string is a descriptor with zero length. A descriptor with a nonzero length and a zero pointer generates an error and yields unspecified results.)

If any of these evaluation rules applies, the range error status (qualified success) is returned. STR$POSITION represents the exception to this convention. This routine returns a function value giving the character position of a substring within a string. If the function value is 0, the substring was not found.

2.4 Allocating Resources for Dynamic Strings

This section tells how to use the Run-Time Library string resource allocation routines. These routines allocate virtual memory for a dynamic string and place the address of the allocated memory in a descriptor.

Dynamic strings may be the most convenient type to write, since you need not specify constant length, maximum length, or position for them. However, there are some restrictions on dynamic strings.

They may cause program execution to be slower at run time.
They require larger address space.
They are not supported by all OpenVMS Alpha and OpenVMS VAX languages.

In most cases, when you call a Run-Time Library routine to manipulate dynamic strings, the Run-Time Library routine itself allocates the required memory for the string. Your program needs to allocate only the descriptors.

For example, if you are copying a source string into a dynamic destination string, simply use one of the library's string-copying routines. Copy the input string into a dynamic string whose length and address are initialized to zero. The string-copying routine then allocates the space that the calling program needs.

However, if your program must explicitly construct or modify a dynamic string descriptor, it must use the Run-Time Library allocation and deallocation routines. This technique may be necessary, for instance, if you are constructing a string out of components that are not themselves in string form. Further, you can use one of the deallocation routines to free the dynamic string after the string resources are no longer needed, in order to optimize the program's use of resources.

The Run-Time Library provides eight entry points for string resource allocation and deallocation, all with slightly different input arguments, calling techniques, or methods of indicating errors. The following tables summarize these routines and their functions.

The following routines allocate a specified number of bytes of dynamic virtual memory to a specified string descriptor.

Routine JSB Entry Point

LIB$SGET1_DD LIB$SGET1_DD_R6

LIB$SGET1_DD_64 LIB$SGET1_DD_R6

OTS$SGET1_DD OTS$SGET1_DD_R6

STR$GET1_DX STR$GET1_DX_R4

STR$GET1_DX_64 STR$GET1_DX_R4

Routine	JSB Entry Point
LIB$SGET1_DD	LIB$SGET1_DD_R6
LIB$SGET1_DD_64	LIB$SGET1_DD_R6
OTS$SGET1_DD	OTS$SGET1_DD_R6
STR$GET1_DX	STR$GET1_DX_R4
STR$GET1_DX_64	STR$GET1_DX_R4

The following routines return one dynamic string area to free storage, and set the descriptor POINTER and LENGTH fields to zero.

Routine JSB Entry Point

LIB$SFREE1_DD LIB$SFREE1_DD6

OTS$SFREE1_DD OTS$SFREE1_DD6

STR$FREE1_DX STR$FREE1_DX_R4

Routine	JSB Entry Point
LIB$SFREE1_DD	LIB$SFREE1_DD6
OTS$SFREE1_DD	OTS$SFREE1_DD6
STR$FREE1_DX	STR$FREE1_DX_R4

The following routines return one or more dynamic string areas to free storage, and set the descriptor POINTER and LENGTH fields to zero.

Routine JSB Entry Point

LIB$SFREEN_DD LIB$SFREEN_DD6

OTS$SFREEN_DD OTS$SFREEN_DD6

Routine	JSB Entry Point
LIB$SFREEN_DD	LIB$SFREEN_DD6
OTS$SFREEN_DD	OTS$SFREEN_DD6

When you call the dynamic string allocation routines, consider the following factors:

When your program calls a string allocation routine, it needs to allocate space only for the string descriptor before making the call. Your program does this using the statement of the particular language, either statically at compile time or dynamically in local stack storage or heap storage.
If your routine explicitly allocates dynamic string descriptors in stack storage, it must explicitly free the associated dynamic string areas by calling the LIB$SFREE1_DD, OTS$SFREE1_DD, or STR$FREE1_DX routine. Then your routine must free the storage for the descriptor. After both areas have been freed, your routine can return to the calling program. If the deallocation is not done, the dynamic string area becomes unavailable when the RET instruction removes the descriptors that point to the string area.
If a routine has explicitly allocated dynamic string areas, and the routine is then unwound by the Condition Handling Facility (CHF), the allocated address space cannot be referenced again. For this reason, your program should establish a handler that frees the associated dynamic string areas when the SS$_UNWIND condition is signaled. The handler can free these areas by calling one of the deallocation routines. This technique is especially important if a large amount of address space is involved, or if the routine allocates space within a repeating loop.

You can call the string resource allocation routines only from user mode, at asynchronous system trap (AST) or non-AST level. However, be extremely careful if you manipulate dynamic strings at AST level. The string manipulation routines in the Run-Time Library do not prevent the strings that they are manipulating at non-AST level from being modified at AST level.

For example, consider the case in which a string manipulation routine has calculated the lengths and addresses involved in a concatenation operation. This string manipulation routine may be interrupted by an AST. The user, at AST level, may write to the same string, changing its length and address. It is then possible to resume execution of the routine with addresses that are no longer allocated or string lengths that are no longer valid. For this reason, if you use dynamic strings at AST level, you should allocate, use, and deallocate them within the AST code.

The dynamic string manipulation routines are intended for use at user mode only. To manipulate dynamic strings at another access mode, you should allocate and deallocate storage for each string at that access mode to avoid side effects. Link each segment of your program that runs at a different access mode with the /NOSYSSHR qualifier. In this way, you establish a separate copy of the string database for each access mode.

2.4.1 String Zone

All virtual memory for dynamic strings is allocated from a Run-Time Library zone called the string zone.

The string zone has the following benefits:

Efficient memory utilization.
Allocation and deallocation for long strings (more than 136 bytes for a VAX system and more than 272 bytes for an Alpha system) is twice as fast.
Elimination of paging contention with the default zone by isolation of the string virtual memory accesses to a separate zone. A direct side effect of this is that corruptions caused by writing into previously freed strings no longer affect items allocated in the default zone, directly easing the debugging effort for such problems.

Table 2-5 shows attribute values for 32-bit and 64-bit string zones. VAX systems have a 32-bit string zone; Alpha systems have both a 32-bit and a 64-bit string zone.

Table 2-5 String Zone Attributes
Attribute 32-bit String Zone 64-bit String Zone

Algorithm Quick fit Quick fit

Number of lookaside lists 17 (short strings from 8 to 136 bytes) 17 (short strings from 8 to 272 bytes)

Area of initial size 4 pages 4 pages

Area of extension size 32 pages 32 pages

Block size 8 bytes 16 bytes

Alignment Longword boundary Quadword boundary

Smallest block size 16 bytes (includes boundary tags) 32 bytes (includes boundary tags)

Boundary tags Boundary tags are used for long strings Boundary tags are used for long strings

Page limit No page limit No page limit

Fill on allocate No fill on allocate No fill on allocate

Fill on free No fill on free No fill on free

**Table 2-5 String Zone Attributes**
Attribute	32-bit String Zone	64-bit String Zone
Algorithm	Quick fit	Quick fit
Number of lookaside lists	17 (short strings from 8 to 136 bytes)	17 (short strings from 8 to 272 bytes)
Area of initial size	4 pages	4 pages
Area of extension size	32 pages	32 pages
Block size	8 bytes	16 bytes
Alignment	Longword boundary	Quadword boundary
Smallest block size	16 bytes (includes boundary tags)	32 bytes (includes boundary tags)
Boundary tags	Boundary tags are used for long strings	Boundary tags are used for long strings
Page limit	No page limit	No page limit
Fill on allocate	No fill on allocate	No fill on allocate
Fill on free	No fill on free	No fill on free

Part 2
STR$ Reference Section

This section contains detailed descriptions of the routines in the OpenVMS RTL String Manipulation (STR$) facility.

STR$ADD

The Add Two Decimal Strings routine adds two decimal strings of digits.

Format

STR$ADD asign ,aexp ,adigits ,bsign ,bexp ,bdigits ,csign ,cexp ,cdigits

RETURNS

OpenVMS usage: cond_value

type: longword (unsigned)

access: write only

mechanism: by value

Arguments

asign

OpenVMS usage: longword_unsigned

type: longword (unsigned)

access: read only

mechanism: by reference

Sign of the first operand. The asign argument is the address of an unsigned longword containing this sign. A value of 0 is considered positive; a value of 1 is considered negative.
aexp

OpenVMS usage: longword_signed

type: longword (signed)

access: read only

mechanism: by reference

Power of 10 by which adigits is multiplied to get the absolute value of the first operand. The aexp argument is the address of a signed longword containing this exponent.
adigits

OpenVMS usage: char_string

type: character string

access: read only

mechanism: by descriptor

Text string of unsigned digits representing the absolute value of the first operand before aexp is applied. The adigits argument is the address of a descriptor pointing to this string. This string must be an unsigned decimal number.
bsign

OpenVMS usage: longword_unsigned

type: longword (unsigned)

access: read only

mechanism: by reference

Sign of the second operand. The bsign argument is the address of an unsigned longword containing the second operand's sign. A value of 0 is considered positive; a value of 1 is considered negative.
bexp

OpenVMS usage: longword_signed

type: longword (signed)

access: read only

mechanism: by reference

Power of 10 by which bdigits is multiplied to get the absolute value of the second operand. The bexp argument is the address of a signed longword containing the second operand's exponent.
bdigits

OpenVMS usage: char_string

type: character string

access: read only

mechanism: by descriptor

Text string of unsigned digits representing the absolute value of the second operand before bexp is applied. The bdigits argument is the address of a descriptor pointing to this string. This string must be an unsigned decimal number.
csign

OpenVMS usage: longword_unsigned

type: longword (unsigned)

access: write only

mechanism: by reference

Sign of the result. The csign argument is the address of an unsigned longword containing the result's sign. A value of 0 is considered positive; a value of 1 is considered negative.
cexp

OpenVMS usage: longword_signed

type: longword (signed)

access: write only

mechanism: by reference

Power of 10 by which cdigits is multiplied to get the absolute value of the result. The cexp argument is the address of a signed longword containing this exponent.
cdigits

OpenVMS usage: char_string

type: character string

access: write only

mechanism: by descriptor

Text string of unsigned digits representing the absolute value of the result before cexp is applied. The cdigits argument is the address of a descriptor pointing to this string. This string is an unsigned decimal number.

Description

STR$ADD adds two strings of decimal numbers (a and b). Each number to be added is passed to STR$ADD in three arguments:

xdigits-the string portion of the number
xexp-the power of ten needed to obtain the absolute value of the number
xsign-the sign of the number

The value of the number x is derived by multiplying xdigits by 10^xexp and applying xsign. Therefore, if xdigits is equal to '2' and xexp is equal to 3 and xsign is equal to 1, then the number represented in the x arguments is 2 * 10³ plus the sign, or -2000.
The result of the addition c is also returned in those three parts.

Condition Values Returned

SS$_NORMAL Routine successfully completed.

STR$_TRU String truncation warning. The destination string could not contain all the characters in the result string.

Condition Values Signaled

LIB$_INVARG Invalid argument.

STR$_FATINTERR Fatal internal error. An internal consistency check has failed. This usually indicates an internal error in the Run-Time Library and should be reported to your Compaq support representative.

STR$_ILLSTRCLA Illegal string class. The class code found in the class field of a descriptor is not a string class code allowed by the OpenVMS calling standard.

STR$_INSVIRMEM Insufficient virtual memory. STR$ADD could not allocate heap storage for a dynamic or temporary string.

STR$_WRONUMARG Wrong number of arguments.

Example


100 !+ ! This is a sample arithmetic program ! showing the use of STR$ADD to add ! two decimal strings. !- ASIGN% = 1% AEXP% = 3% ADIGITS$ = '1' BSIGN% = 0% BEXP% = -4% BDIGITS$ = '2' CSIGN% = 0% CEXP% = 0% CDIGITS$ = '0' PRINT "A = "; ASIGN%; AEXP%; ADIGITS$ PRINT "B = "; BSIGN%; BEXP%; BDIGITS$ CALL STR$ADD (ASIGN%, AEXP%, ADIGITS$, & BSIGN%, BEXP%, BDIGITS$, & CSIGN%, CEXP%, CDIGITS$) PRINT "C = "; CSIGN%; CEXP%; CDIGITS$ 999 END

100 !+ 
    ! This is a sample arithmetic program 
    ! showing the use of STR$ADD to add 
    ! two decimal strings. 
    !- 
 
    ASIGN% = 1% 
    AEXP% = 3% 
    ADIGITS$ = '1' 
    BSIGN% = 0% 
    BEXP% = -4% 
    BDIGITS$ = '2' 
    CSIGN% = 0% 
    CEXP% = 0% 
    CDIGITS$ = '0' 
    PRINT "A = "; ASIGN%; AEXP%; ADIGITS$ 
    PRINT "B = "; BSIGN%; BEXP%; BDIGITS$ 
    CALL STR$ADD        (ASIGN%, AEXP%, ADIGITS$, & 
                        BSIGN%, BEXP%, BDIGITS$,  & 
                        CSIGN%, CEXP%, CDIGITS$) 
    PRINT "C = "; CSIGN%; CEXP%; CDIGITS$ 
999 END

This BASIC example uses STR$ADD to add two decimal strings, where the following values apply:
A = -1000 (ASIGN = 1, AEXP = 3, ADIGITS = '1')
B = .0002 (BSIGN = 0, BEXP = -4, BDIGITS = '2')

The output generated by this program is listed below; note that the decimal value of C equals -999.9998 (CSIGN = 1, CEXP = -4, CDIGITS = '9999998').
A = 1 3 1 B = 0 -4 2 C = 1 -4 9999998

Contents

Index

privacy and legal statement

5936PRO_002.HTML

OpenVMS usage:	cond_value
type:	longword (unsigned)
access:	write only
mechanism:	by value

OpenVMS usage:	longword_unsigned
type:	longword (unsigned)
access:	read only
mechanism:	by reference

OpenVMS usage:	longword_signed
type:	longword (signed)
access:	read only
mechanism:	by reference

OpenVMS usage:	char_string
type:	character string
access:	read only
mechanism:	by descriptor

SS$_NORMAL	Routine successfully completed.
STR$_TRU	String truncation warning. The destination string could not contain all the characters in the result string.

LIB$_INVARG	Invalid argument.
STR$_FATINTERR	Fatal internal error. An internal consistency check has failed. This usually indicates an internal error in the Run-Time Library and should be reported to your Compaq support representative.
STR$_ILLSTRCLA	Illegal string class. The class code found in the class field of a descriptor is not a string class code allowed by the OpenVMS calling standard.
STR$_INSVIRMEM	Insufficient virtual memory. STR$ADD could not allocate heap storage for a dynamic or temporary string.
STR$_WRONUMARG	Wrong number of arguments.

OpenVMS RTL String Manipulation (STR$) Manual

2.3 Selecting String Manipulation Routines

2.3.1 Efficiency

2.3.2 Argument Passing

2.3.3 Error Handling

2.4 Allocating Resources for Dynamic Strings

2.4.1 String Zone

Part 2STR$ Reference Section

STR$ADD

Format

RETURNS

Arguments

asign

aexp

adigits

bsign

bexp

bdigits

csign

cexp

cdigits

Description

Condition Values Returned

Condition Values Signaled

Example

Part 2
STR$ Reference Section