August, 1998. Beta version; documentation updated August, 1998
This is a string library that is intended to be compatible with the class string library in the December 1996 draft of the C++ standard. My version is for strings of characters of type char only.
It is intended for people who do not have access to an official version of the string library or wish to use a version without templates.
It follows the standard class string as I understand it, except that a few functions that are relevant only to the template version are omitted, all the functions involving iterators are omitted and the input/output routines are not standard.
I use the name String rather than string to prevent conflicts with other string libraries (as in BC 5.0).
I claim copyright for this program. The initial version was taken from Tony Hansen's book The C++ answer book, but very little of Tony's code remains.
Permission is granted to use this, but not to sell. I take no responsibility for errors, omissions etc, but please tell me about them.
This library links into my exception package. You need to edit the file include.h to determine whether to use simulated exceptions or compiler supported exceptions or simply to disable exceptions. More information on the exception package is given in the documentation for my matrix library, newmat09.
The package uses a limited form of copy-on-write (see Tony Hansen's book for more details) and also attempts to avoid repeated reallocation of the string storage during a multiple sum. This results in some saving in space and time for some operations at the expense of an increase in the complexity of the program and an increase in the time used by a few operations. As with newmat09 it is still an open question whether the extra complexity is really warranted. Or under what circumstances it is really warranted.
The following files are included in this package
str.h | header file for the string library |
str.cpp | function bodies |
boolean.h | simulation of the standard boolean type |
myexcept.h | header for the exceptions simulator |
myexcept.cpp | bodies for the exceptions simulator |
include.h | options header file (see documentation in newmat09) |
strtst.cpp | test program |
strtst.txt | output from the test program |
test_exs.cpp | test exceptions |
test_exs.txt | output from test_exs |
readme.txt | readme file |
string.htm | this file |
st_gnu.mak | make file for gnu c++ |
st_cc.mak | make file for CC |
I have tested this program on the Borland 5.0, 4.53 (32 bit only, test program won't run under 16 bit), 3.1, MS VC++ 5, Watcom 10a, gnu 2.7.2, 2.8.0 and Sun CC compilers.
For Borland 5.0, MS VC++ 5 and Gnu you need to edit include.h to disable my simulated Booleans.
CC compilers generate 14 error messages when running the strtst test program. I suspect these are due to a slightly different convention in deleting temporaries and don't matter.
For the indexes, lengths etc I use unsigned integer (typedefed to uint). This is instead of size_type in the official package. Using size_type (or size_t) as a type of variable seems too bizarre for me to use (as yet).
You will need to #include files include.h and str.h in your programs that use this package. Don't forget to edit include.h to determine whether exceptions are to be used, simulated or disabled. If you use the simulated exceptions you should turn off the exception capability of a compiler that does support exceptions. If your compiler supports bool variables edit the option in include.h to disable my simulated bool variables.
static uint npos | String::npos is the largest possible value of uint and is used to indicate that a find function has failed to find its target. All Strings must have length strictly less than String::npos |
String() | construct a String of zero length |
String(const String&str) | copy constructor (not explicitly in standard) |
String(const String&str, uint pos, uint n = npos) | construct a String from str starting at location pos (first location = 0) and continuing for the length of the String or for n characters, whichever occurs first |
String(const char* s, uint n) | construct a String from s taking a maximum of n characters or the length of the String |
String(const char* s) | construct a String from s |
String(uint n, char c) | construct a String consisting of n copies of the character c |
~String() | the destructor |
String& operator=(const String& str) | copy a String (except that it may be able to avoid copying) |
String& operator=(const char* s) | set a String equal to a c-style character string pointed to by s |
String& operator=(const char c) | set a String equal to a character |
uint size() const | the length of the String (does not include a trailing zero - in most cases there isn't one) |
uint length() const | same as size |
uint max_size() const | the maximum size of a String - not sure what the standard wants, I have set it to npos-1 |
void resize(uint n, char c = 0) | change the size of a String, either by truncating or filling out with copies of character c (std does default separately) |
uint capacity() const | the total space allocated for a String (always >= size()) |
void reserve(uint res_arg = 0) | change the capacity of a String to the maximum of res_arg and size(). This may be an increase or a decrease in the capacity. |
void clear() | erase the contents of the string |
bool empty() const | true if the String is empty; false otherwise |
char operator[](uint pos) const | return the pos-th character; return 0 if pos = size() |
char& operator[](uint pos) | return a reference to the pos-th character; undefined if pos>=size() - I throw an exception. This reference may become invalid after almost any manipulation of the String |
char at(uint n) const | same as operator[] const |
char& at(uint n) | same as operator[]. Throw an exception of pos >=size() |
String& operator+=(const String& rhs) | append rhs to a String (I don't invalidate pointers and references to the stored c-string if the new extended String will fit into the capacity of the old String - see policy on reallocation) |
String& operator+=(const char* s) | append the c-string defined by s to a String - see note above |
String& operator+=(char c) | append the character c to a String - see note above |
String& append(const String& str) | append str to a String - see note above |
String& append(const String& str, uint pos, uint n = npos) | append String(str,pos,npos) - see note above |
String& append(const char* s, uint n) | append String(s,n) - see note above |
String& append(const char* s) | append String(s) - see note above |
String& append(uint n, char c = 0) | append character c - see note above |
String& assign(const String& str) | replace the String by str - this and the following manipulation functions may invalidate pointers and references to the String storage - see the section policy on reallocation. (this function is not explicitly in the standard) |
String& assign(const String& str, uint pos, uint n = npos) | replace the String by String(str,pos,n) |
String& assign(const char* s, uint n) | replace the String by String(s, n) |
String& assign(const char* s) | replace the String by String(s) |
String& assign(uint n, char c = 0) | replace the String by String(c) |
String& insert(uint pos1, const String& str) | insert str before character pos1 (not explictly in standard) |
String& insert(uint pos1, const String& str, uint pos2, uint n = npos) | insert String(str,pos2,n) before character pos1 |
String& insert(uint pos, const char* s, uint n = npos) | insert String(s,n) before character pos (std does default separately) |
String& insert(uint pos, uint n, char c = 0) | insert character c(s,n) before character pos |
String& erase(uint pos = 0, uint n = npos) | erase characters starting at pos and continuing for n characters or till the end of the String. This was originally called remove |
String& replace(uint pos1, uint n1, const String& str) | erase(pos1,n1); insert(pos1,str) |
String& replace(uint pos1, uint n1, const String& str, uint pos2, uint n2 = npos) | erase(pos1,n1); insert(pos1,str,pos2,n2) |
String& replace(uint pos, uint n1, const char* s, uint n2 = npos) | erase(pos,n1); insert(pos,s,n2); (std does default separately) |
String& replace(uint pos, uint n1, uint n2, char c = 0) | erase(pos,n1); insert(pos,n2,c) |
uint copy(char* s, uint n, uint pos = 0) const | copy a maximum of n characters from a string starting at position pos to memory starting at location given by s. Return the number of characters copied. I assume that the program has already allocated space for the characters |
void swap(String&) | a.swap(b) swaps the contents of Strings a and b. The standard also provides for a function swap(a,b) - see binary operators |
const char* c_str() const | return a pointer to the contents of a String after appending (char)0 to the String. This pointer will be invalidated by almost any operation on the String |
const char* data() const | return a pointer to the contents of a String. This pointer will be invalidated by almost any operation on the String |
uint find(const String& str, uint pos = 0) const | find the first location of str in a String starting at position pos. The location is relative to the beginning of the parent String. Return String::npos if not found |
uint find(const char* s, uint pos, uint n) const | find(String(s,n),pos) |
uint find(const char* s, uint pos = 0) const | find(String(s),pos) |
uint find(const char c, uint pos = 0) const | find(String(1,c),pos) |
uint rfind(const String& str, uint pos = npos) const | find the last location of str in a String starting at position pos. ie begin the search with the first character of str at position pos of the target String. The location is relative to the beginning of the parent String. Return String::npos if not found |
uint rfind(const char* s, uint pos, uint n) const | rfind(String(s,n),pos) |
uint rfind(const char* s, uint pos = npos) const | rfind(String(s),pos) |
uint rfind(const char c, uint pos = npos) const | rfind(String(1,c),pos) |
uint find_first_of(const String& str, uint pos = 0) const | find first of any element in str starting at pos. Return String::npos if not found |
uint find_first_of(const char* s, uint pos, uint n) const | find_first_of(String(s,n),pos) |
uint find_first_of(const char* s, uint pos = 0) const | find_first_of(String(s),pos) |
uint find_first_of(const char c, uint pos = 0) const | find_first_of(String(1,c),pos) |
uint find_last_of(const String& str, uint pos = npos) const | find last of any element in str starting at pos. Return String::npos if not found |
uint find_last_of(const char* s, uint pos, uint n) const | find_last_of(String(s,n),pos) |
uint find_last_of(const char* s, uint pos = npos) const | find_last_of(String(s),pos) |
uint find_last_of(const char c, uint pos = npos) const | find_last_of(String(1,c),pos) |
uint find_first_not_of(const String& str, uint pos = 0) const | find first of any element not in str starting at pos. Return String::npos if not found |
uint find_first_not_of(const char* s, uint pos, uint n) const | find_first_not_of(String(s,n),pos) |
uint find_first_not_of(const char* s, uint pos = 0) const | find_first_not_of(String(s),pos) |
uint find_first_not_of(const char c, uint pos = 0) const | find_first_not_of(String(1,c),pos) |
uint find_last_not_of(const String& str, uint pos = npos) const | find last of any element not in str starting at pos. Return String::npos if not found |
uint find_last_not_of(const char* s, uint pos, uint n) const | find_last_not_of(String(s,n),pos) |
uint find_last_not_of(const char* s, uint pos = npos) const | find_last_not_of(String(s),pos) |
uint find_last_not_of(const char c, uint pos = npos) const | find_last_not_of(String(1,c),pos) |
String substr(uint pos = 0, uint n = npos) const | return String(*this, pos, n) |
int compare(const String& str) const | a.compare(b) compares a and b in normal sort order. Return -1, 0 or 1 |
int compare(uint pos, uint n, const String& str) const | a.compare(pos,n,b) compares String(a,pos,n) and b in normal sort order. Return -1, 0 or 1 |
int compare(uint pos1, uint n1, const String& str, uint pos2, uint n2) const | a.compare(pos1,n1,b,pos2,n2) compares String(a,pos1,n1) and String(b,pos2,n2) in normal sort order. Return -1, 0 or 1 |
int compare(const char* s) const | return compare(String(s)) |
int compare(uint pos, uint n, const char* s) const | return compare(pos, n, String(s)) |
+ means concatenate, otherwise the meanings are obvious.
String operator+(const String& lhs, const String& rhs) String operator+(const char* lhs, const String& rhs) String operator+(char lhs, const String& rhs) String operator+(const String& lhs, const char* rhs) String operator+(const String& lhs, char rhs)
bool operator==(const String& lhs, const String& rhs) bool operator==(const char* lhs, const String& rhs) bool operator==(const String& lhs, const char* rhs)
bool operator!=(const String& lhs, const String& rhs) bool operator!=(const char* lhs, const String& rhs) bool operator!=(const String& lhs, const char* rhs)
bool operator<(const String& lhs, const String& rhs) bool operator<(const char* lhs, const String& rhs) bool operator<(const String& lhs, const char* rhs)
bool operator>(const String& lhs, const String& rhs) bool operator>(const char* lhs, const String& rhs) bool operator>(const String& lhs, const char* rhs)
bool operator<=(const String& lhs, const String& rhs) bool operator<=(const char* lhs, const String& rhs) bool operator<=(const String& lhs, const char* rhs)
bool operator>=(const String& lhs, const String& rhs) bool operator>=(const char* lhs, const String& rhs) bool operator>=(const String& lhs, const char* rhs)
void swap(const String& A, const String& B)
The stream functions - not properly implemented as yet:
istream& operator>>(istream& is, String& str)
... read token from istream - mine reads a whole line
ostream& operator<<(ostream& os, const String& str)
... output a String - mine ignores width setting
istream& getline(istream is, String& str, char delim = '\n')
... read a line - I haven't implemented this yet.
This section discusses under what circumstances the String data in a String object will be moved. It is unclear to me what the standard allows. Moving the String data invalidates the const char* returned by .data() and .c_str() and any reference returned by the non-const versions of .at() or operator[] (and any iterators refering to the string).
I describe here what my program does. Another standard String package may (and probably does) follow different rules.
The value returned by .c_str will most likely become invalid under almost any operation of the String which changes the value of the String. Also a call to .c_str will invalidate a const char* returned by .data() and any reference returned by .at() or operator[].
If A is a String that has been assigned a capacity with the reserve function then the following functions will not cause a reallocation (so the value returned by .data() etc. will remain valid)
A += ... A.assign(...) A.append(...) A.insert(...) A.erase(...) A.replace(...)
where ... denotes a legitimate argument, providing the resulting String will fit in the assigned capacity (as set by a call to reserve).
If the resulting String will not fit into the assigned capacity the String data will be moved (so the value returned by .data() etc. will not remain valid). Also the String will no longer be regarded as having an assigned capacity.
The concept of having an assigned capacity is important in considering the behaviour of assign, erase and replace when the parameters are such that length of the String is reduced. For example
String A = "0123456789"; A.reserve(1); // will set capacity to A.size() = 10 const char* d = A.data(); A.erase(1,9);
will leave a valid value in d whereas
String A = "0123456789"; const char* d = A.data(); A.erase(1,9);
will not leave a valid value in d since the storage of the String data will have been moved.
The operator= does not conform to these rules. A = something will always remove any assigned capacity for A (and will not pick up any capacity from the something).
In this package A.reserve() or A.reserve(0) will remove any assigned capacity. ie it will be as though no capacity had ever been assigned. So an erase or a replace that changes a length will cause a reallocation.
But don't expect anyone else's package to follow these rules.
The evaluation of the concatenation expression A+B is delayed until the expression is used or until the value is referred to twice. This means the expressions such as A+B+C are evaluated in one sweep rather than having A+B formed as a temporary before evaluating A+B+C.
Unfortunately, this means that in expressions such as A + c_string the c-string c_string will be converted to a String object, before the overall String is formed. Since c-strings will usually be small I don't see this as a serious problem.
Likewise A+=X or A.append(X) will not be evaluated until the result is used (unless A has been assigned a capacity that is large enough to accommodate X). This means that sequences like
A += X1; A += X2; ...
will not cause repeated reallocations of the space used by the String data.