Extending the iostream library

Cay S. Horstmann

Original: http://www.horstmann.com/cpp/iostreams.html

Thanks to James K. Lowden for the conversion to HTML.

This article appeared originally in C++ Report, Volume 6, Number 4, May 1994.

1. Introduction

With the first release of C++ came two libraries, complex.h and stream.h. The complex library makes perfect sense, at least to that mandarin minority of programmers dealing with complex numbers. After all, overloading of the arithmetic operators is just what is needed to handle complex numbers as easily as real numbers. But did the streams really answer a pressing need? Most C programmers are perfectly happy with stdio.h. In fact, I must confess that my original attraction to C was not caused by its charming syntax but because stdio.h file handling was far easier than in Pascal. Many C++ programmers embrace classes and virtual functions but stick with stdio.h for input and output.

Why don't streams get more respect? In a nutshell, formatting. With printf, formatting is phenomenally simple:

     printf("(%8.2f,%8.2f)\n", x, y);

For a long time, it was a well-kept secret that you can do formatting with streams at all. The first description of stream formatting that I came across is [1 , appendix A]. I eventually figured out that the stream equivalent of the printf statement above is

     cout << "(" << setprecision(2) << setiosflags(ios::fixed) 
<< setw(8) << x << "," << setw(8) << y << ")" << endl;

This revelation did nothing to increase my enthusiasm for streams.

Of course streams do have two advantages. They are typesafe. You can't do dumb mistakes like

     double x;
scanf("%f", &x); // double* requires %lf

And they are extensible. To provide input and output for complex numbers, one merely needs to implement

     istream& operator>>(istream&, Complex&);
ostream& operator<<(ostream&, Complex);

Now complex numbers can be extracted from, and inserted to, any kind of stream, such as file and string streams, and not just cin and cout.

In fact, streams are extensible in another way. Not only can input and output be defined for new data types, but new stream classes that interact with new devices can be derived from the basic stream classes. In this regard, the stream library functions as a framework for new stream classes. Until very recently, deriving new stream classes was a very black art that required reading through the source code for the stream library for guidance. A brand new book [2], devoted entirely to the stream library, removes much of that challenge.

In this article, I present an overview over the formatting and buffering architecture of streams and give two practical applications. I describe a manipulator that makes formatting much easier. For example, the print statement above becomes

     cout << "(" << setformat("%8.2f") << x << "," 
<< setformat("8.2f") << y << ")" << endl;

which is almost tolerable. And I derive a new stream class for Microsoft Windows programming that routes diagnostic messages into a special debug window. You use it just like any other ostream.

     debugout << "(" << << x << "," << << y << ")" << endl;

This is much nicer than using a message box for displaying diagnostics--you don't need to click `Ok' every time, and the messages stick around in the debug window.

2. Formatting

The stream library actually performs two unrelated tasks: formatting and buffering. Formatting is the act of translating between binary data and their character representations. It is done by the class ios, the base class for both istream and ostream. The ios class keeps a format state that governs formatting. The format state specifies

The interface for setting the format state is less than stellar. For each item, there are two ways to change the setting, with an ios member function and with a manipulator.

For example, to set the fill character to *, say for printing checks, you either call

     cout.fill('*');

or

     cout << setfill('*');

The fill function is just an ordinary member function of the ios class. I will explain in the next section how the setfill manipulator works. Their effect is the same, but the manipulator is a bit more convenient since it can be combined with other << operations in the same statement. On the other hand, the member function returns the old fill value which is nice if you need to restore it.

To set the field width, you use the member function ios::width , or, somewhat illogically, the manipulator setw.

The other portions of the format state are implemented as bits or groups of bits. Bits can be turned on with the member function ios::setf or a manipulator called--you guessed it--setiosflags. To turn a bit off, use ios::unsetf or the manipulator called (I am not making this up) resetiosflags.

For example, to have a string left aligned in a field with width 20 and fill character *, use either

     cout.setf(ios::left);
cout.fill('*');
cout.width(20);
cout << "Hello, World!";

or

     cout << setiosflags(ios::left) << setfill('*') << setw(20) 
<< "Hello, World!";

By default, floating point numbers are printed in general format, Scientific notation is used when necessary to show the significant digits (6 by default), fixed point format otherwise. For example,

      cout << 123.456789 << ' ' << 123456789; 

prints 123.457 1.23457e+007. (You can get an uppercase E by setting ios::uppercase).

You can explicitly choose either scientific or fixed format with the ios flags scientific or fixed.

      cout << setiosflags(ios::fixed);
cout.setf(ios::scientific);

To reset to general format, there isn't a flag ios::general. Instead, you must use

      cout << resetiosflags(ios::floatfield);

or

      cout.unsetf(ios::floatfield);

Don't ask why.

To see a + sign for positive numbers, use ios::showpos. (This also works for integers.) To see trailing zeros, use ios::showpoint . For example,

      cout.setf(ios::fixed | ios::showpoint | ios::showpos);
cout << 123.456;

prints +123.456000.

You can change the precision from the default 6.

      cout << setprecision(10) << 123.45678 

or

      cout.precision(10);
cout << 123.45678

prints 123.45678. In general format, the precision denotes the number of significant digits, in fixed and scientific formats the number of digits after the decimal point. (Not all implementations handle this correctly.) Integers can be formatted in decimal, hexadecimal or octal.

      int n = 12;
cout << hex << n << ' ';
cout << oct << n << ' ';
cout << dec << n << endl;

prints c 14 12. To show the base and print the letters in uppercase, use:

      cout.setf(ios::showbase | ios::uppercase);

Now the same numbers print as 0XC 014 12.

Instead of the manipulators, you can also use ios flags dec, oct and hex.

      cout.setf(ios::hex);

As a final twist, keep in mind that all format state settings persist until they are set to a different value, except for field width. Field width reverts to 0 (i.e. print as many characters as necessary) after each output operation.

Table 1. manipulators and corresponding ios operations
manipulator ios operation


setfill fill
setw width
setprecision precision
setiosflags setf
resetiosflags unsetf
hex
oct
dec

Table 2. ios flags
ios:: flag meaning


left left alignment
right right alignment
internal sign left, remainder right
dec decimal base
hex hex base
oct octal base
showbase show integer base
showpos show + sign
uppercase uppercase E, X, and hex digits A ... F
fixed fixed floating point format
scientific scientific floating point format
showpoint show trailing decimal point and zeros

3. Manipulators

We have seen a number of manipulators (endl, setw( n),...) for formatting output. In this section, we will see what they are and how they work. The endl manipulator, like every manipulator that does not take an argument, is a function:

      ostream& endl(ostream& os)
{ os << '\n';
os.flush();
return os;
}

An ostream is prepared to accept a pointer to such a manipulator function:

      ostream& ostream::operator<<(ostream& (*m)(ostream&))
{ return (*m)(*this);
}

Manipulators that take an argument are more difficult. Consider the setw manipulator that takes an integer argument. It sets the width of either the next input field or the next output field.

The obvious extension of the previous method does not work:

      ios& setw(ios& s, int w)
{ s.width(w);
return s;
}
      os << setw(10) << x;

The expression setw(10) is illegal since the setw function takes two arguments. (We were lucky with functions taking a single argument--omitting the argument results in a legal object, a function pointer.)

Instead, the value of the expression setw(10) must be an object that the stream can accept.

There are two possibilities for setting this up. We can define a class setw with a constructor setw(int). Or we can define a function setw(int) that returns an object of some class. The second approach is preferred because it only introduces a new function for each manipulator, not a new class. Functions are considered lightweight features, but there is always some hesitation to introduce new classes without a compelling reason.

The setw(int) function needs to return an object of some class. That class is shared among all integer manipulators. Other manipulators take other arguments, say of type T, such as long or char* . A template SMANIP<T> defines the return type of the manipulator function. The return object is designed to remember

Here is the class definition.

      template<class T> class SMANIP
{
public:
SMANIP(ios& (*a)(ios&,T), T v);
friend ios& operator<<(ios&, const SMANIP<T>&);
private:
ios& (*_action)(ios&,T);
T _value;
};

(The SMANIP template produces manipulators for the class ios. There are also templates IMANIP and OMANIP for istream and ostream class manipulators.)

The overloaded << operator applies the stored action on the ios object, using the stored value as the second argument.

      template<class T> ios& operator<<(ios& s, const SMANIP<T>& m)
{ return (*m._action)(s, m._value);
}

The setw(int) function returns a specific applicator, specifying the width setting action, and the integer width to be set. The width setting action is given by a function, named do_setw, with two arguments. It invokes the width operation of the ios class.

      static ios& do_setw(ios& s, int n)
{ s.width(n);
return s;
}

Now we are ready to implement the original setw(int) manipulator function

      SMANIP<int> setw(int n)
{ return SMANIP<int>(do_setw, n);
}

4. A convenient manipulator to set format state

The stream format interface, with all its manipulators, is cumbersome and not terribly intuitive. Many programmers prefer the interface of the printf function.

According to [3 , pp. 243], printf accepts a format string consisting of flags (in any order), a number specifying the field width, a period followed by the precision (for floating point numbers only), and a type. Flags and types are

-
left justification
+
show + sign
0
show leading zeros
# use `alternate' format--leading 0 or 0x/0X for octal or
hexadecimal numbers, show decimal point and trailing
zeros for floating point numbers
space if the first character is not a sign, prefix a space
d
decimal
o
octal
x, X
hexadecimal, X uses uppercase letters
f fixed floating point
e, E scientific floating point, E uses uppercase E
g, G general floating point, G uses uppercase E

(Of course, this interface has its problems too, especially the `alternate' format. But it has one thing going for it--tradition.)

Our plan is to write a manipulator setformat that takes a printf style string and translates it into stream formatting instructions. For example,

     cout << setformat("%+06d");

is converted to

      cout << setiosflags(ios::internal|ios::showpos|ios::dec) 
<< setfill('0') << setw(6);

and sets a leading + sign and leading zeros, a field width of 6 and decimal number output. Unlike printf, no errors are introduced when a floating point number is printed inadvertently after a %d. While

      printf("%+06d", M_E);

will do something wrong,

      cout << setformat("%+06d") << M_E;

just prints e with the given style and field width. The d has no effect since it sets integer state to decimal.

Any characters in the format string that are not part of a format string (starting with %) are passed to the stream verbatim.

      cout << setformat("The result is %8.2f") << x;

This is a minor convenience only. All non format characters in the string are inserted into the stream before any other insertions.

      cout << setformat("(%8.2f,%8.2f)") << x << y; // NO

will not work as one might expect from printf. Also, since setformat merely translates the printf style commands to their ios equivalents, the characteristic ios behavior is unchanged--format state is persistent until changed, except for field width which is reset to 0 after every operation. To implement the setformat manipulator, we first code a function performing the essential conversion task that we call do_setformat.

      static ostream& do_setformat(ostream& os, const char fmt[])
{ int i = 0;
while (fmt[i] != 0)
{ if (fmt[i] != '%') { os << fmt[i]; i++; }
else
{ i++;
if (fmt[i] == '%') { os << fmt[i]; i++; }
else
{ Bool ok = TRUE;
int istart = i;
Bool more = TRUE;
int width = 0;
int precision = 6;
long flags = 0;
char fill = ' ';
Bool alternate = FALSE;
while (more)
{ switch (fmt[i])
{
case '+':
flags |= ios::showpos;
break;
case '-':
flags |= ios::left;
break;
case '0':
flags |= ios::internal;
fill = '0';
break;
case '#':
alternate = TRUE;
break;
case ' ':
break;
default:
more = FALSE;
break;
}
if (more) i++;
}
if (isdigit(fmt[i]))
{ width = atoi(fmt+i);
do i++; while (isdigit(fmt[i]));
}
if (fmt[i] == '.')
{ i++;
precision = atoi(fmt+i);
while (isdigit(fmt[i])) i++;
}
switch (fmt[i])
{
case 'd':
flags |= ios::dec;
break;
case 'x':
flags |= ios::hex;
if (alternate) flags |= ios::showbase;
break;
case 'X':
flags |= ios::hex | ios::uppercase;
if (alternate) flags |= ios::showbase;
break;
case 'o':
flags |= ios::hex;
if (alternate) flags |= ios::showbase;
break;
case 'f':
flags |= ios::fixed;
if (alternate) flags |= ios::showpoint;
break;
case 'e':
flags |= ios::scientific;
if (alternate) flags |= ios::showpoint;
break;
case 'E':
flags |= ios::scientific | ios::uppercase;
if (alternate) flags |= ios::showpoint;
break;
case 'g':
if (alternate) flags |= ios::showpoint;
break;
case 'G':
flags |= ios::uppercase;
if (alternate) flags |= ios::showpoint;
break;
default:
ok = FALSE;
break;
}
i++;
if (fmt[i] != 0) ok = FALSE;
if (ok)
{ os.unsetf(ios::adjustfield | ios::basefield |
ios::floatfield);
os.setf(flags);
os.width(width);
os.precision(precision);
os.fill(fill);
}
else i = istart;
}
}
}
return os;
}

Imitating the implementation of the setw manipulator, setformat simply becomes

     OMANIP<const char*> do_setformat(const char* fmt)
{ return OMANIP<const char*>(do_setformat, fmt);
}

5. Buffering

The ios class concerns itself with formatting, the conversion between binary data and their ASCII characters representations. Transporting these characters from and to devices is the responsibility of the streambuf class and its descendants. When a stream object is constructed, it should receive the location of a stream buffer object. The formatting functions call upon that stream buffer object to obtain or transmit characters.

For efficiency, it is usually desirable to buffer input and output, that is, to read and write data chunks of some fixed size at a time. Classes derived from streambuf perform these tasks by overriding three virtual functions:

      virtual int streambuf::overflow(int ch);
virtual int streambuf::underflow();
virtual int streambuf::sync();

Objects of class streambuf manage a buffer area that is dynamically divided into a get area for input and a put area for output. The overflow function must create space in the put area, for example by saving characters to a file, and either save its argument ch as well or place it into the put area. The underflow function must fill the get area with characters from the associated device. The sync function synchronizes the buffer with the external device, by flushing its buffers when possible and adjusting the current device level get and put positions. The streambuf class supplies (non-virtual) member functions to obtain and manipulate the pointers into the get and put areas.

One can dispense with buffering by not supplying a buffer area. Then overflow and underflow interact with the device a character at a time, and sync need not be redefined. Conversely, one can enhance an existing buffer class with working overflow (such as filebuf) and just listen in to the sync activities to route output to a second device. We will see an example of the latter in the next section.

6. Routing stream output to a debug window

Rather than routing output to a file, as the fstream class does, we will design a class that sends it to a window on the desktop. The window logs up to 16,000 characters, and long text can be retrieved using the scroll bars. Since we will only change the buffering, and not the formatting, the same formatting commands that work for all streams will continue to work. Here is how you use it.

Declare an object of type DebugStream.

     DebugStream cdebug;

To send messages to the debug stream, use the regular stream calls:

     cdebug << "Message: " << setw(5) << n << endl;

You should use << endl, not << "\n", at the end of each line to get the buffer flushed immediately. Otherwise the text stays buffered until the buffer overflows or the stream is flushed or closed.

Of course, by declaring more than one instance of class DebugStream , you can have more than one debug stream window at the same time. You can attach a debug stream to a file

     cdebug.open("debug.dat");

and all debug output is sent both to the window and the file with the given name.

This module builds upon two existing facilities, C++ streams and the Windows EDIT class. The point is not to come up with the most efficient implementation but to show how far one can get while being as lazy as possible.

The DebugStream class is derived from ostream and hence gives the user the same << facilities of regular streams, including manipulators such as setw, endl, and, yes, setformat . It constructs a buffer of type DebugStreamBuffer whose address is passed to the ostream constructor.

      class DebugStream : public ostream
{
public:
DebugStream() : ostream(new DebugStreamBuffer()), ios(0) {}
~DebugStream() { delete rdbuf(); }
void open(const char fname[] = 0) { _buf.open(fname); }
void close() { _buf.close(); }
};

The DebugStreamBuffer class is derived from filebuf to achieve file logging for free. It turns out that only the virtual sync() function needs to be overloaded to copy the buffered text before it is sent to the file. (Alternatively, if no file logging was desired, one could turn off buffering and override the virtual overflow() function.)

      class DebugStreamBuffer : public filebuf
{
public:
DebugStreamBuffer() { filebuf::open("NUL", ios::out); }
void open(const char fname[]);
void close() { _win.close(); filebuf::close();}
virtual int sync();
private:
DebugStreamWindow _win;
};
      void DebugStreamBuffer::open(const char fname[])
{ close();
filebuf::open(fname ? fname : "NUL",
ios::out | ios::app | ios::trunc);
_win.open(fname);
}
      int DebugStreamBuffer::sync()
{ int count = out_waiting();
_win.append(pbase(), count);
return filebuf::sync();
}

All code relating to streams is complete. What remains is to bring up a text window on the screen, and to provide the mechanics of stuffing text into it. To avoid reinventing the wheel, we use a window with an EDIT child window to display the text.

      class DebugStreamWindow
{
public:
DebugStreamWindow() : _win(0), _edit(0) {}
         void open(const char fname[] = 0);
void close() { DestroyWindow( m_win ); }
void append(const char text[], int count);
         ~DebugStreamWindow() { close(); }
      private:
enum { BUFSIZE = 16000 };
         int removeFirst();
int getLength();
         static long FAR PASCAL _export winProc
(HWND hwnd, UINT message, UINT wParam, LONG lParam);
         HWND _win;
HWND _edit;
};

To open the window, we go through the usual RegisterClass/ CreateWindow sequence. The only nonstandard aspect is the way in which we obtain the instance handle of the current application. The window procedure of the DEBUGWIN window class creates the EDIT child window and handles WM_SIZE. Note the usual trick of storing the this pointer of the associated C++ object in the `extra' bytes.

      void DebugStreamWindow::open(const char fname[])
/* PURPOSE: Open a window for output logging
RECEIVES: fname - the file name (for the window title)
*/
{ if (_win) close();
char* title = new char[(fname ? strlen(fname) : 0) + 20];
if (title)
{ strcpy(title, "DebugStream");
if (fname)
{ strcat(title, " - (");
strcat(title, fname);
strcat(title, ")");
}
}
         HTASK hTask = GetCurrentTask();
TASKENTRY te;
te.dwSize = sizeof(te);
TaskFindHandle(&te, hTask);
HINSTANCE hInstance = te.hInst;
         WNDCLASS wndclass ;
wndclass.style = CS_HREDRAW | CS_VREDRAW;
wndclass.lpfnWndProc = winProc;
wndclass.cbClsExtra = 0;
wndclass.cbWndExtra = sizeof(void far*);
wndclass.hInstance = hInstance;
wndclass.hIcon = LoadIcon(NULL, IDI_ASTERISK);
wndclass.hCursor = LoadCursor(NULL, IDC_ARROW);
wndclass.hbrBackground = (HBRUSH) GetStockObject(WHITE_BRUSH);
wndclass.lpszMenuName = NULL;
wndclass.lpszClassName = "DEBUGWIN";
RegisterClass(&wndclass);
         _win = CreateWindow("DEBUGWIN", title,
WS_OVERLAPPEDWINDOW | WS_VISIBLE,
CW_USEDEFAULT, CW_USEDEFAULT,
CW_USEDEFAULT, CW_USEDEFAULT,
NULL, NULL,
hInstance, (LPSTR) this);
         delete[] title;
}
      long FAR PASCAL _export DebugStreamWindow::winProc
( HWND hWnd, UINT message, UINT wParam, LONG lParam )
{ DebugStreamWindow* thisWin =
(DebugStreamWindow*)GetWindowLong( hWnd, 0 );
// caveat: this pointer is 0 until after WM_CREATE;
         switch (message)
{
case WM_CREATE:
LPCREATESTRUCT lpcs = (LPCREATESTRUCT) lParam;
thisWin = (DebugStreamWindow*) lpcs->lpCreateParams;
SetWindowLong( hWnd, 0, (LONG) thisWin );
thisWin->_edit = CreateWindow("EDIT", NULL,
WS_CHILD | WS_VISIBLE | WS_HSCROLL | WS_VSCROLL |
ES_LEFT | ES_MULTILINE | ES_READONLY |
ES_AUTOHSCROLL | ES_AUTOVSCROLL,
0, 0, 0, 0,
hWnd, 1,
lpcs->hInstance, NULL);
return 0 ;
         case WM_SIZE:
MoveWindow( thisWin->_edit, 0, 0, LOWORD (lParam),
HIWORD (lParam), TRUE) ;
return 0 ;
         case WM_DESTROY:
thisWin->_win = 0;
thisWin->_edit = 0;
break;
}
return DefWindowProc (hWnd, message, wParam, lParam) ;
}

Text is entered into the edit window by the EM_REPLACESEL command, the only method I found of changing the edit text.

      void DebugStreamWindow::append(const char text[], int count)
/* PURPOSE: Append text to the output window
RECEIVES: text - the start of the text
count - the number of bytes
REMARKS: This function performs \n -> \r\n translation
*/
{ if (!_win) open();
if (!_edit || !count) return;
char* t = new char[2*count+1]; // worst case
if (t == 0) return; // out of memory
int tlen = 0; // index into t
for (int i = 0; i < count; i++)
{ if (text[i] == '\n')
t[tlen++] = '\r';
t[tlen++] = text[i];
}
t[tlen] = 0;
         int nchar = getLength();
while (nchar > 0 && nchar + tlen > BUFSIZE)
nchar -= removeFirst();
         SendMessage(_edit, EM_SETSEL, 0, MAKELONG(nchar, nchar));
SendMessage(_edit, EM_REPLACESEL, 0, (long)(const char far*)t);
SendMessage(_edit, EM_SETREADONLY, TRUE, 0);
delete[] t;
}
      int DebugStreamWindow::getLength()
/* PURPOSE: Get the length of the text in the edit box
*/
{ if (!_edit) return 0;
int linecount = SendMessage(_edit, EM_GETLINECOUNT, 0, 0L);
int nlast = SendMessage(_edit, EM_LINEINDEX, linecount-1, 0L);
if (nlast < 0) nlast = 0;
else nlast += SendMessage(_edit, EM_LINELENGTH, nlast, 0L);
return nlast;
}
      int DebugStreamWindow::removeFirst()
/* PURPOSE: Remove the first line in the edit box
RETURNS: The length of the removed line
*/
{ if (!_edit) return 0;
int nfirst = SendMessage(_edit, EM_LINEINDEX, 1, 0L);
if (nfirst >= 0)
{ SendMessage(_edit, EM_SETSEL, 0, MAKELONG(0, nfirst));
SendMessage(_edit, EM_REPLACESEL, 0,
(long)(const char far*)"");
return nfirst;
}
else return 0;
}

This completes the implementation of the DebugStream class. Naturally, the same mechanism can be employed to send debug messages to another destination.

7. Conclusion

Common wisdom holds that streams are superior to stdio because they are typesafe and extensible. In this article I showed that the extensibility goes far beyond the addition of new types to the formatting layer. I added a manipulator to convert printf formatting commands to their stream equivalents. And I added a new stream class that displays output in a debug window under Microsoft Windows, fully supporting all features of the ostream class (including the new format manipulator). Both additions require very little code, but the programmer must know where to hook into the streams framework. This is, of course, characteristic of programming with frameworks.

References

[1] S. B. Lippmann, C++ Primer, 1st ed. Reading, MA: Addison- Wesley, 1989.

[2] S. Teale, C++ IOStreams Handbook. Reading, MA: Addison-Wesley,
1993.

[3] B. Kernighan and D. Ritchie, The C Programming Language, 2nd
ed. Englewood Cliffs, NJ: Prentice Hall, 1989.