John Chamberlain
Developer Diary
 Developer Diary · You Heard It Here First · Tuesday 17 February 2004
Taking Exception to Exceptions
Most programmers take the error handling mechanism provided by their language for granted which is too bad because the existing schemes could stand a lot of improvement. In an earlier columns I have pointed out that the C# is in some respects a superior language to Java without giving any reasons why. One reason is the presence of checked exceptions in Java. I dislike the try-catch methodology used by both, but Java compounds the mistake by adding checked exceptions which intensify the pain and annoyance of the try-catch system.

The Ugly History of Structured Error Handling - Early Days

To understand the problems with checked exceptions it may help to review the ugly history of so-called "structured exception handling". First of all in the beginning there were errors. Fine. Then the microprocessor makers (who are remember engineers, not programmers) decided that when the device failed that was an error but when the program issued a bad instruction that was something different--an "exception". This term originated in the idea that the CPU had used most possible operation codes so if it received an undefined op code, then such an undefined op code was exceptional. This term later began to be applied not just to undefined op codes but to any bad instruction, for example, division by zero, stack overflows, NAN operands, etc, etc. In the old days the regular response to such an "exception" was for the CPU to fire an interrupt. The program could then have a global error handler if it setup an interrupt handler for this interrupt.

The Ugly History of Structured Error Handling - The Exception Disease Spreads

Once you start executing millions of instructions per second the odds that at least one of them will be invalid in some way tends toward inevitability. Nevertheless engineers think like beginners when it comes to programming: they assume everything will work; what QA people call "the positive path". When reality kept hammering them in the head the engineers at Intel gradually got the idea that exceptions are actually the rule. Their first response was to freeze like bunnies. Their second response was to change their semantics: henceforth all interrupts would considered a kind of exception. In other words now any event external to the processor was an error (an "exception"). Not very logical, but remember we are dealing with engineers here.

The Ugly History of Structured Error Handling - SEH is Born

In deference to the engineers the programmers were not doing much better. Operating systems could have used the interrupt-driven scheme and created software-implemented error handling, but no one did it. Unix just dumped core and MS-DOS handled errors by displaying a blinking underscore in the upper-left corner of the screen. Nice. That left it to the engineers ("vee haf vays of making you handle errors"). When Intel created the 32-bit architecture found in the 80386 they added a special register, FS, to allow for error handling. If an exception occurred there would be an error code and a handler set up in the FS segment that would be automatically called. Microsoft responded by inventing structured error handling for its first 32-bit operating system, OS/2. In SEH you could chain handlers and make them either react to an error code or pass the error backwards. The operating system always added a default handler at the end of the chain that would handle any error code. The GPF was born.

(By the way, when all this was going down in Windows and OS/2 the Unix programmers were still stuck in the dark ages dumping core. Nowadays Unix programmers act as if try-catch dropped from the sky and pointedly refuse to recognize that they use a Microsoft-created invention in their C++ code. These are the advanced unix programmers, by the way. A lot of the average Unix programmers are still dumping core.)

The Ugly History of Structured Error Handling - The Semantics of SEH

Long before SEH the language Lisp had a construct called throw-catch that was a goto (labeled jump) masquerading as some noble solution to error handling. In reality it differed little from World War II era solutions. Microsoft liked these exalted semantics and added the try-catch-finally block to the syntax of C++ for its new SEH methodology which wrapped the Intel-designed solution for handling "exceptions". Thenceforth errors would no longer be called errors. Now they were known by the engineering misnomer "exceptions".

The Ugly History of Structured Error Handling - Java One Ups SEH

When Java was designed rather than invent a new, better solution way to handle errors the inventors chose to use the solution found in Microsoft's C++ SEH: try-catch-finally. In the same way a lot of Java's design is a blind imitation of C++. Unable to leave well enough (bad enough?) alone the designers added the concept of checked exceptions. Not only could you throw an exception in Java, you could force the upstream caller to handle it. They also added the not-so-brilliant innovation of defining exceptions as classes instead of error codes / descriptions. In retrospect both of these decisions were so bad it's probably a good thing they didn't attempt to design their own error handling.

The results of these ill-conceived additions are that the checked exceptions force the good programmers to enclose many statements in clumsy try-catch blocks that interrupt the normal flow of their program logic. Bad programmers, slow programmers or programmers that do not have time simply throw away all exceptions to avoid the checked exceptions and ignore every error completely defeating the entire point of the system. On top of these consequences the definition of exceptions as classes instead of constants in Java has resulted in thousands and thousands of unnecessary classes that obscure package structure and use up space both on the disk and in code/memory.

What Could Have Been - A Better Way to Do Error Handling

Ideally error handling should not be done at all. Instead programs should have different pathways according to state. For example, you might normally have a success path and a failure path, but in some circumstances you might have a tri-state path or a single state path when errors are not expected. This way of doing programming is possible in first-class languages such as Lisp by using continuation passing, however, practical considerations currently prevent Lisp from being used widely for commercial development.

In the procedural programming style found in C and Java error handling could be improved by modeling methods by return value type and adding an error buffer to the stack. When a failure occurs the programmer has the option of adding text to the error buffer before returning false/null. In this system you might have the following method models:

    * binary (returns true on success, false on failure, has buffer)
    * tertiary (returns true on success, false on failure, partial on qualified success, has buffer)
    * value (returns value, has buffer)
    * immediate (returns value, no buffer)
    * void (returns nothing, no buffer)

In the first case no keyword would be used--it would be the default. In the other cases a keyword would be required. The buffer could be accessed off the current object instance or off the thread in the case of a static method. Immediate and Void methods could be used when errors are not expected. In this case if an error occurred a global error handler would be called. If I were designing a language I would make binary the default method mode, but would also allow programmers to declare an entire package or class to default to a particular method mode. For example, the programmer could write "mode immediate" at the beginning of a class and all the methods would default to immediate in the class.

A typical binary method (which should be the usual case in a well-written program) would look like this:

	static binary zLoadString( java.io.InputStream inputstreamResource, StringBuffer sbResource ){
		if( inputstreamResource == null ){
			return error("resource not found");
		}
		do {
			on error: return error("Failed to read resource");
			BufferedReader brFileToBeLoaded = null;
			int iFileCharacter;
			brFileToBeLoaded = new BufferedReader(new InputStreamReader(inputstreamResource));
			while(true) {
				iFileCharacter = brFileToBeLoaded.read();
				if(iFileCharacter==-1) break;
				sbResource.append((char)iFileCharacter);
			}
		} finally {
			on error: return error("Failed to close resource");
			if(brFileToBeLoaded!=null) brFileToBeLoaded.close();
		}
		return success;
	}

Compare this to the same method written in Java as it exists today:

	static boolean zLoadString( java.io.InputStream inputstreamResource, StringBuffer sbResource, StringBuffer sbError){
		if( inputstreamResource == null ){
			sbError.append("resource not found");
			return false;
		}
		BufferedReader brFileToBeLoaded = null;
		try {
			int iFileCharacter;
			brFileToBeLoaded = new BufferedReader(new InputStreamReader(inputstreamResource));
			while(true) {
				iFileCharacter = brFileToBeLoaded.read();
				if(iFileCharacter==-1) break;
				sbResource.append((char)iFileCharacter);
			}
		} catch(Exception ex) {
			sbError.append("Failed to read resource: " + ex);
			return false;
		} finally {
			try {
				if(brFileToBeLoaded!=null) brFileToBeLoaded.close();
			} catch(Exception ex) {
				sbError.append("Failed to close resource: " + ex);
				return false;
			}
		}
		return true;
	}

From the above you can see having moded methods would lead to terser code as well as built-in support for the error buffer. The current methodology is inferior on a number of different counts. For example, re-throwing errors is never done because it is so unwieldy so having nested catch statements does not make sense. Also, combining catch with finally is ad hoc because error handling and finalization are two separate things. The methodology I propose above correctly separates finalization from error handling and makes the error statements linear instead of nested which is natural. By having method modes it is possible to return errors directly (insted of having to do it in two steps). Java proponents might claim you can do it in Java in one step by throwing an error but this is not strictly true because when you throw an error the upstream caller does not know which line of code failed. It is important that the upstream caller be able to test return values for failure in order to return an accurate error message.

As far as I am concerned the end goal is an accurate error message which describes the error and gives the context in which it occurred. This is where try-catch and global handlers fall down: they lose the context of the error. With try-catch many lines of code are enclosed and the actual failing line is not identified unless a stack trace is printed. Printing stack traces is a bad solution for two reasons: (1) it is meaningless and scary to users and (2) much relevant information is lost. If the occurred in a loop, which item was it? the first, the last, the 439th?, or if it was a file which file? etc stack traces do not tell these pieces of information which may be upstream from the point of error. This is why it is important to accumulate messages in an error buffer. Currently no major computer language that I am aware of supports such an error message buffer.

You can see my solution from the second of the two code examples above: I pass a string buffer into all my error-handled routines to accumulate the error. A much better solution would be to support this buffer within the syntax of the language instead of using the ham-handed exception methodology.

Maybe in a future world programmers will have this ability.

return to John Chamberlain's home · diary index
Developer Diary · about · info@johnchamberlain.com · bio · Revised 17 February 2004 · Pure Content