American Science Institute of Technology  

 

   Program Layout
Home Up Feedback Legal News

 

 

 

Program Layout

After naming conventions and where to put braces (or begin..end), the other major argument programmers engage in is how to lay out a program, i.e., what are the indentations one should use in a well written program? Unfortunately, the ideal program layout is something that varies by language. The layout of an easy to read C/C++ program is considerably different than that of an assembly language, Prolog, or Bison/YACC program. As usual, this section will describe those conventions that generally apply to all programs. It will also discuss layouts of the standard control structures described earlier.

According to McConnell (Code Complete), research has shown that there is a strong correlation between program indentation and comprehensibility. Miaria et. al ("Program Indentation and Comprehension") concluded that indentation in the two to four character range was optimal even though many subjects felt that six-space indentation looked better. These results are probably due to the fact that the eye has to travel less distance to read indented code and therefore the reader's eyes suffer from less fatigue.

Guideline:
Indentation should be three to four spaces in an indented control structure with four spaces probably being the optimal value.
Enforced Rule:
If you use tabs to indent your code, insert a comment at the very beginning of the program that states the number of positions for each tab stop. E.g., "/* This program is formatted using four character position tabstops. */"

Steve McConnell, in Code Complete, mentions several objectives of good program layout:

  • The layout should accurately reflect the logical structure of the code. Code Complete refers to this as the "Fundamental Theorem of Formatting." White space (blank lines and indentation) is the primary tool one can use to show the logical structure of a program.
  • Consistently represent the logical structure of the code. Some common formatting conventions (e.g., those used by many C/C++ programmers) are full of inconsistencies. For example, why does the "{" go on the same line as an "if" but below "int main()" (or any other function declaration)? A good style applies consistently.
  • Improve readability. If the indentation scheme makes a program harder to read, why waste time with it? As pointed out earlier, some schemes make the program look pretty but, in fact, make it harder to read (see the example about 2-4 vs. 6 position indentation, above).
  • Withstand modifications. A good indentation scheme shouldn't force a programmer to modify several lines of code in order to affect a small change to one line. For example, many programmers put a begin..end block (or "{".."}" block) after an if statement even if there is only one statement associated with the if. This allows the programmer to easily add new statements to the then-clause of the if statement without having to add additional syntactical elements later.

The principle tool for creating good layout is whitespace (or the lack thereof, that is, grouping objects). The following paragraphs summarize McConnell's finding on the subject:

  • Grouping: Related statements should be grouped together. Statements that logically belong together should contain no arbitrary interleaving whitespace (blank lines or unnecessary indentation).
  • Blank lines: Blank lines should separate declarations from the start of code, logically related statements from unrelated statements, and blocks of comments from blocks of code.
  • Alignment: Align objects that belong together. Examples include type names in a variable declaration section, assignment operators in a sequence of related assignment statements, and columns of initialized data.
  • Indentation: Indenting statements inside block statements improves readability, see the comments and rules earlier in this section.
Rule:
At least one blank line must separate a comment on a line by itself from a line of code following or preceding the comment.

This style guide uses the "Pure Blocks" layout form suggested by McConnell. This is the obvious layout scheme to use when your language supports modern structured statements like if..then..elseif..else..endif. Since this standard requires the emulation of the modern block structured statements, the Pure Blocks layout is appropriate.

Rule:
The standard layout scheme for this coding standard is the Pure Block format. For languages that do not support modern structured control statements, this coding standard specifies an emulation of these statements that allows the use of the Pure Block layout format.

In theory, a line of source code can be arbitrarily long. In practice, there are several practical limitations on source code lines. Paramount is the amount of text that will fit on a given terminal display device (we don't all have 21" high resolution monitors!) and what can be printed on a typical sheet of paper. If this isn't enough to suggest an 80 character limit on source lines, McConnell suggests that longer lines are harder to read (remember, people tend to look at only the left side of the page while skimming through a listing).

Enforced Rule:
Source code lines will not exceed 80 characters in length.

If a statement approaches the maximum limit of 80 characters, it should be broken up at a reasonable point and split across two lines. If the line is a control statement that involves a particularly long logical expression, the expression should be broken up at a logical point (e.g., at the point of a low-precedence operator outside any parentheses) and the remainder of the expression placed underneath the first part of the expression. E.g.,

    if
    (
        ( ( x + y * z) < ( ComputeProfits(1980,1990) / 1.0775 ) ) &&
        ( ValueOfStock[ ThisYear ] >= ValueOfStock[ LastYear ] ) 
    )

            << statements >>

    endif;

Many statements (e.g., IF, WHILE, FOR, and function or procedure calls) contain a keyword followed by a parenthesis. If the expression appearing between the parentheses is too long to fit on one line, consider putting the opening and closing parentheses in the same column as the first character of the start of the statement and indenting the remaining expression elements. The example above demonstrates this for the "IF" statement. The following examples demonstrate this technique for other statements:

    while
    (
        ( NumberOfIterations < MaxCount ) &&
        ( i <= NumberOfIterations )
    )

        << Statements to execute >>

    endwhile;

    fprintf
    (
        stderr,
        "Error in module %s at line #%d, encountered illegal value\n",
        ModuleName,
        LineNumber
    );
Guideline:
For statements that are too long to fit on one physical 80-column line, you should break the statement into two (or more) lines at points in the statement that will have the least impact on the readability of the statement. This situation usually occurs immediately after low-precedence operators or after commas.

For block statements there should always be a blank line between the line containing an if, elseif, else, endif, while, endwhile, repeat, until, etc., and the lines they enclose. This clearly differentiates statements within a block from a possible continuation of the expression associated with the enclosing statement. It also helps clearly show the logical format of the code. Example:

    if ( ( x = y ) and PassingValue( x, y ) ) then

        Output( 'This is done' );

    endif;
Rule:
Always put a blank line between any block statement and the statement(s) it encloses.

If a procedure, function, or other program unit has a particularly long actual or formal parameter list, each parameter should be placed on a separate line. The following (C/C++) examples demonstrate a function declaration and call using this technique:

    int 
    MyFunction
    (
        int    NumberOfDataPoints,
        float  X1Root,
        float  X2Root,
        float  &YIntercept
    );


    x = MyFunction 
        (
            GetNumberOfPoints(RootArray),
            RootArray[ 0 ],
            RootArray[ 1 ],
            Solution
        );
Rule:
If an actual or formal parameter list is too long to fit a function call or definition on a single line, then place each parameter on a separate line and align them so they are easy to read.

Comments and (program) Documentation

Almost everyone agrees that a program should have good comments. Unfortunately, few people agree on the definition of a good comment. Some people, in frustration, feel that minimal comments are the best. Others feel that every line should have two or three comments attached to it. Everyone else wishes they had good comments in their program but never seem to find the time to put them in.

It is rather difficult to characterize a "good comment." In fact, it's much easier to give examples of bad comments than it is to discuss good comments. The following list describes some of the worst possible comments you can put in a program (from worst up to barely tolerable):

  • The absolute worst comment you can put into a program is an incorrect comment. Consider the following Pascal statement:
			A := 10;  { Set 'A' to 11 }
  • It is amazing how many programmers will automatically assume the comment is correct and try to figure out how this code manages to set the variable "A" to the value 11 when the code so obviously sets it to 10.
  • The second worst comment you can place in a program is a comment that explains what a statement is doing. The typical example is something like "A := 10; { Set 'A' to 10 }". Unlike the previous example, this comment is correct. But it is still worse than no comment at all because it is redundant and forces the reader to spend additional time reading the code (reading time is directly proportional to reading difficulty). This also makes it harder to maintain since slight changes to the code (e.g., "A := 9") requires modifications to the comment that would not otherwise be required.
  • The third worst comment in a program is an irrelevant one. Telling a joke, for example, may seem cute, but it does little to improve the readability of a program; indeed, it offers a distraction that breaks concentration.
  • The fourth worst comment is no comment at all.
  • The fifth worst comment is a comment that is obsolete or out of date (though not incorrect). For example, comments at the beginning of the file may describe the current version of a module and who last worked on it. If the last programmer to modify the file did not update the comments, the comments are now out of date.

Steve McConnell provides a long list of suggestions for high-quality code. These suggestions include:

  • Use commenting styles that don't break down or discourage modification. Essentially, he's saying pick a commenting style that isn't so much work people refuse to use it. He gives an example of a block of comments surrounded by asterisks as being hard to maintain. This is a poor example since modern text editors will automatically "outline" the comments for you. Nevertheless, the basic idea is sound.
  • Comment as you go along. If you put commenting off until the last moment, then it seems like another task in the software development process and management is likely to discourage the completion of the commenting task in hopes of meeting new deadlines.
  • Avoid self-indulgent comments. Also, you should avoid sexist, profane, or other insulting remarks in your comments. Always remember, someone else will eventually read your code.
  • Avoid putting comments on the same physical line as the statement they describe. Such comments are very hard to maintain since there is very little room. McConnell suggests that endline comments are okay for variable declarations. For some this might be true but many variable declarations may require considerable explanation that simply won't fit at the end of a line. One exception to this rule is "maintenance notes." Comments that refer to a defect tracking entry in the defect database are okay (note that the CodeWright text editor provides a much better solution for this -- buttons that can bring up an external file). Endline comments are also useful for marking the end of a control structure (e.g., "end{if};").
  • Write comments that describe blocks of statements rather than individual statements. Comments covering single statements tend to discuss the mechanics of that statement rather than discussing what the program is doing.
  • Focus paragraph comments on the why rather than the how. Code should explain what the program is doing and why the programmer chose to do it that way rather than explain what each individual statement is doing.
  • Use comments to prepare the reader for what is to follow. Someone reading the comments should be able to have a good idea of what the following code does without actually looking at the code. Note that this rule also suggests that comments should always precede the code to which they apply.
  • Make every comment count. If the reader wastes time reading a comment of little value, the program is harder to read; period.
  • Document surprises and tricky code. Of course, the best solution is not to have any tricky code. In practice, you can't always achieve this goal. When you do need to restore to some tricky code, make sure you fully document what you've done.
  • Avoid abbreviations. While there may be an argument for abbreviating identifiers that appear in a program, no way does this apply to comments.
  • Keep comments close to the code they describe. The prologue to a program unit should give its name, describe the parameters, and provide a short description of the program. It should not go into details about the operation of the module itself. Internal comments should to that.
  • Comments should explain the parameters to a function, assertions about these parameters, whether they are input, output, or in/out parameters.
  • Comments should describe a routine's limitations, assumptions, and any side effects.
Rule:
All comments will be high-quality comments that describe the actions of the surrounding code in a concise manner
Enforced Rule:
All comments will be up to date. If a programmer makes changes to the code, that programmer is responsible for updating the internal comments and any external documentation affected by those changes.

 

Unfinished Code

Often it is the case that a programmer will write a section of code that (partially) accomplishes some task but needs further work to complete a feature set, make it more robust, or remove some known defect in the code. It is common for such programmers to place comments into the code like "This needs more work," "Kludge ahead," etc. The problem with these comments is that they are often forgotten. It isn't until the code fails in the field that the section of code associated with these comments is found and their problems corrected.

Ideally, one should never have to put such code into a program. Of course, ideally, programs never have any defects in them, either. Since such code inevitably finds its way into a program, it's best to have a policy in place to deal with it, hence this section.

Unfinished code comes in four general categories: non-functional code, partially functioning code, suspect code, and code in need of enhancement. Non-functional code might be a stub or driver that needs to be replaced in the future with actual code or some code that has severe enough defects that it is useless except for some small special cases. This code is really bad, fortunately its severity prevents you from ignoring it. It is unlikely anyone would miss such a poorly constructed piece of code in early testing prior to release.

Partially functioning code is, perhaps, the biggest problem. This code works well enough to pass some simple tests yet contains serious defects that should be corrected. Moreover, these defects are known. Software often contains a large number of unknown defects; it's a shame to let some (prior) known defects ship with the product simply because a programmer forgot about a defect or couldn't find the defect later.

Suspect code is exactly that- code that is suspicious. The programmer may not be aware of a quantifiable problem but may suspect that a problem exists. Such code will need a later review in order to verify whether it is correct.

The fourth category, code in need of enhancement, is the least serious. For example, to expedite a release, a programmer might choose to use a simple algorithm rather than a complex, faster algorithm. S/he could make a comment in the code like "This linear search should be replaced by a hash table lookup in a future version of the software." Although it might not be absolutely necessary to correct such a problem, it would be nice to know about such problems so they can be dealt with in the future.

The fifth category, documentation, refers to changes made to software that will affect the corresponding documentation (user guide, design document, etc.). The documentation department can search for these defects to bring existing documentation in line with the current code.

This standard defines a mechanism for dealing with these five classes of problems. Any occurrence of unfinished code will be preceded by a comment that takes one of the following forms (where "@" denotes the standard comment delimiters in a given language and "_" denotes a single space):

@_#defect#severe_@
@_#defect#functional_@
@_#defect#suspect_@
@_#defect#enhancement_@
@_#defect#documentation_@

It is important to use all lower case and verify the correct spelling so it is easy to find these comments using a text editor search or a tool like grep. Obviously, a separate comment explaining the situation must follow these comments in the source code.

Examples in various languages:

Pascal/Delphi:

(* #defect#severe *)
{ #defect#enhancement }
(* #defect#functional *)
{ #defect#suspect }
{ #defect#documentation }

C:

/* #defect#severe */
/* #defect#suspect */
/* #defect#documentation */

C++:
/* #defect#functional */
// #defect#enhancement //

BASIC:
' #defect#functional '

Assembly (80x86):
; #defect#suspect ;

Ada:
-- #defect#enhancement --
-- #defect#documentation --

Notice the use of delimiters on both sides even if the language, technically, doesn't require them (C++. BASIC, assembly, and Ada).

Enforced Rule:
If a module contains some defects that cannot be immediately removed because of time or other constraints, the program will insert a standardized comment before the code so that it is easy to locate such problems in the future. The four standardized comments are "@_#defect#severe_@, "@_#defect#functional_@", "@_#defect#suspect_@", "@_#defect#enhancement_@", and "@_#defect#documentation_@" where "@" denotes the comment delimiter and "_" denotes a single space. The spelling and spacing should be exact so it is easy to search for these strings in the source tree.

Cross References in Code to Other Documents

In many instances a section of code might be intrinsically tied to some other document. For example, you might refer the reader to the user document or the design document within your comments in a program. This document proposes a standard way to do this so that it is relatively easy to locate cross references appearing in source code. The technique is similar to that for defect reporting, except the comments take the form:

		@  text #link#location text @

The "@" represents the comment delimiters. "Text" is optional and represents arbitrary text (although it is really intended for embedding html commands to provide hyperlinks to the specified document). "Location" describes the document and section where the associated information can be found.

Examples:
C/C++:

/* #link#User's Guide Section 3.1 */
// #link#Program Design Document, Page 5 //

Pascal:

(* #link#Funcs.pas module, "xyz" function *)
{ <A HREF="DesignDoc.html#xyzfunc"> #link#xyzfunc </a> }
Guideline:
If a module contains some cross references to other documents, there should be a comment that takes the form "@ text #link#location text @" that provides the reference to that other document. In this comment, the "@" represents the language's comment delimeter(s), "text" represents some optional text (typically reserved for html tags), and "location" is some descriptive text that describes the document (and a position in that document) related to the current section of code in the program.

 


 

 

Hit Counter

Home ] Up ]

Send mail to webmaster@amscitech.com with questions or comments about this web site.
Copyright © 1997 - 2006 American Science Institute of Technology