Table of Contents

Preface

This style guide is a distillation of my own experience with additional examples and verbiage compiled from several sources, including:

In many cases, the above cited references conflict with each other, and with what I have observed as preferable practice. In those situations, I've done my best to state the justification for the recommendation I made.

Fred Koschara
May 2013

Introduction

Source code development usually involves balancing requirements, some of which are often in conflict. For example, from a computer's perspective, the best source code requires the least interpretation at run time to achieve maximum efficiency. From a network's perspective, the best source code has the least total number of bytes transmitted to achieve maximum efficiency. For someone doing mainenance, source code that is neatly formatted, well commented, and blocked into logical groups of statements will require the least effort to do their job, and is therefore most efficient. For the original programmer, writing code in a creative streak, using only enough structure to keep the code organized in their mind will seem to be the most efficient pattern - until it's time to debug the code, and to insure all of the requirements are met. When that mindset change occurs, the original programmer suddenly finds their needs to be quite similar to those of the maintainer, and stopping to fix the code structure and add documentation makes sense.

For compiled languages such as C and C++, the conflict between the formatted source and least interpretation requirements is resolved by the compiler, but that adds at least one more step to the development process, and makes the code able to run on only one type of platform. Languages that use a just-in-time (JIT) "compiler" such as PHP, Python and Java or C# don't require additional steps in development, but the "compiler" must be run each time the code is used, generally producing "bytecodes" that are then interpreted by a virtual machine. This reduces portability issues but increases the processing requirements. If a bytecode cache is used, the JIT compiler only has to be run when a source code change is detected, but the bytecode interpreter is still less efficient than running native code directly on the underlying hardware. In most cases, the difference is negligible, and the increased portability is used to justify the loss of efficiency.

Network efficiency is most often increased by using compression to reduce the amount of data transmitted, but any compression algorithm has to be in place on both ends of the wire (the receiver must use a matching decompression algorithm to recover the original data without loss or corruption) and increases the computation overhead. Compression patterns are generally established in server configuration, and are therefore out of the scope of program development. What an application developer needs to be concerned with, however, is reducing the total amount of data presented to the compressor in the first place. Things such as eliminating extraneous whitespace may not make much of a difference on a page that is only viewed infrequently, but for a Web site with millions of page views a month, it could make as much as a 10% reduction in network traffic.

In order to decide where the "best" balance can be found, a major consideration is how the lowest total lifetime cost of a piece of software can be achieved. For long-lived products, having neatly formatted and well commented source code is a major consideration because the cost of maintenance skyrockets every time someone has to figure out what the existing code is doing. Over the life of a properly maintained program, maintenance costs can be expected to far outweigh the original cost of development. For a throw-away piece of code that will only be used once, it may make sense to forgo formatting and documentation, but doing so can lead to bad programming habits, and if the code is retained as part of the documentation, or could be used "occasionally" rather than once, it should be treated as carefully as any other part of the system.

Similarly, expending original development effort to achieve maximum network performance versus reducing development time should be decided in light of reducing the total lifetime cost of the application. With a very high traffic Web site, sending neatly formatted HTML code to the browser could significantly increase transmission costs, especially for uncompressed traffic. However, the higher cost of discovering an error in an unformatted page could be more than the savings from eliminating whitespace sent to the browser. As the number of times a particular page is served declines, the savings from eliminating HTML formatting drop, but maintenance costs are likely to stay the same. Thus it makes sense to try to send properly formatted HTML code when serving Web pages in nearly every case. Most of what a Web developer can do to reduce network overhead involves things like using tabs rather than spaces, and commenting out code in PHP rather than in HTML so that the comments are never transmitted.

Balancing requirements for source formatting, computation and network overhead is only one example of resolving conflicts during program development. Other issues are beyond the scope of this document, however, so herein those will be the issues we are addressing.

Miscellaneous Considerations

Consistent user experience

Within an application, pages should have a consistent look-and-feel so a new user can quickly learn their way around. Having pages that react and/or display information in similar ways makes finding information and using it easier for both novices and experienced users.

Readability

While the original author of a Web page may have a clear understanding of how the code is constructed when they first finish writing it, the structure and functionality may be obscure to another developer who is given the task of debugging or enhancing the page as interactions with other resources and requirements change. Even the original author may have difficulty following the code after a significant time has passed and they have been working on other tasks. This one of the major reasons why software maintenance costs frequently far outweigh the costs of development. Following this Style Guide when writing source files will make following the structure of a source file easier to follow.

The original source code for any given program should be human readable. The parsers, interpreters and/or compilers for nearly every programming language ignore whitespace outside of quoted strings. There are rare exceptions, e.g. Python where indentation is used to define logical structure rather than delimiters or control structure keywords. However, for every other language whitespace is allowed for the convenience of human readers and writers. Use the whitespace to make your source code easier to comprehend. Not only will it make maintaining the program easier, but it will make looking at it as it is being written and knowing it functions correctly much simpler, feasible even for large, complex projects.

While it is frequently used to perform server operations not directly visible to a user, PHP is also used to generate source code - HTML, Javascript, CSS - that is sent to a browser for interpretation. Users occasionally look at the source code for Web pages, and skilled QA personnel and maintainers will use the source sent to the browser as a tool in their efforts. Without paying attention, it is trivial to get PHP to create Web source code that is nearly incomprehensible - badly formatted HTML with random line lengths and no rhyme or reason for use of whitespace is a frequent result.

There is no excuse for such a mess: PHP can and should be used to generate HTML code that is just as readable as the original PHP code itself. It doesn't take a lot more work to insure the generated code is properly formatted in the first place. Once it's done, no more work is required to create comprehensible HTML code unless the PHP source is changed: The server will just as happily emit properly formatted Web source code as not. Use the power of automation to write a better Web!

XHTML compliance

As of this writing (May 2013), most of MIT Sloan’s existing PHP Web pages include a DOCTYPE specification indicating they are compliant with XHTML 1.0 Strict, the most restrictive HTML standard. However, a very large number of the pages are not actually compliant with the DTD that defines the standard, which could result in cross-browser compatibility issues: The fact that some (or most) browsers ignore or fix up coding errors doesn't mean they all do (e.g., Internet Exploiter frequently has its own ideas of how things should work). Validated HTML code is most likely to function in the greatest number of browsers. Using a validation tool during development is strongly advisable to eliminate errors and warnings, such as the HTML Validator add-on for Firefox.

PHP errors

Code must run error free and not rely on warnings and notices to be hidden to meet this requirement. For instance, never access a variable that you did not set yourself (such as $_POST array keys) without first checking to see that it isset(). When developing code, check the Apache error log frequently to catch problems early, and eliminate warnings and errors as they are introduced.

PHP should be configured to report as many errors and warnings as possible: Use E_ALL|E_NOTICE, or preferably E_ALL|E_NOTICE|E_STRICT. With error logging enabled and display errors disabled, even a badly written script won't tell the world everything that's wrong, but the problems will be logged so they can be corrected. If a sufficiently high level of reporting is not configured in php.ini, an error_reporting() call at the start of each script can fix the problem.

Debugging code

No debugging code can be left in place for when pushing to the production server unless it is commented out, i.e. no var_dump() or print_r() calls, and no die() or exit() calls that were used solely during development, unless they are commented out.

Example URLs

Use example.com, example.org and example.net for all example URLs and email addresses, per RFC 2606.

Every source file should have a copyright declaration, even if it is a "copyleft" declaration placing the code into the public domain. This will avoid ambiguities about the author's intent. Note that under U.S. copyright law once a document has been published without making a copyright claim, it cannot be legally copyrighted in the future. Therefore, if any copyrights are going to be reserved, any documents (including source code) must have a copyright declaration affixed before they are first made publicly available.

Similarly, every Web page sent to a browser should have a copyright declaration embedded in the HTML <head> section of the document and a visible copyright claim statement if copyrights are being reserved for the page and/or its contents.


The Style Guide


No Shorthand PHP tags

Never use shorthand PHP start tags. Always use full PHP tags. This is important because

  1. short tags rely on deprecated PHP configuration
  2. short tags violate XML specifications (“<?” starts a PI (processing instruction) and must be followed by a name)
  3. full tags are more readable
<?php
// INCORRECT: ?>
<?php 
... ?>
<?php 
echo $var ?>

<?php
// CORRECT: ?>
<?php 
... ?>
<?php 
echo $var ?>

Semicolons End Statements

PHP generally uses semicolons “;” to mark the end of a statement. However, if the PHP closing tag is on the same line after a statement, the semicolon is optional and should not be used.

A code block, enclosed in braces “{}” is not a statement, it is a group of statements. Each statement within the code block must be terminated with a semicolon, but the code block itself is NOT terminated with a semicolon.

No PHP Closing Tag At EOF

The closing “?>” tag at the end of a PHP file is optional to the PHP parser and is not required. However, if used, any whitespace following the closing tag, whether introduced by the developer, user, or an FTP application, will be immediately written to the output. This will prevent any more headers from being sent to the browser, can cause PHP errors, and if the latter are suppressed, blank pages. Leaving it out reduces the processing necessary for the module.

For source files that end in PHP mode (as opposed to HTML mode), omit the closing PHP tag and instead use a comment block to mark the end of file:

<?php
//
// EOF: filename.php

where “filename” is replaced with the name of the source file. There should be one newline after the last non-empty line in the file: When the cursor is at the very end of the file, it should be one line below the closing text.

Having the “end of file” marker in place will make it easier to find unwanted truncations of the file that might accidentally occur.

Line Lengths

Source code statements should always be limited to 80 characters per line, taking into account expansion of tabs to four character spaces. While wide desktop display screens and horizontally scrolling editors make it possible to create and view longer lines in an original development environment, the longer lines are more difficult to comprehend, and will cause wrapping issues on less capable displays. (The first time you open a file with long lines in a terminal window on your smart phone you'll understand the problem.) In addition, when attempting to do side-by-side comparisons of different versions of the same file, the longer lines will either cause wrapping or scrolling issues in the comparison utility.

If necessary, an empty PHP segment can be inserted to wrap a long line of HTML code and keep it within the 80 character line length.

<?php
// For example: ?>
<a href="http://www.example.com/n/deeply/nested/page.php" rel="external" <?php
?>
class="anchor-class-name">The Anchor Text</a>

There will be times when limiting lines to 80 characters will be impossible, such as within a "heredoc" where there isn't a mechanism available for folding long lines. Such cases are generally rare, so following the 80 character limit usually should not be an issue.

One Statement Per Line

Do not combine statements on one line. Doing so reduces the line count (volume) but increases the line density, and more dense lines are harder to comprehend. In addition, the fact that multiple statements are combined on one line may be overlooked when quickly reading code during maintenance, resulting in confusion at best, and quite probably, leading to insertion of additional errors.

<?php
// INCORRECT:
$foo='this'$bar='that'$bat=str_replace($foo,$bar,$bag);

// CORRECT:
$foo='this';
$bar='that';
$bat=str_replace($foo,$bar,$bag);

There are a few cases where this rule may be broken to improve readability of the code. The permitted exceptions are:

Whitespace

The PHP parser ignores whitespace outside of quoted strings. Whitespace is solely used for the convenience and comprehension of human readers.

No whitespace can precede a file's opening PHP tag or follow a closing PHP tag: extraneous whitespace at the boundaries of your files can cause output to begin before it is supposed to, leading to errors and, potentially, blank pages.

Whitespace is required after the PHP start tag - a single space, one or more tab characters, or a newline. For single line statements or control structures where an opening brace starts the next line, the start tag and statement can be on the same line - use a space as the separator if the current indentation level is less than two tabstops from the left margin, otherwise use the correct number of tabs to indent the statement for the current indentation level.

A single space should be used between the PHP code and the close tag unless the close tag is at the start of a new line where there should be no leading space.

In general, parenthesis and brackets should not use any additional spaces. The exception is a space must always follow PHP control structure keywords that take arguments with parenthesis (declare, do-while, while, if-else, switch, for, foreach), to help distinguish them from function calls and increase readability. Function names should not have any whitespace between them and the parentheses enclosing their argument list. When referring to array items, never use spaces around the index.

<?php
// INCORRECT:
foreach( $query->result() as $row )
// CORRECT:
foreach ($query->result() as $row// single space after PHP keyword, not within parenthesis

// INCORRECT:
function Foo $bar )
{
}
// CORRECT:
function Foo($bar// no spaces around parenthesis in function declarations
{
}

// INCORRECT:
$arr$foo ] = 'foo';
// CORRECT:
$arr[$foo]='foo'// no spaces around array keys

Remove trailing spaces

Remove trailing spaces at the end of each line of code. Extraneous whitespace at the end of a line serves no useful purpose, may cause diff errors, and increases network overhead. Your text editor should have an option to assist in meeting this requirement.

Alignment of assignments

To support readability, the equal signs may be aligned in block-related assignments:

<?php
$short     
=foo($bar);
$longername=foo($baz);

The rule should be broken when the length of the variable name is at least eight characters longer or shorter than the surrounding ones:

<?php
$short
=foo($bar);
$thisVariableNameIsVeeeeeeeeeeryLong=foo($baz);

When PHP emits whitespace

One thing that needs to be understood to have PHP generate readable HTML code is when PHP sends whitespace to the browser.

Any whitespace to the left of a PHP opening tag will be emitted - once. This means that if you indent an opening tag for an include statement, the first line, and only the first line, of any HTML or text in the included file will be indented by the whitespace to the left of the opening tag: PHP does not interpret indenting an include statement to mean "indent each line in the file by the amount the include statement was indented by."

On the other hand, unless there are non-whitespace characters on the line after a PHP closing tag, PHP ignores any whitespace up to and including the newline terminating the source statement. This frequently results in run-on HTML statements with embedded tabs where the code's author was expecting nicely formatted output:

<?php
// expected newlines are not emitted ?>
    <tr>
        <td>
            <?php /* intentional indent */ echo $some_variable ?>
        </td>
    </tr>
<?php // yields this HTML output: ?>
    <tr>
        <td>
            some_variable's value        </td>
    </tr>

<?php // better code would be: ?>
    <tr>
        <td><?php echo $some_variable/* non-whitespace after close tag */ ?></td>
    </tr>
<?php // which yields this HTML output: ?>
    <tr>
        <td>some_variable's value</td>
    </tr>

Indentation

The PHP parser doesn't care about indentation, it is solely used for the convenience and comprehension of human readers. Consequently, indentation should be used to enhance readability of the source code. Three simple rules underly the rest of the indentation patterns to be used:

  1. don't indent unnecessarily
  2. indent on purpose
  3. indent consistently

Your indentation should always reflect the logical structure of the code.

At the start of a code module, the code is at the [local] root level - so start all code lines in column 1 (the leftmost column of the page). Each time a new nesting level is entered, whether braces or parentheses are present or not, indent one tab, and use that level of indentation until the nesting level changes - whether indenting another tab to enter another nesting level, or outdenting when leaving the current nesting level.

Most people find that 4-space tabstops provide the best balance between making indentation visible and using screen space. By using tabs, rather than spaces, for indentation, that preference can be adjusted without reformatting the code: If you think four spaces is too much for each indentation level, set your editor to use two or three space tabstops and the tab width will adjust to suit your view. On the other hand, if you want more indentation, you can use eight space tabstops with the same file. Just be sure that when you save the file that you are using TAB characters, not SPACES, for the indentation written to permanent storage.

Within switch statements, the switch statement is the parent indentation level, the case statements (including the default statement, if present) are the next indentation level, and the action statements within each case, including the break statement, are at the next indentation level. Thus, the correct indentation structure for a switch block is:

<?php
switch (condition)
{
    case 
1:
        
action1;
        break;

    case 
2:
        
action2;
        break;

    default:
        
defaultaction;
        break;
}

Use real tabs and not spaces, allowing the most flexibility across editors and operating systems. An acceptable exception is if you have a block of code that would be more readable if things are aligned, use spaces:

<?php
[tab]$foo  ='somevalue';
[
tab]$foo2 ='somevalue2';
[
tab]$foo34='somevalue3';
[
tab]$foo5 ='somevalue4';

For associative arrays, values should start on a new line. Note the comma after the last array item: This is recommended because it makes it easier to change the order of the array, and makes for cleaner diffs too. (Unlike Javascript running in InternetExploiter, PHP ignores a trailing comma in an array declaration.)

<?php
$my_array
=array
([
tab]'foo'  =>'somevalue',
 [
tab]'foo2' =>'somevalue2',
 [
tab]'foo3' =>'somevalue3',
 [
tab]'foo34'=>'somevalue3',
);

The rule of thumb here is that tabs should be used for indentation at the beginning of the line and spaces for alignment within the line.

When concatenating strings in an assignment, long lines should be broken at clauses to improve readability or if the line length limit would be exceeded. In these cases, each successive line should be padded with white space so the "." operator is aligned under the "=" operator:

<?php
$sql
='SELECT id,name FROM people '
    
."WHERE name='Susan' "
    
.'ORDER BY name ASC';

Exceptions: try and catch

The exception handling try and catch mark control structure blocks, just as if and else do. They are indented to the same level as the surrounding code, with try and catch aligned with each other, and with the braces surrounding their code blocks:

<?php
try
{
    
// code that might fail
}
catch (
FirstExceptionType $e)
{
    
// first catch body
}
catch (
OtherExceptionType $e)
{
    
// other catch body
}

Interspersed PHP and HTML

When HTML and PHP are interspersed, ALWAYS put PHP start tags (“<?php”) at the left margin unless one of these specific conditions exists:

  1. the PHP code will be emitting [HTML] which should be indented. In this case, include a comment within the PHP to indicate the indentation is intentional:
    <?php
            
    <p><strong>Area</strong><br />
                <?
    php /* intentional indent */ echo $person['AREANAME'?></p>
  2. the PHP code is emitting output inline
    <?php
    <a href="<?php echo $theLink ?>"><?php echo $theAnchorText ?></a>

When HTML and PHP are interspersed, indent the PHP statements at the indentation level that would be in force if there were no HTML tags: The HTML and PHP codes should be considered as maintaining separate indentation levels.

Parentheses

As a general rule, only use parentheses where they are required. Additional parentheses may be used to clarify groupings in complex conditional constructs, but knowing operator precedence should eliminate their necessity.

Do not use parentheses when using language constructs such as echo, print, include, or require. These are not functions and don't require parentheses around their parameters.

When calling class constructors with no arguments, always include parentheses: The constructors are functions, so constructor calls need to look like function calls.

Braces

Use Allman style indenting, or preferably, Horstmann style (a.k.a. "compacted Allman") indenting. Braces are never at the end of a line, but rather always placed on a new line, and indented at the same level as the control statement that "owns" them. This makes it easier to find the matching braces and provides a logical view of the structure of the code. Code within a block enclosed by braces must be indented one level from the surrounding code, and all statements at the current nesting level begin in the same vertical column of the page. This makes it easier to identify the structure of the code.

With Allman style code, braces are always on a line by themselves. In Horstmann style, the opening brace is also at the same indentation level as the parent statement, but is followed by a tab and the first (or only) statement within the child block.

<?php
// INCORRECT:    // K & R
function Foo($bar) {
    
// ...
    // ...
}
foreach (
$arr as $key => $val) {
    
// ...
    // ...
}
if (
$foo == $bar) {
    
// ...
    // ...
}
else {
    
// ...
    // ...
}
for (
$i 0$i 10$i++) {
    for (
$j 0$j 10$j++) {
        
// ...
    
}
    
// ...
}
   
<?php
// CORRECT, PREFERRED:    // Horstmann
function Foo($bar)
{   
// ...
    // ...
}
foreach (
$arr as $key=>$val)
{   
// ...
    // ...
}
if (
$foo == $bar)
{   
// ...
    // ...
}
else
{   
// ...
    // ...
}
for (
$i=0$i 10$i++)
{   for (
$j=0$j 10$j++)
    {   
// ...
    
}
    
// ...
}
   
<?php
// CORRECT, ACCEPTABLE:    // Allman
function Foo($bar)
{
    
// ...
    // ...
}
foreach (
$arr as $key=>$val)
{
    
// ...
    // ...
}
if (
$foo == $bar)
{
    
// ...
    // ...
}
else
{
    
// ...
    // ...
}
for (
$i=0$i 10$i++)
{
    for (
$j=0$j 10$j++)
    {
        
// ...
    
}
    
// ...
}

If you have a really long block, consider whether it can be broken into two or more shorter blocks or functions. If you consider such a long block unavoidable, put a short comment at the end on the same line the closing brace so people can tell at glance what that ending brace ends. Typically this is appropriate for a logic block, longer than about "35" rows, but any code that’s not intuitively obvious should be commented.

<?php
if (some_condition && !some_other_condition)
{   
// ...
    // ...
    // ...
}   // end if (some_condition && !some_other_condition)

while (yet_other_condition)    // describe how this happens
{   // ...
    // ...
    // ...
}   // end of "how this happens"

Single line blocks can omit braces for brevity as long as the indentation rules are followed as though the braces were present. The only exception is single statement else clauses are preferably written on the same line as the else keyword:

<?php
if (condition)
    
action1();
else if (
condition2)
    
action2();
else 
action3();

If and else

if and else are words, "elseif" is not. Always use else if when specifying an alternate branch in an if control structure, not elseif.

Alternative control structure syntax

PHP supports using an alternative syntax for some of its control structures - if, while, for, foreach, and switch. In each case, the basic form of the alternate syntax is to change the opening brace to a colon (:) and the closing brace to endif;, endwhile;, endfor;, endforeach;, or endswitch;, respectively. Using the alternative syntax yields code EXTREMELY difficult to grasp at a glance (unless you are a BASIC programmer, maybe), especially when interspersed with HTML. It also requires using a whole set of additional keywords instead of consistent braces.

<?php
// INCORRECT:
?>
<table>
    <tbody>
<?php
foreach ($foo as $bar) : ?>
        <tr>
<?php
    
if ($bar == 'example') : ?>
            <th>My Heading</th>
<?php
    
else : ?>
            <td>My Data</td>
<?php
    
endif; ?>
        </tr>
<?php
endforeach; ?>
    </tbody>
</table>
<?php
// CORRECT:
?>
<table>
    <tbody>
<?php foreach ($foo as $bar)
?>
        <tr>
<?php if ($bar == 'example')
    { 
?>
            <th>My Heading</th>
<?php
    
}
    else
    { 
?>
            <td>My Data</td>
<?php
    
?>
        </tr>
<?php
?>
    </tbody>
</table>

When considering using the alternative syntax, keep in mind a simple rule: DON’T DO IT!!

Single and Double Quotes

Single-quoted strings are not examined for escape sequences or embedded variable names and therefore require less processing when a page is being parsed. Always use single quoted strings unless you need variables or escape sequences parsed, or to avoid excessive quote escaping. In most cases where you would want variables embedded in a string parsed, it is preferable to use single-quoted strings concatenated to either side of the variable which is faster to parse. Use double-quoted strings if the string contains single quotes so you do not have to use escape characters.

<?php
// INCORRECT:
"My String" // no variable parsing, so no use for double quotes
"My string $foo// not optimal
'SELECT foo FROM bar WHERE baz=\'bag\'' // ugly

$foo='something';
"My string is $something_else"     // PHP looks for $something_else
"My string is ${something}_else"   // ugly and unobvious

// CORRECT:
'My String'
'My string '
.$foo   // string catenation is faster than embedding variables
"SELECT foo FROM bar WHERE baz='bag'"

$foo='something';
'My string is '.$something.'_else' // no ambiguity

Comments

In general, code should be commented prolifically. It not only helps describe the flow and intent of the code for less experienced programmers, but can prove invaluable when returning to your own code months down the line. Non-documentation comments - those which describe the logic and flow of the program, rather than the interface with the rest of the system - are strongly encouraged. A general rule of thumb is that if you look at a section of code and think "Wow, I don't want to try and describe that," you need to comment it before you forget how it works.

C style comments (delimited by /* */) should be used for creating large comment blocks, comments within a line of PHP code, or when commenting out sections of code. C++ style inline comments (delimited by //) may be used when the comment extends through the remainder of the current line, or for commenting out single PHP statements. Do not use Perl/shell style inline comments (delimited by #).

When adding end of line comments, separate the code statement from the comment delimiter using a single tab. If multiple statements in a group have end of line comments attached, the comment delimiters can be tab aligned to improve readability.

Do not use C++ style inline comment markers (//) at the start of a series of lines for a multi-line comment: It look sloppy, takes more typing, increases the file size, and it requires more processing power, since the interpreter has to repeatedly go in and out of its "parsing a comment" mode.

Further, N.B. - Do NOT use // comment delimiters to comment out blocks of code: It takes more work to comment/uncomment the block, in addition to looking sloppy. If you want to comment out a block of code, insert a /* before the block and a /**/ after it. Then, if you want to uncomment it, all you have to do is add a */ immediately after the opening mark, or remove the comment delimiters - one or two changes rather than having to modify every line that had been commented out.

It is sometimes useful to write a case statement that falls through to the next case by not including a break or return within the first case. To distinguish code so constructed from bugs, any case statement where break or return are omitted should contain a comment indicating that the break was intentionally omitted:

<?php
switch (condition)
{
    case 
1:
        
action1;
        
// no "break" here, we drop through

    
case 2:
        
action2;
        break;

    default:
        
defaultaction;
        break;
}

Docblock comments

Complete inline documentation comment blocks (docblocks) must be provided for files, classes and functions (including class methods). A docblock is a special type of comment that provides verbose information about an element in your code. The information can be used by developers to gain understanding of the purpose and operation of a given element. It can also be used by integrated development environments (IDEs) to provide hints and auto-completion, and by automatic tools to generate API documentation.

Two popular programs designed for reading docblocks to produce documentation are phpDocumentor and Doxygen.

Using consistently constructed docblocks makes "newly found" code easier to understand when doing maintenance, and simplifies re-use of existing code by providing a firm grasp of the interface, effects and results of a set of a functional block or class. This is an example docblock for a function:

<?php
/**
* brief description of the function
*
* (optional) longer description of the function, side effects, etc.
* the longer description usually spans more than one line.
*
* @param type $param1, what it's for
* @param type $param2, what it's for, default='something'
* @global type $global1, what's expected in the global variable
* @global type $global2, what's expected in the global variable
* @return type, description of the return value
*/
function FunctionName($param1,$param2='something')
{   global 
$global1,$global2;
    
$internalName=processed($param1,$param2);
    return 
$internalName;
}

Docblocks must also precede class and method declarations. In this example, some tags needed for publishing classes in repositories such as PEAR (the PHP Extension and Application Repository) are illustrated. They are not, however, required for internal use:

<?php
/**
* Super Class
*
* @package Package Name
* @subpackage Subpackage
* @category Category
* @author Author Name
* @link http://example.com
*/
class Super_class
{
    
/**
    * Encodes string for use in XML
    *
    * @access public
    * @param string
    * @return string
    */
    
function xml_encode($str)

Docblock format

In general, a docblock comment starts with a C-style comment start tag with an extra asterisk attached /** followed by a newline. Each docblock line starts with an asterisk under the slash of the comment start marker to provide visual continuity of the extent of the docblock. Do not space the column of asterisks over to line up under the first opening asterisk because editors with "smart indentation" (following the indentation of the line above when starting a new line) will improperly indent the first line of the comment or function declaration. Text within the docblock is separated from the column of asterisks by a single space, and the docblock is terminated with a standard C-style comment close tag */ under the column of asterisks and immediately above the code being described (no intervening blank lines).

A docblock generally contains three sections, separated from each other by a blank line (consisting solely of the required asterisk in the left column):

Documentation of any globals used by functions or methods is not required, but should be included.

An @global tag may be used in a docblock preceding the definition of a global variable. Only one @global tag is allowed per global variable docblock. A global variable docblock must be followed by the global variable’s definition before any other element or docblock occurs in the source. The name must be the exact name of the global variable as it is declared in the source.

<?php
/**
* short description of this variable
*
* longer description of the variable, e.g., where it's used and how it affects
* the rest of the application
*
* @global int $foo
*/
$foo=0;

Including Code

Anywhere you are unconditionally including a [class] file, use require_once. Anywhere you are conditionally including a [class] file (for example, factory methods), use include_once. Either of these will ensure that class files are included only once. They share the same file list, so you don't need to worry about mixing them - a file included with require_once is not be included again by include_once.

include_once and require_once are language constructs, not functions. Parentheses should not surround the subject filename.

TRUE, FALSE, and NULL

TRUE, FALSE, and NULL are PHP keywords that should always be written fully uppercase.

TRUE and FALSE are BOOLEAN values which express truth (or not), not defined constants with respective values of one and zero. Symbolic constants are specifically designed to always and only reference their constant value. Booleans are not symbolic constants, they are distinct values. TRUE happens to cast to integer 1 when you print it or use it in an expression, but it's not the same as a constant for the integer value 1 and you shouldn't use it as one. FALSE casts to empty when you print it or to zero if you cast it to an integer or use it in an expression. Again, it's not a constant for empty or zero, and should not be used as such.

<?php
echo FALSE;         // prints nothing - FALSE is empty
echo (FALSE);       // prints nothing - FALSE is empty
echo FALSE+FALSE;   // prints 0 - FALSE is cast to integer for addition
echo (FALSE+FALSE); // prints 0 - FALSE is cast to integer for addition
echo intval(FALSE); // prints 0 - FALSE is zero when explicitly cast
echo '"'.FALSE.'"'// prints "" - FALSE is empty

echo TRUE;          // prints 1
echo (TRUE);        // prints 1
echo TRUE+TRUE;     // prints 2
echo (TRUE+TRUE);   // prints 2
echo intval(TRUE);  // prints 1
echo '"'.TRUE.'"';  // prints 1

Similarly, NULL is a special value indicating the absense of anything, not a defined constant equal to zero. NULL, zero and FALSE are all empty, but they are not equivalent.

Logical Operators

Always use the || and && operators instead of the words OR and AND because the word operators have lower priority than assignment operators, which can lead to very unobvious logical errors. For example, what is the value of $z after this code sequence?

<?php
$x
=TRUE;
$y=FALSE;
$z=$y OR $x;

ANSWER: $z is FALSE because the last statement is equivalent to ($z=$y) OR $x rather than $z=($x OR $y) as would naively be expected. On the other hand, after this code sequence:

<?php
$x
=TRUE;
$y=FALSE;
$z=$y || $x;

$z is TRUE because the || operator has higher precedence than assignment operators.

Naming

When writing code meant to be shared across more than one application, global names (classes, functions, variables, defines) must be prefixed to prevent name collisions with PHP itself or other code. When selecting a prefix, pick one relevant to the code being developed, and isn't likely to clash with PHP.

Other than in names of constants, or to specify class hierarchy, underscores should not be used within names. They should only be used as a prefix for private members of classes (variables or methods).

Caution: PHP reserves all function names starting with two underscores (__) as magical. Do not use names starting with two underscores unless you want some documented magic functionality.

Constants and defines

Use all capital letters with underscores separating the words in a name.

<?php
define
('A_STRING_CONSTANT','Hello World!');
define('SOME_BOOLEAN',TRUE);
define('ZERO',0);

Functions and variables

Global function names should be ProperCased (a.k.a. StudlyCaps): They start with an uppercase letter, and each new word begins with an uppercase letter. Acronyms are treated as normal words when used as a name: The first letter is capitalized, others are lower case.

Variables should be named concisely, using camelCase (also known as bumpyCase). Make names descriptive without being overly long. Don't create new variables by appending an integer to an existing variable name. Removing vowels from variable names may shorten them, but don't remove so many that the name becomes incomprehensible: don't use indecipherable abbreviations.

<?php
$aGlobalVariable
=1;
$someThing=array();
function 
MyPublicFunction()

Classes and methods

Classes should be given descriptive names. Avoid using abbreviations where possible. Class names should be ProperCased, starting with an uppercase letter, and each new word begins with an uppercase letter. The PEAR class hierarchy is also reflected in class names where each level of the hierarchy separated with a single underscore.

Class variables (a.k.a. properties) and methods should be named using camelCase. Private class methods and variables that are only accessed internally by your class, such as utility and helper functions that your public methods use for code abstraction, should have their names prefixed with a single underscore.

<?php
class Log               // shows PEAR hierarchy
class Net_Finger        // shows PEAR hierarchy
class HTML_Upload_Error // shows PEAR hierarchy

class AMoreCompleteExample
{
    public 
$counter;                // public property
    
function connect()              // public method
    
function getData()              // public method
    
function buildSomeWidget()      // public method

    
private $_status;               // private property
    
private function _sort()        // private method
    
private function _initTree()    // private method

    
function convertText()          // public method
    
private function _convertText() // private method

Filenames

PHP requires class definitions to be contained within a single file. When writing classes designed for reuse beyond the page the were originally written for, each class should be in its own file whose name is the same as the class. Several closely related classes may be included within one file, e.g., a base class and subclasses directly derived from it. In such a case, the file should be named for the base class.

When several classes are defined specifically for use in a single page, they should be put in a file in an include directory under the one containing the page's script. The filename should then be the page's name with _class appended, e.g., index.php would use classes found in include/index_classes.php.

Arrays

Assignments in arrays may be aligned. When splitting array definitions onto several lines, the last value should also have a trailing comma. This is valid PHP syntax and helps to keep code diffs minimal.

Function Declarations

Function declarations follow the Horstmann (preferred) or Allman style:

<?php
function FooFunction($arg1,$arg2='')
{   if (
condition)
    {   
statement;
    }
    return 
$val;
}

Whenever appropriate, provide default values for function arguments to reduce the overhead required for calling the function. Do not use default values for required parameters to avoid having errors logged when they are not supplied: If a parameter is truly required to be supplied at run time, the logged error message will help debug the faulty code, rather than masking it through use of a default value.

As required by the PHP language specification, arguments with default values must follow ones that do not have default values. When deciding the order of arguments with default values, consider which one(s) will be changed most often: If a non-default value is needed for a function call, all of the default values to the left of the one being modified must be specified with their default value to preserve their default behaviour. Consequently, defaults that are most likely to be kept should be rightmost in the argument list, and the most often changed specified first (leftmost).

<?php
function MyFunction($required,$changed='often',$mostly='static')
{    return 
$required.' = '.$changed.' '.$mostly;
}
MyFunction();                   // returns " = often static" (NULL used for $required)
        // also "PHP Notice:  Undefined variable: $required" is written to the log file
MyFunction(1);                  // returns "1 = often static"
MyFunction(2,'frequently');     // returns "2 = frequently static"
MyFunction(3,'sometimes');      // returns "3 = sometimes static"
MyFunction(4,'often','noise');  // returns "4 = often noise"
MyFunction(5,'screeching');     // returns "5 = screeching static"
MyFunction(6,'','screeching');  // returns "6 =  screeching"

Always return a meaningful value from a function if one is appropriate. If any code branches return a value from a function, all branches MUST return a value. N.B. The functions used as illustration here will not work as expected - see the notes re. booleans above in the TRUE, FALSE, and NULL section.

<?php
// INCORRECT:
function Broken($param)
{    
// ...
    
if ($param == 'true')
        return 
TRUE;
    else if (
$param == 123)
        return 
$param;
    
// error: NULL is returned for any other $param values
}

// CORRECT:
function Fixed($param)
{    
// ...
    
if ($param == 'true')
        return 
TRUE;
    else if (
$param == 123)
        return 
$param;

    return 
'error: invalid param value: '.strval($param);
}

Functions with many parameters may need to be split onto several lines to keep within the 80 characters per line limit. The first parameters may be put onto the same line as the function name if there is enough space. Subsequent parameters on following lines are to be indented two tab stops. The closing parenthesis immediately follows the last parameter. The opening brace follows on the next line, at the same indentation level as the "function" keyword.

<?php
function MyVeryLongFunctionName($firstRequiredParameter,$secondRequiredOne,
        
$thirdRequiredParameter,$firstOptionalOne=TRUE,$lastOptional=NULL)
{   
// code starts here
    // ...
}

Function Calls

Functions should be called with no spaces in the statement. For example:

<?php
$var
=foo($bar,$baz,$quux);

As displayed above, there should be no spaces on either side of an equals sign used to assign the return value of a function to a variable. In the case of a block of related assignments, more space may be inserted left of the equals sign to promote readability:

<?php
$short        
=foo($bar);
$long_variable=foo($baz);

Return Values and Typecasting

Some PHP functions return FALSE on failure and also have return values which evaluate to FALSE in loose comparisons, such as an empty string or zero. Be explicit by comparing the variable type when using these return values in conditionals to ensure the return value is indeed what you expect, and not a value that has an equivalent loose-type evaluation.

Use the same stringency in returning and checking your own variables. Use the === and !== comparison operators as necessary.

<?php
// INCORRECT:
/* If 'foo' is at the beginning of the string, strpos will return a 0,
 * resulting in this conditional evaluating as TRUE
 */
if (strpos($str,'foo') == FALSE)
// CORRECT:
if (strpos($str,'foo') === FALSE)

// INCORRECT:
function build_string($str='')
{
    if (
$str==''// uh-oh! What if FALSE or the integer 0 is passed as an argument?
    
{
    }
}
// CORRECT:
function build_string($str='')
{
    if (
$str==='')
    {
    }
}

Typecasting has a slightly different effect which may be desirable. When casting a variable as a string, NULL and FALSE become empty strings, zero (and other numbers) become strings of digits, and TRUE becomes '1':

<?php
$str
=(string) $str// cast $str as a string

SQL Queries

SQL keywords are always capitalized: SELECT, INSERT, UPDATE, WHERE, AS, JOIN, ON, IN, etc.

Break up long queries into multiple lines for legibility, preferably breaking for each clause. Use string concatenation to allow alignment of the clauses in the source code without introducing extraneous whitespace into the SQL query.

<?php
// INCORRECT:   // keywords are lowercase and query is too long for a
                // single line (... indicates continuation of the line)
$query $this->db->query("select foo, bar, baz, foofoo, foobar as raboof, foobaz from exp_pre_email_addresses
...where foo != 'oof' and baz != 'zab' order by foobaz limit 5, 100"
);
// CORRECT:
$query=$this->db->query'SELECT foo,bar,baz,foofoo,foobar AS raboof,foobaz '
                        
.'FROM exp_pre_email_addresses '
                        
."WHERE foo!='oof' AND baz!='zab' "
                        
.'ORDER BY foobaz LIMIT 5,100');

Design Patterns


Don't Start With A Blank Page

When creating a new Web page, use a prototype to lay out the file structure and implement common elements of the presentation. Using a template helps to insure the file format will be more readily grasped by new readers, and makes it easier to remember to include all of the necessary details.

Three prototype files are available in the /shared/inc directory that should be used when constructing any new pages:

These three prototypes all use PageFrame.php (View Source) to build the overall HTML framework for the page. They are heavily commented with instructions for what needs to be placed where and how to customize them for a particular application. In the simplest case, setting the page title and adding some content is all that will be needed to create a new Web page. Additional code can be added as needed to create any level of complexity desired, yet still fit within the overall design framework of the MIT Sloan site.

Separate Logic and Presentation

Avoid heavy logic within presentational code (HTML). While some processing and logic often needs to be done when it is nestled within a tag soup of HTML, avoid making in-page coding complex. One should not be doing more than basic foreach (), if (), and $obj->get*() within the presentation parts of the PHP document source.

Use PHP To Disable HTML

When parts of an HTML page are [temporarily] disabled, they should be commented out using PHP rather than HTML comments: If the code is disabled by PHP, it will not be sent to the browser, reducing network traffic, and eliminating false "hits" by search engines.

In general, HTML comments should only be used to provide hidden information in the HTML source code that may be useful to debuggers and maintainers, such as the date the last time the page was edited.

<?php
define
('LastModified','April 9, 2013 @ 5:29 pm');
/*
 *    Copyright 2013 by MIT Sloan School of Management.  All rights reserved.
 *
 *    $Id: /path/to/page.php,v $
 */ 
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <title><?php echo $pagetitle ?> - MIT Sloan</title>
</head>
<body>

<ul>
<?php // INCORRECT:    // this gets sent to the browser ?>
    <!--<li>November 1, 2012
    <br/>
    <a href="mapsearch=Wong+Auditorium">Wong Auditorium</a>, 12 Noon<br/>
    Light lunch, 11:30am.</li>-->
<?php // CORRECT:    // this stays on the server ?>
<?php 
/*
    <li>November 1, 2012
    <br/>
    <a href="mapsearch=Wong+Auditorium">Wong Auditorium</a>, 12 Noon<br/>
    Light lunch, 11:30am.</li>
*/ 
?>
</ul>

<?php // CORRECT: maintenance documentation ?>
<!-- <?php echo LastModified ?> -->
</body>
</html>
<?php
//
// EOF: page.php

Return Early

To enhance readability of functions and methods, it is wise to return early if simple conditions apply that can be checked at the beginning of a method: It's better to return early, keeping indentation and the brain power needed to follow the code lower.

<?php
// INCORRECT:    // simple handler gets lost, more indentation
function foo($bar,$baz)
{
    if (
$GLOBALS['foo'])
    {
        
//assume
        //that
        //here
        //is
        //the
        //whole
        //logic
        //of
        //this
        //method
        
return $calculated_value;
    }
    else return 
null;
}
?>
   
<?php
// CORRECT:    // simple handler is obvious, cleaner code
function foo($bar,$baz)
{
    if (!
$GLOBALS['foo'])
        return 
null;

    
//assume
    //that
    //here
    //is
    //the
    //whole
    //logic
    //of
    //this
    //method
    
return $calculated_value;
}

Split Long Statements

Split long if statements

Long if statements may be split onto several lines when the number of character per line limit would be exceeded. The condition clauses are moved to following line(s), indented one tab. Logical operators (&&, ||, etc.) should be aligned under the opening clause to make it easier to comment (and exclude) the condition. When splitting statements this way, the closing parenthesis may be on its own line, positioned under the opening parenthesis. The opening brace for the conditional code goes on the next line, aligned with the start of the statement.

Keeping the operators at the beginning of the line has two advantages: It is trivial to comment out a particular line during development while keeping syntactically correct code (except for the first line). It also keeps the logic at the front of the code where it's more readily observed. Scanning such conditions is very easy since they are aligned below each other.

<?php
if ((  $condition1
    
|| $condition2
    
)
&&  
$condition3
&&  $condition4
   
)
{
    
//code here
}

The first condition may be aligned to the others.

<?php
if ($condition1
||  $condition2
||  $condition3)   // closing parenthesis on the last clause's line is OK
{
    
//code here
}

When an if statement is really long enough to be split, it might be better to simplify it. In such cases, you could express conditions as variables and compare them in the if statement. This yields "naming" and splitting the condition sets into smaller, better understandable chunks. The disadvantage in doing so is the increased processing overhead needed to create the additional variables, which should be avoided within loops.

<?php
$is_foo
=($condition1 || $condition2);
$is_bar=($condition3 && $condtion4);
if (
$is_foo && $is_bar)
{
    
// ....
}

Ternary operators

The same rule as for if statements also applies for the ternary operator: It may be split onto several lines, keeping the question mark and the colon at the front.

<?php
$a 
$condition1 && $condition2
    
$foo $bar;
$b $condition3 && $condition4
    
$foo_man_this_is_too_long_what_should_i_do
    
$bar;

Split function call on several lines

The style guide permits a maximum line length of 80 characters. When calling functions or methods with many parameters it may be impossible to respect the line limit. In that case, splitting the function calls between parameters is needed

Several parameters per line are allowed, filling the line as much as possible. Subsequent parameter lines need to be indented one tab compared to the level of the function call. If there is room for one or more parameters on the function call line, the opening parenthesis is between the function name and parameter name as usual. The closing parenthesis then immediately follows the last parameter:

<?php
$this
->someObject->subObject->callThisFunctionWithALongName($parameterOne,
    
$parameterTwo,$aVeryLongParameterThree);

If the function call and the first parameter will not fit on the same line, the opening parenthesis is aligned with the function call on the next line, followed by a tab and the first parameter. Subsequent parameters can be used to fill the rest of the line, or can be specified on following lines, indented to fall under the first parameter. The closing parenthesis then goes on a line after the last parameter, aligned with the opening parenthesis.

The same applies not only for parameter variables, but also for nested function calls and for arrays.

<?php
$this
->someObject->subObject->callThisFunctionWithALongName
(   $this->someOtherFunc
    
(   $this->someEvenOtherFunc
        
(   'Help me!',
            array(
'foo' =>'bar',
                  
'spam'=>'eggs'),
            
23
        
),
        
$this->someEvenOtherFunc()
    ),
    
$this->wowowowowow(12)
);

Nesting those function parameters is allowed if it helps to make the code more readable, not only when it is necessary when the characters per line limit is reached.

Using fluent application programming interfaces often leads to many concatenated function calls. Those calls may be split onto several lines. When doing this, all subsequent lines are indented by one tab and begin with the -> arrow.

<?php
$someObject
->someFunction('some','parameter')
    ->
someOtherFunc(23,42)
    ->
andAThirdFunction();

Split long assigments

Assigments may be split onto several lines when the character/line limit would be exceeded. The equal sign has to be positioned onto the following line, and indented by one tab.

<?php
$GLOBALS
['TSFE']->additionalHeaderData[$this->strApplicationName]
    = 
$this->xajax->getJavascript(t3lib_extMgm::siteRelPath('nr_xajax'));

Code Efficiently

PHP is used as an interpreted language, rather than a compiled one. Each time a page loads, the source file(s) has/have to be read from disk, parsed, reduced to bytecodes, and interpreted by the Zend engine. Modern operating systems cache disk accesses, and using a bytecode cache with enough memory allocated to its buffers can nearly eliminate the necessity of parsing source files once active development is finished. Between the disk and bytecode caches, PHP code can be just as fast as, or even faster than, compiled languages which run on a virtual machine, such as Java. There are no optimizations, however, which can make up for sloppy, inefficient programming: It's up to you, the developer, to be wary of practices that lead to slower page loads and higher processing requirements.

You should always use echo rather than print because it is more flexible and doesn't have the overhead of returning a value. (print always returns 1, so its return value isn't terribly useful in the first place.)

Don't call echo repeatedly, use string concatenation instead to eliminate the overhead of multiple function calls: Neither echo nor print automatically emit a newline at the end of the string(s) they are passed, so there is absolutely no advantage to having multiple calls in a row.

<?php
// INCORRECT:   // prints "thisisreallybad!
echo 'this';    // no whitespace in any of these statements
echo 'is';
echo 
'really';
echo 
'bad!';

// CORRECT:     // prints "This is much cleaner."
echo 'This '    // spaces embedded in each source string
    
.'is '
    
.'much '
    
.'cleaner.';

Create variables used to store "constant" values (ones which do not change with each iteration) before beginning loops. Otherwise, the variable will be created and destroyed during each loop iteration, which can be a very expensive bit of overhead.

Object oriented vs. procedural code

Obejct oriented programming seeks to deal with systems and data as objects that interact with each other by passing messages through interfaces. Conceptually it's an attractive programming model because the boundaries between objects are well defined, making it easier to keep track of who owns what and how things can be manipulated. With that surface simplicity, however, comes the burden of increased processing overhead, and, behind the scenes, a much more complex system to support the programming model.

While lean classes can be constructed that provide only the necessary methods for the data they are encapsulating, many classes are built with all sorts of bells and whistles "in case" somebody needs them. There are even style guides suggesting that public properties should not be used and class data members have to be read and written using gettors and settors. One problem with writing Web code that way is every time a Web page is loaded, the entire class [file] has to be read and parsed, even if only a small part of the class functionality is being used. On a busy Web server, the extra overhead can add up quickly. Another problem, caused by only having one class per file, is pages that need functionality out of lots of classes will need to open and parse lots of files, which adds to congestion on the server's disk access bus.

Procedural code, on the other hand, requires a more intimate understanding of the data and logic for an application, resulting in an apparently more complex system design. The resulting code is more easily tuned for performance, though, and can be pared down to the minimum required to do the job without a major effort. As a result, pages load faster with smaller server requirements, and they don't have to carry around unused code the way OOP classes often do.

There are many cases where PHP classes are the most logical way to deal with a data set: Having an array of class objects representing data records makes it easy to present the data uniformly - a foreach loop can iterate over the array, calling the same method for each object to write the data to the browser. That makes the body of the loop a single statement, rather than a mass of mingled PHP and HTML that is more easily understood in a class method where the related variables are close at hand for reference. A balance can be found between design and implementation simplicity, but for the best performance, keep in mind that smaller files require less processing power, and will have lower communication overhead, even if the reduction is only internal to the server.

Reduce Duplicate Code

Duplicated code is A REALLY BAD THING™: Multiple copies of the same code require (approaching exponentially) more effort to debug or update, duplicated code increases the size of files, having repeated copies of the same thing (with the names changed or not) makes the code harder to comprehend, and when changes are introduced, verifying the correctness of the code set becomes "rather" difficult.

Copy-paste-edit development is the cause of most code duplication, often the result of an "add another element to my list" request: An existing list item is copied, pasted in above or below the original, then modified with the different information for the new item. While this procedure will add one item with a minimum of immediate effort, it makes changing the presentation of the data in the list effectively impossible unless an unreasonable amount of effort is expended. In addition, if editing accidentally removes characters from HTML tags - or even removes the tags altogether - fixing the resulting problems will most likely take a significant amount of time.

Avoid copy-paste-edit development as though it were poison ivy! If you need to duplicate some existing HTML code, look at the similarities and differences between the two instances. Write a PHP function (use the language to do what it was designed for) to emit the common elements, and pass parameters to control introduction of the differences. In all but the most trivial cases, the result will be a smaller file that is easier to maintain. Smaller files require less processing and often have lower communication overhead.

While the code in the left column of the illustration below appears to be about the same size as the code in the right column, there are a couple of problems:

<?php
// INCORRECT:    // copy, paste, edit
<div class="bioblurbs" id="john-doe">
    <
img src="/images/johndoe.jpg" alt="" class="headshot" />
    <!-- 
remove no-image class when photo is added -->
    <
div class="infoblock">
        <
h3>John Doe</h3>
        <
p>Co-founder &ampCEOJohn Doe Company<br/>
            
April 122013<br/>
            <
a href="mapsearch=Wong+Auditorium">Wong Auditorium</a>,
            
12 Noon</p>
        <
p>John Doe will speak about doing great stuff.</p>
        <
p><a href="/pdf/johndoe.pdf">Full bio &gt;&gt;</a> (PDF)</p>
    </
div>
</
div>

<
div class="bioblurbs" id="jane-smith">
    <
img src="/images/janesmith.jpg" alt="" class="headshot" />
    <!-- 
remove no-image class when photo is added -->
    <
div class="infoblock">
        <
h3>Jane Smith</h3>
        <
p>President and CEOCompanyInc.<br/>
            
March 72013<br/>
            <
a href="mapsearch=Wong+Auditorium">Wong Auditorium</a>,
            
12 Noon<br/>
            
Light lunch11:30am</p>
        <
p>Jane Smith speaks about helping Company grow.</p>
        <
p><a href="/pdf/janesmith.pdf">Full bio &gt;&gt;</a> (PDF)</p>
    </
div>
</
div>

<
div class="bioblurbs" id="joe-doe">
    <
img src="/images/joedoe.jpg" alt="" class="headshot" />
    <!-- 
remove no-image class when photo is added -->
    <
div class="infoblock">
        <
h3>Joe Doe</h3>
        <
p>PresidentMegagiant Corp.<br/>
            
January 232013<br/>
            <
a href="mapsearch=Wong+Auditorium">Wong Auditorium</a>,
            
12 Noon<br/>
            
Light lunch11:30am</p>
        <
p>Joe Doe tells it like it is.</p>
    </
div>
</
div>
   
<?php
// CORRECT:
function bioBlurb($name,$bio,$date,$where,$when,
        
$program,$img='',$bioURL='')
{   
$id=str_replace(' ','-',$name);
    
$url='mapsearch='.urlencode($where);
    
?>
<div class="bioblurbs" id="<?php echo $id ?>">
<?php if ($img)
    { 
?>
    <img src="/images/<?php echo $img ?>" alt="<?php
        
echo $name."'s picture"?>" class="headshot" />
<?php
    
?>
    <div class="infoblock <?php if (!$img) echo ' no-image' ?>">
        <h3><?php echo $name ?></h3>
        <p><?php echo $bio ?><br/>
            <?php /* intentional indent */ echo $date ?><br/>
            <a href="<?php echo $url ?>" rel="external"><?php
    
echo $where ?></a>, <?php echo $when ?></p>
        <p><?php echo $program ?></p>
<?php if ($img)
    { 
?>
        <p><a href="/pdf/<?php
        
echo $bioURL.pdf ?>">Full bio &gt;&gt;</a> (PDF)</p>
<?php
    
?>
    </div>
</div>
<?php
}

bioBlurb('John Doe','Co-founder &amp; CEO, John Doe Company',
         
'April 12, 2013','Wong Auditorium','12 Noon',
         
'John Doe will speak about doing great stuff.',
         
'johndoe.jpg','johndoe');
bioBlurb('Jane Smith','President and CEO, Company, Inc.',
         
'March 7, 2013','Wong Auditorium',
         
'12 Noon<br/>Light lunch, 11:30am',
         
'Jane Smith speak about helping Company grow.',
         
'janesmith.jpg','janesmith');
bioBlurb('Joe Doe','President, Megagiant Corp.',
         
'January 23, 2013','Wong Auditorium',
         
'12 Noon<br/>Light lunch, 11:30am',
         
'Joe Doe tells it like it is.',
         
'joedoe.jpg',/* no bio PDF! 'joedoe' */);

When building a list such as the one illustrated above, consider the question "what's the data source?" If the information is coming from a database query or an XML feed, the display function can probably be written such that it's fed a record directly from the data source, and building the list becomes nothing more than a foreach that repeatedly calls the function, passing the data records in succession:

<?php
require_once $_SERVER['DOCUMENT_ROOT'].'/path/to/data-source.php';

$Faculty=GetFacultyInGroup($groupName);

/**
* displays the faculty member's name as a link to their profile
*
* @param array $person, database record with info about the faculty member
*/
function ShowFaculty($person)
?>
    <li><a href="/faculty/profile.php?id=<?php
    
echo $person['PERSONID']?>"><?php
    
echo $person['FULLNAME']?></a></li>
<?php
}
?>
<h3>Our Staff</h3>
<ul>
<?php foreach ($Faculty as $personShowFaculty($person?>
</ul>

Use existing functions when possible

Don't reinvent the wheel. In addition to the extensive library of functions built into PHP, a significant number of routines have been written by other members of the team. There are files of functions that can be used in the /shared/inc directory, and other places on the server. As you find or write other code that is useful in more than just your current project, encapsulate it in a well-documented function (or class, if it's a more complex set of data and operations) and either add it to one of the existing files containing similar functionality, or create a new one with a name describing the type of code to be found inside (almost always required when writing a new class). Building and using such a code library not only reduces the effort to build new applications, but it makes debugging and maintaining the entire code base much more efficient: Rather than having to track down an ill-defined set of copies of similar code if an error is detected, having a common code base means one fix can update a host of applications with the correction.

Validation

Always remember the cardinal rule of network security: Data coming from userland cannot be trusted. Even if you build a form that limits line lengths, data values, and uses Javascript to insure only valid data can get to the server when the submit button is pressed, there's nothing to stop Joe Hacker from creating a form that connects to your script and sends you a load of bull. Even if it's not Joe, it could be a glitch on the network, or any of a host of other problems that could corrupt the data - it can't be trusted. As a result, YOU MUST ALWAYS VALIDATE DATA ON THE SERVER before using it or passing it to a database (or any other trusting application) if it came from userland - form submittals, email messages, tweets, etc.

There are two basic types of data errors that need to be protected against - invalid data, and incorrect data.

Invalid data includes things such as strings that are longer than the maxlength attribute on an <input> field allows, or values that cannot be chosen from a <select> list. A successful submission of a valid form will not result in receipt of invalid data, so if it is detected, all of the data received must be discarded, and displaying nothing more than a terse Invalid data message is an appropriate response. (Don't be too harsh, though - the invalid data could be the result of a network error.)

Incorrect data, on the other hand, is a common occurrence: Users type their passwords with the CapsLock key on, or enter an email address in a telephone number field, etc. In such cases, the job of the validation code is to detect as many of these types of errors as possible and cause an appropriate error message to be displayed so the user can intelligently correct the problem. When incorrect data is detected, be nice - the error message is supposed to help the user, not belittle them. Don't spend forever trying to make a foolproof system, though, because fools are too ingenious. Besides, only a fool will [be likely to] use a foolproof system...


Final Comments


PHP has grown to be an extremely powerful language. When combined with proper server technology (e.g., a bytecode cache), it is also a very efficient one. Use its power, code well, and you can easily build systems that rival anything constructed using proprietary or otherwise closed-source development tools.