Web Authoring FAQ

This list of Frequently Asked Questions is maintained by the WDG and was last updated on August 9, 1999. It may be found at the following URLs:

<URL:http://www.htmlhelp.com/faq/html/> (index of HTML version)
<URL:http://www.htmlhelp.com/faq/html/all.html> (single-file HTML version)
<URL:http://www.htmlhelp.com/faq/html/all.txt> (single-file text version)

If you would like to contribute to this FAQ, please send mail to <darin@htmlhelp.com>. All contributors will be listed at the bottom of the FAQ.

Index

1. Getting Started

1.3. How can I show HTML examples without them being interpreted as part of my document?

Within the HTML example, first replace the "&" character with "&" everywhere it occurs. Then replace the "<" character with "<" and the ">" character with ">" in the same way.

The next Q&A addresses the more general issue of representing arbitrary characters in HTML documents.

1.4. How do I get a so-and-so character in my HTML?

The safest way to do HTML is in (7-bit) US-ASCII, and expressing characters from the upper half of the 8-bit code by using HTML entities. See the answer to "Which should I use, &entityname; or &#number; ?"

Working with 8-bit characters can also be successful in many practical situations: Unix and MS-Windows (using Latin-1), and also Macs (with some reservations).

The available characters are those in ISO-8859-1, listed at <URL:http://www.htmlhelp.com/reference/charset/>. On the Web, these are the only characters widely supported. In particular, characters 128 through 159 as used in MS-Windows are not part of the ISO-8859-1 code set and will not be displayed as Windows users expect. This includes the em dash, en dash, curly quotes, bullet, and trademark symbol; neither the actual character nor &#nnn; is correct. (See the last paragraph of this answer for more about those characters.)

On platforms whose own character code isn't ISO-8859-1, such as MS DOS, Macs, there may be problems: you'd have to use text transfer methods that convert between the platform's own code and ISO-8859-1 (e.g Fetch for the Mac), or convert separately (e.g GNU recode). Using 7-bit ASCII with entities avoids those problems, and this FAQ is too small to cover other possibilities in detail. Mac users - see the notes at the above URL.

If you run a web server (httpd) on a platform whose own character code isn't ISO-8859-1, such as a Mac, or IBM mainframe, it's the job of the server to convert text documents into ISO-8859-1 code when sending them to the network.

If you want to use characters outside of the ISO-8859-1 repertoire, you must use HTML 4.0 rather than HTML 3.2. See the HTML 4.0 Recommendation at <URL:http://www.w3.org/TR/REC-html40/> and the Babel site at <URL:http://babel.alis.com:8080/> for more details. Another useful resource for internationalization issues is at <URL:http://ppewww.ph.gla.ac.uk/%7Eflavell/charset/>.

1.5. Should I put quotes around attribute values?

It depends. It is never wrong to use them, but you don't have to if the attribute value consists only of letters (A-Za-z), digits, periods and hyphens. This is explained in the HTML 2.0 specs.

Be careful when your attribute value includes double quotes, for instance when you want ALT text like "the "King of Comedy" takes a bow" for an image. Humans can parse that to know where the quoted material ends, but browsers can't. You have to code the attribute value specially so that the first interior quote doesn't terminate the value prematurely. There are two main techniques:

Escape any quotes inside the value with " so you don't terminate the value prematurely: ALT="the "King of Comedy" takes a bow". (" is not part of the formal HTML 3.2 spec, though most current browsers support it.)
Use single quotes to enclose the attribute value: ALT='the "King of Comedy" takes a bow'.

Both these methods are correct according to the spec and are supported by current browsers, but both were poorly supported in some earlier browsers. The only truly safe advice is to rewrite the text so that the attribute value need not contain quotes, or to change the interior double quotes to single quotes, like this: ALT="the 'King of Comedy' takes a bow".

Note that XHTML 1.0 (a reformulation of HTML 4.0 as an XML 1.0 application) requires attribute values to be quoted.

1.6. How can I include comments in HTML?

A comment declaration starts with "<!", followed by zero or more comments, followed by ">". A comment starts and ends with "--", and does not contain any occurrence of "--" between the beginning and ending pairs. This means that the following are all legal HTML comments:





<!>

But some browsers do not support the full syntax, so we recommend you follow this simple rule to compose valid and accepted comments:

An HTML comment begins with "" and does not contain "--" or ">" anywhere in the comment.

See <URL:http://www.htmlhelp.com/reference/wilbur/misc/comment.html> for a more complete discussion.

1.7. How can I check for errors?

The easiest way to catch errors in your HTML is through the use of a program called a validator. A validator is a program which knows all the rules in HTML, reads your source document and outputs a list of mistakes.

While checking for errors in the HTML, it is also a good idea to check for hypertext links which are no longer valid. There are several link checkers available for various platforms which will follow all links on a site and return a list of the ones which are non-functioning.

You can find a list of validators and link checkers at <URL:http://www.htmlhelp.com/links/validators.htm>. Especially recommended is the use of an SGML-based validator such as the WDG HTML Validator <URL:http://www.htmlhelp.com/tools/validator/> or W3C HTML Validation Service <URL:http://validator.w3.org/>.

1.8. What is a DOCTYPE? Which one do I use?

According to HTML standards, each HTML document begins with a DOCTYPE declaration that specifies which version of HTML the document uses. The DOCTYPE declaration is useful primarily to SGML-based tools like HTML validators, which must know which version of HTML to use in checking the document's syntax. Browsers generally ignore DOCTYPE declarations.

See <URL:http://www.htmlhelp.com/tools/validator/doctype.html> for information on choosing an appropriate DOCTYPE declaration.

Note that the public identifier section of the DOCTYPE declaration is case sensitive. Some versions of Netscape Composer are known to insert the lower-case "-//w3c//dtd html 4.0 transitional//en", rather than the correct mixed-case "-//W3C//DTD HTML 4.0 Transitional//EN".

2. Web Publishing

2.1. Where can I put my newly created Web pages?

Many ISPs offer web space to their dial-up customers. Typically this will be less than 5MB, and there may be other restrictions; for example, many do not allow commercial use of this space.

There are several companies and individuals who offer free web space. This usually ranges from 100KB up to 1MB, and again there are often limitations on its use. They may also require a link to their home page from your pages. The following page has pointers to several lists of free web space providers: <URL:http://www.yahoo.com/Business_and_Economy/Companies/Internet_Services/Web_Services/Free_Web_Pages/>.

There are also many web space providers (aka presence providers) who will sell you space on their servers. Prices will range from as little as $1 per month, up to $100 per month or more, depending upon your needs. Non-virtual Web space is typically the cheapest, offering a URL like: http://www.some-provider.com/yourname/ For a little more, plus the cost of registering a domain name, you can get virtual web space, which will allow you to have a URL like http://www.yourname.com/.

If you have some permanent connection to the Internet, perhaps via leased line from your ISP then you could install an httpd and operate your own Web server. There are several Web servers available for almost all platforms.

If you just wish to share information with other local users, or people on a LAN or WAN, you could just place your HTML files on the LAN for everyone to access, or alternatively if your LAN supports TCP/IP then install a Web server on your computer.

2.2. Where can I announce my site?

comp.infosystems.www.announce -- a moderated newsgroup specifically geared toward this subject. You need to obtain its FAQ list before posting to it.
http://www.submit-it.com/ lets you submit site information to 10 different major index sites for free. If you wish to pay you may submit your site to more than 400 sites.
http://ep.com/faq/webannounce.html is the How to Announce your New Web Site FAQ.

2.3. Is there a way to get indexed better by the search engines?

Yes. Use a meaningful <TITLE> and headings (<H1>, <H2>, and so on). The indexing programs of some search engines (including AltaVista and Infoseek) will also take into account the following tags in the <HEAD> part of your documents:

<META NAME="keywords" CONTENT="keyword keyword keyword keyword">
<META NAME="description" CONTENT="description of your site">

Both may contain up to 1022 characters, but no markup other than entities. If you use a keyword too often in the <META NAME="keywords"> tag, the indexing program may ignore your keywords list altogether. At this writing, "too often" means "more than 7 times" to some popular engines, but that may change in the future as indexing programs are changed to defend against "cheaters."

Search Engine Watch at <URL:http://searchenginewatch.com/> is a Web site dedicated to search engines and strategies for Web page authors.

2.4. How do I prevent my site from being indexed by search engines?

See <URL:http://info.webcrawler.com/mak/projects/robots/exclusion.html>.

2.5. How do I redirect someone to my new page?

The most reliable way is to configure the server to send out a redirection instruction when the old URL is requested. Then the browser will automatically get the new URL. This is the fastest and most efficient way, and is the only way described here that can convince indexing robots to phase out the old URL. For configuration details consult your server admin or documentation (with NCSA or Apache servers, use a Redirect statement in .htaccess).

If you can't set up a redirect, there are other possibilities. These are inferior because they tell the search engines that there's still a page at the old location, not that the page has moved to a new location. But if it's impossible for you to configure redirection at your server, here are two alternatives:

Put up a small page with text like "This page has moved to http://new.url/ -- please adjust your bookmarks."
A Netscape and MSIE solution, which doesn't work on many other browsers (and screws up the "back" button in Netscape) is:

<META HTTP-EQUIV="Refresh" CONTENT="x; URL=new.URL">

2.6. How do I password protect my web site?

Password protection is done through HTTP authentication. The configuration details vary from server to server, so you should read the authentication section of your server documentation. Contact your server administrator if you need help with this.

For example, if your server is Apache, see <URL:http://www.apache.org/docs/misc/FAQ.html#user-authentication>.

2.7. How do I stop my page from being cached?

Browsers cache web documents; they store local copies of documents to speed up repeated references to documents that haven't changed. Also, many browsers are configured to use public proxy caches, which serve many users (e.g., all customers of an ISP, or all employees behind a corporate firewall). To effectively control how your documents are cached you must configure your server to send appropriate HTTP headers. The configuration details vary from server to server, so check your server documentation.

The Expires header is understood by virtually all caches. The cached document will be retrieved again automatically once it has expired. The Expires header must contain an HTTP date, which must be Greenwich Mean Time (GMT), not local time.

HTTP 1.1 introduced the Cache-Control header, which provides more flexibility for telling caches how to handle the document. For more information, see the HTTP 1.1 draft (see <http://www.w3.org/Protocols/>).

The Pragma header is generally ineffective because its meaning is not standardized and few caches honor it. Using <META HTTP-EQUIV=...> elements in HTML documents is also generally ineffective; some browsers may honor such markup, but other caches ignore it completely.

Further discussion can be found at <http://www.mnot.net/cache_docs/>.

2.9. How do I detect what browser is being used?

Many browsers identify themselves when they request a document. A CGI script will have this information available in the HTTP_USER_AGENT environment variable, and it can use that to send out a version of the document which is optimized for that browser.

Keep in mind not all browsers identify themselves correctly. Microsoft Internet Explorer, for example, claims to be "Mozilla" to get at Netscape enhanced documents.

And of course, if a cache proxy keeps the Netscape enhanced document, someone with another browser will also get this document if he goes through the cache.

For these reasons and others, it is not a good idea to play the browser guessing game.

3. Web Design

3.1. How do I include one file in another?

HTML itself offers no way to seamlessly incorporate the content of one file into another.

True dynamic inclusion of one HTML document (even in a different "charset") into another is offered by the OBJECT element, but due to shortcomings of browser versions in current use, it seems unwise to rely on this yet for essential content. The same can be said for IFRAME.

Two popular ways of including the contents of one file seamlessly into another for the WWW are preprocessing and server-side inclusion.

Preprocessing techniques include the C preprocessor and other generic text manipulation methods, and several HTML-specific processors. But beware of making your "source code" non-portable.

The HTML can only be validated after pre-processing, so the typical cycle "Edit, Check, Upload" becomes "Edit, Preprocess, Check, Upload" (here, "Check" includes whatever steps you use to preview your pages: validation, linting, management walk-through etc.; and "upload" means whatever you do to finally publish your new pages to the web server).

A much more powerful and versatile pre-processing technique is to use an SGML processor (such as the SP package) to generate your HTML; this can be self-validating.

Examples of server-side inclusion are Server Side Includes "SSI" (Apache, NCSA and some other web servers) and "ASP"; processing occurs at the time the documents are actually retrieved. A typical inclusion looks like

<!--#include virtual="/urlpath/to/myfile.htm" -->

but be sure to consult your own server's documentation, as the details vary somewhat between implementations. The whole directive gets replaced by the contents of the specified file.

Using server-side inclusion (a potentially powerful tool) merely as a way to insert static files such as standard header/footers has implications for perceived access speed and for server load, and is better avoided on heavily loaded servers. If you use it in this way, consider making the result cacheable (e.g., via "XBitHack full" on Apache; setting properties of the "Response" object in ASP). Details are beyond the scope of this FAQ but you may find this useful: http://www.pobox.com/~mnot/cache_docs/

Proper HTML validation of server-side inclusion is only possible after server-side processing is done, e.g. by using an on-line validator that retrieves the document from the server.

3.2. Which should I use, &entityname; or &#number; ?

In HTML, characters can be represented in three ways:

a properly coded character, in the encoding specified by the "charset" attribute of the "Content-type:" header;
a character entity (&entityname;), from the appropriate HTML specification (HTML 2.0/3.2, HTML 4.0 etc.);
a numeric character reference (&#number;) that specifies the Unicode reference of the desired character. We recommend using decimal references; hexadecimal references are less widely supported.

In theory these representations are equally valid. In practice, authoring convenience and limited support by browsers complicate the issue.

HTTP being a guaranteed "8-bit clean" protocol, you can safely send out 8-bit or multibyte coded characters, in the various codings that are supported by browsers.

B. A single repertoire other than Latin-1

In such codings as ISO-8859-7 Greek, koi8-r Russian Cyrillic, and Chinese, Japanese and Korean (CJK) codings, use of coded characters is the most widely supported and used technique.

Although not covered by HTML 3.2, browsers have supported this quite widely for some time now; it is a valid option within the HTML 4.0 specification--use a validator such as the WDG HTML Validator at http://www.htmlhelp.com/tools/validator/ which supports HTML 4.0 and understands different character encodings.

Browser support for coded characters may depend on configuration and font resources. In some cases, additional programs called "helpers" or "add-ins" supply virtual fonts to browsers.

"Add-in" programs have in the past been used to support numeric references to 15-bit or 16-bit code protocols such as Chinese Big5 or Chinese GB2312.

In theory you should be able to include not only coded characters but also Unicode numeric character references, but browser support is generally poor. Numeric references to the "charset-specified" encoding may appear to produce the desired characters on some browsers, but this is wrong behavior and should not be used. Character entities are also problematical, aside from the HTML-significant characters <, & etc.

C. Internationalization per HTML 4.0

Recent versions of the popular browsers have support for some of these features, but at time of writing it seems unwise to rely on this when authoring for a general audience. If you'd like to explore the options, you can find comprehensive background documentation and some practical suggestions at

3.5. Why does my page display fine in browser X but incorrectly or not at all in browser Y?

There are several possibilities.

First, you may have some incorrect HTML. Browsers vary in their ability to guess what you meant. For instance, Netscape is much more fussy about tables than MS Internet Explorer, so a page with incorrect tables may look fine in MSIE but not display at all in Netscape. See the answer to "How can I check for errors?" for tips on finding your HTML errors. (In fact, even correct nested tables may not display correctly in Netscape. See "Can I nest tables within tables?" below for what you can do about that.)

Second, you may have valid HTML that different browsers interpret differently. For instance, it is not clear from the spec what should be done with a string of   characters. Some browsers will collapse them for rendering as a single space; others will render one space per  .

Third, your server may be sending incorrect MIME types for some of your files. Internet Explorer incorrectly ignores server-provided MIME types, so it sometimes "does the right thing" when the server is misconfigured. Other browsers correctly heed the server-provided MIME types, so they will reveal server misconfigurations.

Other possibilities are a bug in one or the other browser, or different user option settings.

3.7. How do I freeze the URL displayed in a visitor's browser?

This is a "feature" of using frames: The browser displays the URL of the frameset document, rather than that of the framed documents. (See the answer to the question "How do I specify a specific combination of frames instead of the default document?").

However, this behavior can be circumvented easily by the user. Many browsers allow the user to open links in their own windows, to bookmark the document in a specific frame (rather than the frameset document), or to bookmark links. Thus, there is no reliable way to stop a user from getting the URL of a specific document.

Furthermore, preventing users from bookmarking specific documents can only antagonize them. A bookmark or link that doesn't find the desired document is useless, and probably will be ignored or deleted.

3.8. How do I make a table which looks good on non-supporting browsers?

See Alan Flavell's document on tables for a good discussion at <URL:http://ppewww.ph.gla.ac.uk/%7Eflavell/www/tablejob.html>.

4. Hyperlinks

4.1. Should I end my URLs with a slash?

The trailing slash is used to distinguish between directory and file URLs. A file URL is an URL for a file, and a directory URL refers to a directory. For example, the URL for the WDG's HTML reference is http://www.htmlhelp.com/reference/ and the URL for the overview of HTML 3.2 elements is http://www.htmlhelp.com/reference/wilbur/overview.html

If you request a directory URL without the trailing slash, the browser will actually ask for a FILE with that name. This file doesn't exist on the server, so the server sends back a message saying that the browser should ask for the directory. It uses a redirection message for this. The browser then sends another request, this time for the directory, and finally gets what was asked for in the first place. This wastes time and network resources.

When you write a document, all directory URLs should end with a slash. Since you already know you are linking to a directory, why force the user to make that second request, when it could have been done using only one?

And by the way, it is NOT the browser which appends the slash. The browser cannot know if what you are asking for is a file or directory, not even when the final part of the URL does not have an extension. http://www.somewhere.com/src/something/README is a perfectly valid URL, has no extension in the final part, yet refers to a file and not a directory.

The only apparent exception is when you refer to an URL with just a hostname. Since it is obvious that when you use http://www.htmlhelp.com you actually want the main index "/" from our server, you do not have to include the / in this case. It is regarded as good style to do so, however.

For a full discussion of the proper form of URLs, see RFC 1738 at <URL:http://www.cis.ohio-state.edu/htbin/rfc/rfc1738.html> and, for relative URLs, RFC 1808 at <URL:http://www.cis.ohio-state.edu/htbin/rfc/rfc1808.html>.

4.2. How do I link to a location in the middle of an HTML document?

First, identify the destination of the link with a named anchor (an anchor that uses the NAME attribute). For example:

<H2><A NAME="section2">Section 2: Beyond Introductions</A></H2>

Second, link to the named anchor. The URL of the named anchor is the URL of the document, with "#" and the name of the anchor appended. For example, elsewhere in the same document you could use:

<A HREF="#section2">go to Section 2</A>

Similarly, in another document you could use:

<A HREF="thesis.html#section2">go to Section 2 of my thesis</A>

4.3. How do I create a link that opens a new window?

<A TARGET="_blank" HREF=...> opens a new, unnamed window.

<A TARGET="foobar" HREF=...> opens a new window named "foobar", provided that a window or frame by that name does not already exist.

Note that links that open new windows can be annoying to your readers if there is not a good reason (from the reader's perspective) for them.

4.4. How do I get a button which takes me to a new page?

This is best done with a small form:

<FORM ACTION="http://url.you.want.to.go.to/" METHOD=GET>
<INPUT TYPE=submit VALUE="Text on button" NAME=foo>
</FORM>

If you want to line up buttons next to each other, you will have to put them in a one-row table, with each button in a separate cell.

Note that search engines might not find the target document unless there is a normal link somewhere else on the page.

A go-to-other-page button can also be coded in JavaScript, but the above is standard HTML and works for more readers.

4.5. How do I get a back button on my page?

In HTML, this is impossible. Going "back" means that you go to the previous page in your history. You might be able to create a link to the URL specified in the "HTTP_REFERER" environment variable in your document, but that only creates a link to a new location in your history. Even worse, the information in that variable can be plain wrong. Some browsers incorrectly send the variable when you use a bookmark or type in an URL manually, and some don't send that variable at all. This would result in an empty link.

A JavaScript could use "history.back()" to do this, but this only works in Netscape 2 or higher and MSIE 3 or higher, and even then only if the user has not disabled JavaScript.

For a more detailed explanation, please see Abigail's "Simulating the back button" at <URL:http://www.foad.org/%7Eabigail/HTML/Misc/back_button.html>.

4.6. How do I create a link that sends me email?

Use a mailto: link, for example

Send me email at
<A HREF="mailto:me@mydomain.com">me@mydomain.com</A>.

4.7. How do I specify a subject for a mailto: link?

You can't, not in any reliable way. The methods that are frequently posted don't do the job on all browsers (or even all popular browsers), and many of them have an important drawback: if your visitors are using an older browser such as Netscape 1.22, their mail will be lost.

If you really need a subject, you can do it by providing a form on your page, which submits data to a CGI program that emails the form data to you with your desired subject line. However, the form must have an input field for the visitor's email address, and you must hope that the visitor enters it correctly.

Here are some other ways to transmit subject-type information:

Create email aliases that are used only for certain mailto links, so you'll know that anything sent to a given alias is in response to the corresponding Web page(s).
The mail handlers for many Web browsers include an "X-Url" header that specifies the URL of the Web page that contained the mailto link. If you configure your mail reader to display this header, you'll see which Web page the sender is responding to much of the time.
Use <A HREF="mailto:user@site" TITLE="Your Subject">. Most browsers will ignore the TITLE attribute, but some minor browsers will use it as a subject for the email message. All browsers will send the mail.
Use <A HREF="mailto:user@site?subject=Your%20Subject">, which puts "Your Subject" (the space is encoded as "%20") in the "Subject" header field of the email message in most current browsers. The details of this recent RFC can be found at <URL:http://info.internet.isi.edu/in-notes/rfc/files/rfc2368.txt>. Note however that you will lose mail from users of older browsers, so you should consider whether the pre-filled subject is worth lost mail.

4.8. How do I link an image to something?

Just use the image as the link content, like this:

<A HREF=...><IMG ...></A>

4.9. How do I eliminate the blue border around linked images?

<IMG ... BORDER=0>

4.10. How do I link different parts of an image to different things?

Use an image map. Client-side image maps don't require server-side processing, so response time is faster. Server-side image maps hide the link definitions from the browser, and can act as a backup for client-side image maps for the few very old browsers that support server-side image maps but not client-side image maps.

The configuration details of server-side image maps vary from server to server. Refer to your server documentation for details.

Client-side image maps are implemented with HTML. The MAP element defines an individual image map and the AREA element defines specific linked areas within that image map. The USEMAP attribute of the IMG element associates an image map with a specific image. A detailed explanation (with examples) is available at <http://www.htmlhelp.com/reference/html40/special/map.html>. A tutorial is available at <http://ppewww.ph.gla.ac.uk/~flavell/www/imgmaptut.html>.

4.11. How do I turn off underlining on my links?

If you want to prevent links on your page being underlined when your visitors see them, there's no way in HTML to accomplish this. You can suggest this presentation using style sheets by defining

a:link, a:visited, a:active {text-decoration: none}

4.12. How can I have two sets of links with different colors?

You can suggest this presentation using style sheets. In your style sheet, define something like this:

a:link        {color: blue;   background: white}
a:visited     {color: purple; background: white}
a:active      {color: red;    background: white}
a.foo:link    {color: yellow; background: black}
a.foo:visited {color: white;  background: black}
a.foo:active  {color: red;    background: black}

Then use CLASS="foo" to identify the links of the second color in your HTML, like this:

<A CLASS="foo" HREF=...>...</A>

4.13. Why are my hyperlinks coming out all wrong or not loading?

Most likely you forgot to close a quote at the end of the HREF attribute. Alternatively, perhaps you used a ">" character somewhere else inside a tag. Although this is legal, several older browsers will think the tag ends there, so the rest is displayed as normal text.

This especially happens if you use comment tags to "comment out" text with HTML tags. (See the answer to "How can I include comments in HTML?") Although the correct syntax is  (without "--" occurring anywhere inside the comment), some browsers will think the comment ends at the first ">" they see.

Validators will show you any syntax errors in your markup, but checkers such as Weblint and HTMLchek can show you where you are liable to provoke known browser bugs. See also the answer to "How can I check for errors?"

4.14. Why does my link work in Internet Explorer but not in Netscape?

Is there a space, #, ?, or other special character in the path or filename? Spaces are not legal in URLs. If you encode the space by replacing it with %20, your link will work.

You can encode any character in a URL as % plus the two-digit hex value of the character. (Hex digits A-F can be in upper or lower case.) According to the spec, only alphanumerics and the special characters $-_.,+!*'() need not be encoded.

You should encode all other characters when they occur in a URL, except when they're used for their reserved purposes. For example, if you wanted to pass the value "Jack&Jill" to a CGI script, you would need to encode the "&" character as "%26", which might give you a URL like the following: http://www.foo.com/foo.cgi?rhyme=Jack%26Jill&audience=child. Note that the "?" and other "&" character in this URL are not encoded since they're used for their reserved purposes.

See section 2.2 of RFC 1738 at <URL:http://www.w3.org/Addressing/rfc1738.txt> for the full story.

5. Other Media

5.1. How do I let people download a file from my page?

Once the file is uploaded to the server, you need only use an anchor reference tag to link to it. An example would be:

<a href="../files/foo.zip">Download Foo Now! (100kb ZIP)</a>

It is possible that the server might need to be configured for some different file types. (See the next Q&A.)

5.2. Why did my link to a _______ file only download a bunch of characters instead?

If you are trying to link to a particular type of file and it is not returning your desired response, chances are that the server needs to have the type configured. Talk to your system administrator about getting them to add the content type. Here is a list of common types that often need configuring:

Content Type	Description
Application/msword	Microsoft Word Document
application/octet-stream	Unclassified binary data (often used for compressed file or executable)
application/pdf	PDF Document
application/wordperfect6.0	WordPerfect 6.0 Document
application/zip	ZIP archive
audio/x-wav	WAV audio format
audio/midi	MIDI audio format
audio/x-pn-realaudio	RealAudio
image/gif	GIF image format
image/jpeg	JPEG image format
image/png	PNG image format
text/html	HTML document
text/plain	Plain text
video/mpeg	MPEG video format
video/quicktime	QuickTime video format
video/x-msvideo	AVI video format

Another method of ensuring that your file is properly sent to the client is to compress it into a standard compression format. Virtually all servers are set to handle the .zip extension and it is widely recognized by users.

Some servers (NCSA, Apache, and others) can be configured to support user-configured content types. Details are server dependent, so consult your server admin or documentation.

Note that Internet Explorer incorrectly ignores server-provided MIME types, so it sometimes "does the right thing" when the server is misconfigured. Other browsers correctly heed the server-provided MIME types, so they will reveal server misconfigurations.

5.6. Why am I getting a colored whisker to the left or right of my image?

This is the result of including "white space" (spaces and newlines) before or after an IMG inside an anchor. For example:

<A HREF=...>
<IMG SRC=...>
</A>

will have white space to the left and right of the image. Since many browsers display anchors with colored underscores by default, they will show the spaces to the left and right of the image with colored underscores.

Solution: don't leave any white space between the anchor tags and the IMG tag. If the line gets too long, break it inside the tag rather than outside it, like this:

<A HREF=...><IMG
SRC=...></A>

Style checkers such as Weblint will call attention to this problem in your HTML source.

5.10. How do I get an audio file to play automatically when someone visits my site?

Most browsers support the EMBED element for this, provided that the user has a suitable plug-in for the sound file. You can reach a slightly wider audience if you use BGSOUND as well. To avoid problems with browsers that support both, place the BGSOUND in a NOEMBED container:

<EMBED SRC="your_sound_file" HIDDEN=true AUTOSTART=true>
<NOEMBED><BGSOUND SRC="your_sound_file"></NOEMBED>

For more on the EMBED element, see <URL:http://developer.netscape.com/docs/manuals/htmlguid/tags14.htm#1286379>. See <URL:http://msdn.microsoft.com/developer/sdk/inetsdk/help/dhtml/references/html/BGSOUND.htm> for more information on BGSOUND. Note that these elements are proprietary and not in any HTML standard. (The HTML standard way of doing this is not well supported.)

Be aware that some users find it annoying for music to automatically start playing. They may not have the volume set properly on their speakers, or they may be listening to something else. As a courtesy to your users, you may prefer to offer the sound file as a link:

<A HREF="your_sound_file">Listen to my sound! (5 kB MIDI)</A>

5.11. How can I strip all the HTML from a document to get plain text?

One of the easiest ways is to open a document in a graphical browser such as Internet Explorer or Netscape, select all the text and copy it to the clipboard. Most browsers also have a "save as" function which will allow you to save the file as plain text.

Lynx users can use "lynx -dump http://..." on the command line to print to file and append a list of referenced URLs as footnotes. If you want the output file without the footnotes, use the "p" command to "print" to a text file.

Some HTML authoring tools have an option to strip all HTML as well. Two programs of note are

HomeSite, available from <URL:http://www.allaire.com/>
DiDa, available from <URL:http://www.faico.net/>

If you are looking for another method (in other words you want to make things more difficult on yourself), you can obtain programs which will strip away all HTML markup from a document. Try doing a search at <URL:http://www.altavista.com/> for the phrase "HTML stripper".

6. Presentational Effects

6.1. How can I make a custom rule?

Your best option is likely a centered IMG with a line of "--" characters as ALT text:

<P ALIGN=center><IMG SRC="custom-line.gif" ALT="--------------------"></P>

For an experimental but somewhat more graceful approach, read about CSS1 and the Decorative HR at <URL:http://ppewww.ph.gla.ac.uk/%7Eflavell/www/hrstyle.html>.

6.2. How can I make a list with custom bullets?

There are several methods, none completely satisfactory:

Use the list-style property of Cascading Style Sheets. This should be the preferred method of using custom bullets, but unfortunately it's not widely supported by browsers. However, non-supporting browsers will see a normal bullet, so using this method today is not a problem. See <URL:http://www.htmlhelp.com/reference/css/> for more information on style sheets.
Use a <DL> with <DD> tags with preceding images (with ALIGN and suitable ALT text) and no <DT>; this won't be as beautiful as a "real" list.
Use a two-column table, with the bullets in the left column and the text in the right. Since browsers show nothing before downloading the entire table, this can be slow with long lists.
Create the bullet with the indent built in. For example, if you use a bullet that is 10 pixels across you can make the background 25 pixels (transparent) and put the bullet all the way on the right. This will create a 15-pixel indent to the left of the bullet. It will add slightly to the byte size of the graphic but since it is all one color it won't add much. This method doesn't work well with any list items that are longer than a line (and remember that you don't know how long a line will be on the visitor's screen).

6.3. Where can I get a "hit counter"?

A hit counter is a small script or program that increases a number every time a document is accessed from the server.

Why do you want one? If you believe that it will tell you how many times your documents have been accessed, you are mistaken. No counter can keep track of accesses from browser caches or proxy caches. Some counters depend on image-loading to increment; such counters ignore accesses from text-mode browsers, or browsers with image-loading off, or from users who interrupted the transfer. Some counters even require access to a remote site, which may be down or overloaded, causing a delay in displaying your documents.

Most web servers log accesses to documents stored on the server machine. These logs may be processed to gain information about the *relative* number of accesses over an extended period. There is no reason to display this number to your viewers, since they have no reference point to relate this number to. Not all service providers allow access to server logs, but many have scripts that will output information about accesses to a given user's documents. Consult your sysadmin or service provider for details.

Counter services and information are available from Yahoo's list of counters: http://www.yahoo.com/Computers/World_Wide_Web/Programming/Access_Counts/

Log analysis tools and scripts are at http://www.yahoo.com/Business_and_Economy/Companies/Computers/Software/Internet/World_Wide_Web/Log_Analysis_Tools/

<URL:http://www.markwelch.com/bannerad/baf_counter.htm> is another good source for counter information.

6.4. How do I display the current date or time in my document?

With server-side includes. Ask your webmaster if this is supported, and what the exact syntax is for your server. But this will display the local time on the server, not for the client. And if the document is cached, the date will of course be incorrect after some time. JavaScript can be used to display the local time for the client, but again, as most people already have one or more clocks on their screen, why display another one?

If you plan on putting the current date or time on your pages, using a CGI, JavaScript or VBScript, take an extra breath and consider that it will take resources, add time to the loading of the page, and prevent good caching. If you find that you really have a need to use it, for instance to inform readers of the up-times of an FTP server, then by all means do so. If, on the other hand, your only reason is 'it looks cool!' - then reconsider.

6.5. How do I get scrolling text in the status bar?

This is not an HTML question; it's done with JavaScript. Check any page which has this feature, and copy the script from the source.

This script has two big problems. One, usually it uses the decrement operator (c--) at some point. The "--" sequence in a comment actually closes it on some browsers, so your code may "leak" on those browsers. The same goes for ">".

Second, keep in mind that many people consider this even worse than <BLINK>, and that it also suppresses the status information which normally appears there. It prevents people from knowing where a link goes to.

6.6. How do I right align text or images?

You can use the ALIGN=right attribute on paragraphs, divisions, and headers, just as you use ALIGN=center to create centered paragraphs and such. This will right align your text (ragged left).

Perhaps what you really want is justified text, in which the left and right edges are both aligned so that all lines are the same length. (This is sometimes incorrectly called "right justify".) There's no way to justify text in HTML 3.2, but it can be done in a CSS1 style sheet with "text-align: justify". (Before you do that, a caveat: though justified text may look pretty, human factors analysis shows that ragged right is actually easier to read and understand.)

For images, you can use <IMG ALIGN=right SRC="..." ALT="..."> before the running text. The image will float at the right margin, and the running text will flow around it. Remember to use <BR CLEAR=right> or <BR CLEAR=all> to mark the end of the text that is to be affected in this way.

6.7. How can I specify fonts in my Web pages?

If you want others to view your web page with a specific font, the most appropriate way is to suggest the font rendering with a style sheet. See: http://www.htmlhelp.com/reference/css/font/font-family.html

The FONT element can also be used to suggest a specific font. Use of the FONT element brings numerous usability and accessibility problems, see: http://www.mcsr.olemiss.edu/%7Emudws/font.html

More information about the FONT element can be found at: http://www.htmlhelp.com/reference/html40/special/font.html

Either way, authors run the risk that a reader's system has a font by the same name but which is significantly different. (e.g., "Chicago" can be a nice text font, or a display font with letters formed by "bullet holes", or a dingbat font with building images for creating skylines).

Also, authors are limited to choosing a font (or a group of similar fonts) that are commonly available on many systems. If a reader does not have the font installed on their system, they will see a default font. Some browsers may use a less legible substitute font than their normal default font in cases where "the specified font wasn't found".

6.8. How do I indent the first line in my paragraphs?

Use a style sheet with the following ruleset:

P { text-indent: 5% }

See <URL:http://www.htmlhelp.com/reference/css/> for more information on style sheets.

6.9. How do I indent a lot of text?

Use a style sheet to set a left margin for the whole document or part of it:

  /* Entire document */
  BODY { margin-left: 20% }

  /* Part of a document with CLASS="foo" */
  .foo { margin-left: 15% }

See <URL:http://www.htmlhelp.com/reference/css/> for more information on style sheets.

6.10. How do I do a page break?

Page breaks are offered in Cascading Style Sheets, Level 2, but they are not well supported by browsers. See <URL:http://www.w3.org/TR/REC-CSS2/page.html#page-breaks> for information on CSS2 page breaks.

In general, page breaks are not appropriate on the Web since what makes a nice page break for you with your font and font size may be a poor page break for me with my font and font size.

If you need to produce a nicely formatted printed copy of your HTML documents, you might also consider using special purpose tools rather than your browser's Print function. For example, html2ps generates nicely formatted PostScript output from HTML documents, and HTML Scissor uses special HTML comments for suggesting page breaks.

6.11. How do I have a fixed background image?

Use a style sheet with the following ruleset:

body { color: black; background: white url(foo.gif) fixed }

Note that the fixed property used in the above style sheet is supported by Internet Explorer 3+, Netscape Navigator 5+, and other browsers. In contrast, the proprietary BGPROPERTIES=fixed attribute is supported only by Internet Explorer 3+.

6.12. How do I have a non-tiled background image?

Use a style sheet with the following ruleset:

body { color: black; background: white url(foo.gif) no-repeat }

7. HTML Forms

7.1. How do I use forms?

Information relating to the use of forms is available at <URL:http://www.hut.fi/u/jkorpela/forms/>.

7.2. Why won't my form email the user's data to me?

Forms that use ACTION="mailto:..." are unreliable. They may work for some of your users, but they will fail for others who have different software configurations.

The only reliable solution is to use a CGI (or other server-side) program to process your forms and mail the results to you. If you can run CGI programs on your server, see the list of prewritten scripts at <URL:http://www.cgi-resources.com/Programs_and_Scripts/>. If you can't run CGI programs on your own server, see the list of remotely hosted form-to-email services at <URL:http://www.cgi-resources.com/Programs_and_Scripts/Remotely_Hosted/Form_Processing/>.

7.3. How do I make a form so it can be submitted by hitting ENTER?

The short answer is that the form should just have one <INPUT TYPE=TEXT> and no TEXTAREA, though it can have other form elements like checkboxes and radio buttons. For a more detailed answer, see <URL:http://ppewww.ph.gla.ac.uk/%7Eflavell/www/formquestion.html>.

7.4. How can I make a form with custom buttons?

Rather than a normal submit button (<INPUT TYPE=submit ...>), you can use an image of a custom submit button. Use <INPUT NAME=foo TYPE=image SRC="http://url.to/image.gif">. There is no way to do this for the reset button.

Most browsers will also send the x and y coordinates of the location where the user clicked on the image to the server. They are available as "foo.x=000&foo.y=000" in the CGI input.

7.5. Can I have two or more Submit buttons in the same form?

Sure. This is part of HTML 2.0 Forms support (some early browsers did not support it, but browser coverage is now excellent).

You will need to give your Submit buttons a Name attribute, and, optionally, a Value attribute. In order to determine which button was used, you will want to use distinctive Names, or Values, or both. Browsers will display the Value, in addition to sending it to the server, so choose something that's meaningful to the user.

Example:

<INPUT TYPE=SUBMIT NAME=join VALUE="I want to join now"> -or-
<INPUT TYPE=SUBMIT NAME=info VALUE="Please send full details">

If you're unsure what results you're going to get when you submit your form, NCSA has a standard script which you can use. Code this, for example (assuming method "post"):

<form method="post" action="http://hoohoo.ncsa.uiuc.edu/htbin-post/post-query">

and then go through the motions of submitting your form. The NCSA server decodes the form input, and displays the result to you.

7.6. How can I allow file uploads to my web site?

First of all, the RFC for this is located at <URL:http://www.ics.uci.edu/pub/ietf/html/rfc1867.txt>.

File upload is handled by the CGI.pm Perl5 library available from <URL:http://stein.cshl.org/WWW/software/CGI/cgi_docs.html>. The most recent versions of the cgi-lib.pl library also support file upload.

These things are necessary for Web-based uploads:

An HTTP server that accepts uploads.
Access to the /cgi-bin/ to put the receiving script.
A form implemented something like this:

<form method="POST" enctype="multipart/form-data" action="fup.cgi">
File to upload: <input type=file name=upfile><br>
Notes about the file: <input type=text name=note><br>
<input type=submit value=Press> to upload the file!
</form>

Not all browsers support form-based file upload, so try to give alternatives where possible. Also, if you need to do file upload in conjunction with form-to-email, the Perl package MIME::Lite handles email attachments.

7.7. How can I use forms for pull-down navigation menus?

There is no way to do this in HTML only; something else must process the form. JavaScript processing will work only for readers with JavaScript-enabled browsers. CGI and other server-side processing is reliable for human readers, but search engines have problems following any form-based navigation.

See <http://www.hut.fi/u/jkorpela/forms/navmenu.html>, which explains how to create pull-down menus, as well as some better navigation alternatives.

8. HTML Frames

8.1. How do I make a link in one frame update another frame?

In the frameset document (the HTML document containing the <FRAMESET> and <FRAME> elements), make sure to name the individual frames using the NAME attribute. The following example creates a top frame named "navigation" and a bottom frame named "content":

<FRAMESET ROWS="*,3*">
    <FRAME NAME="navigation" SRC="navigation.html">
    <FRAME NAME="content" SRC="content.html">
    <NOFRAMES><BODY>
        <!-- Alternative non-framed version -->
    </BODY></NOFRAMES>
</FRAMESET>

Then, in the document with the link, use the TARGET attribute to specify which frame should be used to display the link. (The value of the TARGET attribute should match the value of the target frame's NAME attribute.) You can specify the target frame for each link individually (e.g., <A TARGET="content" HREF=...>), or you can use <BASE TARGET=...> to set a default target frame for every link in the document.

8.2. Why do my links open new windows rather than update an existing frame?

If there is no existing frame with the name you used for the TARGET attribute, then a new browser window will be opened, and this window will be assigned the name you used. Furthermore, TARGET="_blank" will open a new, unnamed browser window.

In HTML 4.0, the TARGET attribute value is case-insensitive, so that abc and ABC both refer to the same frame/window, and _top and _TOP both have the same meaning. However, most browsers treat the TARGET attribute value as case-sensitive and do not recognize ABC as being the same as abc, or _TOP as having the special meaning of _top.

8.3. How do I update two frames at once?

There are two basic techniques for updating multiple frames with a single link: The HTML-based technique links to a new frameset document that specifies the new combination of frames. The JavaScript-based solution uses the onClick attribute of the link to update the additional frame (or frames).

The HTML-based technique can link to a new frameset document with TARGET="_top" (replacing the entire frameset), but there is an alternative if the frames to be updated are part of a nested frameset. In the initial frameset document, use a secondary frameset document to define the nested frameset. For example:

<FRAMESET COLS="*,3*">
    <FRAME SRC="contents.html" NAME="Contents">
    <FRAME SRC="frameset2.html" NAME="Display">
</FRAMESET>

A link can now use TARGET="Display" to replace simultaneously all the frames defined by frameset2.html.

The JavaScript-based solution uses the onClick attribute of the link to perform the secondary update. For example:

<A HREF="URL1" TARGET=Frame1
   onClick="top.Frame2.location='URL2';">Update frames</A>

The link will update Frame1 with URL1 normally. If the reader's browser supports JavaScript (and has it enabled), then Frame2 will also be updated (with URL2).

8.4. How do I get out of a frameset?

If you are the author, this is easy. You only have to add the TARGET attribute to the link that takes readers to the intended 'outside' document. Give it the value of _top.

In many current browsers, it is not possible to display a frame in the full browser window, at least not very easily. The reader would need to copy the URL of the desired frame and then request that URL manually.

I would recommend that authors who want to offer readers this option add a link to the document itself in the document, with the TARGET attribute set to _top so the document displays in the full window if the link is followed.

8.5. How do I make sure my framed documents are displayed inside their frameset?

When the sub-documents of a frameset state are accessed directly, they appear without the context of the surrounding frameset.

If the reader's browser has JavaScript support enabled, the following script will restore the frameset:

<SCRIPT TYPE="text/javascript">
<!--
if (parent.location.href == self.location.href) {
    if (window.location.replace)
        window.location.replace('frameset.html');
    else
        // causes problems with back button, but works
        window.location.href = 'frameset.html';
}
//  -->
</SCRIPT>

A more universal approach is a "restore frames" link:

<A HREF="frameset.html" TARGET="_top">Restore Frames</A>

Note that in either case, you must have a separate frameset document for every content document. If you link to the default frameset document, then your reader will get the default content document, rather than the content document he/she was trying to access. These frameset documents should be generated automatically, to avoid the tedium and inaccuracy of creating them by hand.

Note that you can work around the problem with bookmarking frameset states by linking to these separate frameset documents using TARGET="_top", rather than linking to the individual content documents.

8.6. Is there a way to prevent getting framed?

"Getting framed" refers to having your documents displayed within someone else's frameset without your permission. This can happen accidentally (the frameset author forgot to use TARGET="_top" when linking to your document) or intentionally (the frameset author wanted to display your content with his/her own navigation or banner frames).

To avoid "framing" other people's documents, you must add TARGET="_top" to all links that lead to documents outside your intended scope.

Unfortunately, there is no reliable way to specify that a particular document should be displayed in the full browser window, rather than in the current frame. If you can configure your server to send the proprietary header Window-Target: _top in the HTTP response, then Netscape browsers will display your document in the full browser window. However, other browsers ignore this header, and it doesn't work to use <META HTTP-EQUIV="Window-target" CONTENT="_top"> in the document itself to mimic the HTTP response.

Another workaround is to use <BASE TARGET="_top"> in the document, but this only specifies the default target frame for links in the current document, not for the document itself.

If the reader's browser has JavaScript enabled, the following script will automatically remove any existing framesets:

<SCRIPT TYPE="text/javascript">
<!--
if (top.frames.length!=0)
    top.location=self.document.location;
// -->
</SCRIPT>

An alternative script is

<SCRIPT TYPE="text/javascript">
<!--
function breakOut() {
    if (self != top) 
        window.open("my URL","_top","");
}
// -->
</SCRIPT>
</HEAD>
<BODY onLoad="breakOut()">

8.9. How do I change the title of a framed document?

The title displayed is the title of the frameset document rather than the titles of any of the pages within frames. To change the title displayed, link to a new frameset document using TARGET="_top" (replacing the entire frameset).

8.11. Are there any problems with using frames?

The fundamental problem with the design of frames is that framesets create states in the browser that are not addressable. Once any of the frames within a frameset changes from its default content, there is no longer a way to address the current state of the frameset. It is difficult to bookmark - and impossible to link or index - such a frameset state. It is impossible to reference such a frameset state in other media. When the sub-documents of such a frameset state are accessed directly, they appear without the context of the surrounding frameset. Basic browser functions (e.g., printing, moving forwards/backwards in the browser's history) behave differently with framesets.

Furthermore, frames focus on layout rather than on information structure, and many authors of framed sites neglect to provide useful alternative content in the <NOFRAMES> element. Both of these factors cause accessibility problems for browsers that differ significantly from the author's expectations and for search engines.

For further discussion, see <URL:http://www.htmlhelp.com/design/frames/whatswrong.html>