comp.infosystems.www FAQ

1: Recent changes to the FAQ
2: Information about this document
3: Elementary Questions
4: Advanced Questions
5: Credits

1: Recent changes to the FAQ

3/15/94: Information on saving inline images to disk added

2: Information about this document

This is an introduction to the World Wide Web project, describing the concepts, software and access methods. It is aimed at people who know a little about navigating the Internet, but want to know more about WWW specifically. If you don't think you are up to this level, try an introductory Internet book such as Ed Krol's "The Whole Internet" or "Big Dummy's Guide to the Internet". The latter is available electronically by anonymous FTP from ftp.eff.org in the directory pub/Net_info/Big_Dummy.

This informational document is posted to news.answers, comp.infosystems.www, comp.infosystems.gopher, comp.infosystems.wais and alt.hypertext on the 1st and 15th of every month (please allow a day or two for it to propagate to your site). The latest version is always available on the web as <http://siva.cshl.org/~boutell/www_faq.html>. (see the section titled "What is a URL?" to understand what this means.)

The most recently posted version of this document is kept on the news.answers archive on rtfm.mit.edu in /pub/usenet/news.answers/www/faq file://rtfm.mit.edu/pub/usenet/news.answers/www/faq>. For information on FTP, send e-mail to mail-server@rtfm.mit.edu with "send usenet/news.answers/finding-sources" in the body, instead of asking me.

Thomas Boutell maintains this document. Feedback about it is to be sent via e-mail to boutell@netcom.com.

In all cases, regard this document as out of date. Definitive information should be on the web, and static versions such as this should be considered unreliable at best. Please excuse any formatting inconsistencies in the posted version of this document, as it is automatically generated from the on-line version.

3: Elementary questions

3.1: What are WWW, hypertext and hypermedia?

WWW stands for "World Wide Web". The WWW project, started by CERN (the European Laboratory for Particle Physics), seeks to build a distributed hypermedia system.

To access the web, you run a browser program. The browser reads documents, and can fetch documents from other sources. Information providers set up hypermedia servers which browsers can get documents from.

The browsers can, in addition, access files by FTP, NNTP (the Internet news protocol), gopher and an ever-increasing range of other methods. On top of these, if the server has search capabilities, the browsers will permit searches of documents and databases.

The documents that the browsers display are hypertext documents. Hypertext is text with pointers to other text. The browsers let you deal with the pointers in a transparent way -- select the pointer, and you are presented with the text that is pointed to.

Hypermedia is a superset of hypertext -- it is any medium with pointers to other media. This means that browsers might not display a text file, but might display images or sound or animations.

3.2: What is a URL?

URL stands for "Uniform Resource Locator". It is a draft standard for specifying an object on the Internet, such as a file or newsgroup.

URLs look like this:

file://wuarchive.wustl.edu/mirrors/msdos/graphics/gifkit.zip
file://wuarchive.wustl.edu/mirrors
http://info.cern.ch:80/default.html
news:alt.hypertext
telnet://dra.com

The first part of the URL, before the colon, specifies the access method. The part of the URL after the colon is interpreted specific to the access method. In general, two slashes after the colon indicate a machine name (machine:port is also valid).

In this document, you will often see URLs surrounded by angle brackets. This is done because some newsreaders (I am told) can recognize them and treat them as "buttons". Do not enter the angle brackets when entering a URL by hand to your web browser.

3.3: How can I access the web?

You have two options -- either use a browser that can be telnetted to, or use a browser on your machine.

3.3.1: Browsers accessible by telnet

An up-to-date list of these is available on the Web as http://info.cern.ch/hypertext/WWW/FAQ/Bootstrap.html and should be regarded as an authoritative list.

info.cern.ch: No password is required. This is in Switzerland, so continental US users might be better off using a closer browser.
ukanaix.cc.ukans.edu: A full screen browser "Lynx" which requires a vt100 terminal. Log in as www.
www.njit.edu: (or telnet 128.235.163.2) Log in as www. A full-screen browser in New Jersey Institute of Technology. USA.
vms.huji.ac.il: (IP address 128.139.4.3). A dual-language Hebrew/English database, with links to the rest of the world. The line mode browser, plus extra features. Log in as www. Hebrew University of Jerusalem, Israel.
sun.uakom.cs: Slovakia. Has a slow link, only use from nearby.
info.funet.fi: (or telnet 128.214.6.102). Log in as info. Not working.
fserv.kfki.hu: Hungary. Has slow link, use from nearby. Login is as www.

3.3.2: Obtaining browsers

The preferred method of access of the Web is to run a browser yourself. Browsers are available for many platforms, both in source and executable forms. Here is a list generated from the authoritative list, http://info.cern.ch/hypertext/WWW/Clients.html.

Terminal based browsers

Line Mode Browser: This program gives W3 readership to anyone with a dumb terminal. A general purpose information retrieval tool. Available by anonymous ftp from info.cern.ch in the directory /pub/www/src.
"Lynx" full screen browser: This is a hypertext browser for vt100s using full screen, arrow keys, highlighting, etc. Available by anonymous FTP from ftp2.cc.ukans.edu.
Tom Fine's perlWWW: A tty-based browser written in perl. Available by anonymous FTP from archive.cis.ohio-state.edu in the directory pub/w3browser as the file w3browser-0.1.shar.
For VMS: Dudu Rashty's full screen client based on VMS's SMG screen management routines. Available by anonymous FTP from vms.huji.ac.il in the directory www/www_client.
NCSA Mosaic for VMS: Browser using X11/DecWindows/Motif. Multimedia magic. Full http 1.0 support including PUT-method forms, image maps, etc. Recommended if you can run it. Available by anonymous FTP from ftp.ncsa.uiuc.edu in the directory Mosaic.
Emacs w3-mode: W3 browse mode for emacs. Uses multiple fonts when used with Lemacs or Epoch. See doc . Available by anonymous FTP from Cello; Browser from Cornell LII. Available by anonymous FTP from fatty.law.cornell.edu in the directory /pub/LII/cello.
Mosaic for Windows: From NCSA. Available by anonymous FTP from ftp.ncsa.uiuc.edu in the directory PC/Mosaic.

Macintosh

NOTE: all of these browsers require that you have SLIP, PPP or other TCP/IP networking on your PC. SLIP and PPP can be accomplished over phone lines, but only with the active cooperation of your network provider or educational institution. If you only have normal dialup shell access, your best option at this time is to run Lynx on the system you call.
Mosaic for Macintosh: From NCSA. Full featured. Available by anonymous FTP from ftp.ncsa.uiuc.edu in the directory Mac/Mosaic.
Samba: From CERN. Basic. Available by anonymous FTP from info.cern.ch in the directory /ftp/pub/www/bin as the file mac.

XWindows

NCSA Mosaic for X: Browser using X11/Motif. Multimedia magic. Full http 1.0 support including PUT-method forms, image maps, etc. Recommended if you can run it. Available by anonymous FTP from ftp.ncsa.uiuc.edu in the directory Mosaic.
tkWWW Browser/Editor for X11: Browser/Editor for X11. (Beta test version.) Available for anonymous ftp from export.lcs.mit.edu in the directory contrib as tkWWW-0.10.tar.Z. (Note: this document may be up to date, so you may prefer to ftp to this site by hand and look for an even newer version rather than using the link above.)
MidasWWW Browser: From Tony Johnson. (Beta, works well.)
Chimera: Browser using Athena (doesn't require Motif). Supports forms, inline images, etc.; closest to Mosaic in feel of the non-Motif X11 browsers. Available for anonymous FTP from ftp.cs.unlv.edu in the directory /pub/chimera.

NeXTStep

Browser-Editor on the NeXT: A browser/editor for NeXTStep. Allows wysiwyg hypertext editing. Requires NeXTStep 3.0. Available for anonymous FTP from info.cern.ch in the directory /pub/www/src.

Batch Mode

Batch mode browser: A batch-mode "browser", url_get, which is available through the URL <http://wwwhost.cc.utexas.edu/test/zippy/url_get.html>. (I am not aware of an anonymous FTP site for the same package at present.) This package is intended for use in cron jobs and other settings in which fetching a page in a command-line fashion is useful.

Unreleased or Unsupported

Browser on CERNVM: A full-screen browser for VM. Nonexistent. Use the line mode www. Might arrive suddenly one day.
Dave Ragget's Browser: Unreleased. For X11, (later PC?)
Erwise: X-windows early browser. Unsupported, now of historical interest only.
NJIT's Browser

Assumes a character-grid terminal with cursor addressing, and provides a full-screen interface to the web.

3.4: How can I provide information to the web?

Information providers run programs that the browsers can obtain hypertext from. These programs can either be WWW servers that understand the HyperText Transfer Protocol HTTP (best if you are creating your information database from scratch), "gateway" programs that convert an existing information format to hypertext, or a non-HTTP server that WWW browsers can access -- anonymous FTP or gopher, for example.

If you only want to provide information to local users, placing your information in local files is also an option. This means, however, that there can be no off-machine access.

3.4.1: Obtaining Servers

CERN's server is available for anonymous FTP from info.cern.ch and many other places. Use archie to search for "www" or "WWW" to find copies close to you. NCSA has also released a server, available for FTP from ftp.ncsa.uiuc.edu.

See http://info.cern.ch/hypertext/WWW/Daemon/Overview.html for more information on writing servers and gateways in general.

3.4.2: Producing HTML documents

There are several ways to produce HTML. One is to simply write it by hand; try the "source" button of of your browser to look at the HTML for an interesting page. The odds are that it'll be a great deal simpler than you would expect. If you're used to marking up text in any way (even red-pencilling it), HTML should be rather intuitive. A beginner's guide to HTML is available at the URL <http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimer.html>.

Of course, most folks would still prefer to use a friendlier, graphical editor. One option is to use an SGML editor with the HTML DTD . Another, for EMACS fans, is to use EMACS and html-mode.el .

In addition, there are two collections of filters for converting your existing documents (in TeX and other non-HTML formats) into HTML automatically:

Rich Brandwein and Mike Sendall's List at CERN .

NCSA's List of Filters and Editors , which also mentions two editors for MS Windows.

Finally, TkWWW (listed above under XWindows browsers) supports HTML editing.

3.5: How does WWW compare to gopher and WAIS?

While all three of these information presentation systems are client-server based, they differ in terms of their model of data. In gopher, data is either a menu, a document, an index or a telnet connection. In WAIS, everything is an index and everything that is returned from the index is a document. In WWW, everything is a (possibly) hypertext document which may be searchable.

In practice, this means that WWW can represent the gopher (a menu is a list of links, a gopher document is a hypertext document without links, searches are the same, telnet sessions are the same) and WAIS (a WAIS index is a searchable page, returning a document with no links) data models as well as providing extra functionality.

The principal difference between the three systems, it turns out, is deployment. WWW does not have as large a user base as gopher, mainly because of the small number of WWW browsers that are out. This is changing as WWW reaches critical mass (usage of the server at CERN doubles every 4 months -- twice the rate of Internet expansion).

3.6: What is on the web?

Currently accessible through the web:

anything served through gopher
anything served through WAIS
anything on an FTP site
anything on Usenet
anything accessible through telnet
anything in hytelnet
anything in hyper-g
anything in techinfo
anything in texinfo
anything in the form of man pages
sundry hypertext documents

One of the few limitations of the current networked information systems is that there is no simple way to find out what has changed, what is new, or even what is out there. As a result, a definitive list of the web's contents is impossible at this moment. There are, however, several resources which provide a great deal of information on new and established servers by topic. These are just two:

The WWW Virtual Library at the URL <http://info.cern.ch/hypertext/DataSources/bySubject/Overview.html>, a good place to find resources on a particular subject
What's New With NCSA Mosaic at the URL <http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/whats-new.html>, which carries announcements of new servers on the web

3.7: I want to know more

To find out more, use the web. This FAQ hopefully provides enough information for you to locate and install a browser on your system. If you have system specific questions regarding FTP, networking and the like, please consult newsgroups relevant to your particular hardware and operating system!

Later you may return to this FAQ for answers to some of the advanced questions covered in the second section. The advanced section contains the most-asked technical questions in the group.

4: Advanced Questions

4.1: How do I set up a clickable image map?

There are really two issues here: how to indicate in HTML that you want an image to be clickable, and how to configure your server to do something with the clicks returned by Mosaic, Chimera, and other clients capable of delivering them.

You can read about image maps and the NCSA server at the URL <http://hoohoo.ncsa.uiuc.edu/docs/setup/admin/Imagemap.html>.

4.2: How do I make a "link" that doesn't load a new page?

Such links are useful when a form is intended to perform some action on the server machine without sending new information to the client, or when a user has clicked in an undefined area in an image map; these are just two possibilities.

Rob McCool of NCSA provided the following wisdom on the subject:

Yechezkal-Shimon Gutfreund (sg04@gte.com) wrote:
: Ok, here is another bizzare request from me:

: I am currently running scripts which I "DO NOT" want to return
: any visible result. That is, not text/plain, not text/HTML, not
: image/gif. The entire results are the side effects of the
: script and nothing should be returned to the viewer.

: It would be nice to have an internally supported null viewer
: so that I could do this, more "cleanly" (ok, ok, I hear your groans).

HTTP now supports a response code of 204, which is no operation. Some
browsers such as Mosaic/X 2.* support it. To use it, make your script a nph
script and output an HTTP/1.0 204 header. Something like:

HTTP/1.0 204 No response
Server: Myscript/NCSA httpd 1.1

(You can learn more about nph scripts from the NCSA server documentation at the URL <http://hoohoo.ncsa.uiuc.edu/docs>.) Essentially they are scripts that handle their own HTTP response codes.

4.3: Where can I learn how to create fill-out forms?

You can read about the Common Gateway Interface at the URL <http://hoohoo.ncsa.uiuc.edu:80/cgi/>. In addition to documenting the standard interface for which scripts can now be written for both NCSA and CERN-derived servers, these pages also cover HTML forms and how to handle the results on the server side.

4.4: How can I save an inline image to disk?

Here are two ways:

1. Turn on "load to local disk" in your browser, if it has such an option; then reload images. You'll be prompted for filenames instead of seeing them on the screen. Be sure to shut it off when you're done with it.

2. Choose "view source" and browse through the HTML source; find the URL for the inline image of interest to you; copy and paste it into the "Open URL" window. This should load it into your image viewer instead, where you can save it and otherwise muck about with it.

5: Credits

Thomas Boutell boutell@netcom.com
Nathan Torkington Nathan.Torkington@vuw.ac.nz
Marc Andreessen marca@ncsa.uiuc.edu
Tony Johnson