HTTP-HYPERTEXT TRANSFER PROTOCOL
The standard web transfer protocol is HTTP (Hyper Text Transfer Protocol). Each interaction consists of one ASCII request, followed by one MIME-like response. HTTP is constantly evolving. Several versions are in use and others are under development. The HTTP protocol consists of two distinct items: the set of requests from browsers to servers and the set of responses going back the other way.
All the newer versions of HTTP support two kinds of requests: simple requests and full requests. A simple request is just a single GET line naming the page desired, without the protocol version. The response is just the raw page, with no headers, no MIME, and no encoding.
For e.g.
GET /hypertext/WWW/TheProject.html
but without the HTTP/1.0. The page will be returned with no indication of its content type.
Full requests are indicated by the presence of the protocol version on the GET request line. Requests may consist of multiple lines, followed by a blank line to indicate the end of the request. The first line of a full request contains the command (of which GET is but one of the possibilities), the page desired, and the protocol/version. Subsequent lines contain headers.
Although HTTP was designed for use in the Web, it has been intentionally made more general than necessary with an eye to future object-oriented applications. For this reason, the first word on the full request line is simply the name of the method (command) to be executed on the web page (or general object). The built-in methods are listed below. The names are case sensitive, so GET is a legal method but get is not.
Method |
Description |
GET |
Request to read a web page |
HEAD |
Request to read a web page’s header |
PUT |
Request to store a web page |
POST |
Append to a named resource (e.g., a web page) |
DELETE |
Remove the Web page |
LINK |
Connects two Existing resources |
UNLINK |
Breaks an existing connection between two resources |
Figure (a). The built-in HTTP request methods.
The GET method requests the server to send the page (by which we mean object, the most general case), suitably encoded in MIME. However, if the GET request is followed by an If-Modified-Since header, the server only sends the data if it has been modified since the date supplied. Using this mechanism, a browser that is asked to display a cached page can conditionally ask for it from the server, giving the modification time associated with the page. If the cache page is still valid, the server just sends back a status line announcing that fact, thus eliminating the overhead of transferring the page again.
The HEAD method just asks for the message header, without the actual page. This method can be used to get a page’s time of last modification, to collect information for indexing purposes, or just to test a URL for validity. Conditional HEAD requests do not exist.
The PUT method is the reverse of GET: instead of reading the page, it writes the page. This method makes it possible to build a collection of web pages on a remote server. The body of the request contains the page. It may be encoded using MIME, in which case the lines following the PUT might include Content Type and authentication headers, to prove that the caller indeed has permission to perform the requested operation.
Somewhat similar to PUT is the POST method. It too bears a URL, but instead of replacing the existing data, the new data is “appended” to it in some generalized sense. Posting a message to a news group or adding a file to a bulletin board system are examples of appending in the context.
DELETE removes the page. As with PUT, authentication and permission play a major role here. There is no guarantee that DELETE succeeds, since even if the remote HTTP server is willing to delete the page, the underlying file may have a mode the forbids the HTTP server from modifying or removing it.
The LINK and UNLINK methods allow connections to be established between existing pages or other resources.
Every request gets a response consisting of a status line, and possible additional information (e.g., all or part of a web page). The status line can bear the code 200 (OK), or any one of a variety of error codes, for example 304 (not modified), 400 (bad request), or 403 (forbidden).
FILE TRANSFER PROTOCOL (FTP)
FTP is a common file transfer protocol. It is a protocol in the TCP/IP suite and is built on the client/server paradigm. A user, interacting with a local FTP program, connects to a remote site also running FTP. This can be done in a couple of ways. First is to simply enter the command
"ftp text-address"
Which will establish a connection to the specified remote computer. The second way is to enter
"ftp"
and wait for the prompt ftp>. Next the user enters
" ftp> open text-address"
to establish the connection. Sometimes connect is used instead of open. Once connected, the user is asked to enter a user identification followed by a password. On entering the appropriate identification and password the user then can peruse subdirectories, get directory lists, and get copies of files.
Many sites make files available to the general Internet community. This means a user can access them without having an account on that machine. When a user connects to the site he or she usually enters “anonymous” for the account name and either “guest” or his or her email address as the password. The latter is to track use of accesses. This application is often called anonymous FTP.
FTP allow a user to establish a remote connection. The difference with Telnet is that Telnet allows a legitimate login, whereas FTP primarily provides access to certain files and directories.
TELNET
One example of a network virtual terminal protocol is Telnet. It was designed for the ARPANET and is one of the protocols in the TCP/IP suite. Perhaps most people know Telnet as the application that allows remote logins. A virtual terminal is a data structure maintained by either the application software or a local terminal. Its contents represent the state of the terminal. For example, they may include the current cursor position, reverse video indicator, cursor shape, number of rows and columns, and color. Both the user and the application can reference this structure. The application writes to the virtual terminal without worrying about terminal-specific matters. Virtual terminal software does the required translation, and the data is displayed. When a user enters data, the process works in reverse. Virtual terminal protocols define the format of the data structure, software converts user input to a standard form, and the application then reads the standard screen.
To the user, a remote login appears to be no different than a login to a local computer. A user works at a PC (or is connection to another computer) that runs protocols to connect to a network. The protocols establish a connection over the network to a remote computer. The user and remote computer exchange commands and data using protocol such as TCP/IP.The user is working at a higher layer, however, so this is all transparent and appears much like a local login. The only difference may be slight delays between responses, especially if the remote computer is far away or network traffic is heavy.
Telnet works in a client/server mode. That is, a PC (or other computer) runs Telnet (client) locally and transmits data between the user and network protocols. It also can format and send specific commands. The remote computer (server) also runs its version of Telnet. It performs similar functions, exchanging data between network protocols and the operating system and interpreting user-transmitted commands.
A user typically uses Telnet in a couple of ways. One is to log in to a local computer, wait for a system prompt (“>” in our example), and enter the command
> Telnet text-address
The text address specifies the host computer to which the user wants to connect. Telnet then calls on the transport protocol to negotiate and establish a connection with the remote site. Once connected, the user must log in to the remote site by specifying the account number and password. Another way to use Telnet is to enter the command Telnet without a text address. The local system will respond with a Telnet prompt (Telnet >). If you are running on a graphical user interface (GUI), there is typically a Telnet icon you can access. Either way you can enter Telnet commands (or select them from a menu). For example, you can connect to the remote site by entering a connect or open command (depending on the local system) specifying the text address.
Once connected, Telnet works in the background completely transparent to the user. However, the user can escape from the remote login to give subsequent commands to Telnet. This is normally done by entering a control sequence such as Ctrl-]. This returns the Telnet prompt to the user but does not break the remote connection.
CHAT AND BULLETIN BOARDS
Chat is synchronous (happening in real time, like a phone conversation, unlike an e-mail exchange), line-by-line communication with another user over a network. Chat rooms are search for chats on subjects that interest you; if the room members are discussing the stated topic, you may meet some interesting person. Following are different types of chats:
- With featured chats you can check out the chat rooms.
- In the member chats, you’ll find the same sorts of categories as in the featured chats, but the chat rooms themselves are member created.
- The private chat rooms are private. They are also useful for conversation with more than two other people. You can create a private chat room and invite your friends in to have them in your buddy list. You can also get into private chat rooms by guessing at names.
- With channel chat you can connect to conference rooms, get into game rooms and can look for special some ones.
You can protect your right to quality chat. By double clicking the name of a rude chatter and, in the information about dialog box that comes up, check the Ignore Member button. Once a chatter is ignored, his comments won’t show up on your scrolling chat screen. You can also give chat preferences like getting notified when members arrive or leave. You can double-space incoming messages or alphabetize the member list. You can also enable chat room sounds.
Another way to go about chatting is to search the member directory for people who share your enthusiasm for say chess, live in the town you grew up etc. you can type location-specific and name-specific search words to narrow down the search. The advance search offers you the option of filling in everything about the person you seek.
You can chat in style by changing fonts, coloring letters, using bold, italics or underlines etc. you can also use shorthands in chatting. Little pictures can also be drawn through the keyboard.
Once you enter a chat room, you can create your member profile depending on what kind of attention you want to attract.
Bulletin boards are data banks that allow the free exchange of some software, files, or other information. Electronic bulletin boards are a way to meet other computer users, voice opinions, receive technical help, and download shareware etc. bulletin boards are often focused on a particular subject area. Boards are more intimate and personal and provide an easy way for people with similar interests to congregate and interact.
USENET
One of the most popular applications of computer networking is the worldwide system of newsgroups called net news. Often net news is referred to as USENET.
A newsgroup is a worldwide discussion forum on some specific topic. People interested in the subject can “subscribe” to the newsgroup. Subscribers can use a special kind of user agent, a news reader, to read all the articles (messages) posted to the newsgroup. People can also post articles to the newsgroup. Each article posted to a newsgroup is automatically delivered to all the subscribers, wherever they may be in the world. Delivery typically takes between a few seconds and a few hours, depending how far off the beaten path the sender and receiver are. In effect, a newsgroup is somewhat like a mailing list, but internally it is implemented differently. It can be thought of as a kind of high-level multicast
The number of newsgroups is so large that they are arranged in a hierarchy to make them manageable. Following figure shows the top levels of the ‘official’ hierarchies.
Name |
Topics covered |
Comp |
Computers, computer science, and the computer industry |
Sci |
The physical sciences and engineering |
Humanities |
Literature and the humanities |
News |
Discussion of the USENET itself |
Rec |
Recreational activities, including sports and music |
Misc |
Everything that does not fit in somewhere else |
Soc |
Socializing and social issues |
Talk |
Diatribes, polemics, debates and arguments galore |
Alt |
Alternative tree covering virtually everything |
Figure (b) . USENET hierarchies
Each of the categories listed is broken into subcategories, recursively. For example, rec.sport is about sports, rec.sport.basketball is about basketball, and rec.sport.basketball.women is about women’s basketball.
Numerous news readers exist. Like email readers, some are keyboard based; others are mouse based. In nearly all cases, when the new sreader is started, it checks a file to see which newsgroups the user subscribes to. It then displays a one-line summary of each as-yet-unread article in the first newsgroup and waits for the user to select one or more reading. The selected articles are then displayed one at a time. After being read, they can be discarded, saved, printed, and so on.
News readers also allow users to subscribe and unsubscribe to newsgroups. Changing a subscription simply means editing the local file listing which newsgroups the user is subscribed to.
News readers also handle posting. The user composes an article and then gives a command or clicks on an icon to send the article on its way. Within a day, it will reach almost everyone in the world subscribing to the newsgroup to which it was posted. It is possible to crosspost an article, that is, to send it to multiple newsgroups with a single command. It is also possible to restrict the geographic distribution of posting.
With USENET thousands of people who do not know each other can have worldwide discussions on a vast variety of topics. It is possible for someone with a problem to post it to the net. The next day, he may have a number of solutions for it.
To stop abusive postings called flamewar, an individual user can install a killfile, which specifies that articles with a certain subject or from a certain person are to be discarded upon arrival, prior to being displayed. Most news readers also allow an individual discussion thread to be killed, too. This is useful when a discussion looks like it is starting to get into an infinite loop.
If enough subscribers to a group get annoyed with newsgroup pollution, they can propose having the newsgroup be moderated. A moderated newsgroup is one in which only one person, the moderator, can post articles to the newsgroup. All postings to a moderated newsgroup are automatically sent to the moderator, who posts the good ones and discards the bad ones. Some topics have both a moderated newsgroup and an unmoderated one.
Since thousands of people subscribe to USENET for the first time every day, the same beginner's questions tend to be asked over and over. To reduce this traffic, many newsgroups have constructed a FAQ (Frequently Asked Questions) document that tries to answer all the questions that beginners have. Some of these are highly authoritative and run to over 100 pages. The maintainer typically posts them once or twice a month.
USENET is full of jargon such as BTW (By The Way), ROFL (Rolling On the Floor Laughing), and IMHO (In My Humble Opinion). Many people also use little ASCII symbols called smileys or emotions.
Although most people use their real names in postings, some people wish to remain totally anonymous, especially when posting to controversial news groups or when posting personal ads to news groups dealing with finding partners. This desire has led to the creation of anonymous remailers, which are servers that accept email messages (including postings) and change the From:, Sender:, and Reply-To: fields to make them point to the remailer instead of the sender. Some of the remailers assign a number to each user and forward email addressed to these numbers, so people can send email replies to anonymous postings like
As more and more people subscribe to USENET, there is a constant demand for new and more specialized news groups. Consequently, a procedure has been established for creating new ones. Suppose that somebody likes english movies and wants to talk to other english movie fans. He posts a message to news.groups naming the proposed group, say rec.movies.english, and giving the names of English movies.
Some of the Smaller news groups are implemented as mailing lists. To post an article to such a mailing list, one sends it to" the mailing list address, which causes copies to be sent to each address on the mailing list.
USENET is not generally implemented using mailing lists. Instead each site (campus, company, or Internet service provider) stores incoming mail in a single directory, say, news, with subdirectories for comp, sci, etc. These, in turn have subdirectories such as news/comp/os/minix. All incoming news is deposited in the appropriate directory. News readers just fetch the articles from there as they need them. This arrangement means that each site needs only one copy of each news article, no matter how many people subscribe to its newsgroup. After a .few days, articles time out and are removed from the disk.
To get on USENET, a site must have a newsfeed from another site on USENET. One can think of the set of all sites that get net news as the nodes of a directed graph. The transmission lines connecting pairs of nodes form the arcs of the graph. This graph is USENET. Note that being on the Internet is neither necessary nor sufficient for being on. USENET.
Periodically, each site that wants news can poll its newsfeed(s) asking if any new news has arrived since the previous contact. If so, that news is collected and stored in the appropriate subdirectory of news. In this manner, news diffuses around the network. It is equally possible for the newsfeed, rather than the receiver .to take the initiative and make contact when there is enough new news.
Not every site gets all newsgroups. There are several reasons here. First, the total newsfeed exceeds 500 MB per day and is growing rapidly. Storing it all would require a very large amount of disk space. Second, transmission time and cost are issues. Third, not every site is interested in every topic. Finally, some newsgroups are a bit too funky for the tastes of many system administrators, who then ban them, despite considerable local interest.
News articles have the same format as email messages, but with the addition of a few extra headers. This property makes them easy to transport and compatible with most of the existing email software.
A description of the news headers is:
The Path: header is the list of nodes the message traversed to get from the poster to the recipient. At each hop, the forwarding machine puts its name at the front of the list. This list gives a path back to the poster.
The Newsgroups: header tells which newsgroups the message belongs to. It may contain more than one newsgroup name. Any message crossposted to multiple newsgroups will contain all of their names.
Because multiple names are allowed here, the Followup-To: header is needed to tell people where to post comments and reactions to put all of the subsequent discussion in one newsgroup.
The Distribution: header tells how far to spread the posting. It may contain one or more state or country codes, the name of a specific site or network, or "world."
The Nntp-Posting-Host: header tells which machine actually posted the article, even if it was composed on a different machine.
The References: header indicates that this article is a response to an earlier article and gives the ID of that article. It is required on all follow-up articles and prohibited when starting a new discussion.
The Organization: header can be used to tell what company, university, or agency the poster is affiliated with. Articles that fill in this header often have a disclaimer at the end saying that if the article is goofy, it is not the organization's fault.
The Lines: header gives the length of the body.
The Subject: lines tie discussion threads together. Many news readers have a command to allow the user to see the next article on the current subject, rather than the next article that came in. Also, killfiles and kill commands use this header to know what to reject.
The Summary: is normally used to summarize the follow-up article. On follow-up articles, the Subject: header contains "Re: " followed by the original subject.
NNTP-NETWORK NEWS TRANSFER PROTOCOL
Now let us look at how articles diffuse around the network. The initial algorithm just flooded articles onto every line within USENET. While this worked for a while, eventually the volume of traffic made this scheme impractical, so something better had to be worked out.
Its replacement was a protocol called NNTP (Network News Transfer Protocol). NNTP is somewhat similar to SMTP, with a client issuing commands in ASCII and a server issuing responses as decimal numbers coded in ASCII. Most USENET machines now use NNTP.
NNTP was designed for two purposes. The first goal was to allow news articles to propagate from one machine to another over a reliable connection (e.g., TCP). The second goal was to allow users whose desktop computers cannot receive news to read news remotely.
Two general approaches are possible. In the first one, news pull, the client calls one of its newsfeeds and asks for new news. In the second one, news push, the newsfeed calls the client and announces that it has news. The NNTP commands support both of these approaches, as well as having people read news remotely.
To acquire recent articles, a client must first establish a TCP connection on one of its newsfeeds. Behind it is the NNTP daemon, which is either there all the time waiting for clients or is created as needed. After the connection has been established, the client and server communicate using a sequence of commands and responses. These commands and responses are used to ensure that the client gets all the articles it needs, but no duplicates. no matter how many newsfeeds it uses.
Command |
Meaning |
LIST |
Give me a list of all newsgroups and articles you have |
NEWSGROUPS date time |
Give me a list of newsgroups created after date/time |
GROUP grp |
Give me a list of all articles in grp |
NEWNEWS grps date time |
Give me a list of new articles in specified groups |
ARTICLE id |
Give me a specific article |
POST |
I have an article for you that was posted here |
IHAVE id |
I have article id. Do you want it? |
QUIT |
Terminate the session. |
Figure(c)
The main ones used for moving articles between news daemons are listed in figure (c)