HTTP: A look behind the scenes of the Internet

If you think about it more closely, it’s actually almost magical. You enter a character string in the address bar of your browser, press Enter once and you see the weather report for the weekend or the latest football results. But have you ever thought about what was going on in the background until you discovered, to your dismay, that TSV Grünkraut had lost again? HTTP plays a central role here.

The HTTP protocol

As a normal user, you would hardly notice it if it weren’t at the beginning of so many URLs, but communication throughout the World Wide Web is based on the HTTP protocol. In this article we want to take a closer look at the so-called HTTP request-response cycle, i.e. everything that happens in the background from entering the URL to displaying the website in the browser.

HTTP messages: request and response

HTTP stands for “Hypertext Transfer Protocol”. It is now mainly used to load websites from the WWW into a web browser. The “communication units” in HTTP are called messages. In principle there are two different types of HTTP messages. The request goes from the user (often called the client in this context) to the server. The response goes from the server to the user in response. Let’s look at an example.

A user (let’s call him Herbert) wants to access a website, e.g. the Stones & Weeds online shop. Herbert would like to take a look at the newly added products from the “Stones” category. He therefore types the following URL into his browser: http://www.SteineundUnkraut.org /Neu/Steine.html The browser then sends an HTTP request to the server on which the online shop’s website is located. This processes the request and sends an HTTP response. But what exactly does an HTTP message look like?

HTTP messages: structure

A typical HTTP message consists of 3 parts, the start line, the message header and the message body. The information contained in these three parts depends on whether the HTTP message is an HTTP request or an HTTP response.

HTTP request

The start line in an HTTP request initially contains the method, a command that tells the server what it should actually do. Two very well-known examples of HTTP methods are GET (tells the server to send data to the user) and POST (tells the server to save data to the database). In our example, the method would be GET because Herbert’s browser requests data (the website) from the server.

Next in the start line is the URI, not to be confused with the URL. URIs are used for identification and URLs for localization. But be careful, localization also means identification. For example, you can identify a person based on where they live. So all URLs are URIs at the same time. However, there are URIs that are not URLs. For example, is SteineundUnkraut.org a URI and http://www.SteineundUnkraut.org is a URL because not only does it identify the web page, but HTTP also shows us where the resource (the web page) is.

At the end of the start line, the browser tells you which HTTP version it is using. The starting line from our example could look like this: GET /Neu/Steine.html HTTP 1.0

The request headers are basically pairs consisting of a name (case sensitive) followed by a colon and a value. Headers specify certain rules and information, e.g. the host. This is the address of the server to which we send the request. In the headers you can also specify, for example, which language the client accepts as a response (Accept Language). The request headers in our example could look like this:
Host: www.SteineundUnkraut.org
Accept-Language: de

We need an HTTP request body at this point, but not in our example.

HTTP response

There is no method or URI in the start line of the HTTP response. Here we only have the HTTP version and a status code. This status code tells the user whether the request was successful or failed. Status Code 200 returns “OK” and means that the request was processed successfully and the result is transmitted in the response. The code, on the other hand, which probably every Internet user knows: 404 returns “not found” and means that the requested resource was not found. But there are many others.

For example, status code 418 returns “I’m a teapot” and indicates that the server refuses to make coffee because it is a teapot. This error message is part of the “Hyper Text Coffee Pot Control Protocol” and was intended as an extension for HTTP – but only as an April Fool’s joke. Nevertheless, the error was also implemented by humorous developers in well-known software projects (e.g. Google’s Go programming language).

The response headers have the same format as the request headers. However, you can use it to specify other information and rules. For example, the time of sending (date) or information about the web server used. The response headers in our example could look like this:
Date: Tue, 27 Feb 2018 08:12:31 GMT
Server: Apache/1.3.27 (Unix) (Red-Hat/Linux)

The response body then contains the desired data, i.e. the website.

Depending on the method, HTTP version, etc., you may have to send different headers or no headers at all. However, you can select and send as many of the official headers defined in the HTTP standard as you deem necessary. As a rule, there is not just one HTTP request and HTTP response, but several. For example, there is an extra exchange for an image that is on the website.

IT security through HTTP headers

HTTP headers also play a major role from security aspects. For example, they are a simple way to actively protect visitors to a website.

You can find out exactly how this works and which HTTP headers you should definitely set in one of our next blog posts: Data security: Do I need HTTPS?

If you would like to check now whether these important headers are set for you, register with Enginsight and test our platform free of charge for 14 days.

Inhalt

Mehr zum Thema:

Kommende Events & Webcasts:

Weitere Beiträge im Enginsight Blog
Enginsight Logo