A simple Python web server

Matt Luther 🏠 🐦 🔗

In this post, we'll build a simple example of a web server in Python using the socket module. This is a basic server that can be accessed with a web browser to see an HTML page. First, we'll show how a socket can be made to listen for requests from a browser and return some HTML. Then, we'll show how the server can look into the request to decide what information to send back based on the request.

In [1]:
import socket

Sockets are what we use to transfer data from one place to another across the internet or other computer networks. We can think of a socket like an address and instructions for where and how we'd like data to transferred. This Python module handles a lot of the details for us.

We're going to make a basic server. A server is supposed to accept requests for information and return that information. Sockets are "where" we will check for requests and send back information.

For an analogy, being a server is like running a physical "help center" where people could write in letters requesting advice. In order to run this service, you should have a mailing address. People writing in should mail their requests in an envelope addressed to your service and containing a return address so you can contact them back. You should routinely check your incoming mail, see what has been requested, and send back your response.

The analogy matches with how our web server will need to establish a socket, check for incoming requests, and respond to the appropriate address.

In the next block of code, our program will establish its socket and start "listening".

In [2]:
listen_socket = socket.socket()
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

HOST, PORT = '', 8889
listen_socket.bind((HOST,PORT))

listen_socket.listen(1)

print "Server started on port %i" % PORT
Server started on port 8889

The first two lines get us a socket object named listen_socket and set some options on that object. The bind call ties our socket to port 8889. That is, the object listen_socket is associated with requests that come in to port 8889 on our server. The listen call makes our socket start actually collecting requests. If we open a browser and go to http://localhost:8889, our browser will send an HTTP request to port 8889, and listen_socket will be able to access that request. Right now, it won't do anything with those requests. We will deal with that soon.

The URL above might look strange if you're used to just going to something like http://www.google.com. Above, we have to specify the port with :8889 because it is not the usual HTTP port 80 that our browser assumes. Also, localhost is a special word that stands for our own computer, as opposed to domain names like google.com, or IP addresses.

As mentioned above, right now the server just allows requests, but it doesn't do anything with them or return anything.

If we make a request, for example by trying to open http://localhost:8889 in a browser, we won't get anything back. The browser just sends a request and waits. The request goes into a queue known by our socket listen_socket.

The next block of code will make our server look at requests, print out the request, and then respond. After running the code, we can go to the URL above to see the new behavior. We are putting this code in an infinite loop, so if you're following along, remember to interrupt it when you're done checking your browser.

In [3]:
while True:
    client_connection, client_address = listen_socket.accept()
    request = client_connection.recv(1024)
    
    print request

    http_response = """\
HTTP/1.1 200 OK

<html>
<title>A test page</title>
<body>
<h1>Testing</h1>
<p>This is a test page.</p>
</body>
</html>
"""
    
    client_connection.sendall(http_response)
    client_connection.close()
GET / HTTP/1.1
Host: localhost:8889
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36
DNT: 1
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8

The .accept call on listen_socket makes Python wait for the next request available to listen_socket. If no new requests have been made at that point, Python just waits at this line until a request is available. When it finally accepts a connection, we get a new socket client_connection which is already associated with an (IP, port) pair we called client_address. This new socket lets us send information back to whomever made the request.

Our code above then prints out the response, sends back some HTML on the new client socket, and then closes the new client socket.

Notice that our response above does not depend on the request at all. It will always make our browser show a very minimal page saying that it is a test page.

In the next block of code, we will make the response return the path of the url. That is, if we go to http://localhost:8889/test, then we want our browser to get a page that just says "/test" on it.

Look back at the request that we printed above. For right now, the most important part of the request is the first line, which says "GET / HTTP/1.1". If we go to http://localhost:8889/test, this part of the request will instead be "GET /test HTTP/1.1".

If we pick out that line of the request, we can use it to change the response based on the url used to access the server. Our code below does this, returning a simple page that states the path we find in the request.

In [4]:
while True:
    client_connection, client_address = listen_socket.accept()
    request = client_connection.recv(1024)
    
    first_line = request.split("\r\n")[0]
    path = first_line.split(' ')[1]

    http_response = """\
HTTP/1.1 200 OK

<html>
<title>A test page</title>
<body>
<h1>Testing</h1>
<p>The server got a request with path %s</p>
</body>
</html>
""" % path
    
    client_connection.sendall(http_response)
    client_connection.close()

Now, our browser will get a different page depending on what URL we use. If we go to http://localhost:8889/test, we get a page that tells us the path is /test. If we go to http://localhost:8889/foo/bar, the page will tell us the path is /foo/bar.

This provides some basic understanding of how a web server works. To summarize:

  1. The server opens a socket, associates it to a port, and listens for requests throughout the life of the server.
  2. When a browser tries to access the server at that port, the browser sends an HTTP request which includes part of the URL and some other information generated by the browser.
  3. The server processes this request to decide what to send back to the browser.
  4. The server sends some HTML back
  5. The browser displays the HTML as a page.

Of course, there's still a lot more going on. Some interesting questions at this point are:

  • What is HTTP? We used the acronym above, but what do we actually mean by "an HTTP request or response"? This has to do with the format and extra information in what gets sent around. The socket doesn't actually care what it gets or sends. What makes a server a "web server" is that it is expected to interact correctly with HTTP messages.
  • How does data actually get "sent" on a socket? We'd have to learn more about what Python, the operating system, and our hardware are doing.
  • What happens if we attempt multiple connections at once? Right now, the server only handles one request at a time. So, if there are multiple requests, the later requests will have to wait. This can be avoided by carefully using threading to handle the different connections.
  • How should we actually return web pages? We probably don't want to write all of the HTML for our web pages in one giant Python file. The straightforward way to deal with this is to put our HTML and Python in separate files, and then use the URL path to decide what files to run and return.
  • Should we just use a prebuilt server? Probably. There are lots of popular Python web servers and web frameworks that are already available. But this was fun and now we have a better idea of what's going on underneath it all.