Node's HTTP Module

CSCI-UA.0480-008

A Step Back

What's the internet? a global network of networks

What's the underlying protocol that computer's on the internet use to communicate?TCP/IP

What's the world wide web? a bunch of documents connected by hyperlinks … that are retrievable by url

What protocol is the web based on? HTTP

The web is a service built on top of the internet. HTTP is a protocol built on top of TCP/IP (TCP/IP handles the connection, sending/routing/transmitting of data, etc. … while HTTP is the message).

A TCP/IP Server → Web Server

So… previously, we made a simple web server using nothing more than the net module. The TCP/IP part was taken care of by the module, but we had to build http on top of it. That meant:

  • parsing a request
  • manually writing a response back


Also, it had a lot of shortcomings:

  • trailing slashes, casing
  • html as strings!?
  • etc. (NOT FUN!)

The HTTP Module

Sooo… let's use another module that takes care of the http bits for us. The built-in HTTP module gives us:

(From the docs, on the http module) is

  • a low-level API for creating HTTP
    • servers
    • clients
  • it only parses a message into headers and a body
    • it does not work with the actual headers
    • or actual body

Using the HTTP Module

How do we bring in a module in node?

Use the require function.

  • …so, how do we bring in the http module?
  • do we have to install it first?



// http is a core node module 
// it's compiled in to the node binary

const http = require('http');

Let's Look at What HTTP Can Do

How can we see what's in the node module?

Just try importing it in the interactive shell, and typing out http:


const http = require('http');
http

We'll be going over:

  • status codes constant
  • a Server object and its methods
  • the Request object and its methods
  • the Response object and its methods
  • (and a brief detour in time)

About Reading Node's Documentation

Reading Function Signatures

When reading node documentation, note that:

  • all arguments are shown within parentheses after the function name
  • arguments surrounded by brackets are optional



functionName(requiredArg1, [optionalArg2], [optionalArg3]);

// an example from the docs
response.writeHead(statusCode, [reasonPhrase], [headers])

So, Let's Take a Look at Some Details

It's pretty spartan:

  • http.STATUS_CODES
  • http.createServer

http.STATUS_CODES

http.STATUS_CODES is an object that contains:

  • all of the standard HTTP response status codes as properties
  • and their short descriptions as values

{ '100': 'Continue',
  '101': 'Switching Protocols',
  '102': 'Processing',
  '200': 'OK',
  '201': 'Created',
  .
  .
  '418': 'I\'m a teapot',
  .
  .
  '510': 'Not Extended',
  '511': 'Network Authentication Required' }

I'm a teapot

createServer

createServer returns a new web server object:


http.createServer([requestListener])
  • note that it takes one optional parameter - a function that handles request events
  • the callback function takes two arguments, a request object and a response object

Server Object

The server object that results from calling http.createServer is simply an object that emits (generates) events, some of which include:

  • request - whenever a new request is received
  • connection - when a tcp connection is made
  • close - emitted when a server stops listening and closes


Additionally, some useful methods that it has are:

  • server.listen - accept connections at the given port number and hostname
  • server.close - stop the server from accepting new connections

Back to createServer

http.createServer binds a callback function to a request event. What are the two arguments that this callback function takes?

  • request - an instance of http.IncomingMessage
  • response - an instance of http.ServerResponse

http.IncomingMessage

http.IncomingMessage is an object that represents a client's HTTP request.

  • it's the first argument passed in to the request event's callback
  • some of its properties include:
    • httpVersion - the HTTP version sent by the client
    • url - the url that the client requested
    • headers - the request headers sent by the client (as an object with lowercase header names as properties)
    • method - the request method used by the client (GET, POST, PUT, DELETE, etc.)

http.ServerResponse

https.ServerResponse is an object that represents the HTTP response that the server will return.

  • it's the second argument passed in to the request event's callback
  • it's created internally by the HTTP server (not by the user)
  • has two methods of sending headers - explicit and implicit

http.ServerResponse Continued

Some useful properties and methods that a ServerResponse object has are:

  • writeHead(statusCode, [reasonPhrase], [headers]) - explicitly send a response header (status code and headers) to the request
  • setHeader(name, value) - sets a single header value for response for implicit sending of headers
  • getHeader(name) - reads out a header that's been queued for implicit sending (if writeHead wasn't called) to the client
  • removeHeader(name) - removes a header that's been queued for implicit sending (if writeHead wasn't called) to the client
  • write(chunk, [encoding]) - sends a chunk of the response body (causes implicit headers to be sent if writeHead wasn't called
  • end([data], [encoding]) - signals to the server that all of the response headers and body have been sent
  • statusCode - the status code of the response for implicit sending (if writeHead wasn't called)

response.writeHead()

writeHead(200, {'Content-Type':'text/plain'})

  • writeHead has one required argument, the 3-digit HTTP status code (as a Number)
  • last argument is headers (an object with property names as HTTP response header names)
  • there's an optional second argument for a human readable version of the status code
  • this method must only be called once on a message and it must be called before end() is called
  • if you call write() or response.end() before writeHead, the implicit headers will be determined and writeHead will be called with those headers

response.end()

end('<!DOCTYPE html><html><body>hello</body></html>')

  • end has one required argument, the body of the HTTP response that's being sent back to the client
  • signals to server that the message, the response, is complete
  • end(), must be called on each response

writeHead() will (either explicitly or implicitly) be called before end

end() must be called on each response

A Web Server in Node (Revisited)

Let's try writing our own web server, with help from the http module!

  • bring in the http module
  • create a web server object and listen on port 3000… use the callback below
  • create a callback function
    • the function will send back a 200 OK, and the header Content-Type set to 'text/plain'
    • the body will just be 'hello world'

A Web Server in Node, Implemented


const http = require('http');
const port = 3000;

http.createServer(handleRequest).listen(port);
console.log('starting server on ' + port);


function handleRequest(req, res) {
	const responseStatusCode = 200;
	res.writeHead(responseStatusCode, {'Content-Type':'text/plain'});
	res.end('hello');
}

Adding Some Features

Let's try adding some logging. On the server side, output the requested url to the console whenever a request is made.


// in handleRequest
console.log(req.url);

AND MOAR FEATURES

How about sending back some html?

First let's try actually sending back html in the body…


// in handleRequest
res.end('<!DOCTYPE html>Hi there!');

What happened?The content type needs to be changed to text/html

Response Headers?

Check out the rfc… here are few that you'll commonly see:

  • cache-control - specifies directives that MUST be obeyed by all caching mechanisms (for example, don't cache this page if no-cache)
  • content-encoding - type of encoding used on data - primarily used to allow a document to be compressed
  • content-type - the MIME type of the response
    • standard identifier used on Internet to indicate the type of data that file contains
    • signals to client (browser, email client) how to display content
  • date - the date and time at which the message was originated
  • last-modified - date and time at which the origin server believes the resource was last modified
  • server - server's name

Some Sample Response Headers


cache-control:no-cache
content-encoding:gzip
content-type:text/html; charset=utf-8
date:Mon, 29 Sep 2014 01:15:00 GMT
server:cloudflare-nginx
status:200 OK
vary:Accept-Encoding
version:HTTP/1.1

Sending Back HTML

Which header(s) will fix our problem, again?


// change Content-Type header to text/html
res.writeHead(200, {'Content-Type':'text/html'});

Aaaand An Aside on Node

Node's primary programming paradigm is event driven programming. Event driven programming is a way of programming where:

  • rather than just the conventional top-to-bottom execution, the flow of the programming is determined by events
    • these events are usually some sort of I/O
    • …such as user input
    • or network events
  • there's generally a main loop (in node's case, that's the event loop)
    • the main loop triggers a callback function when an event is detected
    • what's the event that's being handled in our web server?a request event

Additional URLs

Great. So, usually there's more than one page on a site, so let's figure out how to serve up additional URLs.

  • serve up different text based on a case insensitive URL
  • the urls and their corresponding response code and text/html body should be as follows:
    • / or /home → 200 OK, "homepage v2, now with routing!"
    • /about → 200 OK, "made with node"
    • any other page → 404 Not Found, "nothing to see here!"
  • on the server side, log both the url and the response code
  • test with /, /home, /about, /about/, /about?q=something, /blog

Additional URLs Implementation

In handleResponse:


  const resCode,
    body,
    headers = {'Content-Type':'text/html'},
    path = req.url.toLowerCase();
  if(path === '/about') {
    resCode = 200;
    body = 'made with node';    
  } else if (path === '/' || path === '/home') {
    resCode = 200;
    body = 'homepage v2, now with routing';    
  } else {
    resCode = 404;
    body = 'nothing to see here'
  }

  res.writeHead(resCode, headers);
  res.end(body);
  console.log(req.url, res.statusCode);

Hm… Something Feels Wrong

Hardcoding html in our code, doesn't seem great. How about we just read some files, and serve them up?

The File System Module

The code node module, fs, allows general file I/O. It allows the reading and manipulation of files.

As usual, bring it in to your program by using require.


const fs = require('fs');

Let's Use the File System Module

In order to have our web server read static files, we'll use fs.readFile(). fs.readFile() asynchronously reads the entire contents of a file. You know what that means, right?callback time!


fs.readFile(filename, [options], callback)#
  • the filename is the full path to the file.
  • options is an object specifying details such as encoding {'encoding':'utf-8'}
  • the callback takes two parameters:
    • err - an error object (present if something goes wrong)
    • data - the contents of the file

Using fs.readFile

Let's try printing out the contents of a file…

  • create a file
  • bring in the fs module
  • use readFile by passing in the path to your file, an encoding (use utf-8), and your callback
  • your callback should check if there's an error
    • if there's an error, log a message
    • otherwise, log the contents of the file

Using fs.readFile Example


const fs = require('fs');

fs.readFile('./public/index.html', {'encoding':'utf-8'}, function(err, data) {
	if (err) {
		console.log('uh oh!');
	} else {
		console.log(data);
	}
});

Back to Serving Static Files

We'll serve 3 pages and an image:

  • /, home - read from public/index.html
  • about - read from public/about.html
  • 404 - read from public/404.html
  • magicman.png - an image … read from public/img/magicman.png

Static Files Implementation

Define a function called serverStatic… it should:

  • read a file and send it out as an http response
    • if there's an error, send out a 500
    • if it reads the file successfully, use the file's contents as the body of the response
  • it'll have 4 parameters
    • res - the response object
    • path - the path of the file to read
    • contentType - the file's content type
    • resCode - the response code that will be sent back
      • it will default to a 200

Let's Create Our Static Files

We'll have to create a couple of folders and files:


mkdir -p public/img

Sample body for index.html, home.html


		

homepage v3, now with static files!

And, of course, drop magicman.png into img.

serveStatic


function serveStatic(res, path, contentType, resCode) {
	fs.readFile(path, function(err, data) {
		if (err) {
			res.writeHead(500, { 'Content-Type': 'text/plain' }); 
			res.end('500 - Internal Error');
		} else {
			res.writeHead(resCode, { 'Content-Type': contentType }); 
			res.end(data);
		}
	});
}

Using serveStatic

Let's modify our handleRequest so that it uses serveStatic.

Using serveStatic Continued


const http = require('http'),
	fs = require('fs');
const port = 3000;
http.createServer(handleRequest).listen(3000);
console.log('Started server on port', port);

function handleRequest(req, res) {
	if (req.url === '/home' || req.url === '/') {
		serveStatic(res, './public/index.html', 'text/html', 200);
	} else if (req.url === '/about') {
		serveStatic(res, './public/about.html', 'text/html', 200);
	} else if (req.url === '/img/magicman.png') {
		serveStatic(res, './public/img/magicman.png', 'image/png', 200);
	} else {
		serveStatic(res, './public/404.html', 'text/html', 404);
	}
}

BTW, Can We Move the Callback Out to a Function Declaration?

Callbacks

Let's try it.

In this case, not really (mainly because of the way we've structured our program):

  • it depends on the closure to get the context of the function that calls it
  • specifically, the callback needs access to res
  • maybe we could use bind… but seems little gross

fs.readFile(path, handleFileRead.bind(
	{res:res, contentType:contentType, resCode:resCode}));

We'll see a better way to do this later…

Well. That Was Fun.

Great. We just implemented a terrible static file web server (we already have Apache, Nginx, etc. to handle that).

What was difficult to deal with… and what were some shortcomings?

  • urls (we didn't use regexes, trailing slash and query strings)
  • mapping to specific static files
  • rewrites / aliases
  • just to name a few…
  • seems like not so great for static sites… but for dynamic?
  • that's where express comes in
  • but before that, I mentioned Apache and Nginx…