Node.js

Behrang Noruzi Niya
September 7, 2011

Node.js

A set of bindings to the V8 JavaScript VM.
Allowes one to script programs that do I/O in JavaScript.
Focused on performance.
Like a general purpose programming language.
Used on server, client, shell scripting, ...
http://nodejs.org/

Google V8

Open source JavaScript engine, developed by Google, shipped with Google Chrome browser.

Compiles JavaScript to native machine code instead of executing bytecode or interpreting it.

Employs optimization techniques for better performance.

I/O needs to be done differently

What is the software doing while it queries the database?

			    result = db.query("select * from T");
			    // use result

What is the software doing while it queries the database?

			    result = db.query("select * from T");
			    // use result

In many cases, just waiting for the response.

Modern Computer Latency

L1: 3 cycles
L2: 14 cycles
RAM: 250 cycles
-----------------------
DISK: 41,000,000 cycles
NETWORK: 240,000,000 cycles

http://duartes.org/gustavo/blog/post/what-your-computer-does-while-you-wait

Non-blocking and Blocking

"Non-blocking"
L1, L2, RAM
-----------------------
"Blocking"
DISK, NETWORK

Multitask

Better software can multitask.

Other threads of execution can run while waiting.

Is that the best that can be done?

Apache vs. NGINX - Requests per Second

Comparing Apache and NGINGX by requests per second

http://blog.webfaction.com/a-little-holiday-present

Apache vs. NGINX - Memory Usage

Comparing Apache and NGINGX by memory usage

http://blog.webfaction.com/a-little-holiday-present

Apache vs. NGINX - The Difference?

Apache uses one thread per connection.

NGINX doesn't use threads. It uses an event loop.

Event Loop

A loop that waits for events (I/O availability, timeout, UI interaction, ...).

When an event is available, it will be processed by executing a callback.

Multithread Problems

Context switching is not free
Execution stacks take up memory

For massive concurrency, cannot use an OS thread for each connection.

Threads Are Evil

In systems programming, threads are a necessary evil.

In application programming, threads are just evil.

Threads provide a deceptively simple model of concurrency.

Threads are subject to races and deadlocks.

Mutual Exclusion

Semaphore, Monitor, Rendezvous, Synchronization

This used to be operating system stuff.

It has leaked into applications because of networking and the multi-core problem.

Blockiong Code

Code like this:

			    result = query("select...");
			    // use result

either blocks the entire OS thread or implies multiple execution stacks.

Non-blockiong Code

But a code like this:

			    query("select...", function (result) {
			        // use result
			    });

allows the program to return to the event loop immediately.

This is how I/O should be done

			    query("select...", function (result) {
			        // use result
			    });

So why isn't everyone using event loops, callbacks, and non-blocking I/O?

For reasons both cultural and infrastructural.

Cultural Bias

We're tought I/O with this:

			    puts("Enter your name: ");
			    name = gets();
			    puts("Name: " + name);

We're tought to demand input and do nothing until we have it.

Cultural Bias

Code like

			    puts("Enter your name: ");
			    gets(function (name) {
			        puts("Name: " + name);
			    });

is rejected as too complicated.

Missing Infrastructure

So why isn't everyone using event loops?

Single threaded event loops require I/O to be non-blocking.

Most libraries are not.

Missing Infrastructure

POSIX async file I/O not available
MAN pages don't state if a function will access the disk
No closures or anonymous functions in C; makes callbacks difficult
Database libraries (e.g. libmysql_client) do not provide support for asynchronous queries
Asynchronous DNS resolution not standard on most systems

Too Much Infrastructure

EventMachine, Twisted, AnyEvent provide very good event platforms.

Easy to create efficent servers.

But users are confused how to combine with other available libraries.

Users still require expert knowledge of event loops, non-blocking I/O.

JavaScript To The Rescue

JavaScript designed specifically to be used with an event loop:

Anonymous functions, closures
Only one callback at a time
I/O through DOM event callbacks

The culture of JavaScript is already geared towards evented programming.

Node.js Project

To provide a purely evented, non-blocking infrastructure to script highly concurrent programs.

Photo of Ryan Dahl Joyent Logo Created by Ryan Dahl
Sponsored by Joyent

GitHub's Popular Wached Repositories

Node.js is second in GitHub's Popular Watched Repositories

September 3, 2011 https://github.com/popular/watched

Examples

Hello World

			    console.log('hello world');

Hello World, Advanced!

			    setTimeout(function() {
			        console.log('world');
			    }, 2000);

			    console.log('hello');

Hello World, Advanced!

			    setTimeout(function() {
			        console.log('world');
			    }, 2000);

			    console.log('hello');

A program which prints "hello", waits 2 seconds, outputs "world", and then exits.

Node exits automatically when there is nothing else to do.

Hello Loop

Change the "hello world" program to loop forever, but print an exit message when the user kills it.

Special object process and SIGINT signal should be used.

Hello Loop

                setInterval(function() {
                    console.log('hello');
                }, 500);

                process.addListener('SIGINT', function() {
                    console.log('goodbye');
                    process.exit(0);
                });

The process object emits an event when it receives a signal. Like in the DOM, you need only to add a listener to catch them.

DNS Resolver

Write a program which resolves google.com and prints its IP addresses.

Built-in module called dns is for working with DNS.

Resolving DNS takes some time but a lot of APIs let you think things happen instantly.

We can allow the program to continue while it waits for the response.

DNS Resolver

                var dns = require('dns');

                console.log('resolving google.com...');

                dns.resolve('google.com', function (err, addresses) {
                    if (err) {
                        throw err;
                    }
                    console.log('found: ', addresses);
                });

DNS Resolver

It runs very quickly. But millions of clock cycles passed.

We can't perceive 2 microseconds.

In a server environment, since handling many clients, waiting millions of clock cycles is wasteful.

A Simple HTTP Server

                var http = require('http');

                var s = http.createServer(function (req, res) {
                    res.writeHead(200);
                    res.end('hello world\n');
                });

                // listen on port 8000
                s.listen(8000);

Streaming HTTP Server

                var http = require('http');

                http.createServer(function (req, res) {
                    res.writeHead(200);

                    setTimeout(function () {
                        res.end('world\n');
                    }, 2000);

                    res.write('hello\n');
                }).listen(8000);

Web Based DNS Resolver

Requesting http://localhost:8000/yahoo.com should return:

query: yahoo.com
			["98.137.149.56","209.191.122.70","67.195.160.76"]

Requesting http://localhost:8000/google.com should return:

query: google.com
			["74.125.39.99","74.125.39.103","74.125.39.104"]

Web Based DNS Resolver

                var http = require('http');
                var dns = require('dns');

                http.createServer(function (req, res) {
                    var query = req.url.replace('/', '');
                    res.write('query: ' + query + '\n');

                    dns.resolve(query, function (err, addresses) {
                        res.end(JSON.stringify(addresses) + '\n');
                    });

                }).listen(8000);

Web Based DNS Resolver

DNS resolution is async
HTTP server "streams" the response

The overhead of each connection is low so the server is able to achive good concurrency: it juggles many connections at a time.

TCP Server

Write a program which:

Starts a TCP server on port 8000
Sends the peer a message
Closes the connection

TCP Server

                var net = require('net');

                net.createServer(function (socket) {
                    socket.end('Goodbye\n');
                }).listen(8000);

TCP Chat Server

                var net = require('net');

                var people = [];

                net.createServer(function (socket) {
                    people.push(socket);
                    socket.on('data', function (data) {
                        people.forEach(function (person) {
                            if (person != socket && person.writable)
                                person.write(data);
                        });
                    });
                }).listen(8000);

TCP Chat Server With HTTP View

                var http = require('http');
			    // ... TCP chat server code omitted
			    // ... it is listening on 8000
                http.createServer(function (req, res) {
                    res.writeHead(200);
                    people.push(res);
                }).listen(8001);

A single process is listening on both 8000 and 8001 ports, and everything is handled because node.js has non-blocking I/O.

Node.js Package Manager

NPM Logo npm is a package manager for node. You can use it to install and publish your node programs. It manages dependencies and does other cool stuff.

curl http://npmjs.org/install.sh | sh

http://npmjs.org/

NPM Usage

npm install -g connect

Connect

Node.js is very powerful but handling everything in web like cookie, session, logging is hard.
Connect is an extensible HTTP server framework for node, providing high performance "plugins" known as middleware.
Connect is bundled with over 14 commonly used middleware, including a logger, session support, cookie parser, and more.

http://senchalabs.github.com/connect/

Connect Sample

                var connect = require('connect');

                connect.createServer(
                    connect.static(__dirname + '/public'),
                    connect.favicon(),
                    connect.logger(),
                    connect.errorHandler(),
                    connect.bodyParser(),
                    connect.cookieParser(),
                    connect.session({'secret': 'password'})
                ).listen(8000);

Express

High performance, high class web development for Node.js
It's like Sinatra from the Ruby world

                var express = require('express');
                var app = express.createServer();
                app.get('/', function(req, res){
                    res.send('Hello World');
                });
                app.listen(3000);

http://expressjs.com/

Express Features

Robust routing
Redirection helpers
Dynamic view helpers
Content negotiation
View rendering and partials support
Session based flash notifications
Built on Connect

Parallel Programming

Node.js And Multicore CPUs

Node.js is inherently single threaded (and arguably JavaScript too).

Single threaded means that a Node process can only use one core.

Node is good at concurrency, but how to run code in parallel with it?

Running Code in Parallel

Use processes
Pass messages
Allow the OS scheduler to run processes

Node's focus on asynchronous networking makes IPC easy, and thus parallel programming easy.

Web Workers

An HTML5 spec adapted for node.js
Each worker is a real OS process
Communicate with each other through sockets
Each process is able to send and receive messages (actor-style)
Useful to take heavy calculations out of a server

Web Workers Example

                var w = new Worker('fib.js');

                w.postMessage({ calculate: 10 });

                w.onmessage = function(m) {
                   console.log('result: %j', m);
                    w.terminate();
                };

How to accept connections on multiple cores?

Spawn a copy of yourself N times
Send the server file descriptor to your clones
Accept connections in each process

Effectively allowes the kernel to load balance connections across processes.

Parent Process

                var http = require('http');
                var net = require('net');
                // Create a web server
                var web = http.Server(function (req, res) {
                    res.writeHead(200);
                    res.end('hello world\n');
                });
                web.listen(8000);
                // File Descriptor server
                net.Server(function (c) {
                    c.write('blah', 'ascii', web.fd);
                    c.end();
                }).listen('/tmp/node_server.sock');

Child Process

                var http = require('http');
                var net = require('net');

                var web = http.Server(function (req, res) {
                    res.writeHead(200);
                    res.end('hello from ' + process.pid + '\n');
                });

                var c = net.createConnection('/tmp/node_server.sock');
                c.on('fd', function (fd) {
                    web.listenFD(fd);
                });

Cluster

Extensible multi-core server management for nodejs.

http://learnboost.github.com/cluster/

References

Thank You

Questions?

Presentation powered by Shower