I spent this summer working on a web platform running on Node.js. This was the first time I worked full-time with Node.js and one thing that became quite apparent after a few weeks of working with it was that many developers, including myself at the time, lack clarify on exactly how the asynchronous features of Node.js work, and how they are implemented at a lower level. Since I believe the only way to use a platform efficiently is to have a clear understanding of how it works, I decided to dig deeper. This curiosity also made me start playing around with implementing similar asynchronous features in other languages, in particular Python, it being my go-to language for experimenting and learning. This led me to Python 3.4's asynchronous IO library asyncio
in particular, which intersected with my already existing interest in coroutines (see my post on combinatorial generation using coroutines in Python.) This post is about exploring the questions and answers that came up while I was learning more about this subject, which I hope can help clarify and answer some questions for others as well.html
All the Python code is intended for Python 3.4. This is mostly because Python 3.4 introduces the selectors
module as well asasyncio
. For earlier versions of Python, libraries such as Twisted, gevent, and tornado, provide similar functionality.node
In the early examples below, I chose to almost entirely ignore the issue of error handling and exceptions. This was done mostly for the sake of simplicity, and it should be noted that proper handling of exceptions should be a very important aspect of the type of code we see below. I will provide a few examples of how Python 3.4's asyncio
module handles exceptions at the end.python
Let's start by writing a program to solve a very simple problem. We will use this problem and minor variations of it for the rest of the section to demonstrate the ideas.ios
Write a program to print "Hello world!" every three seconds, and at the same time wait for input from the user. Each line of user input will contain a single positive number n. As soon as input is entered, calculate and output the Fibonacci number F(n) and continue to wait for more input.
Note that there's a chance the periodic "Hello world!" is inserted in the middle of user input, but we do not care about that.git
Those of you familiar with Node.js and JavaScript might already have a solution in mind that will likely look something like this:github
log_execution_time = require('./utils').log_execution_time; var fib = function fib(n) { if (n < 2) return n; return fib(n - 1) + fib(n - 2); }; var timed_fib = log_execution_time(fib); var sayHello = function sayHello() { console.log(Math.floor((new Date()).getTime() / 1000) + " - Hello world!"); }; var handleInput = function handleInput(data) { n = parseInt(data.toString()); console.log('fib(' + n + ') = ' + timed_fib(n)); }; process.stdin.on('data', handleInput); setInterval(sayHello, 3000);
As you can see, this is quite easy to do in Node.js. All we have to do is set an interval timer to print "Hello world!" and attach an event handler to the data
event of process.stdin
and we are done. Simple to understand on an abstract level, and very easy to use. It just works! But how? To answer this let's try to do the exact same thing in Python.web
Also notice that we use a log_execution_time
decorator to output the time it takes to calculate the Fibonacci number. Here's the definition of this decorator in Python:express
from functools import wraps from time import time def log_execution_time(func): @wraps(func) def wrapper(*args, **kwargs): start = time() return_value = func(*args, **kwargs) message = "Executing {} took {:.03} seconds.".format(func.__name__, time() - start) print(message) return return_value return wrapper
And similarly, in JavaScript:json
// We do not care about handling the "this" parameter correctly in our examples.
// Do not use this decorator where that's needed! module.exports.log_execution_time = function log_execution_time(func) { var wrapper = function() { start = (new Date()).getTime(); return_value = func.apply(this, arguments); message = "Calculation took " + ((new Date()).getTime() - start) / 1000 + " seconds"; console.log(message); return return_value; }; return wrapper; };
The algorithm to calculate the Fibonacci numbers used here is intentionally chosen to be the slowest one of all (exponential running time). This is because this post is not about Fibonacci numbers (see this post on that subject, as there is a logarithmic-time algorithm) and that I actually want the code to be slow to demonstrate some of the concepts below. Here's the Python code for it, that will be used multiple times below.api
from log_execution_time import log_execution_time def fib(n): return fib(n - 1) + fib(n - 2) if n > 1 else n timed_fib = log_execution_time(fib)
So, back to the task at hand. How do we even begin? Python does not provide a built-in setInterval
or setTimeout
. So a first possible solution is to use OS-level concurrency for this. Let's look at using two threads to do what we need. We will look at threads in some more detail in a bit.
from threading import Thread from time import sleep from time import time from fib import timed_fib def print_hello(): while True: print("{} - Hello world!".format(int(time()))) sleep(3) def read_and_process_input(): while True: n = int(input()) print('fib({}) = {}'.format(n, timed_fib(n))) def main(): # Second thread will print the hello message. Starting as a daemon means # the thread will not prevent the process from exiting. t = Thread(target=print_hello) t.daemon = True t.start() # Main thread will read and process input read_and_process_input() if __name__ == '__main__': main()
Quite simple as well. But are the thread-based Python solution and the Node.js solution equivalent? Let's do an experiment. As we discussed, our Fibonacci number calculation code is very slow, so let's try a rather large number, say 37 for Python and 45 for Node.js (JavaScript is quite a bit faster than Python at numerical calculations).
$ python3.4 hello_threads.py 1412360472 - Hello world! 37 1412360475 - Hello world! 1412360478 - Hello world! 1412360481 - Hello world! Executing fib took 8.96 seconds. fib(37) = 24157817 1412360484 - Hello world!
As you notice, it took about 9 seconds for the calculation to be finished but the "Hello world!" message is printed while that calculation takes place. Let's try it with Node.js:
$ node hello.js 1412360534 - Hello world! 1412360537 - Hello world! 45 Calculation took 12.793 seconds fib(45) = 1134903170 1412360551 - Hello world! 1412360554 - Hello world! 1412360557 - Hello world!
With Node.js on the other hand, the printing of the "Hello world!" message is paused while the Fibonacci number is calculated. Let's see how this makes sense.
To understand the difference in behaviour of the two solutions in the previous section, we need to have a simple understanding of threads and event loops. Let's start with threads. Think of a thread as a single sequence of instructions and the CPU's current state in executing them (CPU state refers to e.g. register values, in particular the next instruction register).
A simple synchronous program often runs on a single thread, which is why if an operation needs to wait for something, say an IO operation or a timer, the execution of the program is paused until the operation is finished. One of the simplest blocking operations is sleep
. In fact, that's all sleep
does, namely blocking the thread it is executed on for the given length of time. A process can have multiple threads running in it. Threads in the same process share the same process-level resources, such as memory and its address space, file descriptors, etc.
The operating system is in charge of handling threads, and the scheduler in the OS takes care of jumping between threads in a process (and between processes, but we are not too concerned with that part, since it is outside the scope of this post.) The operating system's scheduler will choose when to put a thread on pause and give control of the CPU to another thread for execution. This is called a context switch, and involves saving of the context of the current thread (e.g. CPU register values) and then loading the state of the target thread. Context switching can be somewhat expensive in that it itself requires CPU cycles.
There are many reasons the OS might choose to switch to another thread. Examples can be that another higher priority process or thread requires immediate attention (for example, code that handles hardware interrupts), that the thread itself asks to be paused for a while (e.g. in sleep
), or because the thread has used the dedicated time it was assigned (this is also called thethread quantum) and will have to go back into a queue to be scheduled to continue execution.
Going back to our solutions above, the Python solution is clearly multi-threaded. This explains why the two tasks are run concurrently, and why the calculation of the large Fibonacci number, which is CPU intensive, is not blocking the execution of the other thread.
But what about Node.js? It appears, based on the fact that the calculation is blocking the other task, that the our code is running on a single thread. And this is in fact how Node.js is implemented. As far as the operating system is concerned your application is running in a single thread (I am simplifying things a little bit here, since depending on the platform libuv might use thread pools for some of the IO events, but even that doesn't change the fact that your JavaScript code is still running on a single thread.)
There are a few reasons you might want to avoid threads in certain situations. One is that threads can be computationally and resource-wise expensive, and the other that the true concurrent behaviour of threads, along with shared memory means concurrency issues such as deadlocks and race conditions enter the picture, leading to more complex code and the need to keep thread safety in mind while programming. (Of course, these are relative, and there's a time and place for threads. But that's besides the point of this article!)
Let's see if we can solve the above problem without using multi-threading. To do so, we will imitate what Node.js uses behind the scenes: an event loop. First, we will need a way to poll stdin
for input availability, that is, a system call that asks if a file descriptor (in this case stdin
) has input available for reading or not. Depending on the operating system, there are a variety of system calls for this, such as poll
, select
, kqueue
, etc. In Python 3.4, the selectors
module provides an abstraction over these system calls so you can use them (somewhat) safely on a variety of machines.
Once we have the polling functionality, our event loop will be very simple: in each iteration of the loop, we check to see if there's input available for reading, and if so we read and process it. After that, we check to see if more than three seconds has passed since the last printing of "Hello world!" and if yes, we print it. Let's give this a shot.
import selectors import sys from time import time from fib import timed_fib def process_input(stream): text = stream.readline() n = int(text.strip()) print('fib({}) = {}'.format(n, timed_fib(n))) def print_hello(): print("{} - Hello world!".format(int(time()))) def main(): selector = selectors.DefaultSelector() # Register the selector to poll for "read" readiness on stdin selector.register(sys.stdin, selectors.EVENT_READ) last_hello = 0 # Setting to 0 means the timer will start right away while True: # Wait at most 100 milliseconds for input to be available for event, mask in selector.select(0.1): process_input(event.fileobj) if time() - last_hello > 3: last_hello = time() print_hello() if __name__ == '__main__': main()
And the output:
$ python3.4 hello_eventloop.py 1412376429 - Hello world! 1412376432 - Hello world! 1412376435 - Hello world! 37 Executing fib took 9.7 seconds. fib(37) = 24157817 1412376447 - Hello world! 1412376450 - Hello world!
And as expected, because we are using a single thread, the program acts the same way as Node.js does, that is, the calculation blocks the running of the "Hello world!" task. Great, this is neat! But our solution is rather hard-coded for the specific problem. In next sections, we will look at generalizing our event loop code to be a bit more powerful and easier to program for, first using callbacks and then using coroutines.
A natural generalization of the previous section's event loop is to allow for generic event handlers. This can be relatively easily achieved using callbacks: for each event type (in our case, we only have two of them, input on stdin
and timers going off), allow the user to add arbitrary functions as event handlers. The code is simple enough that we might as well just jump to it directly. There is only one bit that's a bit tricky, and it's the use of bisect.insort
to handle timer events. The algorithm here is to keep the list of timer events sorted, with the timers to run earliest first. This way, at each iteration of the event loop, we just have to check to see if there are any timers, and if there are, start at the beginning and run all timers that have expired. bisect.insort
makes this easier by inserting the item in correct index in the list. There are various other approaches to this but this is the one I opted for.
from bisect import insort from collections import namedtuple from fib import timed_fib from time import time import selectors import sys Timer = namedtuple('Timer', ['timestamp', 'handler']) class EventLoop(object): """ Implements a callback based single-threaded event loop as a simple demonstration. """ def __init__(self, *tasks): self._running = False self._stdin_handlers = [] self._timers = [] self._selector = selectors.DefaultSelector() self._selector.register(sys.stdin, selectors.EVENT_READ) def run_forever(self): self._running = True while self._running: # First check for available IO input for key, mask in self._selector.select(0): line = key.fileobj.readline().strip() for callback in self._stdin_handlers: callback(line) # Handle timer events while self._timers and self._timers[0].timestamp < time(): handler = self._timers[0].handler del self._timers[0] handler() def add_stdin_handler(self, callback): self._stdin_handlers.append(callback) def add_timer(self, wait_time, callback): timer = Timer(timestamp=time() + wait_time, handler=callback) insort(self._timers, timer) def stop(self): self._running = False def main(): loop = EventLoop() def on_stdin_input(line): if line == 'exit': loop