From: Ben Darnell Date: Wed, 25 Jun 2014 01:58:35 +0000 (-0700) Subject: Add the start of a new user guide. X-Git-Tag: v4.0.0b2~3 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=5d83890654687fbb4253b0b9b3e3cdafa066ee85;p=thirdparty%2Ftornado.git Add the start of a new user guide. --- diff --git a/docs/documentation.rst b/docs/documentation.rst index c1cec79f1..beac58a32 100644 --- a/docs/documentation.rst +++ b/docs/documentation.rst @@ -4,6 +4,7 @@ Tornado Documentation .. toctree:: :titlesonly: + guide overview webframework networking diff --git a/docs/guide.rst b/docs/guide.rst new file mode 100644 index 000000000..2f6087480 --- /dev/null +++ b/docs/guide.rst @@ -0,0 +1,8 @@ +User's guide +============ + +.. toctree:: + + guide/intro + guide/async + guide/coroutines diff --git a/docs/guide/async.rst b/docs/guide/async.rst new file mode 100644 index 000000000..e84126f6e --- /dev/null +++ b/docs/guide/async.rst @@ -0,0 +1,111 @@ +Asynchronous and non-Blocking +----------------------------- + +Real-time web features require a long-lived mostly-idle connection per +user. In a traditional synchronous web server, this implies devoting +one thread to each user, which can be very expensive. + +To minimize the cost of concurrent connections, Tornado uses a +single-threaded event loop. This means that all application code +should aim to be asynchronous and non-blocking because only one +operation can be active at a time. + +The terms asynchronous and non-blocking are closely related and are +often used interchangeably, but they are not quite the same thing. + +Blocking +~~~~~~~~ + +A function **blocks** when it waits for something to happen before +returning. A function may block for many reasons: network I/O, disk +I/O, mutexes, etc. In fact, *every* function blocks, at least a +little bit, while it is running and using the CPU (for an extreme +example that demonstrates why CPU blocking must be taken as seriously +as other kinds of blocking, consider password hashing functions like +`bcrypt `_, which by design use +hundreds of milliseconds of CPU time, far more than a typical network +or disk access). + +A function can be blocking in some respects and non-blocking in +others. For example, `tornado.httpclient` in the default +configuration blocks on DNS resolution but not on other network access +(to mitigate this use `.ThreadedResolver` or a +``tornado.curl_httpclient`` with a properly-configured build of +``libcurl``). In the context of Tornado we generally talk about +blocking in the context of network I/O, although all kinds of blocking +are to be minimized. + +Asynchronous +~~~~~~~~~~~~ + +An **asynchronous** function returns before it is finished, and +generally causes some work to happen in the background before +triggering some future action in the application (as opposed to normal +**synchronous** functions, which do everything they are going to do +before returning). There are many styles of asynchronous interfaces: + +* Callback argument +* Return a placeholder (`.Future`, ``Promise``, ``Deferred``) +* Deliver to a queue +* Callback registry (e.g. POSIX signals) + +Regardless of which type of interface is used, asynchronous functions +*by definition* interact differently with their callers; there is no +free way to make a synchronous function asynchronous in a way that is +transparent to its callers (systems like `gevent +`_ use lightweight threads to offer performance +comparable to asynchronous systems, but they do not actually make +things asynchronous). + +Examples +~~~~~~~~ + +Here is a sample synchronous function:: + + from tornado.httpclient import HTTPClient + + def synchronous_fetch(url): + http_client = HTTPClient() + response = http_client.fetch(url) + return response.body + +And here is the same function rewritten to be asynchronous with a +callback argument:: + + from tornado.httpclient import AsyncHTTPClient + + def asynchronous_fetch(url, callback): + http_client = AsyncHTTPClient() + def handle_response(response): + callback(response.body) + http_client.fetch(url) + +And again with a `.Future` instead of a callback:: + + from tornado.concurrent import Future + + def async_fetch_future(url): + http_client = AsyncHTTPClient() + my_future = Future() + fetch_future = http_client.fetch(url) + fetch_future.add_done_callback( + lambda f: my_future.set_result(f.result())) + return my_future + +The raw `.Future` version is more complex, but ``Futures`` are +nonetheless recommended practice in Tornado because they have two +major advantages. Error handling is more consistent since the +`.Future.result` method can simply raise an exception (as opposed to +the ad-hoc error handling common in callback-oriented interfaces), and +``Futures`` lend themselves well to use with coroutines. Coroutines +will be discussed in depth in the next section of this guide. Here is +the coroutine version of our sample function, which is very similar to +the original synchronous version:: + + from tornado import gen + + @gen.coroutine + def fetch_coroutine(url): + http_client = AsyncHTTPClient() + response = yield http_client.fetch(url) + return response.body diff --git a/docs/guide/coroutines.rst b/docs/guide/coroutines.rst new file mode 100644 index 000000000..bc400833c --- /dev/null +++ b/docs/guide/coroutines.rst @@ -0,0 +1,142 @@ +Coroutines +========== + +**Coroutines** are the recommended way to write asynchronous code in +Tornado. Coroutines use the Python ``yield`` keyword to suspend and +resume execution instead of a chain of callbacks (cooperative +lightweight threads as seen in frameworks like `gevent +`_ are sometimes called coroutines as well, but +in Tornado all coroutines use explicit context switches and are called +as asynchronous functions). + +Coroutines are almost as simple as synchronous code, but without the +expense of a thread. They also `make concurrency easier +`_ to reason +about by reducing the number of places where a context switch can +happen. + +Example:: + + from tornado import gen + + @gen.coroutine + def fetch_coroutine(url): + http_client = AsyncHTTPClient() + response = yield http_client.fetch(url) + # In Python versions prior to 3.3, returning a value from + # a generator is not allowed and you must use + # raise gen.Return(response.body) + # instead. + return response.body + +How it works +~~~~~~~~~~~~ + +A function containing ``yield`` is a **generator**. All generators +are asynchronous; when called they return a generator object instead +of running to completion. The ``@gen.coroutine`` decorator +communicates with the generator via the ``yield`` expressions, and +with the coroutine's caller by returning a `.Future`. + +Here is a simplified version of the coroutine decorator's inner loop:: + + # Simplified inner loop of tornado.gen.Runner + def run(self): + # send(x) makes the current yield return x. + # It returns when the next yield is reached + future = self.gen.send(self.next) + def callback(f): + self.next = f.result() + self.run() + future.add_done_callback(callback) + +The decorator receives a `.Future` from the generator, waits (without +blocking) for that `.Future` to complete, then "unwraps" the `.Future` +and sends the result back into the generator as the result of the +``yield`` expression. Most asynchronous code never touches the `.Future` +class directly except to immediately pass the `.Future` returned by +an asynchronous function to a ``yield`` expression. + +Coroutine patterns +~~~~~~~~~~~~~~~~~~ + +Interaction with callbacks +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To interact with asynchronous code that uses callbacks instead of +`.Future`, wrap the call in a `.Task`. This will add the callback +argument for you and return a `.Future` which you can yield:: + + @gen.coroutine + def call_task(): + # Note that there are no parens on some_function. + # This will be translated by Task into + # some_function(other_args, callback=callback) + yield gen.Task(some_function, other_args) + +Calling blocking functions +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The simplest way to call a blocking function from a coroutine is to +use a `~concurrent.futures.ThreadPoolExecutor`, which returns +``Futures`` that are compatible with coroutines:: + + thread_pool = ThreadPoolExecutor(4) + + @gen.coroutine + def call_blocking(): + yield thread_pool.submit(blocking_func, args) + +Parallelism +^^^^^^^^^^^ + +The coroutine decorator recognizes lists and dicts whose values are +``Futures``, and waits for all of those ``Futures`` in parallel:: + + @gen.coroutine + def parallel_fetch(url1, url2): + resp1, resp2 = yield [http_client.fetch(url1), + http_client.fetch(url2)] + + @gen.coroutine + def parallel_fetch_many(urls): + responses = yield [http_client.fetch(url) for url in urls] + # responses is a list of HTTPResponses in the same order + + @gen.coroutine + def parallel_fetch_dict(urls): + responses = yield {url: http_client.fetch(url) + for url in urls} + # responses is a dict {url: HTTPResponse} + +Interleaving +^^^^^^^^^^^^ + +Sometimes it is useful to save a `.Future` instead of yielding it +immediately, so you can start another operation before waiting:: + + @gen.coroutine + def get(self): + fetch_future = self.fetch_next_chunk() + while True: + chunk = yield fetch_future + if chunk is None: break + self.write(chunk) + fetch_future = self.fetch_next_chunk() + yield self.flush() + +Looping +^^^^^^^ + +Looping is tricky with coroutines since there is no way in Python +to ``yield`` on every iteration of a ``for`` or ``while`` loop and +capture the result of the yield. Instead, you'll need to separate +the loop condition from accessing the results, as in this example +from `motor `_:: + + import motor + @gen.coroutine + def loop_example(collection): + cursor = collection.find() + while (yield cursor.fetch_next): + doc = cursor.next_object() diff --git a/docs/guide/intro.rst b/docs/guide/intro.rst new file mode 100644 index 000000000..b4a057a42 --- /dev/null +++ b/docs/guide/intro.rst @@ -0,0 +1,28 @@ +Introduction +------------ + +`Tornado `_ is a Python web framework and +asynchronous networking library, originally developed at `FriendFeed +`_. By using non-blocking network I/O, Tornado +can scale to tens of thousands of open connections, making it ideal for +`long polling `_, +`WebSockets `_, and other +applications that require a long-lived connection to each user. + +Tornado can be roughly divided into three major components: + +* A web framework (including `.RequestHandler` which is subclassed to + create web applications, and various supporting classes). +* Client- and server-side implementions of HTTP (`.HTTPServer` and + `.AsyncHTTPClient`). +* An asynchronous networking library (`.IOLoop` and `.IOStream`), + which serve as the building blocks for the HTTP components and can + also be used to implement other protocols. + +The Tornado web framework and HTTP server together offer a full-stack +alternative to `WSGI `_. +While it is possible to use the Tornado web framework in a WSGI +container (`.WSGIAdapter`), or use the Tornado HTTP server as a +container for other WSGI frameworks (`.WSGIContainer`), each of these +combinations has limitations and to take full advantage of Tornado you +will need to use the Tornado's web framework and HTTP server together.