Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

How does Python sync differ from asynchronous?


May 31, 2021 Article blog


Table of contents


The article comes from the public number: Architecture Headline Author: Miguel

Have you ever been told that asynchronous Python code is faster than "normal (or synchronized) Python code"? So is that really the case?

What do you mean, "sync" and "asynchronous"?

Web applications typically process many requests from different clients in a short period of time. To avoid processing delays, you must consider processing multiple requests in parallel, often referred to as concurring.

In this article, I'll continue to use Web applications as an example, but there are other types of applications that also benefit from concurring. Therefore, this discussion is not just about Web applications.

The terms "synchronization" and "asynchronous" refer to two ways to write concurrent applications. S o-called "synchronization" servers use threads and processes supported by the underlying operating system to achieve this concurringness. Here's a diagram of the synchronous deployment:

 How does Python sync differ from asynchronous?1

In this case, we have five clients that all send requests to the application. T he access portal to this application is a Web server that acts as a load balancer by assigning services to a pool of server workers, which can be implemented as a combination of processes, threads, or both. T hese workers execute the requests assigned to them by the load balancer. The application logic that you write using a web application framework, such as Flask or Django, runs in these workers.

This type of scenario is better for servers with multiple CPUs because you can set the number of workers to the number of CPUs so that you can make balanced use of your processor core, which a single Python process cannot do due to global interpreter lock (GIL) limitations.

In terms of shortcomings, the diagram above also clearly shows the main limitations of such a scheme. W e have 5 clients, but only 4 workers. I f all five clients send requests at the same time, the load balancer sends all requests outside one client to the worker pool, and the remaining requests have to remain in a queue until a worker becomes available. A s a result, four-fifths of requests are responded to immediately, while the remaining fifth have to wait a while. A key to server optimization is to select the appropriate number of workers to prevent or minimize request blocking given the expected load.

The configuration of an asynchronous server is difficult to draw, but I try my best:

 How does Python sync differ from asynchronous?2

This type of server runs in a single process and is controlled through loops. T his loop is a very efficient task manager and scheduler that creates tasks to execute requests sent by clients. U nlike long-standing server workers, asynchronous tasks are created by loops to process a particular request, and when that request is completed, the task is destroyed. At any one time, an asynchronous server has hundreds or thousands of active tasks that perform their work under circular management.

You may want to know how parallels between asynchronous tasks are implemented. T hat's the interesting part, because an asynchronous application does this with a unique collaborative multitasking. W hat does that mean? W hen a task needs to wait for an external event (for example, a response from a database server), it does not wait like a synchronized worker, but instead tells the loop what it needs to wait for, and then returns control to it. L oops can discover another ready task when it is blocked by a database. Eventually, the database sends a response, and the loop assumes that the first task is ready to run again and will recover it as soon as possible.

This ability to pause and resume execution of asynchronous tasks can be difficult to understand abstractly. To help you apply to what you already know, consider implementing the await or yield keyword in Python, but you'll see later that this isn't the only way to implement asynchronous tasks.

It is surprising that an asynchronous application runs completely in a single process or thread. O f course, this type of concurrence requires some rules, so you can't let a task take up CPU for too long, otherwise the remaining tasks will be blocked. I n order to perform asynchronously, all tasks need to be timed to actively pause and return control to the loop. T o benefit from asynchronous approaches, an application needs tasks that are often blocked by I/O, and there is not much CPU work. Web applications are often well suited, especially if they need to handle a large number of client requests.

When using an asynchronous server, in order to maximize multi-CPU utilization, you typically need to create a hybrid scheme, add a load balancer, and run an asynchronous server on each CPU, as shown in the following image:

 How does Python sync differ from asynchronous?3

2 ways to implement asynchronous in Python

I'm sure you know that to write an asynchronous application in Python, you can use the asyncio package, which implements the pause and recovery features required by all asynchronous applications on a co-process basis. The yield keywords, as well as the updated async and await are the basis for asyncio ability to build asynchronous.

https://docs.python.org/3/library/asyncio.html

There are other convulsion-based asynchronous scenarios in the Python ecosystem, such as Trio and Curio. And Twisted, the oldest of all co-equation frameworks, even appeared earlier than asyncio

If you're interested in writing asynchronous web applications, there are many co-ordinated asynchronous frameworks to choose from, including aiohttp, sanic, FastAPI, and Tornado.

What many people don't know is that co-ops are just one of two ways to write asynchronous code in Python. T he second method is based on a library called greenlet, which you can install with pip. Greenlets, like co-equations, also allow a Python function to pause execution and resume later, but they do this in completely different ways, meaning that the asynchronous ecosystem in Python is divided into two broad categories.

The most interesting difference between co-programs and greenlets for asynchronous development is that the former requires Python language-specific keywords and features to work, while the latter does not. I mean, co-process-based applications need to be written using a specific syntax, and greenlet-based applications look almost like normal Python code. This is cool because in some cases this allows synchronization code to be executed asynchronously, which is not possible with co-process-based scenarios such as asyncio

So in terms of greenlets, what are the libraries that are equivalent to asyncio I know three greenlet-based asynchronous packages: Gevent, Eventlet, and Meinheld, although the last one is more like a Web server than a generic asynchronous library. T hey all have their own asynchronous loop implementations, and they all provide an interesting "monkey-patching" feature that replaces blocking functions in the Python standard library, such as those that execute networks and threads, and implements an equivalent non-blocking version based on greenlets. If you have some sync code that you want to run asynchronously, these packages will help you.

As far as I know, the only Web framework that explicitly supports greenlets is Flask. T his framework is automatically monitored, and when you want to run on a greenlet Web server, it adjusts itself accordingly without any configuration. When doing this, you need to be careful not to call blocking functions, or, if you want to call blocking functions, it is best to "fix" those blocking functions with monkey patches.

However, Flask is not the only framework that benefits from greenlets. Other Web frameworks, such as Django and Bottle, although there are no greenlets, can also run asynchronously by combining a greenlet Web server and using monkey-patching to fix blocking functions.

Is asynchronous faster than synchronization?

There is a widespread misconception about the performance of synchronous and asynchronous applications -- asynchronous applications are much faster than synchronous applications.

I need to clarify this. thon code runs almost the same speed, whether it's written synchronously or asynchronously. In addition to code, there are two factors that can affect the performance of a concurrent application: context switching and scalability.

Context switching

Sharing the work required by the CPU fairly across all running tasks, called context switching, can affect the performance of your application. F or synchronization applications, this is done by the operating system and is essentially a black box that does not require configuration or fine-tuning options. For asynchronous applications, context switching is done by looping.

The default loop implementation, provided by asyncio is written in Python and is not very efficient. T he uvloop package, on the other hand, provides an alternative looping scheme, some of which is written in C for better performance. T he event loops used by Gevent and Meinheld are also written in C. Eventlet uses loops written by Python.

Highly optimized asynchronous loops are more efficient than operating systems for context switching, but in my experience, to see actual efficiency gains, you have to run a very large amount of concurrescence. For most applications, I don't think the performance gap between synchronization and asynchronous context switching is significant.

Extensibility

I think the source of the myth that asynchronous faster is that asynchronous applications tend to use the CPU more efficiently, scale better, and scale more flexibly than synchronization.

If the sync server in the diagram above receives 100 requests at the same time, think about what will happen. This server can only process up to four requests at a time, so most requests stay in a queue and wait until they are assigned a worker.

In contrast, the asynchronous server immediately creates 100 tasks (or, in mixed mode, 25 tasks each on 4 asynchronous workers). With asynchronous servers, all requests start processing immediately without waiting (although it's fair to say that there are other bottlenecks that slow down, such as restrictions on active database connections).

If these 100 tasks primarily use CPUs, then synchronization and asynchronous scenarios have similar performance because each CPU runs at a fixed speed, Python executes code at the same speed, and the application does the same job. H owever, if these tasks require a lot of I/O operations, the synchronization server can only handle four concurrent requests without achieving high CPU utilization. Asynchronous servers, on the other hand, are better able to keep the CPU busy because it runs all 100 requests in parallel.

You might wonder why you can't run 100 synchronous workers so that the two servers have the same concurring capabilities. N ote that each worker needs his or her own Python interpreter and all the resources associated with it, plus a separate copy of the application and its resources. T he size of your servers and applications will determine how many worker instances you can run, but usually this number is not very large. Asynchronous tasks, on the other hand, are lightweight and run in the context of a single worker process, so they have a clear advantage.

To sum up, we can say that asynchronous may be faster than synchronization only in the following scenarios:

  • There is a high load (without a high load, there is no advantage to high concurring access)
  • TaskS Are I/O Bound (If The Task Is CPU Bound, ConcurresCies That Exceed The Number Of CPUs Do Not Help)
  • You view the average number of requests processed per unit of time. If you look at the processing time of a single request, you won't see a big difference, or even asynchronously, which may be slower because asynchronously there are more concurrent tasks competing for CPU.

conclusion

Hopefully, this article will answer some of the confusions and misunderstandings of asynchronous code. I want you to remember the following two key points:

  • Asynchronous applications can only do better than synchronous applications under high load
  • Thanks to greenlets, you can benefit from asynchronous even if you write code in a general way and use traditional frameworks like Flask or Django.

That's W3Cschool编程狮 differs from Python sync and asynchronous? Related to the introduction, I hope to help you.