Finding a Python package, for WebSocket connectivity, often turns out to be harder than initially planned. But as we want to make challenging things a bit easier for people around us, we have created a practitioners' guideline referring to utilizing Websockets in Python, without async.
But what is the reason behind avoiding async? Simply put, because the async paradigm in Python may prove challenging even for people which are familiar with event loop based languages. It really is one hard to love trick, and once you add the async keyword to your codebase, it is going to spread.
This, is done by design to increase performance and enforce rules which eventually make life easier. It could even be considered an outstanding performance for server apps and microservices. Great for the language as a whole, the async functions is short, fast and (b)lock perfectly over input /output.
Turns out most functions written in Python are not quite like that. A lot of people love the language for it's capability of simplifying the numerical heavy lifting in fields like image processing, data science and machine learning. However, it is by no means input /output and, on top everything, it has a lengthy execution time.
Consider the following example in which a WebSocket connection does both request/response and state maintenance for an application such as a chat. How would you approach starting to prototype this in Jupyter?
# jupyter cell 1 sock = socket.open( 'endpoint' ) sock.send( 'query ring' ) sock.recv() # N unrelated sock.recv() # 'response ring + data' # update chat mesages while( 'when does this stop?' ): sock.recv() # M unrelated # jupyter cell 2 # unreachable sock.send( 'query users' )
One method would be to build a single cell from intertwining the recv calls with legitimate cell contents. If state management solves your problem - go for it.
Is to handle both the receive and sends within a separate thread. Some coroutines (async) code are required to make this work, otherwise the order of sends and receivals may need to be known beforehand.
In order to isolate the async code from the rest of our implementation, you can use the curio package which does support execution of a threaded event loop and provides a great UniversalQueue object for communication between the async thread and the main, synchronous one. Framing of the data packets and handshaking is performed using wsproto.
In this scenario, the data and events are passed to the client using callbacks, not the most 'Pythonic' way of doing things, but definitely not the worst.
The 'n' parts stands for normal or node-like. This package replicates the API surface of the WebSocket client found in your browser. So, do Python practitioners have to deal with the async compromises?
Is the magic WebSocket
class which can run callbacks. The first thing you need to do to make your life easier is to isolate the callbacks right-away, and place the data received into instance variables. This can be done by distinguishing messages, as seen in the example below:
from nwebsocket import WebSocket class ChatProtocol( WebSocket ): def __init__( url ): super.__init__( url ) # store an array of messages self.messages = [] def onmessage( m ): # if string, add to internal variable if( isinstance( m, str ) ): self.messages.append( json.loads( m )[ 'data' ] ) # if bytes, reflect payload else: self.send( m ) def post( message ): self.send( json.dumps( dict( type = 'message', data = message ) ) )
In this scenario you can use any ChatProtocol instance to send messages and read the current messages list from it. The messages
property is changed only from inside this class. Any other logic like connection state can also be stored within the class.
If the endpoint supports some form of request/response messaging, keeping track of of the message- request correspondence can be tricky. A solution to this is to find a property which acts like a reflect header. Whatever information you send to the endpoint will be reflected back to your client. This way you can determine precisely when the query has finished and what is the server response.
It depends. Roundtrip numbers on localhost seem to sit fairly low under Windows.
Results on Unix systems seem to be at least one order of magnitudes faster, sitting at about 3 thousand roundabout queries per second.
In the end, you have to try it for yourself and, for this reason, we'll finish this article with a guideline on when to use async in Python.
In a perfect world there should be a perfect solution for running hardware intensive tasks on the event loop but, unfortunately, there hasn't been one found yet.
We recommend using async if you:
It is not advised to use it to parallelize hardware intensive, long running functions. For this, there are better existing alternatives, either multiprocessing, or load balancing and auto-scaling.
Otherwise:
pip install nwebsocket
Thanks for reading and best of luck in your trials!