Leave these fields empty (spam trap):
Name
You can leave this blank to post anonymously, or you can create a Tripcode by using the format Name#Password
Comment
[i]Italic Text[/i]
[b]Bold Text[/b]
[spoiler]Spoiler Text[/spoiler]
>Highlight/Quote Text
[pre]Preformatted & Monospace Text[/pre]
[super]Superset Text[/super]
[sub]Subset Text[/sub]
1. Numbered lists become ordered lists
* Bulleted lists become unordered lists
File

Sandwich


Python, Celery and threading/async

Reply
- Wed, 02 Nov 2016 10:36:00 EST gEtE4wNA No.36274
File: 1478097360593.jpg -(59959B / 58.55KB, 900x587) Thumbnail displayed, click image for full size. Python, Celery and threading/async
Hey
What is a good proper way to asynchronously run multiple http requests from within a celery task?
Some background; I am coding a web API with Flask, when the API receives a request it spins up a celery task in the background.
The celery task itself needs to run several (>100) outbound web requests, and the problem is, the whole process is too slow. But since most of the waiting time is just waiting for web requests to return, what I want to do is run some of those requests in parallel to speed things up.

I know how to use multiprocessing.ThreadPool and whatnot, but how would this work if I have a bunch of celery workers spawning multiple threads at the same time? I don't wanna kill the server if I get a spike in traffic, and I assume if I throw some of those requests over to one of the other celery workers (nest the tasks) I'll end up with a deadlock.
Can I have something like a shared threadpool across all celery workers, or what is the proper way to handle this?

Thanks
>>
Jack Mungergold - Wed, 02 Nov 2016 23:10:00 EST 2i397rxA No.36276 Reply
If you're asking a question like this, I would tell you to stay away from the multiprocessing module. Eventlets are probably what you want. If you're worried about the thread count or network resources, mock it up and try to run the server into the ground before you over engineer anything. The more realistic concern is probably rate limiting with third party services. I'll bet you can do that more confidently with Redis than what Celery gives you out of the box.
>>
Ebenezer Mavingshit - Thu, 03 Nov 2016 08:59:50 EST gEtE4wNA No.36277 Reply
>>36276
The main reason I'm using celery is because the requests are gonna be coming in through a mobile app from users who are potentially on shoddy connections, so requiring to keep an open http request for >30 seconds doesn't seem very reliable. So once I get the request, I just send it off to celery and respond right back with a task ID and the app polls for the result every x seconds.

I'm looking into gevent and eventlet right now, sounds like it's what I need. Should give me some concurrency, without out the complications of spawning multiple threads on each celery worker. Then if I can speed things up enough, I might not even need celery at all.

Report Post
Reason
Note
Please be descriptive with report notes,
this helps staff resolve issues quicker.