Stream: general

Topic: Async download threads

Paul Faria (Aug 26 2019 at 17:53, on Zulip):

Async question here: I'm trying to download a very large number of files from a remote server. I've already got async/await setup, but spawning a new tokio task clearly won't work because I'll run out of file handles (already confirmed this).

If I wasn't using async/await, I'd just spawn a dedicated number of download threads and use mpsc to send the downloads to the threads with a cycling index and have the threads process their downloads sequentially.

With tokio and async/await, it's not clear how to use their existing thread pool to do something similar.

Paul Faria (Aug 26 2019 at 17:53, on Zulip):

Do I need to run a runtime in each thread or is there a way to leverage the existing thread pool in a clean way?

Taylor Cramer (Aug 26 2019 at 18:07, on Zulip):

To clarify, you're trying to limit the number of files that are downloaded at the same time?

Paul Faria (Aug 26 2019 at 19:46, on Zulip):

Yes (sorry for the really slow response)

Taylor Cramer (Aug 26 2019 at 20:40, on Zulip):

no problem!

Taylor Cramer (Aug 26 2019 at 20:41, on Zulip):

You might consider using the .buffered combinator on streams or the limit to for_each_concurrent to limit the number of running futures actively pulling things from the remote server

Taylor Cramer (Aug 26 2019 at 20:42, on Zulip):

something like stream::iter(vec![thing1, thing2, thing3]).for_each_concurrent(MAX_CONCURRENT_FILES, |thing| async move {... /* get your file info here *' ... })

Paul Faria (Aug 26 2019 at 20:47, on Zulip):

Ok, I was also looking at ThreadPool in futures while waiting for your response, but I guess that's not the right idea. I saw on that futures-cpupool could process multiple requests while limiting IO polling to the main thread (I should have looked at the actual impl). I wasn't sure if ThreadPool was the same idea. I'll code up the for_each_concurrent now. Thanks!

Last update: Jun 04 2020 at 17:40UTC