When to Use Parallel.ForEachAsync vs. Task.WhenAll?

#net8

#concurrency

#async

2025-01-29

👋 Introduction

If you’ve worked with async programming in .NET, you’ve probably come across Parallel.ForEachAsync and Task.WhenAll. At first glance, they seem to do the same thing — helping you run multiple tasks concurrently. But if they serve the same purpose, why do both exist? The reality is, they behave quite differently under the hood, and picking the wrong one for your use case could lead to unexpected performance issues — or even deadlocks.

⏱️ Benchmarking

Preparation

Initially, I thought I could use Task.CompletedTask to avoid scheduling overhead, since I didn’t want to measure that part — because it’s not really deterministic, given that scheduling depends on many environmental factors. But then I realized that this approach wouldn't work with Parallel.ForEachAsync, because it always schedules tasks no matter what. That meant I had to take a different approach. To solve this, I created a FakeParallel class, which is basically the same as Parallel, but without the scheduling logic. This allowed me to focus purely on measuring the performance of awaiting tasks without the interference of scheduling nuances.

Results

The results were clear: Task.WhenAll completed in 360 ns, while Parallel.ForEachAsync took significantly longer at 3197 ns. This makes sense because Task.WhenAll simply awaits already-started tasks, whereas Parallel.ForEachAsync involves scheduling overhead, which adds complexity and time. In terms of memory usage, for just a few number of tasks Task.WhenAll is superior, but Parallel.ForEachAsync is much better at scaling. It might worth mentioning that I was experimenting with other types of arrays as well like IEnumerable from .Append() methods or from yield returns but all of them were slower and much worse at memory management.

Method	count	Mean	Error	StdDev	Gen0	Allocated
WhenAllUsingArray	2	58.48 ns	1.182 ns	1.048 ns	0.0286	120 B
WhenAllUsingArray	5	98.48 ns	1.275 ns	1.130 ns	0.0343	144 B
WhenAllUsingArray	10	165.26 ns	2.890 ns	2.562 ns	0.0439	184 B
WhenAllUsingArray	25	360.30 ns	3.814 ns	2.978 ns	0.0725	304 B
WhenAllUsingArray	300	4,104.64 ns	57.244 ns	47.801 ns	0.5951	2504 B
FakeParallelForEachWithUnlimitedDegreeOfParallelism	2	411.32 ns	5.423 ns	4.807 ns	0.0935	392 B
FakeParallelForEachWithUnlimitedDegreeOfParallelism	5	801.97 ns	5.680 ns	4.743 ns	0.0935	392 B
FakeParallelForEachWithUnlimitedDegreeOfParallelism	10	1,493.07 ns	12.411 ns	10.364 ns	0.0935	392 B
FakeParallelForEachWithUnlimitedDegreeOfParallelism	25	3,197.74 ns	26.647 ns	23.622 ns	0.0916	392 B
FakeParallelForEachWithUnlimitedDegreeOfParallelism	300	35,937.42 ns	402.805 ns	336.361 ns	0.0610	392 B

Conclusion

This was unexpected at first because both methods seem to serve the same purpose. However, during the preparation phase, it became clear that there are fundamental differences. The key realization is that Parallel.ForEachAsync always schedules tasks because it always receives unstarted ones — at least, the provided Func<> is unstarted by definition. On the other hand, Task.WhenAll operates only on tasks that are already running, avoiding any scheduling overhead.

Another important distinction is that Parallel.ForEachAsync is not limited to tasks. You can use it to run anything concurrently, making it a more flexible tool for cases where you need controlled execution and throttling.

📋 Summary

So, what’s the takeaway? If you need to start and control a batch of tasks (or operations that are not tasks, especially), the methods of the Parallel class are the way to go. If you have tasks and they are already running and you just need to wait for them, Task.WhenAll is the better choice. Performance-wise, Task.WhenAll is faster because it does less work. But if you need proper scheduling and throttling, the Parallel.ForEachAsync is worth the tradeoff. Understanding these nuances can save you from unexpected performance issues and even prevent your app from deadlocking!