(EFCore Best Practices) Buffering and Streaming of Resultset

A query can first store the entire resultset into memory, and then use a foreach loop to process each item. Alternatively, it can stream each result into a foreach loop one by one. We examine both these possibilities in this tutorial.
(Rev. 31-Oct-2024)

Categories | About |     |  

Parveen,

Table of Contents (top down ↓)

Buffering into Memory

A query that evaluates to a List or Array buffers the entire resultset into memory.

Consider this code that uses the ToArray method to obtain an array of items. This causes the entire resultset to be buffered into memory. It might be faster and efficient if the count of items is small.

But if the number of items is large and each item is heavy then it will have a memory overhead. Memory requirements could increase, leading to a fall in performance.


// items are buffered in memory 
// because of ToArray and ToList 
var list = _ctx.Items.ToArray();

foreach (var item in list)
{
  // ... 
}

So you should be aware of a possible pitfall when you use ToList and ToArray methods.

Video Explanation (see it happen!)

Please watch the following youtube video:

Streaming a Resultset

If your resultset is large, then streaming will be a better option.

Consider this code that will stream the items one by one.


var query = _ctx.Items.Where(p => p.Price > 100);

// items will be streamed one at  
// a time because we have not used ToList/ToArray 
foreach (var blog in query)
{
  // ... 
}

The query identifier obtains an IQueryable by using the Where operator.

The foreach loop streams each item one by one. This will be very efficient if the number of items is very large.

Streaming has a fixed overhead. So it doesn't matter whether we stream one single item or we stream thousands of them. Hence, streaming is not an every-time solution. It is more suitable if your query is expected to return many items. However, if your query is expected to return a fewer items then buffering would be better.

Chained LINQ Operators

Sometimes your query is built with a chain of operators. It could be more efficient to use AsEnumerable to stream each item before it is processed by subsequent operators. Let's take an example.

_ctx.Items.Where(p => p.Price > 100)
.AsEnumerable() // !!! .ToList 
.Where(m => myfunction(m));

In this query a subset of items is first obtained by using a Where operator. And then a function called myfunction has to be called on each item. Thus, we have a chain of two Where operators.

There might be a memory overhead if we extract all the results of the first Where operator into a list and then call the next Where operator on the list thus obtained.

It might, infact be optimal if we use AsEnumerable to first obtain a stream, and then call myfunction one by one.

These are subtle things that you can keep in mind. Thanks!


This Blog Post/Article "(EFCore Best Practices) Buffering and Streaming of Resultset" by Parveen is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.