Downloading large files with HttpClient and you see that it takes lots of memory space? This post is probably for you. Let's see how to efficiently streaming large HTTP responses with HttpClient.
@ 05-11-2014
by Tugberk Ugurlu


I see common scenarios where people need to download large files (images, PDF files, etc.) on their .NET projects. What I mean by large files here is probably not what you think. It should be enough to call it large if it’s 500 KB as you will hit a memory limit once you try to download lots of files concurrently in a wrong way as below:

static async Task HttpGetForLargeFileInWrongWay()
{
    using (HttpClient client = new HttpClient())
    {
        const string url = "https://github.com/tugberkugurlu/ASPNETWebAPISamples/archive/master.zip";
        using (HttpResponseMessage response = await client.GetAsync(url))
        using (Stream streamToReadFrom = await response.Content.ReadAsStreamAsync())
        {
            string fileToWriteTo = Path.GetTempFileName();
            using (Stream streamToWriteTo = File.Open(fileToWriteTo, FileMode.Create))
            {
                await streamToReadFrom.CopyToAsync(streamToWriteTo);
            }

            response.Content = null;
        }
    }
}

By calling GetAsync method directly there, we are loading every single byte into memory. You can see this happening in a simple way by opening the Task Manager and observing the memory of the process.

2

We are calling ReadAsStreamAsync on HttpContent after the GetAsync method is completed. This will just get us the MemoryStream, so there is no point there:

Screenshot 2014-05-11 15.18.14

We need a way not to load the response body into memory and have the raw network stream so that we can pass the bytes into another stream without hitting the memory too hard. We can do it by just reading the headers of the response and then getting a handle for the network stream as below:

static async Task HttpGetForLargeFileInRightWay()
{
    using (HttpClient client = new HttpClient())
    {
        const string url = "https://github.com/tugberkugurlu/ASPNETWebAPISamples/archive/master.zip";
        using (HttpResponseMessage response = await client.GetAsync(url, HttpCompletionOption.ResponseHeadersRead))
        using (Stream streamToReadFrom = await response.Content.ReadAsStreamAsync())
        {
            string fileToWriteTo = Path.GetTempFileName();
            using (Stream streamToWriteTo = File.Open(fileToWriteTo, FileMode.Create))
            {
                await streamToReadFrom.CopyToAsync(streamToWriteTo);
            }
        }
    }
}

Notice that we are calling another overload of the GetAsync method by passing the HttpCompletionOption enumeration value as ResponseHeadersRead. This switch tells the HttpClient not to buffer the response. In other words, it will just read the headers and return the control back. This means that the HttpContent is not ready at the time when you get the control back. Afterwards, we are getting the stream and calling the CopyToAsync method on it by passing our FileStream. The result is much better:

3

Resources




Hi πŸ™‹πŸ»β€β™‚οΈ I'm Tugberk Ugurlu.
Coder πŸ‘¨πŸ»β€πŸ’», Speaker πŸ—£, Author πŸ“š, Microsoft MVP πŸ•Έ, Blogger πŸ’», Software Engineering at Deliveroo πŸ•πŸœπŸŒ―, F1 fan πŸŽπŸš€, Loves travelling πŸ›«πŸ›¬
Lives in Cambridge, UK 🏑