A C# Developer's First Thoughts on MongoDB

After working with RavenDB over the year, I just started looking into MongoDB. I worked with MongoDB a year ago or so in a small project but my knowledge was mostly rusty and I don't want that to happen again :) So, here I'm, documenting what my second thoughts are :)
2014-04-12 14:22
Tugberk Ugurlu


After working with RavenDB over the year, I just started looking into MongoDB. I worked with MongoDB a year ago or so in a small project but my knowledge was mostly rusty and I don't want that to happen again :) So, here I'm, documenting what my second thoughts are :) The TL;DR is that: I'm loving it but the lack of transaction sometimes drifts on a vast dark sea. It's OK through. The advantages generally overcomes this disadvantage.

Loved the Mongo Shell

First thing I liked about MongoDB is its shell (Mongo Shell). 

3d6f5fdd4be53096b6992d5b84b0d7be

It makes it insanely easy for you to get used to MongoDB. After running the mongod.exe, I fired up a command prompt and navigated to mongo.exe directory and entered the mongo shell. Mongo Shell runs pure JavaScript code. That's right! Anything you know about JavaScript is completely valid inside the mongo shell. Let's see a few things that you can do with mongo shell.

You can list the databases on your server: show dbs

2

You can see which database you are connected to: db

3

You can switch to a different database: use <database name here>

4

You can see the collections inside the database you are on: show collections

5

You can save a document inside a collection: db.users.save({ _id: "tugberk", userName: "Tugberk" })

6

You can list the documents inside a collection: db.users.find().pretty()

7

You can run a for loop:

for(var i = 0; i < 10; i++) { 
	db.users.save({ 
		_id: "tugberk" + i.toString(), 
		userName: "Tugberk" + i.toString() 
	}) 
}

8

You can run the code inside a js file

9

saveCount.js contains the following code and it just gets the count of documents inside the users collection and logs it inside another collection:

(function() {
     var myDb = db.getSiblingDB('myDatabase'),
         usersCount = myDb.users.find().count();
         
     myDb.countLogs.save({
          count: usersCount,
          createdOn: new Date()
     });
}());

All of those and more can be done using the mongo shell. It's a full blown MongoDB client and probably the best one. I don't know if it's just me but big missing feature of RavenDB is this type of shell.

Loved the Updates

Update operations in MongoDB is just fabulous. You can construct many kinds of updates, the engine allows you to do this fairly easily. The one I found most useful is the increment updates. Increment updates allows you to increment a field and this operation will be performed concurrency in-mind:

db.books.update(
   { item: "Divine Comedy" },
   {
      $inc: { stock: 5 }
   }
)

The above query will update the stock filed by 5 safely.

Not much Love for the .NET Client

MongoDB has an official .NET client but MongoDB guys decided to call this "C# driver". This is so strange because it works with any other .NET languages as well. I have to say that MongoDB .NET client is not so great in my opinion. After coming from the RavenDB .NET client, using the MongoDB .NET client just feels uncomfortable (However, I’m most certainly sure that I’d love its Node.Js client as it would feel very natural).

First of all, it doesn't support asynchronous requests to MongoDB server. All TCP requests are being done synchronously. Also, there is no embedded server support. RavenDB has this and it makes testing a joy. Let's look at the below code which gets an instance of a database:

MongoClient client = new MongoClient("mongodb://localhost");
MongoServer server = client.GetServer();
MongoDatabase db = server.GetDatabase("mongodemo");

There is too much noise going on here. For example, what is GetServer method there? Instead, I would want to see something like below:

MongoClient client = new MongoClient("mongodb://localhost");
using(var session = client.OpenSession("myDatabase"))
{
     // work with the session here...
}

Looks familiar :) I bet it does! Other than the above issues, creating map/reduce jobs just feels weird as well because MongoDB supports JavaScript to perform map/reduce operations.

var map =
    "function() {" +
    "    for (var key in this) {" +
    "        emit(key, { count : 1 });" +
    "    }" +
    "}";

var reduce =
    "function(key, emits) {" +
    "    total = 0;" +
    "    for (var i in emits) {" +
    "        total += emits[i].count;" +
    "    }" +
    "    return { count : total };" +
    "}";

var mr = collection.MapReduce(map, reduce);
foreach (var document in mr.GetResults()) {
    Console.WriteLine(document.ToJson());
}

The above code is directly taken from the MongoDB documentation.

Explore Yourself

MongoDB has a nice documentation and you can explore it yourself. Besides that, Pluralsight has a pretty nice course on MongoDB: Introduction to MongoDB by Nuri Halperin. Also, don't miss the Ben's post on the comparison of Map-Reduce in MongoDB and RavenDB.

Simple OAuth Server: Implementing a Simple OAuth Server with Katana OAuth Authorization Server Components (Part 1)

In my previous post, I emphasized a few important facts on my journey of building an OAuth authorization server. As great people say: "Talk is cheap. Show me the code." It is exactly what I'm trying to do in this blog post. Also, this post is the first one in the "Simple OAuth Server" series.
2014-04-01 14:30
Tugberk Ugurlu


In my previous post, I emphasized a few important facts on my journey of building an OAuth authorization server. As great people say: "Talk is cheap. Show me the code." It is exactly what I'm trying to do in this blog post. Also, this post is the first one in the "Simple OAuth Server" series.

What are We Trying to Solve Here?

What we want to achieve at the end of the next two blog posts is actually very doable. We want to have a console application where we handle calls to our protected web service endpoints and access them in a delegated manner which means that the client will actually access the resources on behalf of a user (in other words, resource owner). However, we won't be accessing the web service with resource owner's credentials (username and password). Instead, we will use the credentials to obtain an access token through the resource owner credentials grant and use that token to access the resources from that point on. After this blog post, we will expend our needs and build on top of our existing solution with the upcoming posts. That's why this post will be a little bit detailed about how you could set up the project and we will only cover building the OAuth server part.

Building the Application Infrastructure

I'll start by creating the ASP.NET Web API application. As mentioned, our application will evolve over time with the upcoming posts. So, this post will only cover the minimum requirements. So, bare this in mind just in case. I used the provided project templates in Visual Studio 2013 to create the project. For this blog post content, we only need ASP.NET Web API components to create our project.

Screenshot 2014-03-31 11.19.58 

Screenshot 2014-03-31 11.22.19

At the time of writing this post, visual Studio 2013 had the old ASP.NET Web API bits and it's worth updating the package before we continue:

Screenshot 2014-03-31 11.26.09

The OAuth authorization server and the ASP.NET Web API endpoints will be hosted inside the same host in our application here. In your production application, you would probably don't want to do this but for our demo purposes, this will be simpler.

Now we are ready to build on top of the project template. First thing we need is a membership storage system. Nothing would be better than new ASP.NET Identity components. I will use the official Entity Framework port of the ASP.NET Identity for our application here. However, you are free to choose your own data storage engine. Scott Allen has a great blog post about the extensibility of ASP.NET Identity and he listed available open source projects which provide additional storage options for ASP.NET Identity such as AspNet.Identity.RavenDB.

Screenshot 2014-03-31 11.36.12

There are two more packages that you need to install. One of them is Microsoft.AspNet.Identity.Owin. This package provides several useful extensions you will use while working with ASP.NET Identity on top of OWIN. The other one is Microsoft.Owin.Host.SystemWeb package which enables OWIN-based applications to run on IIS using the ASP.NET request pipeline.

The packages we just installed (Microsoft.AspNet.Identity.Owin) also brought down some other packages as its dependencies. One of those dependency packages is Microsoft.Owin.Security.OAuth and this is the core package that includes the components to support any standard OAuth 2.0 authentication workflow. Just wanted to highlight this fact as this is an important part of the project.

I will create the Entity Framework DbContext which will hold membership and OAuth client data. ASP.NET Identity Entity Framework package already has the DbContext implementation for the membership storage and our context class will be derived from that.

public class OAuthDbContext : IdentityDbContext
{
    public OAuthDbContext()
        : base("OAuthDbContext")
    {
    }

    public DbSet<Client> Clients { get; set; }
}

OAuthDbContext class is derived from IdentityDbContext class as you see. Also notice that we have another DbSet property for clients. That will represent the information of the clients. The Client class is a shown below:

public class Client
{
    public string Id { get; set; }
    public string Name { get; set; }
    public string ClientSecretHash { get; set; }
    public OAuthGrant AllowedGrant { get; set; }

    public DateTimeOffset CreatedOn { get; set; }
}

This is the minimum that we need from the client to register in our authorization server. For certain grants, the client doesn't need to have a secret but for "Resource Owner Password Credentials Grant", it's mandatory. The client is also allowed for only one grant, that's all. This is not inside the OAuth 2.0 specification but it's the recommended approach. OAuthGrant is an enum and has the following values:

public enum OAuthGrant
{
    Code = 1,
    Implicit = 2,
    ResourceOwner = 3,
    Client = 4
}

These are all we need for now and we are ready to create the database. I will use Entity Framework Migrations feature to stand up the database and seed some data for demo purposes. As a one time process, I need to enable migrations first by running the "Enable-Migrations" command from the Package Manager Console.

Screenshot 2014-03-31 14.39.34

I will run the another command to add a migration code to reflect my context to a database schema: Add-Migration:

Screenshot 2014-03-31 14.41.44

Enable-Migration command created an internal class called Configuration and it contains a Seed method. I can use that seed method to inject some data during the database creation process:

protected override void Seed(SimpleOAuthSample.Models.OAuthDbContext context)
{
    context.Clients.AddOrUpdate(
        client => client.Name,
        new Client
        {
            Id = "42ff5dad3c274c97a3a7c3d44b67bb42",
            Name = "Demo Resource Owner Password Credentials Grant Client",
            ClientSecretHash = new PasswordHasher().HashPassword("client123456"),
            AllowedGrant = OAuthGrant.ResourceOwner,
            CreatedOn = DateTimeOffset.UtcNow
        });

    context.Users.AddOrUpdate(
        user => user.UserName,
        new IdentityUser("Tugberk")
        {
            Id = Guid.NewGuid().ToString("N"),
            PasswordHash = new PasswordHasher().HashPassword("user123456"),
            SecurityStamp = Guid.NewGuid().ToString(),
            Email = "tugberk@example.com",
            EmailConfirmed = true
        });
}

Now, I will use the Update-Database command to create my database:

Screenshot 2014-03-31 17.04.38

This command just created the database with the seed data on my SQL Express:

image

We will interact with our database mostly through the UserManager class which ASP.NET Identity core library provides. However, we will still use the OAuthDbContext directly. To use those classes efficiently, we need to write some setup code. I'll do this inside the OWIN Startup class:

public class Startup
{
    public void Configuration(IAppBuilder app)
    {
        app.CreatePerOwinContext<OAuthDbContext>(() => new OAuthDbContext());
        app.CreatePerOwinContext<UserManager<IdentityUser>>(CreateManager);
    }

    private static UserManager<IdentityUser> CreateManager(
        IdentityFactoryOptions<UserManager<IdentityUser>> options,
        IOwinContext context)
    {
        var userStore =
            new UserStore<IdentityUser>(context.Get<OAuthDbContext>());

        var manager =
            new UserManager<IdentityUser>(userStore);

        return manager;
    }
}

This is the minimum code that we can write to use the UserManager class inside our OWIN components efficiently. Although I'm not fan of this approach, I chose to do it this way since doing it in my way would complicate the post.

OAuth Authorization Server Application with Katana OAuthAuthorizationServerMiddleware

Here we come to the real meat of the post. I will now set up the OAuth 2.0 token endpoint to support Resource Owner Password Credentials Grant by using the OAuthAuthorizationServerMiddleware which comes with the Microsoft.Owin.Security.OAuth library. There is a shorthand extension method on IAppBuilder to use this middleware: UseOAuthAuthorizationServer. I will use this extension method to configure my OAuth 2.0 endpoints through the Configuration method of my Startup class:

public void Configuration(IAppBuilder app)
{
    //... 
	
    app.UseOAuthAuthorizationServer(new OAuthAuthorizationServerOptions
    {
        TokenEndpointPath = new PathString("/oauth/token"),
        Provider = new MyOAuthAuthorizationServerProvider(),
        AccessTokenExpireTimeSpan = TimeSpan.FromMinutes(30),
#if DEBUG
        AllowInsecureHttp = true,
#endif
    });
}

I'm passing an instance of OAuthAuthorizationServerOptions here and setting a few of its properties. Everything is pretty much self explanatory except of Provider property. I'm setting an implementation of IOAuthAuthorizationServerProvider to Provider property to handle the request at the specific places. Fortunately, I didn't have to implement this interface from top to bottom as there is a default implementation of it (OAuthAuthorizationServerProvider) and I just needed to override the methods that I needed.

Spare some time to read the documentation of the OAuthAuthorizationServerProvider's methods. Those are pretty detailed and should give you a great head start.

public class MyOAuthAuthorizationServerProvider : OAuthAuthorizationServerProvider
{
    public override async Task ValidateClientAuthentication(
        OAuthValidateClientAuthenticationContext context)
    {
        string clientId;
        string clientSecret;

        if (context.TryGetBasicCredentials(out clientId, out clientSecret))
        {
            UserManager<IdentityUser> userManager = 
                context.OwinContext.GetUserManager<UserManager<IdentityUser>>();
            OAuthDbContext dbContext = 
                context.OwinContext.Get<OAuthDbContext>();

            try
            {
                Client client = await dbContext
                    .Clients
                    .FirstOrDefaultAsync(clientEntity => clientEntity.Id == clientId);

                if (client != null &&
                    userManager.PasswordHasher.VerifyHashedPassword(
                        client.ClientSecretHash, clientSecret) == PasswordVerificationResult.Success)
                {
                    // Client has been verified.
                    context.OwinContext.Set<Client>("oauth:client", client);
                    context.Validated(clientId);
                }
                else
                {
                    // Client could not be validated.
                    context.SetError("invalid_client", "Client credentials are invalid.");
                    context.Rejected();
                }
            }
            catch
            {
                // Could not get the client through the IClientManager implementation.
                context.SetError("server_error");
                context.Rejected();
            }
        }
        else
        {
            // The client credentials could not be retrieved.
            context.SetError(
                "invalid_client", 
                "Client credentials could not be retrieved through the Authorization header.");

            context.Rejected();
        }
    }

    public override async Task GrantResourceOwnerCredentials(
        OAuthGrantResourceOwnerCredentialsContext context)
    {
        Client client = context.OwinContext.Get<Client>("oauth:client");
        if (client.AllowedGrant == OAuthGrant.ResourceOwner)
        {
            // Client flow matches the requested flow. Continue...
            UserManager<IdentityUser> userManager = 
                context.OwinContext.GetUserManager<UserManager<IdentityUser>>();

            IdentityUser user;
            try
            {
                user = await userManager.FindAsync(context.UserName, context.Password);
            }
            catch
            {
                // Could not retrieve the user.
                context.SetError("server_error");
                context.Rejected();

                // Return here so that we don't process further. Not ideal but needed to be done here.
                return;
            }

            if (user != null)
            {
                try
                {
                    // User is found. Signal this by calling context.Validated
                    ClaimsIdentity identity = await userManager.CreateIdentityAsync(
                        user, 
                        DefaultAuthenticationTypes.ExternalBearer);

                    context.Validated(identity);
                }
                catch
                {
                    // The ClaimsIdentity could not be created by the UserManager.
                    context.SetError("server_error");
                    context.Rejected();
                }
            }
            else
            {
                // The resource owner credentials are invalid or resource owner does not exist.
                context.SetError(
                    "access_denied", 
                    "The resource owner credentials are invalid or resource owner does not exist.");

                context.Rejected();
            }
        }
        else
        {
            // Client is not allowed for the 'Resource Owner Password Credentials Grant'.
            context.SetError(
                "invalid_grant", 
                "Client is not allowed for the 'Resource Owner Password Credentials Grant'");

            context.Rejected();
        }
    }
}

Petty much all the methods you will implement, you will be given a context class and you can signal the validity of the request at any point by calling the Validated and Rejected method with their provided signatures. I implemented two methods above (ValidateClientAuthentication and GrantResourceOwnerCredentials) and I performed Validated and Rejected at several points as I have seen it fit.

An HTTP POST request made to "/oauth/token" endpoint with response_type parameter set to "password" will first arrive at the ValidateClientAuthentication method. This is the place where you should retrieve the client credentials and validate it. According to OAuth 2.0 specification, the client credentials can also be sent as request parameters. However, I don't think this is such a good idea comparing to sending the credentials through basic authentication. That's why I only tried to get it from the "Authorization" header. If the client credentials are valid, the request will continue. If not, it will not process further and the error response will be returned as described inside the OAuth 2.0 specification.

If the client credentials are valid and the "response_type" parameter is set to password, the request will arrive at the GrantResourceOwnerCredentials method. Inside this method there are three things we will essentials do:

  • Validate the client's allowed grant. I's check if it's set to ResourceOwner.
  • If the client's grant type is valid, validate the resource owner credentials.
  • If resource owner credentials are valid, generate a claims identity for the resource owner and pass it to the Validated method.

If all goes as expected, the middleware will issue the access token.

Calling the OAuth Token Endpoint and Getting the Access Token

Let's try out the pieces that we have built. As you see previously, I have seeded a sample client and a sample user when during the database creation process. I will use those information to generate a valid OAuth 2.0 "Resource Owner Password Credentials Grant" request.

Request:

POST http://localhost:53523/oauth/token HTTP/1.1
User-Agent: Fiddler
Content-Type: application/x-www-form-urlencoded
Authorization: Basic NDJmZjVkYWQzYzI3NGM5N2EzYTdjM2Q0NGI2N2JiNDI6Y2xpZW50MTIzNDU2
Host: localhost:53523
Content-Length: 56

grant_type=password&username=Tugberk&password=user123456

Response:

HTTP/1.1 200 OK
Cache-Control: no-cache
Pragma: no-cache
Content-Length: 550
Content-Type: application/json;charset=UTF-8
Expires: -1
Server: Microsoft-IIS/8.0
X-SourceFiles: =?UTF-8?B?RDpcRHJvcGJveFxBcHBzXFNhbXBsZXNcQXNwTmV0SWRlbnRpdHlTYW1wbGVzXFNpbXBsZU9BdXRoU2FtcGxlXFNpbXBsZU9BdXRoU2FtcGxlXG9hdXRoXHRva2Vu?=
X-Powered-By: ASP.NET
Date: Tue, 01 Apr 2014 13:56:32 GMT

{"access_token":"ydbP24rMOATt7TK3dBCjluD2F5LcLkoX8ud39X135x0a1LEvOgsPf0ekm4Lyu2a06Rv_Z105GRZT_NoclgTTf7Slt5_WNfe68zOUq22j6MqW4Fh__Abzjm6I8otDzxvCJpt5d73R-Um6GwTui3LDbcOk5bH2BZuQLTJsNLknbLPu_FdpgkYfBodUoyPiFhv5-gNBEsfp4gCZYfdKtlhaK0wtloZiIzH1_sNPhBt9FavSfThM5BeoWkz8PFxkv_cOsOhOIzK66nSx7B2XL7K9aLqPSJLxus2ud8GBZyteSeFi26L9oX9do7MyCL1nXa8D9DRWfcIXiQi1v19AwyhoupP3L-k89xOK6_NTSzYOVhSMG9Juz8VYHWGkJeYTmekmnVkCvQe7KMQ6PceeUFJnA88TkiHNhai0hV8j012OUxPpUN5zRPJOU81XywSkQ7oKE0UsX3hQamgFrXV9eA-TSwZd4Qr-P9w6a82OM66Te9E","token_type":"bearer","expires_in":1799}

We successfully retrieved the response and it contains the JSON response body which includes the access token in the format described inside the OAuth 2.0 specification.

Summary and What is Next

In this post, we have set up our authorization server and we have a working OAuth 2.0 token endpoint which only supports "Resource Owner Password Credentials Grant" for now. The code is available on GitHub if you are interested in. In the next post, we will create our web service and protect it using our authorization server. We will also see how we can call this web service successfully from a typical .NET application.

Resources

My Baby Steps to OAuth 2.0 Hell (or Should I Call It Heaven)

Securing our HTTP API endpoints are one of the biggest challenges we face when writing so-called modern applications and this is where the OAuth 2.0 enters. In this post, I will highlight the things that I have found vital for the last couple of months when I have been working on an OAuth 2.0 Server implementation in .NET Framework.
2014-03-16 13:05
Tugberk Ugurlu


Securing our HTTP API endpoints are one of the biggest challenges we face when writing so-called modern applications. There are multiple concerns that we need to cover when security is the issue but for those modern applications, the concerns are much bigger because you are no more under your trusted subsystem in your organization domain. Dominick Baier, Mr. Identity, has covered this topic several times:

This is where the OAuth 2.0 enters. OAuth 2.0 is a specification and defines an authorization framework which enables you to give limited access to the third party applications on behalf of the resource owner through one of the defined flows. I’m not going to duplicate the RFC 6749 here but I will highlight the things that I have found vital for the last couple of months when I have been working on an OAuth 2.0 Server implementation in .NET Framework. So, this post is mostly a brain dump.

First of all, there are a few terms that you have to know before you dip your toes in the OAuth water:

  • Resource server: The server that holds the resource.
  • Resource owner: The user which owns the specific resources.
  • Authorization server: A server that authenticates the resource owner and issues access tokens to the client.
  • Client: The application that requests access on a resource on behalf of the resource owner.

There are more terms that you will see when you start reading the specification but the above list should be enough for this blog post. From this point on, I will go through a few bullet points which come in handy as a checklist when designing an OAuth 2.0 authorization server and an HTTP service backed by that authorization server.

Give Scope-based Permissions

The best thing that OAuth specifies is the distinction between the client and the application (resource server) and this gives you a chance to think about your structure again after you read the specification. The application is where you handle your business and expose the certain data. The data exposed here may be tied to a resource owner (for example, user’s hotel bookings) but it doesn’t have to be. That data can be something that your application deals with directly with the user context (for example, the list of hotels).

You should probably see where I’m going with this. Each of your exposed HTTP endpoints need different level of access grant, permission, whatever you want to call it. The OAuth 2.0 helps you handles this as well in a standard way: scopes. A scope defines a level of access that the client acquired. In each OAuth flow, the client specifies which scopes it needs to obtain. Assume that your application is handles data for a travel and tourism company (just like Expedia) and the scopes would be similar to below list:

  • Reading user hotel reservations
  • Changes to user hotel reservations
  • Making hotel reservations on behalf of a user

This list can go on. Handling access to a resource based on scopes in your HTTP service layer gives you a natural way of dealing with the resources. Thinktecture.IdentityModel library has a nice little authorization attribute for your ASP.NET Web API applications: ScopeAuthorizeAttribute. This attribute allows you to protect the each HTTP service endpoint based on required scopes.

Every Endpoint Should Require Authentication

It is certainly a valid situation if you have some endpoints which don’t require resource owner context. For example, a client application may want to get a list of hotels in a region (going with my Expedia sample here again). In that case, resource owner permission is not necessary. So, those endpoints can be exposed publicly without needing any type of authentication and authorization? Absolutely not! In such an environment, don’t ever design your HTTP service in this way. I have found out that requiring an identity in each endpoint of your HTTP service is vital. For cases where you see that resource owner context is not necessary, the client application can obtain the access token through the Client Credentials Grant and hit your HTTP service endpoints with the access token obtained through the client credentials grant.

DO Validate the Redirect URI

This is a must! You should always validate the redirect_uri parameter and try to match it with the client’s pre-registered redirect URIs for Authorization Code Grant and Implicit Grant requests. Unless you do this, your authorization server will be very dangerous! Potentially, you could be ending up redirecting the access token to anywhere.

Only One Grant Type + Client Credentials Grant for a Client

Besides the client credentials grant, only allow one grant type for a client. When you think about it, you should see that a client only needs one type of grant. A server based client application (a web application, for instance) will need Authorization Code Grant. An iPhone application will need Implicit Grant. So, on the way to issuing an access token, validate the grant type as well.

References

Should I await on Task.FromResult Method Calls?

Task class has a static method called FromResult which returns an already completed (at the RanToCompletion status) Task object. I have seen a few developers "await"ing on Task.FromResult method call and this clearly indicates that there is a misunderstanding here. I'm hoping to clear the air a bit with this post.
2014-02-24 21:14
Tugberk Ugurlu


Task class has a static method called FromResult which returns an already completed (at the RanToCompletion status) Task object. I have seen a few developers "await"ing on Task.FromResult method call and this clearly indicates that there is a misunderstanding here. I'm hoping to clear the air a bit with this post.

What is the use of Task.FromResult method?

Imagine a situation where you are implementing an interface which has the following signature:

public interface IFileManager
{
     Task<IEnumerable<File>> GetFilesAsync();
}

Notice that the method is Task returning which allows you to make the return expression represent an ongoing operation and also allows the consumer of this method to call this method in an asynchronous manner without blocking (of course, if the underlying layer supports it). However, depending on the case, your operation may not be asynchronous. For example, you may just have the files inside an in memory collection and want to return it from there, or you can perform an I/O operation to retrieve the files list asynchronously from a particular data store for the first time and cache the results there so that you can just return it from the in-memory cache for the upcoming calls. These are just some scenarios where you need to return a successfully completed Task object. Here is how you can achieve that without the help of Task.FromResult method:

public class InMemoryFileManager : IFileManager
{
    IEnumerable<File> files = new List<File>
    {
        //...
    };

    public Task<IEnumerable<File>> GetFilesAsync()
    {
        var tcs = new TaskCompletionSource<IEnumerable<File>>();
        tcs.SetResult(files);

        return tcs.Task;
    }
}

We here used the TaskCompletionSource to produce a successfully completed Task object with the result. Therefore, the caller of the method will immediately have the result. This was what we had been doing till the introduction of .NET 4.5. If you are on .NET 4.5 or above, you can just use the Task.FromResult to perform the same operation:

public class InMemoryFileManager : IFileManager
{
    IEnumerable<File> files = new List<File>
    {
        //...
    };

    public Task<IEnumerable<File>> GetFilesAsync()
    {
        return Task.FromResult<IEnumerable<File>>(files);
    }
}

Should I await Task.FromResult method calls?

TL;DR version of the answer: absolutely not! If you find yourself in need to using Task.FromResult, it's clear that you are not performing any asynchronous operation. Therefore, just return the Task from the Task.FromResult output. Is it dangerous to do this? Not completely but it's illogical and has a performance effect.

Long version of the answer is a bit more in depth. Let's first see what happens when you "await" on a method which matches the pattern:

IEnumerable<File> files = await fileManager.GetFilesAsync();

This code will be read by the compiler as follows (well, in a simplest way):

var $awaiter = fileManager.GetFilesAsync().GetAwaiter();
if(!$awaiter.IsCompleted) 
{
     DO THE AWAIT/RETURN AND RESUME
}

var files = $awaiter.GetResult();

Here, we can see that if the awaited Task already completed, then it skips all the await/resume work and directly gets the result. Besides this fact, if you put "async" keyword on a method, a bunch of code (including the state machine) is generated regardless of the fact that you use await keyword inside the method or not. Keeping all these facts in mind, implementing the IFileManager as below is going to cause nothing but overhead:

public class InMemoryFileManager : IFileManager
{
    IEnumerable<File> files = new List<File>
    {
        //...
    };

    public async Task<IEnumerable<File>> GetFilesAsync()
    {
        return await Task.FromResult<IEnumerable<File>>(files);
    }
}

So, don't ever think about "await"ing on Task.FromResult or I'll hunt you down in your sweet dreams :)

References

How and Where Concurrent Asynchronous I/O with ASP.NET Web API

When we have uncorrelated multiple I/O operations that need to be kicked off, we have quite a few ways to fire them off and which way you choose makes a great amount of difference on a .NET server side application. In this post, we will see how we can handle the different approaches in ASP.NET Web API.
2014-02-21 22:06
Tugberk Ugurlu


When we have uncorrelated multiple I/O operations that need to be kicked off, we have quite a few ways to fire them off and which way you choose makes a great amount of difference on a .NET server side application. Pablo Cibraro already has a great post on this topic (await, WhenAll, WaitAll, oh my!!) which I recommend you to check that out. In this article, I would like to touch on a few more points. Let's look at the options one by one. I will use a multiple HTTP request scenario here which will be consumed by an ASP.NET Web API application but this is applicable for any sort of I/O operations (long-running database calls, file system operations, etc.).

We will have two different endpoint which will hit to consume the data:

  • http://localhost:2700/api/cars/cheap
  • http://localhost:2700/api/cars/expensive

As we can infer from the URI, one of them will get us the cheap cars and the other one will get us the expensive ones. I created a separate ASP.NET Web API application to simulate these endpoints. Each one takes more than 500ms to complete and in our target ASP.NET Web API application, we will aggregate these two resources together and return the result. Sounds like a very common scenario.

Inside our target API controller, we have the following initial structure:

public class Car 
{
    public int Id { get; set; }
    public string Make { get; set; }
    public string Model { get; set; }
    public int Year { get; set; }
    public float Price { get; set; }
}

public class CarsController : BaseController 
{
    private static readonly string[] PayloadSources = new[] { 
        "http://localhost:2700/api/cars/cheap",
        "http://localhost:2700/api/cars/expensive"
    };

    private async Task<IEnumerable<Car>> GetCarsAsync(string uri) 
    {
        using (HttpClient client = new HttpClient()) 
        {
            var response = await client.GetAsync(uri).ConfigureAwait(false);
            var content = await response.Content
                .ReadAsAsync<IEnumerable<Car>>().ConfigureAwait(false);

            return content;
        }
    }

    private IEnumerable<Car> GetCars(string uri) 
    {
        using (WebClient client = new WebClient()) 
        {    
            string carsJson = client.DownloadString(uri);
            IEnumerable<Car> cars = JsonConvert
                .DeserializeObject<IEnumerable<Car>>(carsJson);
                
            return cars;
        }
    }
}

We have a Car class which will represent a car object that we are going to deserialize from the JSON payload. Inside the controller, we have our list of endpoints and two private methods which are responsible to make HTTP GET requests against the specified URI. GetCarsAsync method uses the System.Net.Http.HttpClient class, which has been introduces with .NET 4.5, to make the HTTP calls asynchronously. With the new C# 5.0 asynchronous language features (A.K.A async modifier and await operator), it is pretty straight forward to write the asynchronous code as you can see. Note that we used ConfigureAwait method here by passing the false Boolean value for continueOnCapturedContext parameter. It’s a quite long topic why we need to do this here but briefly, one of our samples, which we are about to go deep into, would introduce deadlock if we didn’t use this method.

To be able to measure the performance, we will use a little utility tool from Apache Benchmarking Tool (A.K.A ab.exe). This comes with Apache Web Server installation but you don’t actually need to install it. When you download the necessary ZIP file for the installation and extract it, you will find the ab.exe inside. Alternatively, you may use Web Capacity Analysis Tool (WCAT) from IIS team. It’s a lightweight HTTP load generation tool primarily designed to measure the performance of a web server within a controlled environment. However, WCAT is a bit hard to grasp and set up. That’s why we used ab.exe here for simple load tests.

Please, note that the below compressions are poor and don't indicate any real benchmarking. These are just compressions for demo purposes and they indicate the points that we are looking for.

Synchronous and not In Parallel

First, we will look at all synchronous and not in parallel version of the code. This operation will block the running the thread for the amount of time which takes to complete two network I/O operations. The code is very simple thanks to LINQ.

[HttpGet]
public IEnumerable<Car> AllCarsSync() {

    IEnumerable<Car> cars =
        PayloadSources.SelectMany(x => GetCars(x));

    return cars;
}

For a single request, we expect this to complete for about a second.

AllCarsSync

The result is not surprising. However, when you have multiple concurrent requests against this endpoint, you will see that the blocking threads will be the bottleneck for your application. The following screenshot shows the 200 requests to this endpoint in 50 requests blocks.

AllCarsSync_200

The result is now worse and we are paying the price for blowing the threads for long running I/O operations. You may think that running these in-parallel will reduce the single request time and you are not wrong but this has its own caveats, which is our next section.

Synchronous and In Parallel

This option is mostly never good for your application. With this option, you will perform the I/O operations in parallel and the request time will be significantly reduced if you try to measure only with one request. However, in our sample case here, you will be consuming two threads instead of one to process the request and you will block both of them while waiting for the HTTP requests to complete. Although this reduces the overall request processing time for a single request, it consumes more resources and you will see that the overall request time increases while your request count increases. Let’s look at the code of the ASP.NET Web API controller action method.

[HttpGet]
public IEnumerable<Car> AllCarsInParallelSync() {

    IEnumerable<Car> cars = PayloadSources.AsParallel()
        .SelectMany(uri => GetCars(uri)).AsEnumerable();

    return cars;
}

We used “Parallel LINQ (PLINQ)” feature of .NET framework here to process the HTTP requests in parallel. As you can, it was just too easy; in fact, it was only one line of digestible code. I tent to see a relationship between the above code and tasty donuts. They all look tasty but they will work as hard as possible to clog our carotid arteries. Same applies to above code: it looks really sweet but can make our server application miserable. How so? Let’s send a request to this endpoint to start seeing how.

AllCarsInParallelSync

As you can see, the overall request time has been reduced in half. This must be good, right? Not completely. As mentioned before, this is going to hurt us if we see too many requests coming to this endpoint. Let’s simulate this with ab.exe and send 200 requests to this endpoint in 50 requests blocks.

AllCarsInParallelSync_200

The overall performance is now significantly reduced. So, where would this type of implementation make sense? If your server application has small number of users (for example, an HTTP API which consumed by the internal applications within your small team), this type of implementation may give benefits. However, as it’s now annoyingly simple to write asynchronous code with built-in language features, I’d suggest you to choose our last option here: “Asynchronous and In Parallel (In a Non-Blocking Fashion)”.

Asynchronous and not In Parallel

Here, we won’t introduce any concurrent operations and we will go through each request one by one but in an asynchronous manner so that the processing thread will be freed up during the dead waiting period.

[HttpGet]
public async Task<IEnumerable<Car>> AllCarsAsync() {

    List<Car> carsResult = new List<Car>();
    foreach (var uri in PayloadSources) {

        IEnumerable<Car> cars = await GetCarsAsync(uri);
        carsResult.AddRange(cars);
    }

    return carsResult;
}

What we do here is quite simple: we are iterating through the URI array and making the asynchronous HTTP call for each one. Notice that we were able to use the await keyword inside the foreach loop. This is all fine. The compiler will do the right thing and handle this for us. One thing to keep in mind here is that the asynchronous operations won’t run in parallel here. So, we won’t see a difference when we send a single request to this endpoint as we are going through the each request one by one.

AllCarsAsync

As expected, it took around a second. When we increase the number of requests and concurrency level, we will see that the average request time still stays around a second to perform.

AllCarsAsync_200

This option is certainly better than the previous ones. However, we can still do better in some certain cases where we have limited number of concurrent I/O operations. The last option will look into this solution but before moving onto that, we will look at one other option which should be avoided where possible.

Asynchronous and In Parallel (In a Blocking Fashion)

Among these options shown here, this is the worst one that one can choose. When we have multiple Task returning asynchronous methods in our hand, we can wait all of them to finish with WaitAll static method on Task object. This results several overheads: you will be consuming the asynchronous operations in a blocking fashion and if these asynchronous methods is not implemented right, you will end up with deadlocks. At the beginning of this article, we have pointed out the usage of ConfigureAwait method. This was for preventing the deadlocks here. You can learn more about this from the following blog post: Asynchronous .NET Client Libraries for Your HTTP API and Awareness of async/await's Bad Effects.

Let’s look at the code:

[HttpGet]
public IEnumerable<Car> AllCarsInParallelBlockingAsync() {
    
    IEnumerable<Task<IEnumerable<Car>>> allTasks = 
        PayloadSources.Select(uri => GetCarsAsync(uri));

    Task.WaitAll(allTasks.ToArray());
    return allTasks.SelectMany(task => task.Result);
}

Let's send a request to this endpoint to see how it performs:

AllCarsInParallelBlockingAsync

It performed really bad but it gets worse as soon as you increase the concurrency rate:

AllCarsInParallelBlockingAsync_200

Never, ever think about implementing this solution. No further discussion is needed here in my opinion.

Asynchronous and In Parallel (In a Non-Blocking Fashion)

Finally, the best solution: Asynchronous and In Parallel (In a Non-Blocking Fashion). The below code snippet indicates it all but just to go through it quickly, we are bundling the Tasks together and await on the Task.WhenAll utility method. This will perform the operations asynchronously in Parallel.

[HttpGet]
public async Task<IEnumerable<Car>> AllCarsInParallelNonBlockingAsync() {

    IEnumerable<Task<IEnumerable<Car>>> allTasks = PayloadSources.Select(uri => GetCarsAsync(uri));
    IEnumerable<Car>[] allResults = await Task.WhenAll(allTasks);

    return allResults.SelectMany(cars => cars);
}

If we make a request to the endpoint to execute this piece of code, the result will be similar to the previous one:

AllCarsInParallelNonBlockingAsync

However, when we make 50 concurrent requests 4 times, the result will shine and lays out the advantages of asynchronous I/O handling:

AllCarsInParallelNonBlockingAsync_200

Conclusion

At the very basic level, what we can get out from this article is this: do perform load tests against your server applications based on your estimated consumption rates if you have any sort of multiple I/O operations. Two of the above options are what you would want in case of multiple I/O operations. One of them is "Asynchronous but not In Parallel", which is the safest option in my personal opinion, and the other is "Asynchronous and In Parallel (In a Non-Blocking Fashion)". The latter option significantly reduces the request time depending on the hardware and number of I/O operations you have but as our small benchmarking results showed, it may not be a good fit to process a request many concurrent I/O asynchronous operations in one just to reduce a single request time. The result we would see will most probably be different under high load.

References