Pulling an Old Article From the Coffin: SignalR with Redis Running on a Windows Azure Virtual Machine

Long time ago (about 5 years, at least), I contributed an article to SignalR wiki about scaling SignalR with Redis. You can still find the article here. I also blogged about it here. However, over time, pictures got lost there. I got a few requests from my readers to refresh those images and I was luckily able to find them :) I decided to publish that article here so that I would have a much better control over the content.
2018-08-08 14:32
Tugberk Ugurlu


Long time ago (about 5 years, at least), I contributed an article to SignalR wiki about scaling a SignalR application with Redis. You can still find the article here. I also blogged about it here. However, over time, pictures got lost there. I got a few requests from my readers to refresh those images and I was lucky enough to be able to find them :) I decided to publish that article here so that I would have a much better control over the content. So, here is the post :)

Please keep in mind that this is a really old post and lots of things have evolved since then. However, I do believe the concepts still resonate and it’s valuable to show the ways of how to achieve this within a cloud provider’s context.


SignalR with Redis Running on a Windows Azure Virtual Machine

This wiki article will walk your through on how you can run your SignalR application in multiple machines with Redis as your backplane using Windows Azure Virtual Machines for scale out scenarios.

Creating the Windows Azure Virtual Machines

First of all, we will spin up our virtual machines. What we want here is to have two Windows Server 2008 R2 virtual machines for our SignalR application and we will name them as Web1-08R2 and Web2-08R2. We will have the IIS installed on both of these servers and at the end, we will load balance the request on port 80.

Our third virtual machine will be another Windows Server 2008 R2 only for our Redis server. We will call this server Redis-08R2.

To spin up the VMs, go to new Windows Azure Management Portal and hit New icon at the bottom-right corner.

0-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

Creating a virtual machine running Windows Server 2008 R2 is explained here in details. We followed the same steps to create our first VM named Web1-08R2.

The second VM we will be creating has a slightly different approach than the first one. Under the hood, every virtual machine is a cloud service instance and we want to put our second VM (Web2-08R2) under the same cloud service that our first web VM is running under. To do that, we need to follow the same steps as explained inside the previously mentioned article but when we come to 3rd step in the creation wizard, we should chose Connect to existing Virtual Machine option this time and we should choose our first VM we have just created.

1-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732d417a7572652d5

As the last step, we now need to create our redis VM which will be named Redis-08R2. We will follow the same steps as we did when we were creating our second web VM (Web2-08R2).

Setting Up Redis as a Windows Service

To use Redis on a Windows machine, we went to Redis on Windows prototype GitHub page and cloned the repository and followed the steps explained under How to build Redis using Visual Studio section.

After you build the project, you will have all the files you need under msvs\bin\release path as zip files. redisbin.zip file will contain the redis server, redis command line interface and some other stuff. rediswatcherbin.zip file will contain the msi file to install redis as a windows service. You can just copy those zip files to your Redis VM and extract redisbin.zip under c:\redis\bin. Then follow the steps:

  • Currently, there is a bug in the RedisWatcher installer and if you don't have Microsoft Visual C++ 2010 Redistributable Package installed on your machine, the service won't start. So, I installed it first.

  • Copy this redis.conf file and put it under c:\redis\bin directory. Open it up and add a password by adding the following line of code:

    requirepass 1234567

    Take this note into considiration when you are setting up your redis password:

    Warning: since Redis is pretty fast an outside user can try up to 150k passwords per second against a good box. This means that you should use a very strong password otherwise it will be very easy to break.

  • Then, extract the rediswatcherbin.zip somewhere and run the InstallWatcher.msito install the service.

  • Navigate to C:\Program Files (x86)\RedisWatcher directory. You will see a file named watcher.conf inside this directory. Open this file up and replace the entire file with the following text. Only difference here is that we are supplying the redis.conf file directory for the server to use:

    exepath c:\redis\bin
    exename redis-server.exe
    
    {
     workingdir c:\redis\inst1
     runmode hidden
     saveout 1
     cmdparms c:\redis\bin\redis.conf
    }
    
  • Create a folder named inst1 under c:\redis because we have specified this folder as working directory for our redis instance.

  • When you do a search against windows services in PowerShell, you will see RedisWatcherSvc service is installed.

2-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

  • Run the following PowerShell command to start the service for the first time.

    (Get-Service -Name RedisWatcherSvc).Start()
    

Now we have a Redis server running on our VM. To test if it is actually running, open up a windows command window under c:\redis\bin and run the following command (assuming you set your password 1234567):

redis-cli -h localhost -p 6379 -a 1234567

Now, you have a redis client running.

3-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

Ping the redis to see if you are really authenticated:

4-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

Now, we are nearly set. As a last step in our redis server, we need to open up TCP port 6379 for external communication. You can do this under Windows Firewall with Advanced Security window as explained here.

5-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

Communicating Through Internal Endpoints Between Windows Azure Virtual Machines Under Same Cloud Service

When you are inside one of your web VMs, you can simply look up the redis VM by hostname.

6-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

The hostname will resolve to DIP (Dynamic IP Address) which Windows Azure will use internally. We can configure public endpoints through Windows Azure Management Portal easily but in that case, we would be opening redis to the whole world. Also, if we communicate to our redis server through VIP (Virtual IP Address), we would always go through the load balancer which has its own additional cost.

So, we can easily connect to our redis server from any other connected VM by hostname.

The SignalR Application with Redis

Our SignalR application will not be that much different from a normal SignalR application thanks to SignalR.Redis project. All you need to do is to add the SignalR.Redis nuget package into your application and configure SignalR to use Redis as the message bus inside the Application_Start method in Global.asax.cs file:

protected void Application_Start(object sender, EventArgs e)
{
    // Hook up redis
    string server = ConfigurationManager.AppSettings["redis.server"];
    string port = ConfigurationManager.AppSettings["redis.port"];
    string password = ConfigurationManager.AppSettings["redis.password"];

    GlobalHost.DependencyResolver.UseRedis(server, Int32.Parse(port), password, "SignalR.Redis.Sample");
}

For our demo, the AppSettings should look like as below:

<appSettings>
    <add key="redis.server" value="Redis-08R2" />
    <add key="redis.port" value="6379" />
    <add key="redis.password" value="1234567" />
</appSettings>

I put the application under IIS on our both web servers (Web1-08R2 and Web2-08R2) and configured them to run under .NET Framework 4.0 integrated application pool.

For this demo, I am using the Redis.Sample chat application included inside the SignalR.Redis project.

Let's test them quickly before going public. I fired the both web applications inside the servers and here is the result:

7-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

Perfectly running! Let's open them up to the world.

Opening up the Port 80 and Load Balancing the Requets

Our requirement here is to make our application reachable over HTTP and at the same time, we want to load balance the request between our two web servers.

To do that, we need to go to Windows Azure Management portal and set up the TCP endpoints for port 80.

First, we navigate to dashboard of our Web1-08R2 VM and hit Endpoints from the dashboard menu:

8-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

From there, hit the End Endpoint icon at the bottom of the page:

9-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

A wizard is going to appear on the screen:

10-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

Click the right-arrow icon and go to next step which is the last one and we will enter the port details there:

11-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

After that, our endpoint will be created:

12-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

Follow the same steps of Web2-08R2 VM as well and open the Add Endpoint wizard. This time, we will be able to select Load-balance traffic on an existing port. Chose the previously created port and continue:

13-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

At the last step, enter the proper details and hit save:

14-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

We will see our new endpoint is being crated but this time Load Balanced column indicates Yes.

15-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

As we configured our web applications without a host name and they are exposed through port 80, we can directly run reach our application through the URL or Public Virtual IP Address (VIP) which is provided to us. When we run our application, we should see it running as below:

16-687474703a2f2f7475676265726b2e626c6f622e636f72652e77696e646f77732e6e65742f7475676265726b756775726c752d626c6f672f53696e67616c522d776974682d52656469732d52756e6e696e672d6f6e2d612d57696e646f77732

No matter which server it goes, the message will be broadcasted to every client because we will be using Redis as a message bus.

References

Graph Depth-First Search (DFS)

A while ago, I have written up on Graphs and gave a few examples about their application for real world problems. In this post, I want to talk about one of the most common graph algorithms, Depth-first search (DFS).
2018-07-28 12:32
Tugberk Ugurlu


A while ago, I have written up on Graphs and gave a few examples about their application for real world problems. I absolutely love graphs as they are so powerful to model the data for several key computer science problems. In this post, I want to talk about one of the most common graph algorithms, Depth-first search (DFS) and how and where it could be useful.

What is Depth-First Search (DFS)?

DFS is a specific algorithm for traversing and searching a graph data structure. Depending on the type of graph, the algorithm might differ. However, the idea is actually quite simple for a Directed Acyclic Graph (DAG):

  1. You start with a source vertex (let's call it "S")
  2. You visit the first neighbour vertex of that node (let's call this "N")
  3. You do the same for "N" and you keep going till you end up at a leaf vertex (L) (which is a vertex that has no edges to another vertex)
  4. Then you visit the second neighbour of L's parent vertex.
  5. You would be once you exhaust all the vertices.

I must admit that this is a bit simplified version of the algorithm even for a DAG. For instance, we didn't touch on the fact that we might end up actually visiting the same vertex multiple times if we don't take this into account in our algorithm. There is a really good visualization of this algorithm here where you can observe how the algorithm works in a visual way through a logical graph representation.

Picture2

Application of Depth-First Search

There are various applications of DFS which are used to solve particular problems such as Topological Sorting and detecting cycle in a graph. There are also occasions where DFS is used as part of another known algorithm to solve a real world problem. One example to that is the Tarjan’s Algorithm to find Strongly Connected Components.

This is also a good resource which lists out different real world applications of DFS.

Other Graph Traversal Algorithms

As you might guess, DFS is not the only known algorithm in order to traverse a graph data structure. Breadth-First Search (BFS) is a another most known graph traversal algorithm which has the similar semantics to DFS but instead of going in depth on a vertex, it prefers visit the all the neighbors of the current vertex. Bidirectional search is another one of the traversal algorithms which is mainly used to find a shortest path from an initial vertex to a goal vertex in a directed graph.

Setting up a MongoDB Replica Set with Docker and Connecting to It With a .NET Core App

Easily setting up realistic non-production (e.g. dev, test, QA, etc.) environments is really critical in order to reduce the feedback loop. In this blog post, I want to talk about how you can achieve this if your application relies on MongoDB Replica Set by showing you how to set it up with Docker for non-production environments.
2018-01-31 10:10
Tugberk Ugurlu


Easily setting up realistic non-production (e.g. dev, test, QA, etc.) environments is really critical in order to reduce the feedback loop. In this blog post, I want to talk about how you can achieve this if your application relies on MongoDB Replica Set by showing you how to set it up with Docker for non-production environments.

Hold on! I want to watch, not read!

I got you covered there! I have also recorded a ~5m covering the content of this blog post, where I also walks you through the steps visually. If you find this option useful, let me know through the comments below and I can aim harder to repeat that :)

What are we trying to do here and why?

Picture2

 

If you have an application which works against a MongoDB database, it’s very common to have a replica set in production. This approach ensures the high availability of the data, especially for read scenarios. However, applications mostly end up working against a single MongoDB instance, because setting up a Replica Set in isolation is a tedious process. As mentioned at the beginning of the post, we want to reflect the production environment to the process of developing or testing the software applications as much as possible. The reason for that is to catch unexpected behaviour which may only occur under a production environment. This approach is valuable because it would allow us to reduce the feedback loop on those exceptional cases.

Docker makes this all easy!

This is where Docker enters into the picture! Docker is containerization technology and it allows us to have repeatable process to provision environments in a declarative way. It also gives us a try and tear down model where we can experiment and easily start again from the initial state. Docker can also help us with easily setting up a MongoDB Replica Set. Within our Docker Host, we can create Docker Network which would give us the isolated DNS resolution across containers. Then we can start creating the MongoDB docker containers. They would initially be unaware of each other. However, we can initialise the replication by connecting to one of the containers and running the replica set initialisation command. Finally,  we can deploy our application container under the same docker network.

Picture1

There are a handful of advantages to setting up this with Docker and I want to specifically touch on some of them:

  •  It can be automated easily. This is especially crucial for test environments which are provisioned on demand.
  • It’s repeatable! The declarative nature of the Dockerfile makes it possible to end up with the same environment setup even if you run the scripts months later after your initial setup.
  • Familiarity! Docker is a widely known and used tool for lots of other purposes and familiarity to the tool is high. Of course, this may depend on your development environment

Let’s make it work!

First of all, I need to create a docker network. I can achieve this by running the "docker network create” command and giving it a unique name.

docker network create my-mongo-cluster

The next step is to create the MongoDB docker containers and start them. I can use “docker run” command for this. Also, MongoDB has an official image on Docker Hub. So, I can reuse that to simplify the acqusition of MongoDB. For convenience, I will name the container with a number suffix. The container also needs to be tied to the network we have previously created. Finally, I need to specify the name of the replica set for each container.

docker run --name mongo-node1 -d --net my-mongo-cluster mongo --replSet “rs0"

First container is created and I need to run the same command to create two more MongoDB containers. The only difference is with the container names.

docker run --name mongo-node2 -d --net my-mongo-cluster mongo --replSet "rs0"
docker run --name mongo-node3 -d --net my-mongo-cluster mongo --replSet “rs0"

I can see that all of my MongoDB containers are at the running state by executing the “docker ps” command.

Image

In order to form a replica set, I need to initialise the replication. I will do that by connecting to one of the containers through the “docker exec” command and starting the mongo shell client.

docker exec -it mongo-node1 mongo

Image

As I now have a connection to the server, I can initialise the replication. This requires me to declare a config object which will include connection details of all the servers.

config = {
      "_id" : "rs0",
      "members" : [
          {
              "_id" : 0,
              "host" : "mongo-node1:27017"
          },
          {
              "_id" : 1,
              "host" : "mongo-node2:27017"
          },
          {
              "_id" : 2,
              "host" : "mongo-node3:27017"
          }
      ]
  }

Finally, we can run “rs.initialize" command to complete the set up.

You will notice that the server I am connected to will be elected as the primary in the replica set shortly. By running “rs.status()”, I can view the status of other MongoDB servers within the replica set. We can see that there are two secondaries and one primary in the replica set.

.NET Core Application

As a scenario, I want to run my .NET Core application which writes data to a MongoDB database and start reading it in a loop. This application will be connecting to the MongoDB replica set which we have just created.  This is a standard .NET Core console application which you can create by running the following script:

dotnet new console

The csproj file for this application looks like below.

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp2.0</TargetFramework>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Bogus" Version="18.0.2" />
    <PackageReference Include="MongoDB.Driver" Version="2.4.4" />
    <PackageReference Include="Polly" Version="5.3.1" />
  </ItemGroup>
</Project>

Notice that I have two interesting dependencies there. Polly is used to retry the read calls to MongoDB based on defined policies. This bit is interesting as I would expect the MongoDB client to handle that for read calls. However, it might be also a good way of explicitly stating which calls can be retried inside your application. Bogus, on the other hand, is just here to be able to create fake names to make the application a bit more realistic :)

Finally, this is the code to make this application work:

partial class Program
{
    static void Main(string[] args)
    {
        var settings = new MongoClientSettings
        {
            Servers = new[]
            {
                new MongoServerAddress("mongo-node1", 27017),
                new MongoServerAddress("mongo-node2", 27017),
                new MongoServerAddress("mongo-node3", 27017)
            },
            ConnectionMode = ConnectionMode.ReplicaSet,
            ReplicaSetName = "rs0"
        };

        var client = new MongoClient(settings);
        var database = client.GetDatabase("mydatabase");
        var collection = database.GetCollection<User>("users");

        System.Console.WriteLine("Cluster Id: {0}", client.Cluster.ClusterId);
        client.Cluster.DescriptionChanged += (object sender, ClusterDescriptionChangedEventArgs foo) => 
        {
            System.Console.WriteLine("New Cluster Id: {0}", foo.NewClusterDescription.ClusterId);
        };

        for (int i = 0; i < 100; i++)
        {
            var user = new User { Id = ObjectId.GenerateNewId(), Name = new Bogus.Faker().Name.FullName() };
            collection.InsertOne(user);
        }

        while (true)
        {
            var randomUser = collection.GetRandom();
            Console.WriteLine(randomUser.Name);

            Thread.Sleep(500);
        }
    }
}

This is not the most beautiful and optimized code ever but should demonstrate what we are trying to achieve by having a replica set. It's actually the GetRandom method on the MongoDB collection object which handles the retry:

public static class CollectionExtensions 
{
    private readonly static Random random = new Random();

    public static T GetRandom<T>(this IMongoCollection<T> collection) 
    {
        var retryPolicy = Policy
            .Handle<MongoCommandException>()
            .Or<MongoConnectionException>()
            .WaitAndRetry(2, retryAttempt => 
                TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)) 
            );

        return retryPolicy.Execute(() => GetRandomImpl(collection));
    }

    private static T GetRandomImpl<T>(this IMongoCollection<T> collection)

    {
        return collection.Find(FilterDefinition<T>.Empty)
            .Limit(-1)
            .Skip(random.Next(99))
            .First();
    }
}

I will run this through docker as well and here is the dockerfile for this: 

FROM microsoft/dotnet:2-sdk

COPY ./mongodb-replica-set.csproj /app/
WORKDIR /app/
RUN dotnet --info
RUN dotnet restore
ADD ./ /app/
RUN dotnet publish -c DEBUG -o out
ENTRYPOINT ["dotnet", "out/mongodb-replica-set.dll"]

When it starts, we can see that it will output the result to the console:

Image

Prove that It Works!

In order to demonstrate the effect of the replica set, I want to take down the primary node. First of all, we need to have look at the output of rs.status command we have previously ran in order to identify the primary node. We can see that it’s node1! 

Image

Secondly, we need to get the container id for that node. 

Image

Finally, we can kill the container by running the “docker stop command”. Once the container is stopped, you will notice that application will gracefully recover and continue reading the data. 

Image

Speaking at SQL in the City 2017, Register Now!

I'm quite happy to tell you that I'll be speaking at SQL in the City 2017 on the 13th of December about Latest SQL Compare features and support for SQL Server 2017 with my colleague and fellow MVP, Steve Jones.
2017-12-05 18:27
Tugberk Ugurlu


I'm quite happy to tell you that I'll be speaking at SQL in the City 2017 on the 13th of December about latest SQL Compare features and support for SQL Server 2017 with my colleague and fellow MVP, Steve Jones.

MVPs in SITCs

SQL in the City Redgate's annual virtual event and this year's livestream event focuses on enabling you to be more productive. Technical sessions will dive into the latest Microsoft SQL Server releases, and cover topical issues such as data compliance, protection & privacy.

This year's agenda is full of really great sessions from enabling DevOps for databases by automating your deployments to rapid (and magic!) database provisioning with SQL Clone. This year Data Platform MVPs Steve Jones and Grant Fritchey will be joined by Kathi Kellenberger, Editor of Simple Talk and many more who are behind the great tools we build!

Register Now!

Register now to confirm your attendance, and be the first to get access to Grant Fritchey’s new eBook, SQL Server Execution Plans, when it's released in 2018.

Understanding Graphs and Their Application on Software Systems

Lately, I wanted to spend a little bit time on going back to fundamental computer science concepts. I am going to start with Graphs, specifically Depth First Traversal (a.k.a. Depth First Search or DFS) and Breadth First Traversal (a.k.a Breadth First Search or BFS). However, this post is only about the definition of Graph and its application in software systems.
2017-09-19 17:21
Tugberk Ugurlu


Lately, I wanted to spend a little bit time on going back to fundamental computer science concepts. Hopefully, I will be able to write about these while I am looking into them in order to offload the knowledge from my brain to the magic hands of the Web :) I am going to start with Graphs, specifically Depth First Traversal (a.k.a. Depth First Search or DFS) and Breadth First Traversal (a.k.a Breadth First Search or BFS). However, this post is only about the definition of Graph and its application in software systems.

What is a Graph?

I am sure you are capable of Googling what a Graph is and ironically maybe that’s why you are reading this sentence now. However, I am not going to put the fancy explanation of a Graph here. Wikipedia already has a great definition on a Graph which can be useful to start with.

Let’s start with a picture:

Picture1

This is a graph and there are some unique characteristics of this which makes it a graph.

  • Vertices (a.k.a. Nodes): Each circle with a label inside the above picture is called a vertex or node. They are fundamental building blocks of a graph.
  • Edges (a.k.a. Arc, Line, Link, Branch): A line that joins two vertices together is called as edge. An edge could be in three forms: undirected and directed. We will get to what these actually mean.

At this point you might be asking what is the difference between a graph and a tree? A tree is actually a graph with some special constraints applied to. A few of these that I know:

  • A tree cannot contain a cycle but a graph can (see the A, B and E nodes and their edges inside the above picture).
  • A tree always has a specific root node, whereas you don’t have this concept with a graph.
  • A tree can only has one edge between its two nodes whereas we can have unidirectional and bidirectional edges between nodes within a graph

I am sure there are more but I believe these are the ones that matter the most.

As we can see with the tree example, graphs comes in many forms. There are many types of graphs and each type has its own unique characteristics and real world use cases. Undirected and directed graphs are two of these types as I briefly mentioned while explaining the edges. I believe the best example to describe the difference between them is to have a look at the fundamental concept of Facebook and Twitter.

Application of Graphs

Graphs are amazing, I absolutely love the concept of a graph! Everyone interacts with a system everyday which somehow makes use of graphs. Facebook, Google Maps, Foursquare, the fraud check system that your bank applies are all making use of a graph and there are many, many more. One application of graph concept which I love is a recommendation engine. There are many forms of this but a very basis one is called Collaborative Filtering. At its basis, it works under a notion of “Steve and Mark liked BMW, Mercedes and Toyota, you like BMW and Toyota, and you may like Mercedes, too?”.

There are some really good graph databases with their own query languages as well. One that I love about is Neo4j which uses Cypher query language to make its data available to be consumed. On their web site, there are a few key applications of Neo4j listed and they are fundamentally real world applications of the graph concept.

You can also come across some interesting problems in the space of mathematics which has solutions based on a type of graph like Seven Bridges of Königsberg problem (and I think this problem is the cornerstone in the history of graph theory).

Tags