Understanding Graphs and Their Application on Software Systems
Lately, I wanted to spend a little bit time on going back to fundamental computer science concepts. Hopefully, I will be able to write about these while I am looking into them in order to offload the knowledge from my brain to the magic hands of the Web :) I am going to start with Graphs, specifically Depth First Traversal (a.k.a. Depth First Search or DFS) and Breadth First Traversal (a.k.a Breadth First Search or BFS). However, this post is only about the definition of Graph and its application in software systems.
What is a Graph?
I am sure you are capable of Googling what a Graph is and ironically maybe that’s why you are reading this sentence now. However, I am not going to put the fancy explanation of a Graph here. Wikipedia already has a great definition on a Graph which can be useful to start with.
Let’s start with a picture:
This is a graph and there are some unique characteristics of this which makes it a graph.
- Vertices (a.k.a. Nodes): Each circle with a label inside the above picture is called a vertex or node. They are fundamental building blocks of a graph.
- Edges (a.k.a. Arc, Line, Link, Branch): A line that joins two vertices together is called as edge. An edge could be in three forms: undirected and directed. We will get to what these actually mean.
At this point you might be asking what is the difference between a graph and a tree? A tree is actually a graph with some special constraints applied to. A few of these that I know:
- A tree cannot contain a cycle but a graph can (see the A, B and E nodes and their edges inside the above picture).
- A tree always has a specific root node, whereas you don’t have this concept with a graph.
- A tree can only has one edge between its two nodes whereas we can have unidirectional and bidirectional edges between nodes within a graph
I am sure there are more but I believe these are the ones that matter the most.
As we can see with the tree example, graphs comes in many forms. There are many types of graphs and each type has its own unique characteristics and real world use cases. Undirected and directed graphs are two of these types as I briefly mentioned while explaining the edges. I believe the best example to describe the difference between them is to have a look at the fundamental concept of Facebook and Twitter.
Application of Graphs
Graphs are amazing, I absolutely love the concept of a graph! Everyone interacts with a system everyday which somehow makes use of graphs. Facebook, Google Maps, Foursquare, the fraud check system that your bank applies are all making use of a graph and there are many, many more. One application of graph concept which I love is a recommendation engine. There are many forms of this but a very basis one is called Collaborative Filtering. At its basis, it works under a notion of “Steve and Mark liked BMW, Mercedes and Toyota, you like BMW and Toyota, and you may like Mercedes, too?”.
There are some really good graph databases with their own query languages as well. One that I love about is Neo4j which uses Cypher query language to make its data available to be consumed. On their web site, there are a few key applications of Neo4j listed and they are fundamentally real world applications of the graph concept.
You can also come across some interesting problems in the space of mathematics which has solutions based on a type of graph like Seven Bridges of Königsberg problem (and I think this problem is the cornerstone in the history of graph theory).