Evolution of Graph Computation and Machine Learning

Understanding graph computation, its relevance in ML, and advancement from traditional graph engines to GNNs

Published in

Towards Data Science

7 min readNov 18, 2020

image by ds_30 in https://pixabay.com/photos/magnifier-schedules-tables-glasses-5245329/

“Why is the focus of Deep Learning shifting towards Graph Neural Networks?”, I thought to myself. Is it because of the data structure or efficient computation? It all begins with the structure of graphs and how they are able to solve relational problems and support distributed computing. Evolution really takes place when the same structure is applied to the state of the art Machine Learning Algorithms. From implementation in simple Matrix Factorization and linear regression algorithms to Graph Neural Networks, we can observe the evolution in technology.

In this article, we will go through graphs, their advantages, and how they are implemented in Machine Learning frameworks. We will also discuss their compatibility and evolution through time.

Graphs

Graphs are relational data structures that are able to define information collectively. It is a compilation of nodes and links in a non-linear format. Real-world information like your LinkedIn and Facebook social networks, Netflix movie structure, google maps, path optimizations can only be represented by graphs. Let us take an example of a family tree:

Each family member is a vertex (V) in the family tree graph (G) with relationships defined by edges (E). If information regarding a particular family member is to be extracted then his/her relation also must be known otherwise the data would seem incomplete. Each node and link hold their own significance and data. Also, the same graph can be represented in many different ways i.e. that ancestral graph can also be drawn bottom-up with different link values.

Computation Graphs

Computation graphs are graphs with equational data. They are a form of directed graphs that represent a mathematical expression. A very common example is postfix, infix, and prefix calculation. Every node in the graph can contain either an operation, variable, or an equation itself. These graphs appear in most of the computation that takes place in computers.

Advantages of Graphs

Graphs provide a unique structure that represents many real-life problems. Unlike typical tables or matrices, the order is not given much priority. Every element is dependent on the other to form a relationship. This relation is a core for all hypotheses and predictions based on it. Its advantages are:

Node-Link Structure - The distinct node-link structure of the graphs can store a lot of information. Network-based and relational problems can only be represented in this format. Although other structures like matrices, treemaps exist to represent graphs, the primary composition takes priority.
Distributed Computing - Huge problems with billions of nodes/elements cannot be processed by a single core or system. The fact that distributed computing can directly be implemented in graphs, saves up a lot of computation, and reduces time complexity.
Relational Problems - Usually, we work with datasets that contain independent input values and their respective output labels. What if I wanted to predict a movie for myself based on my recent watches, favorite actors, music, and so on. This a relational problem that can only be solved through graphs. Even when tried through unsupervised learning, clusters could be predicted but not the exact label or connection. We will try and understand one such problem of Netflix movie prediction in Brief:

Imagine genre, actors, language, release date as primary nodes of a graph. A number of movies are linked to the above nodes based on their labels. Depending on the movies I watch my preferred attribute nodes are stored. Netflix utilizes a Personalized Video Ranker (PVR) Algorithm that predicts movies in terms of genres, titles based on those stored graph data. In each genre or title, it again applies a Top-N Video Ranker algorithm that mixes popular and personal choices to predict movies.

Graphs in ML

All neural networks are computation graphs. Not only those but algorithms like linear regression can also be represented in the form of graphs. The major difference between traditional graphs and neural networks is implementation. Neural networks tend to mimic computation graphs for training but cannot handle graph-like data. They need structured data in order to work.

Let us understand it in terms of forward propagation in neural networks.

Consider this to be a graph of 8 nodes and 16 links. x1 and x2 input neurons (nodes) are densely connected to hidden layer nodes. These nodes are then connected similarly to the output layer. The values in x1, x2 are transferred to the hidden layer. The hidden layer performs A = WX + B. The links connecting the hidden and output layer activate these values. Their equation is H = function(A). A similar process is performed in the output layer also. Overall, this graph is able to represent the equation for forward propagation in neural networks.

Evolution

This is the main section of the article. On completion of the basics, we move on to why GNNs came into existence and how they are different from ANNs.

BACKGROUND:

Today, Machine learning is existent in many autonomous industries and provides state of the art results for many organizations and research. Distributed Graph computation follows from efficient parallel computation, stable graph structures, and implementation of many real-life applications such as social networks, knowledge graphs, etc. Combining both technologies would bring huge benefits and new areas of research for better development and efficiency.

GRAPH ENGINE FRAMEWORKS:

Many attempts had been made to bridge the gap between graphs and ML algorithms. Graphs lack properties which are essential to the training of these algorithms. Lack of support in terms of Looping, heterogeneity, and consistency in data, data abstraction were the main topics of concern when combining graph computing and machine learning. Graph engine frameworks like TUX2 and GraphLab proposed models that countered some of the issues. They successfully combined Distributed graph computation with matrix factorization, Latent Dirichlet allocation algorithms but failed in implementing neural networks. Unlike deep learning frameworks which are able to use GPU for computation, these engines only utilized distributed computing.

INTRODUCTION OF GNN (ANN vs GNN):

image in http://snap.stanford.edu/decagon/ representing SNAP dataset using GCN

Neural networks have substituted many static algorithms and lead the current ML industry. The market demands a graph-based technology that has direct Deep Learning affiliation. Failure of traditional engines and lack of GPU support leads to the introduction of Graph neural networks.

Graph Neural Networks are up and coming domain of deep learning that learns from graph data. With the introduction of Graph convolutional networks, LSTM networks, etc this domain has shown huge potential. These networks are graph structures themselves and utilize similar data for training. Graph datasets like CORA and SNAP are used for benchmarking them.

If Artificial neural networks are computation graphs then why do we require GNNs?

The answer can be confusing at times but let us go bask to the basics.

ANNs take input in a matrix format, that is more or less an ordered data whereas problems like social networks prioritize linkage over order. Technically in graphs, an order can be derived by choosing a root node and a particular link through it.
ANNs being computation graphs simply implies that they are feed-forward mathematical expressions interlinked together. The dependency graphs for both the networks as well as the data that they work with are different.
In terms of layers and functions, both networks contain dense, softmax, ReLU, etc but in each the calculations and processing are different. A normal dense layer implies a complete interconnection, but in GNNs that may not be the case.
Our traditional neural networks can solve two types of problems, classification, and regression. They fail when it comes to social networks or knowledge graphs. These come under relational problems that require direct graph input.
In an ANN, model architecture is defined and fed with input and its respective output. The training and prediction of graph-based networks are somewhat unsupervised. So if want to predict the name of a particular LinkedIn member, I should be able to do it using his/her 1st, 2nd connections, company, institute affiliations, and without really knowing his/her actual label.

Conclusion

We have gone through the basics of graphs, their relevance in machine learning, and evolution with time. The development of graph neural networks as they stand today is quite intuitive. How did they come into existent, why do we need them and how are they unique, are some questions answered in this article. In the future, I plan to write more detailed articles on GNNs and their implementation in python. If you want to stay updated do connect on GitHub and LinkedIn for new projects and articles.

References:
1. A Gentle Introduction to Graph Neural Networks (Basics, DeepWalk, and GraphSage)
2. Distributed Graph Computation Meets Machine Learning
3. Graphlab: A new framework for parallel machine learning
4. The Netflix Recommender system