The words of Technology Enhanced Learning – a SNA view

Nicolas Balacheff and his team had a dataset about all the terms from the TEL Open Archive (, which is a high quality paper archive of articles in the area of TEL. One of their aims of this excercise has been to find keywords for their TEL thesaurus:
The dataset consists of a directed graph of terms that have been extracted from the paper corpus of the TEL Open Archive.
For example, “open-learning” is linked to “learners” because “learners” is one of the top words appearing in “open-learning” close context (close context: 50 words before-50 words after the word). But “learners” is not directly linked to “open-learning” because other words appears much many times that “open-learning” actually do in “learners” close context.
I had a go with the dataset using the SNA tool GEPHI and had some interesting findings.

In the first picture,  I created a graph without community detection. I filtered the “long tail” of words based on the degree of each node.

In a next step, I applied a community detection algorithm. A community shows a dense connection within the community and sparse connection between communities. Each community has its own colour. The size of the nodes is still based on the degree of the node.

In a last step I changed the ranking for the node size from the node degree to the community degree of each node. Instead of using the degree (in/out-degree) for the size of the nodes, I used the weights of the nodes in each community. This is the difference between the figure two and three and is also the reason for the different layout of the graph.

Just by looking at the last graph, I would say that the connected words make very much sense, and some topics become apparent.