Visualising Connectivist Networks
I originally developed the Comment Collector as a ‘scraper’ program for scanning the RSS feeds of blogs in connectivist MOOCs in order to generate brief summarised versions of participant posts and their comments. The idea was to provide quick and constantly updated impressions of current MOOC activity (see ‘Pages’ menu above for details). I ran the Comment Collector twice daily for periods during several MOOCs with encouragement from participants and facilitators. The Collector amasses a considerable amount of data during the course of a MOOC and this raises possibilities for visualisation of the network formed by connections between the authors of blog posts and those who comment on them. Twitter networks have been visualised in a variety of forms, notably by Aras Bozkurt and Martin Hawksey and with some modification the networks formed by blogs and their commenters can be visualised in similar ways.
I have written a Python program for experimenting with blog and comment visualisation. The data generated by the Comment Collector is used to produce formatted output for display using the excellent Gephi network visualisation and exploration software . The example below illustrates visualisation of a network created by posts and comments from 70 WordPress and Blogger blogs based on an OPML file kindly supplied by Laura Gibbs . This visualisation corresponds to the last 10 days of the Rhizomatic Learning course (Rhizo15).
Names and Numbers – The real names of all blog post authors and commenters have been replaced by numerical labels – blog authors from 0 to 70, others from 71.
Nodes and connections – each node represents either a blog (yellow) with at least one post published in the time interval, or a commenter (pink or yellow). A black line connecting a commenter to a blog represents comments made by the commenter on that blog. The more comments, the thicker is the connecting line and the arrow head pointing at the blog’s node. A loop to and from the same blog node indicates comments by a blog author on their own blog.
Node Size – The size of a node is proportional to the number of connections made by the node – ie the total number of comments to or from the node.
Examples – Commenter 88 (top right) comments on blog 31. The thick connecting line and arrow head indicates several comments (actually 5).
Commenters 5, 11, 82 and 83 comment on blog 48 (left hand side). The thin lines indicate a small number of comments (actually 1 each). Although 11 denotes a blog (< 71), the node is coloured pink as no posts were made by 11 during the time interval. The absence of a loop on 48 indicates no comments made by the blog author on his or her own blog.
The cluster around blog 40 (bottom left) represent blogs that have posted during the time interval but have not made or received comments. Their nodes appear yellow with no connections.
Scope – Data was collected only from WordPress and Blogger blogs with posts carrying the rhizo15 hashtag. These two popular platforms account for the majority of blogs but several other forms of social media (Twitter, Google plus, Facebook etc) were also used in Rhizo15 and are common in connectivist networks.
Names of Participants – Accurate visualisation depends on accurate identification of names, particularly the names of commenters. If Fred Blogs is a blog author his blog’s RSS feed will probably output his name consistently as, ‘Fred Blogs’ but Fred is more likely to be inconsistent when commenting on other blogs, perhaps as ‘fred blogs’ or ‘FredBlogs’. This can be dealt with by eliminating white space and upper case so that all names reduce to the form, ‘fredblogs’ but resolving ambiguities resulting from ‘Fred_Blogs’, ‘fred B’ or just ‘Fred’ is not straightforward. If there is no ambiguity a look-up table can map ‘Fred’ or Fred’s other known aliases to ‘fredblogs’ but in general, the inconsistent use of names by commenters is a problem with no easy solution.
Lost Comments – The Comment Collector normally downloaded RSS feeds every 12 hours but, very occasionally, a post attracting a large numbers of comments immediately after a collection resulted in some new comments being replaced in the feed by later comments before the next collection time. Checking the accuracy of blog postings and comments with the original posts (rather than the RSS feeds) is a lengthy and time-consuming task. It has not been done rigorously!
Visualisation over Longer Periods
Less detailed but more comprehensive visualisations can be obtained by aggregating data over longer time intervals. The visualisation below (same format as above) corresponds to the last 5 weeks of Rhizo15. It involves posts from 49 blogs and 148 blog authors and commenters.
The concentration of the most active bloggers and commenters in the central part of the network and the extent of their connections with each other and with the wider network is very evident. Less connected blogs are located on the periphery and several connect with different clusters of commenters who have no other connections. The four disconnected nodes on the left represent blogs with new posts but without outgoing or incoming comments during the time interval.
My interest in the visualisation of connectivist networks has mainly been in the programming and the creation of objective data that could throw some light on how such networks form and develop. The visualisations above are examples of what might be achieved but are not definitive in any way. Features such as the number of connections per blog or the the number of comments connecting a commenter to the same blog could be visualised differently, or maybe not at all in favour of other more meaningful measures. How visualisations should be presented and interpreted depends on many factors and I’ve only begun to look at Social Network Analysis. I did try a modularity program available in Gephi that attempts to separate the nodes of a network into distinct communities – with little success suggesting that Rhizo15 is a good example of a well-distributed connectivist network!
There are certainly dangers of jumping to highly subjective or even judgemental conclusions when it comes to interpreting blog and comment visualisation. For example, bloggers who are unresponsive to comments on their own posts are not necessarily being inconsiderate. Receiving no comments on a post is not necessarily a reflection on its quality or content. There are also wider issues concerning privacy. Although the underlying data is publicly available, detailed revelations about the posting and commenting habits of MOOC participants could be considered inappropriate and tantamount to snooping. This is one reason why participants’ names in the rhizo15 visualisations were anonymised (although I’ll supply any interested participants with their own number on request).
Are blog and comment visualisations useful? There are definite possibilities for research in conjunction with visualisation software such as Gephi. If privacy concerns are properly addressed then visualisation might also have a part to play during a MOOC but exactly how and in what form is an open question – comments and suggestions are welcome!