Connection not Content

A Blog for MOOCs and Other Animals

Collecting Connected Courses Comments (#ccourses)

with 8 comments

comI’ve been running my new Comment Collector program during the Connected Courses (ccourses) MOOC and updating the output on a daily basis. The idea is to get a quick impression of current MOOC activity by bringing together in one place brief summarised versions of blog posts and their comments. Posts with comments are displayed for 15 days in order of their latest comments while posts without comments are flagged ‘New Post’ and displayed for 3 days. These parameters reflect my own ideas of what might be useful and can easily be changed.

The Collector currently scans the RSS post and comment feeds of a subset of blogs taken from the list of syndicated blogs. RSS feeds can lose old data so the Collector aggregates posts and comments over the 15 day periods. Posts intended as ccourses contributions are recognised by a tag placed in the post. There are currently over 230 syndicated blogs listed but some are inactive or have posts without recognisable tags in a label, category or in the title. Originally, the Collector recognised only ‘ccourses’ as a tag but this was altered so that variants such as ‘connectedcourses’ or ‘Connected Course’ were also recognised (not ‘cc’ – ‘cute cats’?) resulting in a significant increase in the number of accepted posts. The Collector works with most WordPress or Blogger blogs but not with some other commenting methods (eg tumblr, G+, FeedBurner etc) or blogs without comment feeds. At present, the Collector scans about 80 blogs with suitable feeds and probably covers the majority of active ccourses bloggers.

I ran a previous version of the program during the rhizo14 MOOC producing a graph showing (roughly) how commenting developed with time. The first graph illustrated below is similar and shows the total number of posts (blue) and comments (red) displayed each day (normally evening BST) and published with recognised ccourses tags over the preceding 15 day period. Again, this is no scientific study. The Collector is experimental and adjustments were made during the 31 day period covered by the graph This applies particularly to the first few days when blogs were being added and removed and the aggregation period was less than the nominal 15 days. A few blogs were removed because apparently valid RSS feeds could not be accessed by the Collector (reasons beyond me!). A sudden increase in posts and comments on the 25th Sept was caused when the number of recognised tags was increased. Subsequently, the graph is at least indicative of post and comment activity over the 80 or so blogs being scanned with not much variation around an average of about 60 posts and 225 comments over 15 day periods. For clarity, the average number of comments per post for each period (yellow) is scaled up by a factor of 100.


KEY:   BLUE = No. of posts. RED = No. of comments
YELLOW = Average Comments per post x 100

The second graph below is an attempt to estimate the distribution of specific numbers of comments among all recognised posts (495 in total) over the entire period from Sep 24 to Oct 24. For example, the first point indicates that 19 posts received 1 comment. The missing zeroth point corresponding to posts with zero comments would have indicated that 78 posts received no comments at all (displaying it would have compressed the vertical scale). This seems high but includes blogs with at least one recognisable post followed by other posts that may or may or may not have been intended for ccourses but with no recognisable tags. The sample lacks statistical significance but a cluster of posts with around 2 comments and maybe other clusters are discernible followed by a long tail of up to 18 comments for some single posts.


Other quantitative types of analysis are possible and may be useful for research or other purposes. For example, representations of the network of connections created by participants in a MOOC as they comment on each other’s posts could be of interest, maybe along the lines of what Martin Hawksey has done for Twitter. There are other possibilities – ranking people by name in order of number of posts or comments? This seems more questionable than ranking tweets in the same way but where should the line be drawn and why? Advice and suggestions welcome!

Thanks to all ccourses folks who have retweeted and favorited the Collector updates. The rapid turnover of ccourses posts and comments has field-tested the Comment Collector well – sometimes to breaking point! I will keep it running until the formal end of Connected Courses and now that the program is reasonably stable it’s little trouble to continue publishing the output. However, there are several other methods available to ccourses participants for monitoring activity such as the blog aggregator, the forum, the Facebook page etc and I’m unsure to what extent the Comment Collector has a useful or distinct role to play.

As always, comments and suggestions are very welcome but at the very least if you find the Collector useful, please ‘like’ this post so I have some measure of the Collector’s value in the context of ccourses – thanks!

Written by Gordon Lockhart

October 26, 2014 at 4:36 pm

Posted in Mooc

Tagged with ,

8 Responses

Subscribe to comments with RSS.

  1. Hi, Gordon ~ I’ve been looking at #DALMOOC and reminded myself to drop you a line asking if you are following or scraping there too. Since our main communication channel seems to be blog comments, I’m very glad to see this in my mailbox just when I was thinking about contacting you.

    As useful and potentially informative as a “universal collector” (which I remember both Stephen Downes and Vance Stevens musing on) would be, the downside of inherent intrusive could well outweigh benefits. Hey, isn’t that what the NSA has?

    I’m glad too that you pointed out the gaps. The places I am most active are not the scraping list. They are, however, the best outlets for my networks and online constituency ~ my priorities. I also question the validity of ranking by number when there are so many gaps.

    On both accounts, data says more about a limited number of high profile participants than the overall course and vast middle ground of participants who may not be able to devote as much time to iterating themselves across a limited media set? How good a predictor is it for the long range effect and influence? I can only speak for myself but know the long range influence of a course and connections is not always related to these numbers. They also do not reflect all the other places I share.

    PS this just in today’s catch (I’m in the middle of blogging it for my higher ed advocacy readership on an account not on the scraping list.)



    October 26, 2014 at 6:06 pm

    • Thanks for the comments Vanessa and the links – I’ve been aware of #DALMOOC but too busy to get involved. The ‘multiple learning pathways’ is intriguing – could be an important development in MOOC evolution.

      Having embarked on a bit of data analysis I’m finding it difficult to figure out where to draw lines but I hope to keep in safe territory – not play the evil demon. (eg my gut reaction is that ranking people according to their post or comment numbers would be more questionable than doing it according to their tweets – but I’m not sure why!)

      Yes inevitably the quantitative data is mostly about a limited number of high profile participants but they’re an important if tiny minority since they form the critical mass that can drive the MOOC onwards and serve as exemplars for the vast majority who can’t participate fully for whatever reason. I guess mainly qualitative methods of analysis could be applied to this majority group but I’ve no experience of these.

      Gordon Lockhart

      October 26, 2014 at 11:18 pm

  2. Thanks so much for aggregating and analyzing this data Gordon. It’s very illuminating. As a course facilitator the first graph provides a very useful overview of overall levels of activity week by week that complements the twitter data. While I love the syndicated model of the cMOOC it does make it harder as organizer to get one’s arms around the activity happening in such a diverse range of platforms! I’m still struggling to figure out what the right indicators are for the engagement funnel for connected courses. Or maybe it is more appropriately thought of as different learning pathways or genres. The second graph is one helpful data point in thinking through that. I’ll try to blog some of the initial thoughts I’ve been tossing around with my team on that when I have given it a bit more thought…

    Mimi Ito

    October 28, 2014 at 10:47 pm

    • Thanks Mimi – the first graph just uses post and comment numbers appearing on outputs every evening here so gives an idea of average activity over a 15 day moving time window. The second differs in that the data is processed from the Collector’s internal day-to-day aggregation records and although the results have little statistical significance they serve as an example of what is analytically possible. But I also find it difficult to know what indicators to focus on. Maybe network visualisations of blogger / commenter connections would be of interest but it’s not so easy to do!

      Gordon Lockhart

      October 29, 2014 at 11:38 am

  3. I like it 🙂


    November 2, 2014 at 2:02 pm

  4. Hi Gordon,
    I think this is absolutely wonderful! I was wondering if it would be possible for you to share an overall picture of number of blog posts & comments from the start of Connected Courses until now. I would be very interested in seeing that info. If you have the time I would love to continue this conversation via email. Thanks, Gordon.


    November 21, 2014 at 8:54 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: