Thoughts on Coursera and Online Courses

I recently completed my first online course (or “MOOC“) on Coursera. It was an interesting experience and wanted to share some thoughts here.

I decided to take an online course for several reasons. Firstly the topic, Astrobiology, was fun and I thought the short course might make an interesting alternative to watching BBC documentaries and US TV box sets. I certainly wasn’t disappointed as the course content was accessible and well-presented. As a biology graduate I found much of the content was fairly entry-level, but it was nevertheless a good refresher in a number of areas. The mix of biology, astronomy, chemistry and geology was really interesting. The course was very well attended, with around 40,000 registrants and 16,000 active students.

The second reason I wanted to try a course was because MOOCs are so popular at the moment. I was curious how well an online course would work, in terms of both content delivery and the social aspects of learning. Many courses are longer and are more rigorously marked and assessed, but the short Astrobiology course looked like it would still offer some useful insights into online learning.

Clearly some of my experiences will be specific to the particular course and Coursera, but I think some of the comments below will generalise to other platforms.

Firstly, the positives:

  • The course material was clear and well presented
  • The course tutors appeared to be engaged and actively participated in discussions
  • The ability to download the video lectures, allowing me to (re)view content whilst travelling was really appreciated. Flexibility around consuming course content seems like an essential feature to me. While the online experience will undoubtedly be richer, I’m guessing that many people are doing these courses in spare time around other activities. With this in mind, video content needs to be available in easily downloadable chunks.
  • The Coursera site itself was on the whole well constructed. It was easy to navigate to the content, tests and the discussions. The service offered timely notifications that new content and assessments had been published
  • Although I didn’t use it myself, the site offered good integration with services like Meetup, allowing students to start their own local groups. This seemed like a good feature, particularly for longer running courses.

However there were a number of areas in which I thought things could be greatly improved:

  • The online discussion forums very quickly became unmanageable. With so many people contributing, across many different threads, it’s hard to separate the signal from the noise. The community had some interesting extremes: people associated with the early NASA programme, through to alien contact and conspiracy theory nut-cases. While those particular extremes are peculiar to this course, I expect other courses may experience similar challenges
  • Related to the above point, the ability to post anonymously in forums lead to trolling on a number of occasions. I’m sensitive to privacy, but perhaps pseudonyms may be better than anonymity?
  • The discussions are divorced from the content, e.g. I can’t comment directly on a video I have to create a new thread for it in a discussion group. I wanted to see something more sophisticated, maybe SoundCloud style annotations on the videos or per-video discussion threads.
  • No integration with wider social networks: there were discussions also happening on twitter, G+ and Facebook. Maybe its better to just integrate those, rather than offer a separate discussion forum?
  • Students consumed content at different rates which meant that some discussions contained “spoilers” for material I hadn’t yet watched. This is largely a side-effect of the discussions happening independently from the content.
  • Coursera offered a course wiki but this seemed useless
  • It wasn’t clear to me during the course what would happen to the discussions after the course ended. Would they be wiped out, preserved, or would later students build on what was there already? Now that it’s finished it looks each course is instanced and discussions are preserved as an archive. I’m not sure what the right option is there. Starting with a clean slate seems like a good default, but can particularly useful discussions be highlighted in later courses? Seems like the course discussions would be an interesting thing to mine for links and topics, especially for lecturers

There are some interesting challenges with designing this kind of product. Unlike almost every other social application the communities for these courses don’t ramp up over time: they arrive en masse at a particular date and then more or less evaporate over night.

As a member of that community this makes it very hard to identify which people in the community are worth listening too and who to ignore: all of a sudden I’m surrounded by 16000 people all talking at once. When things ramp up more slowly, I can build out my social network more easily. Coursera doesn’t have any notion of study groups.

I expect the lecturers must have similar challenges as very quickly they’re faced with a lot of material that they might have to potentially read, review and respond to. This must present challenges when engaging with each new intake.

While a traditional discussion forum might provide the basic infrastructure for enabling the necessary basic communication, MOOC platforms need to have more nuanced social features — for both students and lecturers — to support the community. Features that are sensitive to the sudden growth of the community. I found myself wanting to find out things like:

  • Who is posting on which topics and how frequently?
  • Which commentators are getting up-voted (or down-voted) the most?
  • Which community members are at the same stage in the course as me?
  • Which community members have something to offer on a particular topic, e.g. because of their specific background?
  • What links are people sharing in discussions? Perhaps filtered by users.
  • What courses are my fellow students undertaking next? Are there shared journeys?
  • Is there anyone watching this material at the same time?

Answering all of these requires more than just mining discussions but it feels like some useful metrics could be nevertheless. For example, one common use of the forums was to share additional material, e.g. recent news reports, scientific papers, you tube videos, etc. That kind of content could either be collected in other ways, e.g. via a shared reading list, or as a list that is automatically surfaced from discussions. I ended up sifting through the forums and creating a reading list on readlists, as well as a YouTube playlist just to see whether others would find them useful (they did).

All of these challenges we can see playing out in wider social media, but with a MOOC they’re often compressed into relatively short time spans.

(Perhaps inevitably) I also kept thinking that much of the process of creating, delivering and consuming the content could be improved with better linking and annotation tools. Indeed, do we even need specialised MOOC platforms at all? Why not just place all of the content on services like YouTube, ReadLists, etc. Isn’t the web our learning infrastructure?

Well I think there is a role for these platforms. The role in certification — these people have taken this course — is clearly going to become more important, for example.

However I think their real value is in marking out a space within which the learning experience takes place: these people are taking this content during this period. The community needs a focal point, even if its short-lived.

If everything was just on the web, with no real definition to the course, then that completely dissolves the community experience. By concentrating the community into time-boxed, instanced courses, it creates focus that can enrich the experience. The challenge is balancing unwieldy MOOC “flashmobs” against a more diffused internet community.

The Science of Alien

I’ve been digging through some old files and papers recently, partly prompted by sorting out the loft and also various hard disks with backups of documents and photos.

Amongst the papers I found this fun piece that I wrote back in 1994: A Speculative Paper on Xenomorph Biology.

I wrote it whilst watching a re-run of Alien shortly after finishing my degree. I got to wondering: what if we took the events in the films at face-value, what could we then guess about the Alien’s biology and origin? Reading it back now has made me wince quite a bit. Younger me needed an editor. I think I was trying for the feel of an academic paper or report, but its also obviously part science fiction story.

Despite it being a bit sketchy — and clear evidence as to why I never built a career as a writer! — it’s stood up pretty well I think. Even against the revelations in Prometheus. My fictional scientist even guessed that the “Space Jockey” (as its now called) was there as part of a terra-forming team, and that they were over-run by their own engineered, bio-mechanical servants.

For some better informed attempts at applying science to scifi/fantasy then you might want to look at “Godzilla from a Zoological Perspective” (why isn’t it free?!) or “The pyrophysiology and sexuality of Dragons“. The former is a semi-serious paper, while the latter was published on 1st April 2002. Also, check the lead authors name.

Anyway, thought I’d post that as a bit of fun for a Friday evening. Have a good weekend.

Ants, Overlays and Open Data

Whilst standing behind the yellow line on the platform this morning, waiting for a train to Oxford, I noticed an ant on the floor wending its way along the tarmac, within the bounds of the thick yellow paint. The little black speck stood out quite sharply against the bright yellow.
Obviously the ant wasn’t following the line, but neither was it moving randomly. It was clearly following its own little invisible marker, an ant scent trail, that just happened to co-incide with the platform markings.
Last night BBC 1 showed Britain from Above an ariel view of Britain during a 24 hour period. The show had some great information visualisations of including traffic patterns for taxis, garbage collection, commuters, shipping, aircraft, as well as more static landmarks such as railway lines, electricity cables, water courses and telephone and network cabling. If you didn’t catch it the programme is definitely worth a watch.
It was this birds eye view of the world that lead me to reflect on that ant and it’s invisible trail. I wonder how many other layers of information could have been
added to the human-centric views shown in the programme? Animal migratory paths are an obvious one. Paths of dispersal, ranges and colonization are some others. It doesn’t take long to come up with many, many more.
The combinations of different paths and layers are also interesting to explore. Are many of these chance overlaps, like the ant on the paint or are there dependencies or inter-relations? For example how are migratory routes affected by no-fly zones or shipping lanes? Do migratory pathways begin to align with man-made features like roads and railways? And where have features like fish ladders and toad tunnels been introduced to avoid clashes between competing uses for the same space?
It’s doubtful that these kinds of questions will be answered in the rest of the series. Judging by the trailer for next week’s episode there seems to be a more of a “Pop geography” focus. (I’ll be tuning in regardless)
The truly exciting thing is that we can do this kind of exploration of layered information sources through map based visualizations ourselves using a huge, and growing, range of commodity tools and data sets.
Whilst watching the programme, what intrigued me more than the admittedly beautiful, animations were questions such as: how did they approach the
information holders in order to get permission to use it? What steps were made towards privacy and anonymity? For the BBC it’s going to be very easy to get access to all kinds of data. Not least because they have resources to spend, but also because their reputation proceeds them and the result of the sharing of data is immediate: “don’t you want to be on the telly”?
Open data advocates may do well to band together to form an organization that can become the focal point for activism and importantly trust. Such an organization could recommend best practices, including auditing of data for privacy results. It could also put together a showcase of the end results: creative visualizations of published data. It may be easier to approach data owners as a member or representative of such an collective, open, distributed, collegial organization than as an independent interested hacker.
But creating a compelling presentation is about more than just having the right technology and data. A good visualization tells a story. It’s through stories that data, really comes alive. The open data movement needs the involvement of strongly creative people as much as (and perhaps more than) technology people.
You need do be able to do more than animate a little black speck against a yellow band: where was that little ant going?

The Modern Palimpsest

The following is a brief summary of a talk I gave recently at the Ingenta Publisher Forum on the 28th November. The slides are available as a Powerpoint presentation.
In the presentation I tried to highlight some of the possibilities that could become available if academic publishers begin to share more metadata about the content they publish, ideally by engaging with the scientific community to expose “raw” data and results.

Read More »

Nature Quote

There’s a short article in Nature (subscribers only I’m afraid) this week about Google Base and its potential impacts on the science community. In particular whether it might galvanise greater data sharing between scientists.
I’ve been corresponding with Declan Butler, the author of the piece, on this and some related topics recently, and he ended up quoting me:


Alf Eaton posts today to point to the new WebCite service. This is going to be very useful. Don’t think so? Well there’s plenty of research to show that link atrophy is a big problem in scientific literature:
Persistence of Web References in Scientific Research
See also: A study of missing Web-cites in scholarly articles: towards an evaluation framework which reports that “[a]fter evaluating 2162 bibliographic references it was found that 48.1% (1041) of all citations used in the papers referred to a Web-located resource. A significant number of references to URLs were found to be missing (45.8%)…

iSpecies and taxonomy (no, not that kind)

For the last few years I’ve been lurking on a mailing list run by the Taxonomic Databases Working Group. It’s a low volume list used by scientists interested in capturing and marking up taxonomies. That’s taxonomy in the Linnaean sense not the semantic web sense. I’ve been lurking there since I wrote this paper a while back proposing an XML format to replace a text based format that had been popular.
Yesterday on the list this interesting little mash-up was announced: It works by searching NCBI, Yahoo images and Google Scholar to attempt to find relevant information on biological specis. Lions for example.
I found it interesting mainly because it is was one of the first mashups I’ve seen that aren’t combinations of the same old APIs (maps, music, bookmarks) but also because its clearly focused at a particular scientific community.
The author, Rod Page (apparently a big RDF fan) built this as an off-shoot of a wider project thats storing phylogenetic data as RDF. His site also has a Taxonomic Search Engine which federates a number of taxonomic name databases. Perform a search it links you to metadata about the organism. There’s a paper on the application on BioMedCentral.
Given an LSID (Life Sciences Identifier) it turns out you can get RDF metadata about the organism. Lions for example.
There’s a lot of interesting mash-up potential in this data, as well as that available from a few other projects in this area.
I’ve been keeping half an eye on this space recently, after reading this paper on how bioinformatic researchers are bumping into limits of XML and looking at RDF instead: “…the syntactic and document-centric XML cannot achieve the level of interoperability required by the highly dynamic and integrated bioinformatics applications“.
These guys have a lot of data that needs integrating and merging. Modern classification is about much more than the old Linnaean system. It has to be able to merge together data sources ranging from molecular biology through to field observations, and depending on what sources you draw on, and from what level, the tree of life can be draw quite differently.
The early web has pioneered in part by the needs of scientists exchanging research papers. It strikes me that “eScience” and bioinformatics may very well become the driving forces behind a more semantic web.