Here are a couple of links to JISCPress …

Here are a couple of links to JISCPress related stuff elsewhere on the web:

An ‘expert talk’ by Joss and some pitches for the project at the JISCRI conference: http://devcsi.ukoln.ac.uk/demonstrator/tag/jiscpress/

An interview with Alex on the dev8D blog: http://dev8d.jiscinvolve.org/2010/02/24/interview-alex-bilbie/

We were also featured in the ‘Tool Shed’, a newspaper for the jiscri programme. I don’t think it’s online, but it’s a fine publication!

Eddie talks about digress.it and JISCPress

It’s long, it’s a bit rough, but if you’re interested in the development of CommentPress, digress.it and a major part of the JISCPress project, you might want to set an hour aside…

Questions

  1. Can you tell us a bit about CommentPress and why the move to digress.it? (00:10)
  2. What design decisions have you made for digress.it? Is there anything that other developers should be aware of? (06:00)
  3. What single area of work on the JISCPress Project has been the most time-consuming (and therefore expensive)? (10:45)
  4. What’s been the biggest challenge for you on the JISCPress Project? (20:40)
  5. Paragraph-level trackbacks and remote embedding of paragraphs which also provided a trackback, were two requirements we kept pushing for. What problems still remain with these features? (23:50)
  6. What software tools or productivity methods do you use and how do you use them? (34:40)
  7. What was the most important thing that brought value to your work? (52:40)
  8. What’s the future of digress.it? How will it be sustained now the JISCPress project has finished? (55:16)
  9. Any more plans for digress.it? (01:02:15)
  10. You’ve started writing a digress.it server, right? (01:04:56)

Eddie Tejeda talks about digress.it and JISCPress from University of Lincoln on Vimeo.

The JISCPress Prototype Demonstrator Platform

Well, here’s what we’ve managed to pull together over the last six months. Many thanks to freelance developers, Eddie Tejeda and Alex Bilbie who developed the WordPress plugins and theme which we discuss below. [This post was written by Joss with help from Tony].

In our original bid, we proposed a ‘prototype demonstrator platform’ for JISC’s Funding Calls and Final Project Reports. We outlined 11 deliverables:

  1. A WordPress Multi-User based platform for authoring and publishing JISC funding calls in a form that allows paragraph-level comment and discussion either locally or remotely.
  2. A meta-site that aggregates all document data into a single site for search, navigation by categories and tags and can syndicate searches, tags and categories.
  3. Develop CommentPress to meet WCAG 2.0 accessibility guidelines, meeting public sector requirements.
  4. Evaluation and integration of “related content” utilities to dynamically link related project calls and reports based on content and/or semantic analysis.
  5. Evaluation and possible integration of remote, realtime messaging services such as Twitter and XMPP integration.
  6. Evaluation and possible integration of enterprise authentication services such as LDAP and Shibboleth.
  7. Evaluation and possible integration of OpenCalais, a semantic tagging service.
  8. Documentation on how to exploit the benefits of AWS and clone the project instance for other uses.
  9. A documented suggested workflow for document authors
  10. Documented examples of how to fully exploit the platform for data extraction and syndication.
  11. Documented ‘user stories’ for the JISC funding call process. Note that we do not guarantee fulfillment of all user stories.

I’ll go through each of these one by one with illustrations where relevant. A more informal reflection is also available (Thoughts on JISCPress ):

Paragraph level commenting and discussion of JISC funding calls

This was achieved through the development of the digress.it plugin. digress.it is a rewrite of the original CommentPress WordPress theme which we used on WriteToReply (which JISCPress is based on). I’ve posted a video interview with Eddie Tejeda, developer of the original CommentPress and digress.it, where he discusses the move from CommentPress to digress.it. In terms of local and remote paragraph commenting, the same feature set found in CommentPress has been retained. Remote, document section level comments are possible through the use of trackbacks.

We spent quite some time looking at remote paragraph level remote commenting and Eddie expects to support this with digress.it in the near future. We discovered that the use of trackbacks and pingbacks is an unreliable method of guaranteeing ‘comments’ from remote websites. It depends on the CMS being used and the settings of both the remote and local site. Sometimes, test comments we made never arrived, other times they did. So for example, whilst internal links within a WordPress domain may be recognised by other sites on the same platform, links from posts on other blogging platforms may not be. Link tracking using third party services (e.g. Google, Google blogsearch, BackType) rely on links being hardcoded in third party web pages (rather than being added dynamically to a page via Javascript, or within an embed object) and even then are not detected reliably (it depends on the crawler). Commercial tracking/monitoring services  were not explored.

WordPress provides a robust commenting system, with excellent spam filtering and comment moderation features. digress.it leverages this locally to allow commenters to respond at the paragraph, rather than the section (i.e. blog post) level. For more on digress.it, Eddie talks in length about his work in a previously posted video interview.

Paragraph level comments

An aggregated meta-site

Alex has been working on this, which can be seen in the screenshot below (and until we take the server down, can be browsed at http://jiscpress.org )

JISCPress Home Page

What you see here are a number of ways of finding documents on the site. The large tag cloud uses tags generated from Alex’s Open Calais/Yahoo Term Extractor related tags plugin. This plugin uses both these third-party APIs to tag each document and then create intelligent relationships between documents on the site. More on that later. The tags are held in a separate database table to the human created, native WordPress tags, but are equally a source of information that can be used by theme designers and plugin authors. Here the tags are simply being used to display a cloud, similar to the one on http://en.wordpress.com/tags/ and marked up with the rel=”tag” microformat.

Clicking a tag lists the documents by title

Clicking on a tag

As you can see, each result for a tag has an RSS feed, which can be found at the top right of the results. So if you’re interested in watching for key words in JISC documents, this would be a useful way of doing that. RSS feds can be monitored from feed readers such as Google Reader, or via web desktops, such as Netvibes (e.g. An Example Netvibes Dashboard). You don’t have to use the tag cloud to do that. You can construct your own and wait for the results to come in. i.e. http://jiscpress.org/?jiscpress_tag=MY_KEYWORD&feed

(Note that WordPress also offers the option of subscribing to a free text search using the default WordPress search utility, e.g. https://jiscpress.blogs.lincoln.ac.uk/feed/?s=jiscpress )

Similarly, you can do this with ‘Topics’ (otherwise known as WordPress categories), which are aggregated from across all documents and displayed on the right side of the home page. For JISC’s purposes, there is a controlled list of about 40 Topics that are used by the organisation. Our example shows the use of a few of those. Again, if you’re interested in Funding Calls for Data & Text Mining, then you can subscribe to a feed for that Topic.

The main thing to point out about the use of WordPress categories on the Home Page, is that it assumes a controlled list and not a publishing environment where authors make up their own taxonomy. The list would get very very long and unmanageable. It need not use JISC’s Topics. Their Themes and Programmes would also work.

You’ll see that it displays the document title and author’s name. The use of author’s names is worth considering too. While WordPress is a multi-user CMS, it may be that each Programme decides to publish under their Programme’s name rather than the individual’s name. This is just a matter of changing the settings in WordPress, so that all the IE Team publish under their Programme’s name. The choice is up to JISC.

Above the ‘Topics’, is a list of the latest funding calls. It’s just a text box which some HTML links pointing to the latest documents. Nothing fancy nor difficult to maintain either.

In the backend, Alex has provided some options for the Tag Cloud (otherwise known as the JISCPress Browser widget). You can see that we have the option of using User, Open Calais and Yahoo tags, as well as blacklisting the display of certain tags, too. You can also decide how many tags you want to display.

JISCPress Browser widget options

The Topics are listed using a simple WPMU Site-wide categories widget that Alex wrote. It looks at the categories across all documents/blogs and displays them in alphabetical order.

For searching, we’re using the built in BuddyPress search (did I mention we’re using BuddyPress?). It simply allows you to search document titles from the front page or, on the Authors page, you can search authors by name, too.

Each Author has a profile page.

Author profiles

We looked at full-text search and it’s quite possible using this lucene-based plugin, which indexes all document text and all comments, too. It could be integrated into the theme to allow site-wide full text search but we chose not to because of time constraints and are simply using the built-in BuddyPress search. If JISC would like full-text search, it’s something that Alex could do. Of course, full-text search is possible on each document site and constructing searches derived from the Open Calais and Yahoo Term Extraction plugin is also possible as I’ve shown above. Full text search could end up returning more results than are useful. I’ve no strong opinions about this either way.

Finally, you can see in the menu bar on the front page that there’s an ‘About JISCPress’ page and a JISCPress blog. Nothing fancy going on there. We’re just using basic BuddyPress features.

Accessibility

Well, we haven’t ignored accessibility requirements but then again, they haven’t been a driving factor in the development of JISCPress either. I think we’ve improved on CommentPress and have had useful feedback from the British Computer Association for the Blind on the accessibility of digress.it. At one point Eddie included the jQuery.accessible plugin in digress.it but it’s not currently being used. Along similar lines, much time was spent on IE6 compatibility, which has been achieved at some cost to the project. The user experience in IE6 is not as nice as using Chrome, for example, and I wonder whether it was worth the effort in a project such as this that was about producing a ‘prototype demonstrator’. Nevertheless, because digress.it is now in widespread use elsewhere, IE 6 & 7 compatibility was one of the first requests that came through on the mailing list. We did a quick survey of browser use in HEIs and Andy Powell followed this up with something more detailed and wide ranging. What both show is IE 6 & 7 can’t be ignored 🙁

I feel that we didn’t manage to do as much toward accessibility as I originally hoped but it is something that can be worked on with digress.it over time and I know it is something that interests Eddie a great deal. Hopefully as he does work for more organisations, like Cornell and the New York Public Library, their accessibility requirements will filter through into the core code. In addition, Eddie has recently been able to employ a designer to work with him on digress.it so the theme should get more close attention over the next few releases.

Related content utilities

Alex did a lot of work on this and has released his wpmu-related-blogs-and-posts plugin on the official WordPress plugin repository.

Here’s an overview:

The WPMU Site Admin options look like this (click the image to see it full size):

WPMU Related posts admin options

You can see that both or either the Open Calais and Yahoo Term Extraction APIs can be used. The plugin provides a background service which runs via a cron job, which can be set to daily, twice daily or hourly. The cron job can be started manually and the entire platform can be re-tagged at any time. The relationships between document sections and documents can be re-established at any time, too.

Both the relevance of the tags (using features of the APIs) and the relevancy of the posts (when showing related document sections), can be adjusted.

Finally, you can opt to ignore certain blogs/documents, so test sites and the main site don’t mess up the weighting of the relationships made.

As we’ve seen with the JISCPress Browser widget, those tags can be used to provide a way to navigate the entire platform. However, the principle intended use of the Open Calais/Yahoo services is to display related document sections or, optionally, related documents while reading any given document. One potential issue with the auto-tagging services relates to the variable quality and usefulness of the tags they return. One possible way of addressing this would be to limit the range of tags used on the site by filtering the automatically generated tags via a whitelist, blacklist, or based on semantics (e.g. ignore placenames).

Here’s the widget options:

WPMU Related Posts widget

Here’s the widget as displayed to the reader. In principle, it works and should work better as the number of documents on the site grows.

Related documents/blogs:

Related Documents

Related document sections/blog posts:

Related Sections

Realtime messaging

This deliverable has been tackled in two ways. Eddie developed realtime alerts for digress.it so that if someone comments on the site while you’re reading it, the comment will ‘pulse’ in the Comment Box. For remote realtime messaging, things have been taken care for us. As realtime on the web is becoming very much mainstream, it’s getting easier to push content out through a variety of ways. For example, since the start of our project, WordPress now has RSSCloud and Web Hooks plugins. The latter provides a relatively simple framework for developers to push any notification from WordPress to external services, as I’ve discussed on my blog. There are also SUP (FriendFeed) and PubSubHubbub plugins, too. In addition, Google is now indexing in realtime and Google Alerts are showing up via RSS almost immediately, too. XMPP PubSub is also in use on WordPress.com and that work is due to be released as  plugin for WordPress once it has been well tested. Likewise, their work on a Twitter compatible API for WordPress will also be released. Given all of this, we didn’t think we needed to reinvent the realtime wheel for WordPress.

Authentication Services

I can confirm that LDAP works well with WPMU. I know this because I run WPMU with this plugin at the University of Lincoln. It is a feature rich plugin and well supported. If anyone wants to integrate LDAP with WPMU and is running into problems, please get in touch. Note that there is no reason why much of JISCPress couldn’t be used as a private platform for document discussion and annotation, internal to an organisation.

Shibboleth support was not tested because we don’t use it at Lincoln and I never got around to trying to convince someone to help me test it. However, I have been told by the plugin developer that it is in use at other universities and it too, is well supported by Will Norris, the developer of the WordPress Open ID plugin. I would be very grateful if someone running a Shibboleth service at their institution would help me test the plugin. I’m pretty confident that it will work.

Open Calais

See above!

Amazon Web Services

I documented our use of Amazon Web Services on our Google Code site. I had intended to leave an image of the JISCPress server on AWS so that anyone could clone it for use or play. However, unless asked, I’m not going to do that now. If you want to use such a system, you can use http://digress.it or http://writetoreply.org or and if you’re skilled enough to work on AWS, then you’re skilled enough to install WPMU and our plugins on your own server for testing purposes. I’d be happy to advise anyone wanting to create their own version of JISCPress and even work on your server if you want me to. Leaving an Amazon Machine Image lying around for testing means that it will soon go out of date as new versions of WordPress are released and need rebuilding anyway.

Workflow for authors

I have begun writing documentation for authors on our Google Code site. I’d be grateful for feedback. This documentation is also used on the http://digress.it site. I also intend to provide documentation for Administrators, too. Although if you are running WPMU, then you will be familiar with much of what you need to know. One of the great things about this project is that we’re using one of the most popular pieces of publishing software on the web and there is already a lot of familiarity with it.

Data extraction and syndication

A lot of work went into developing digress.it with this in mind and Alex has also developed a plugin that posts RDF triples to the Talis Platform from WPMU. I’ll say more about that below, but here’s a post I wrote recently about JISCPress and Open Data. I hope you’ll agree that we’ve made good progress in this area.

A demonstration of how paragraph level content from JISCPress can be transcluded (that is, embedded) on a third party site is available here: Paragraph Embedding from JISCPress. The post includes a Javascript snippet that can pull in a random paragraph from a contiguous set of paragraphs on a single JISCPress page. Such a utility might be used to rotate through headline pargraphs on a JISCPress document fron page, for example.

On the subject of web analytics, we reviewed how to collect JICWEBS recommended metrics using Google Analytics (Measuring Website Usage With Google Analytics, Part I , how to monitor traffic coming in from University networks, and how to exploit Google Analytics campaign tracking codes from shared links and via Feedburner. We did intend to look at Piwik, too, as we’ve tinkered with it on WriteToReply, but I’m afraid it didn’t happen. I do want to highlight Piwik to anyone interested in open source analytics software. It allows you to expose the analytics for any given site to the public and provides a number of ways to access the data, via CSV, JSON, XML, RSS. Have a play around on the WriteToReply site to see what I mean. Naturally, there is a WordPress plugin for Piwik, too. Google Analytics has also recently opened up an API that would support the development of custom reporting dashboards. Steph Gray at Helpful Technology has recently described an “ultimate dashboard” built around free tracking/monitoring services – Minding the shop – and is willing to share the code. Such a dashboard could be used to provide a useful overview of document related discussion on the wider web.

Documented user stories

This is an area where I wonder how we might have improved things. As I’ve written about before, the project team has worked virtually, hardly ever meeting and we worked very much in public, with digress.it having a fairly broad range of users mid-way into the project. For some of our work, we were not short of user feedback but for other areas, we received hardly any. I set up a UserVoice site early on and we’ve asked people to use it and linked to it from our blog, but it’s never seen any use. Feedback on the project in general has been quite low, although we receive informal feedback on how WriteToReply might be improved quite regularly and similarly, there has been a lot of feedback on digress.it over Twitter.

I have tried to document some of the uses of JISCPress that I could think of. They’re not really User Stories or even complete scenarios, but it does sketch out some of the uses that I could think of and I’ll add to them as I think of them. It will be interesting to demonstrate JISCPress to JISC and be able to respond to their feedback now we have a working platform. JISC staff have started using digress.it on WriteToReply (the #jisclms Call was re-published by two JISC staff) and I anticipate there being more opportunities for us to work with JISC staff on using some of the work we’ve done in the near future.

Other work

I mentioned the work that Alex has done on connecting WPMU to the Talis Platform. Using Triplify, he’s developed a plugin that runs a service which posts RDF Linked Data to Talis. It can be downloaded from the WordPress repository.

The plugin has some global options, available to the platform admin, which are pretty self-explanatory (click to enlarge)

Talis Global Options

Each site owner also has the option to participate, too. First you agree:

Talis T&C

and then you set your license:

Talis site options

The RDF data is available at both the http://example.jiscpress.org/triplify endpoint and for query on Talis. This is what you get back, for example:

Talis Platform query result

The configuration file for WPMU is available here. Be sure to check it over to ensure that it exposes the data you want it to expose and no more! Also, do read the Triplify documentation. You’ll see that it does more than produce RDF. There’s JSON output, too.

Final reflections

These final reflections are from Joss.

I’m pretty pleased with the way the project has gone. It’s been a real pleasure to work on the project and to work with Tony, Eddie and Alex. Not once have I felt it a chore even at some of the ridiculous hours of the night that we’ve worked. I’m also very grateful to JISC for thinking that it was a project worth funding and the support from JISC in terms of the JISCRI conference/IE Demonstrator and general lack of interference, but steady encouragement, has been really welcome, too.

I think we’ve delivered enough of the original objectives to have made it all worthwhile. There are areas, such as accessibility and more user testing, which I wish we’d been able to do better, but it wasn’t for want of trying. I really wish we’d cracked simple embedding of paragraphs and paragraph-level trackbacks, too. We’re close, but not quite there, yet.

On the other hand, we have produced some pretty nice plugins for WordPress. For WordPress developers, digress.it provides some significant innovations around document publishing and Open Data in WordPress and the code is free to be built on and improved. Likewise, we’ve also produced the first plugin that allows WPMU admins to run a background service for Open Calais and Yahoo Term Extraction across all blogs/documents. Another ‘first’ for the project, is that we’ve joined WPMU to Triplify and the Talis Platform. Huge amounts of data are generated by WPMU installations and now that can be Linked Data hosted on a well known Data Store. It would not be difficult to bring both plugins together so that the Open Calais semantic tags are included as data that is posted to the Talis Platform.

The legacy of the project will hopefully be similar to what we’re trying to achieve with WriteToReply. Not necessarily a wholesale conversion of organisations to using the platform with all its features that I’ve described above, but for people to pick and choose what works best for them. It may be that the project just inspires something better or different on another platform, like Drupal, and that’s good, too. It’s also about trying to demonstrate how publishing and engagement with public documents on the web can still, very much, be improved. In that sense, there is probably as much value in the blog posts that our work has generated as the software itself. We’ll continue to write about this on the WriteToReply blog.

All the code is open source and available under the GPL (digress.it) or Modified BSD license (Calais and Talis plugins). Eddie continues to maintain digress.it and I know that Alex is keen to ensure that any issues identified with his code are dealt with too. If you use the code and improve it in anyway, do tell us as we’re keen to include contributions from other developers.

Final Progress Post

Please also refer to this post, which provides a full and final overview of the project deliverables.

JISCPress: A prototype demonstrator publishing platform for JISC funding calls and project reports.

Screenshots or diagram of prototype

JISCPress Home Page

Description of Prototype

JISCPress allows communities to comment on, discuss, annotate and review documents in considerable detail. As a platform, JISCPress discovers relationships between hosted documents and provides a variety of ways to make discovery easy and useful.

Link to working prototype

http://jiscpress.org <– temporary until Feb 2010 (see also http://writetoreply.org)

Link to end user documentation

http://code.google.com/p/jiscpress/wiki/Documentation

Link to code repository or API

http://code.google.com/p/jiscpress/source/

http://code.google.com/p/digressit/source/

Link to technical documentation

http://code.google.com/p/jiscpress/wiki/Documentation

Date prototype was launched

17/12/09

Project Team Names, Emails and Organisations

Joss Winn, Alex Bilbie (University of Lincoln)

Tony Hirst (Open University)

Eddie Tejeda (Freelance developer, Visudo)

Project Website

https://jiscpress.blogs.lincoln.ac.uk

PIMS entry

https://pims.jisc.ac.uk/projects/view/1348

Table of Contents for Project Posts

Please use the tag cloud in the right-hand sidebar.

Working remotely as a team

One of the things that I wanted to get a feel for during this project was how the university might successfully apply for and work on rapid innovation projects while not actually having development staff of our own. We have staff in ICT and Marketing who work on the corporate web and online services, such as Blackboard and Sharepoint. There are programmers in the School of Computing, too, who have their own research projects on the go, but there is no ‘development team’ who might be approached to work on JISCRI type projects. I am an adequate Linux SysAdmin, but am no ‘developer’. I can’t write a line of code.

The four members of the JISCPress team are:

Joss Winn (Staff, University of Lincoln)

Tony Hirst (Staff, Open University)

Alex Bilbie (Student, University of Lincoln)

Eddie Tejeda (Freelance developer, Visudo, San Francisco)

Project Methodology?

I wrote a bit about project methodology at the start of the project. The emphasis from JISC on using an ‘agile methodology’ just didn’t seem to fit the structure of our Team. All the agile methodologies I knew of emphasised the importance of working closely, sometimes even in pairs, and seemed designed for tightly focused, customer orientated projects with regular iterations around sprints of development. There was no way that I was going to be able to organise Tony, Eddie and Alex around regular sprints and had I tried to crack the whip, I think I would have received far less co-operation from them than I did. As it turned out, the team never actually met in the same room (err, pub…) until the JISC CETIS Conference earlier this month. On a couple of occasions during the summer, Alex worked in my office, but it wasn’t especially productive. It was clear that he was much more comfortable working at home. While Eddie was over for the CETIS conference (he paid his own way here, I want to add!), he came to Lincoln and worked for a day in the office with me. It was a productive day and I could see how teams working together like that could be highly productive both in the development of ideas and code but it wasn’t unlike how Eddie and I would work together virtually.

The other significant contributing factor to the way we worked was the type of development we were doing. Again, agile methodologies seem designed around creating a user-led product. There’s an expectation that the development team are working for a client who is embedded in the decision making process for the project. The client ‘owns’ the project and they ultimately own the code, too.

JISCPress didn’t have this type of client and we weren’t solely focused on the writing of new code either. The project was as much about taking existing code and piecing it together in a fruitful way, developing what we’d already pieced together with WriteToReply. We started with the WordPress Multi-User platform and CommentPress. We looked at some other existing WordPress plugins for working with Open Calais and creating relationships between content and also created a configuration file for Triplify and a plugin to publish data to the Talis Platform. We bootstrapped a complete rewrite of CommentPress, called digress.it and Alex wrote plugins for making WPMU work with OpenCalais and Triplify.

IPR

The University of Lincoln has no interest in ‘owning’ the code (no-one here would maintain it), JISC have no interest in owning the code (no-one there would maintain it) and the code is completely reliant on other open source software and APIs which we don’t maintain either. I was advised by OSS Watch to ensure that the code we funded was, wherever possible, copyright University of Lincoln, but I’ve ignored that advice. I just don’t see who it would benefit. In order to make the work interesting and sustainable, we needed Eddie and Alex to have a vested interest in the code they were writing. Eddie’s work on CommentPress was already credited to him and Jesse Wilbur and licensed under the GPL3. There was no advantage in the University of Lincoln claiming copyright in the digress.it code we were funding and I’m sure that had we insisted on this, Eddie wouldn’t have worked for us. We could have claimed copyright for the code that Alex has written, but as I said, no-one here will maintain it, so I think it’s better to encourage Alex to remain the copyright holder in the hope that he might maintain it for a while. There is no ££ value in WordPress plugin code. There’s only value in providing services around the code and Alex is the person best suited to do that, not the University.

How we actually worked together

My original intention was that we’d work together in public on IRC. Tony and I had started doing this on WriteToReply and thought we’d just continue the weekly meetings we were having. This happened once at the start of the project, but after that, we all just used Google Talk. The reason for this is simple: we all use it anyway. Eddie and I work with GMail open and rather than firing up another application just to ask a quick question, we used GTalk. It allowed us to see when we were online and effortlessly kept a log of everything we were saying. Alex and Tony also seemed to prefer to use it, too. Because it was attached to our personal email addresses, rather than work addresses, it also meant that we were available after 5pm and over the weekends. The convenience of using GTalk meant that on many occasions, I’d be contacted by Eddie as I was going to bed and end up spending late nights testing his latest code revisions and feeding back comments via GTalk. Had we relied on other tools, I wouldn’t have been in the habit of using them at home and consequently Eddie and I would have chatted far less than we did.

Our ‘office’ consisted of the Google Code svn repository and Google Talk. Eddie would check code in, I’d be waiting on the jiscpress.org server to check it out and then we’d chat about it over Google Talk. Anything that required time to fix, was added to the list of Issues. Alex and I worked differently. He’d usually come to my office on campus and show me what he’d done. I preferred to work over svn and Alex taught himself how to use it, but we never worked in real-time over svn as Eddie and I do.

Tony’s role was always one of provocateur and champion for the project. A lot of the best ideas (and hardest to implement) originated with Tony and then I would try to push them into Eddie’s list of tasks, which were hosted on the digress.it Google Code site.

Working with students

I think we’ve been really lucky to have Alex on the team. In the Centre for Educational Research and Development, where I work, we’re keen to work with and undertake research with students and it seemed like a natural extension of this  to be working with Alex. The project was nicely timed over the summer holidays, too, so Alex had quite a bit of time to spare during July – September. Noticeably, when the semester started again, he was very pushed for time partly because I’d found him other paying projects to work on and regrettably I ended up competing for his time! I should add that both Eddie and Alex worked more hours than they were actually paid for – not unusual when working in education, but from my point of view, it meant I was juggling a lot of good will at times.

Alex also came to the JISCRI and CETIS conferences where he seemed to really get a lot out of meeting other developers and talking about the work he has been doing on the project. I think all of us would say that the project has been an opportunity to learn from each other and hopefully we’ve documented the useful bits on this blog for others to learn from too. Eddie talks about his work for the project in a video interview I made with him. I’ll link to it from here when it’s ready.

Who were our users?

Agile methodologies put a lot of emphasis on the user and their stories. We didn’t arrange for a distinct set of users to feed back to the project. The main Stakeholders for the project were JISC and the JISC community, not Lincoln staff and students, so we needed to engage them as much as possible. We had already done this to a certain degree through WriteToReply and had republished the JISCRI call specifically to see how the community might use the platform. During the project, JISC launched their draft Strategy and a consultation on the Google Book Settlement, both of which used early versions of the digress.it code we were were developing. On a few occasions, I pointed people via Twitter and the blog to our UserVoice site, but no-one offered anything this way. Tony acted as an uber-user, looking at the platform from the point of view of a developer/mashup-artist and consequently fed back lots of thoughts for how it could be improved. Tony and I have also spent hours and hours publishing documents via WriteToReply and we’re acutely aware of the issues that publishing users might face. We also gained a lot of users following the launch of digress.it and it was embraced by the WordPress community (931 downloads and counting) as well as people signing up to Eddie’s site http://digress.it A mailing list specifically for digress.it was set up and people would feedback issues and requests that way. We also have a public mailing list for the JISCPress project as a whole. It has a few people lurking on it, but has not been a particularly effective method of communication. Perhaps it will be if people pick up the work we’ve done and try to set up their own version of JISCPress, as I hope they will do. As a public open source project, we pursued our original ideas in public and as the code was released, waited for users to feed back to us. Alex, too, got feedback in this way for his OpenCalais plugin and was made changes to his code based on a suggestion from another developer who is using it.

Misc.

A couple of things I didn’t plan for when working with Alex and Eddie were payroll related. Customs and Excise have told us that we have to pay VAT on Eddie’s invoices.  Fair enough. It didn’t occur to me because he doesn’t reside in the UK, but it makes sense. I learned a lesson and it didn’t affect the budget because the exchange rate improved in our favour over the course of the project and what we saved there, we spent on VAT.

Alex began as a freelancer with the intention of invoicing the project for his work at the agreed amount, but the university said it would be easier for Alex if he was put on the payroll and then his tax and NI would be dealt with. It affected the project by adding about £30 in NI contributions to the total amount we’d agreed – good value, I think. One less set of invoices that I had to deal with.

Did it work?

As I said at the top of this post, one of the benefits for me and the University of Lincoln, was to see how effectively we could run a development project in this way. It’s likely that any development project I get involved in will be based on existing open source software that has an existing community of developers and users, just as WordPress does. I’m just not interested in re-inventing the wheel on something completely new. I prefer to contribute rather than invent. In this respect, I feel like it’s been a success. Within a few days, we’ll have the prototype we set out to create and within a week or two, it will be documented so that anyone should be able to pick JISCPress up and make it their own. JISC have seen the benefits of using it on three documents so far and in one shape or another, I think they’ll continue to use it. We’ve shared our ideas around the project widely, both on this blog and the other sites we use (see sidebar) and have discussed and demonstrated our work at two conferences. digress.it is now a well-regarded WordPress plugin with a growing number of users. Recently, Cornell University have employed Eddie to continue to develop digress.it for a project they are running with the Whitehouse. Through the JISCPress project, JISC have directly contributed to the work Cornell are doing. Likewise, the New York Public Library also want Eddie to develop digress.it for them, so I’m confident that this part of our work will be maintained and fed back into the core tree of the code for everyone to benefit from.

WriteToReply has benefited of course. We’ve been able to spend time thinking about and working on ‘stuff’ which we intend to use on http://writetoreply.org Recently Eduserv offered to help us with hosting WriteToReply which has encouraged Tony and I to keep pursuing our interests in document publishing and public engagement on the web. I hope that we can work with Eduserv over the coming months and pass on what we know about WordPress hosting.

Finally, based on the work we’ve done on JISCPress with Triplify and WPMU, I plan to apply for funding from Talis to develop a ‘wordpress.com for OPACs’. It sounds ambitious, but a lot of the work has already been done on this and the Scriblio project and I think we can show it is a viable idea. You can read more about what we have in mind, here.

Using JISCPress/Digress.it for Reading List Publication

One of the things I’ve been doodling with but not managing to progress much thinking wise (not enough dog walking time lately!) is how we might be able to use the digress.it WordPress theme to support various course related functions in ways that exploit the disaggregating features of the theme.

Chatting with Huw Jones last week about his upcoming Arcadia seminar on “The Problem of Reading Lists” (this coming Tuesday, Nov 24th – all welcome;-) I started thinking again about the potential for using digress.it as a means of publishing, and collecting comments on, reading lists.

So for example, over on the doodlings WriteToReply site I’ve posted an example of how a reading list posted under the theme is automatically disaggregated into separate, uniquely identified references:

The reading list was generated simply by copying and pasting a PDF based reading list into a WordPress blog post. Looking at the format of the list, one could imagine adding further comments or notes relating to each reference using a blog comment. Given that the basis of each paragraph is a citation to a particular work, it might be possible to parse out enough information to generate a link to a search on the University OPAC for the corresponding work (and if so, pull back an indication of the availability of the book as, for example, my Library Traveler script used to do for books viewed on Amazon).

Under the current in-testing digress.it theme, each paragraph on the page can be made available as a separate item in an RSS feed; that is, as well as the standard ‘single item’ RSS page feed that WordPress generates automatically, we can get an N-item feed from the page for the N-paragraphs contained on a page.

Which in terms means that to generate an itemised RSS feed version of a reading list, all I need to do is paste the reading list – with each reference in a separate paragraph – into a single blog post. (the same is true for disaggregating/feed itemising previous exam papers, for example, or I guess video links in order to generate a DeliTV programme bundle…?!)

(For more details of the various ways in which digress.it can automatically disaggregate/atomise a document, see Open Data: What Have We Got?.)

PS just a reminder again – Huw’s Reading List project talk, which is about far more than just reading lists, is on Tuesday in the Old Combination Room, Wolfson College, Cambridge, at 6pm.

Open Data. What have we got?

I attended the ‘Global Graph’ session at the #cetis09 conference and made a largely failed attempt to demo some of the work we’ve been doing with Triplify and the Talis Platform. (In my defence, it wasn’t a planned demo and jiscpress.org was down while Alex was doing some design work).

Anyway, what I would have shown was how each document site on jiscpress.org uses Triplify to provide Linked Data in the form of RDF/N3 triples, which we store on the Talis Platform using a plugin Alex wrote.

Using Alex’s config file for WordPress MultiUser, we drop the triplify directory into the WPMU root directory, alongside wp-admin, wp-includes and all the other WordPress files. You should take a look at the config file and make sure it’s doing what you want it to do, but it will work as it is.  With this in place, Linked Data in the form of an RDF flat file for each document site (blog) is available at http://document.jiscpress.org/triplify or http://jiscpress.org/document/?triplify

(I should warn you that none of the URLs in this post are genuine URLs. They’re examples of syntax. The server at jiscpress.org will stop running at the end of December).

Now, to get that same data onto the Talis Platform, Alex has written a plugin for WPMU that periodically crawls the documents for changes and pushes the new data to a Talis Platform account.  Here are the WPMU site-wide admin options:

Admin settings

and here are the per document site user settings:

User settings

I won’t explain what the plugin does in detail. Just click on those images above and you’ll see the options that are available and if you’re reading this stuff, you know what it’s all about.  The Talis/Triplify plugin for WPMU will appear on  http://wordpress.org/extend/plugins in the next couple of weeks. It’s been tested and it does what we expect it to do but we want to test it more on sub-directory installs before it’s publicly available. Full documentation will appear soon on http://code.google.com/p/jiscpress/wiki/Documentation

We have also developed a WPMU plugin for Open Calais and the Yahoo! Term Extraction API. This provides a background service which indexes each document section (blog post) and creates relationships between content across the platform. We’ll post here about that very soon.

In addition to the Linked Data, JISCPress, using digress.it on WordPress, provides a long list of other open data (not Linked Data) end-points which might be put to good use. Here you go..

Document paragraphs

These are switches that provide individual paragraph data in different formats.

Updating the project calendar with timet…

Updating the project calendar with timet…

Updating the project calendar with timet…

Updating the project calendar with timet…

Updating the project calendar with timet…

Document sections

This is just the regular WordPress post content in RSS format. In JISCPress terms, it’s the document section which is a single feed item.

http://test.jiscpress.org/2009/07/28/6-how-jisc-invests/feed/?withoutcomments=1

and this is the normal WordPress feed of comments on a particular post/document section.

http://test.jiscpress.org/2009/07/28/6-how-jisc-invests/feed/

We’ve also added the provision of a feed for each document section (‘post’), where each paragraph is a feed item. Note that this makes digress.it a nice tool for building your own feeds out of a single WordPress post.

http://test.jiscpress.org/feed/paragraphlevel/3-jisc-vision-mission-and-objectives/

Per paragraph comments/discussions

For each paragraph, there’s a feed of the comments/discussion.

http://test.jiscpress.org/feed/paragraphcomments/3-jisc-vision-mission-and-objectives,1

Commenter feeds

For each person that comments, there’s a feed of their comments

http://test.jiscpress.org/feed/usercomments/Joss%20Winn

All the other stuff

Don’t forget that the entire document content is also available as a feed

http://test.jiscpress.org/feed/

http://test.jiscpress.org/feed/rss

http://test.jiscpress.org/feed/rss2

http://test.jiscpress.org/feed/atom

http://test.jiscpress.org/feed/rdf

as are all comments from the site, too:

http://test.jiscpress.org/comments/feed

with WordPress, tags also have feeds

http://test.jiscpress.org/tag/tag1/feed

and so do categories

http://test.jiscpress.org/category/category1/feed

You can also combine tags

http://test.jiscpress.org/tag/tag1+tag2+tag3/feed

and you can combine tags and categories

http://test.jiscpress.org/?category_name=category1&tag=tag2,tag3&feed=rss2

Finally, authors have a feed, too

http://test.jiscpress.org/author/joss/feed/

Summary

WordPress is a versatile CMS for organising/designing and publishing data as feeds and therefore a useful source of Open Data. JISCPress has extended this versatility by choosing to develop further data end points using digress.it and offering a simple way of publishing Linked Data to the Talis Platform RDF triple store where is can be queried and mashed up using the platform’s API.

I am at Lincoln LocalGovCamp, where 30 o…

I am at Lincoln LocalGovCamp, where 30 or so people have gathered to create an unconference around improving local government online. This morning, I started a session on online consultations where I talked about WriteToReply and the development of our ideas and the platform through the JISCPress project. There was a lot of positive feedback and twitter back channel chat about our work which was really encouraging. People seemed to appreciate our efforts around making the platform a source for open data via the URI switches, RSS feeds and Triplify end points. I’ve just given a five minute video interview where I introduce WriteToReply and JISCPress. It should appear on http://www.lgeoresearch.com/ soon.

A quick update

A lot of development is happening right now, so I thought I’d write a very quick summary to keep people informed.

Firstly, version 2.2 of the digress.it plugin was released yesterday. Remember that the JISCPress project bootstrapped the re-development of CommentPress (which has been at v1.4.1 for over a year now, I think) and we helped Eddie release digress.it v2 back in mid-August.  We’ve had seven releases since then and v2.2 finally brings IE6 compatibility with it (IE7 came in v2.1.7). It’s feels stable now and provides pretty much the same experience across browsers. Performance is superb on a modern browser like Chrome 3, Firefox 3.5 or Safari 4. I’ve found that with wp-super-cache installed, too, pages are rendered in a snap.

I’ve also started to document the features that come with digress.it. Some of the really interesting stuff isn’t immediately obvious, like the incredible range of RSS feeds that are now available and the switches for RSS, JSON, XML, HTML and text. @paulgeraghty asked on Twitter whether this might be ‘micro-content’. I’d be interested to know if there are other CMS platforms that provide a formal method of obtaining document data at the paragraph level.

http://test.jiscpress.org/?p=15&digressit-embed=1&format=xml
http://test.jiscpress.org/?p=15&digressit-embed=1&format=text
http://test.jiscpress.org/?p=15&digressit-embed=1&format=rss
http://test.jiscpress.org/?p=15&digressit-embed=1&format=html
http://test.jiscpress.org/?p=15&digressit-embed=1&format=json

And remember that this is in addition to the full document or document section level RSS feeds that are built into WordPress.We’ve also introduced RSS feeds for each comment author and for the discussion around each paragraph, so if you want to follow one particular person or a discussion around one particular paragraph, you can.

We’re still working on ways to provide an easy way to copy and paste some code and embed a paragraph in your own site, while at the same time giving us a paragraph-level trackback. We’ve been trying various different methods but none of them have worked so far. We’re close though. If you’ve got any ideas for how this might be achieved, please leave a comment 🙂

Alex has been working hard on platform-wide features. He recently uploaded his ‘related documents’ code which looks across the entire platform of documents and makes suggestions for related document sections in the page sidebar. What’s especially interesting about this is the way this is achieved as a background service that runs periodically (you choose how often) and uses the OpenCalais API to provide contextual tags and the Yahoo! Term Extraction API to extract terms from the document. The relevancy of the tags received can be adjusted and author entered tags are also taken into account. These three different methods of mining the document ensure that the document sections that are ‘advertised’ to readers are relevant to the document they are currently reading.

Alex has also been working on integrating Triplify with JISCPress (and WordPressMU).

Triplify is a small plugin for Web applications, which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.

In practice, this means that the semantic structures for each JISCPress document are now available as RDF triples. Click here and you’ll get an XML/RDF file for a single document. Alex has also written a plugin for WPMU which will work with Triplify and allow the document author to include a license of their own choice in the RDF. Finally, he’s been testing this with Talis’ Connected Commons triple store and now has the WPMU plugin pushing RDF triples to Talis where they can be queried and mashed up using the Talis API. His work on this should go up on our Google Code site in the next few days.

It all needs testing and tweaking a little more, but the substantial part of the work on these three plugins has been done and now it’s a matter of refining them and integrating the platform as a whole and documenting it thoroughly. We’re always interested in what you would like to see the JISCPress project achieve, so please take a look at our UserVoice site and add any suggestions you might have.  We’re also tracking Issues about JISCPress on Google Code and Issues specifically about digress.it on the digress.it Google Code site. You can also get the development code for digress.it there, too.

More soon!

Scholarly publishing with WordPress

Working on the JISCPress project, I’ve been thinking quite a lot about scholarly publishing on the web, and in particular with WordPress. This morning, I read a post over on the ArchivePress blog about some WordPress plugins which are useful additions for creating a scholarly blog and it got me thinking a bit more about what features WordPress would need to support scholarly publishing.

JISCPress does away with the idea that WordPress is a blogging tool, and instead uses WordPress Multi-User as a document publishing platform, where one site or ‘blog’ is a document. The way WPMU is structured means that despite serving multiple (potentially millions) of document sites, the platform remains relatively ‘lightweight’ as each document site generates just a handful of additional database tables, while sharing the same administrative core as a single WordPress install. So, 100 WordPress blogs on WPMU is nothing like the equivalent of running 100 separate WordPress blogs, both from the point of resource requirements and administration. In fact, quite soon, there will be no such thing as WPMU as the two products are going to be merged and because they share 90%+ of the same code already, it’s not too difficult to achieve.1

Anyway, my point here is to discuss whether WordPress can be extended to accommodate most conventions found in scholarly publishing and where it is lacking, to identify the development work required to meet the needs of most academic who wish to write on and publish to the web.2

Scholarly publishing extends to a wide variety of published outputs. As a Content Management System (CMS) and technology development platform, I believe that WordPress has the potential to support any type of scholarly publishing that the web supports. It is extremely extensible, as can be seen from the 6000+ plugins that are available. However, what I’m interested in is what can be done now, by an academic wishing to publish their work through the use of WordPress acting as a CMS. What can be achieved with a few quid3 to self-host WordPress so that a few plugins can be installed and a well structured, typical, scholarly paper can be published.

My Dissertation

For some time, I’ve been meaning to publish my MA dissertation. Back in 2002, I undertook some unique research which has not, to my knowledge, been repeated and I think there is some value in having it easily accessible on the web. I have an OpenOffice file and a PDF and, in the course of a morning, have published it under my own domain. The reason I did not publish it on the university WPMU platform is because I have been experimenting with different plugins and did not want to install plugins that were untested or we may not support long-term.  In this case, I’ve used a single WordPress installation, but ideally an individual researcher, group of researchers or research institution, would run a WPMU installation which allowed multiple documents to be authored individually or collaboratively4 and published directly to the web as XHTML.

BuddyPress, by the way, can make the experience even more natural, not only because it is based around a community of like-minded people writing together  on the same web publishing platform, but also because, with a few tweaks here and there, we can move away from the language of blogs and towards the language of documents.


BuddyPress admin bar

Profile menu

Enough of BuddyPress on WPMU for now and back to my dissertation. I set up the site in ten minutes, without using FTP or a command line because I use a host that provides a one-click install of WordPress and WordPress allows you to search for and install plugins from its Dashboard, rather than having to use FTP. Once the site was installed, I then  made some basic changes to the settings, turning on XML-RPC and AtomPub, so that, if I decided to, I could publish to the site using my Word Processor.5 I didn’t use this in the end, but trust me, it works very well using recent versions of MS Word, Open Office (free) and other blogging clients such as MS Live Writer (free).

So, what are the common characteristics of an academic paper? What does WordPress have to support to provide functionality that meets most scholars’ publishing requirements? I scratched my head (and asked on Twitter) and came up with the following:

  • footnotes/endnotes
  • citations
  • use of LaTeX (sciences)
  • tables
  • images
  • bibliography
  • sub-headings
  • annexes
  • appendices
  • dedication
  • abstract
  • table of contents
  • index to figures
  • introduction
  • exposition
  • conclusion

Many of these are supported in WordPress by default and don’t require any additional plugins (tables, images, sub-headings, annexes, appendices, dedication, abstract, introduction, exposition, conclusion, are all either basic literary conventions or just part of a simply structured document).

For additional support, I installed digress.it, which we have funded through the JISCPress project. This is a WordPress plugin which allows readers to comment on the paragraphs of a document, rather than at the document section level. We’re adding a lot more functionality to meet the objectives of the JISCPress project, but I chose digress.it, principally for the reason that it is designed to turn a WordPress blog into a document site. I could have used any other WordPress theme, but digress.it automatically creates a Table of Contents and allows you to re-order WordPress posts when they are read so that you don’t have to author your document in reverse or adjust the publication dates so the document sections appear in the correct order.

My dissertaion published using digress.it

My dissertation published using digress.it

I added the abstract for my dissertation to the ‘about’ page, so it shows up on the front of the site. I also uploaded a PDF version so that people can download it directly. You’ll see that I also added some links to a related book and DVD, which will certainly appeal to people who are interested in my dissertation. The links pull an image and some basic metadata from Amazon, using the Amazon Machine Tags plugin. This could be used to link to the book in which your article is published and earn you money in click referrals. An alternative, would be the Open Book Book Data plugin, which retrieves a book cover and metadata from Open Library, where your book may already be catalogued. If it’s not on Open Library, catalogue it!

After setting this up, I installed a few more plugins:

Dublin Core for WordPress: Automatically adds ten Dublin Core metadata elements to the document mark up.

wp-footnotes: This allows you to easily add footnotes to your document by enclosing your footnote in double parentheses.6

OAI-ORE Resource Map: Automatically marks up the document sections with a OAI-ORE 1.0 resource map.

Google Analyticator: Adds Google Analytics support so you can collect statistics on the readership of your document.

WP Calais Archive Tagger: Analyses your entire document and automatically keywords each section, using the Open Calais API.

Search API: WordPress comes with search built in, but there is a new search API which will eventually make its way into the WordPress core. I’ve installed the plugin to provide full-text search across the document. It can also add Google Search to your document site.

wp-super-cache: This is simple to install and will significantly speed up your document site, making it a pleasure to navigate through and read :-)

Plugins I didn’t use

wp-latex: Although I didn’t need it for my dissertation, it’s worth noting that WordPress supports the use of \LaTeX.

Academic Citation: You need to add a line of code to your theme for this to display. It supports the concept of an article being a single blog post, rather than a ‘document site’ and displays a variety of citation formats for readers to use.

Do you know of any other plugins for a scholarly blog?

The Beauty of Feeds

The other useful thing about managing a document using WordPress and in particular, using digress.it, is that you automatically get RSS/Atom feeds for the document. I’ve already discussed these in detail. It means that I was able to read my document in my feed reader, with footnotes and images displayed correctly.

Document in Google Reader

See how nicely the formatting is preserved. \LaTeX is also rendered correctly in feed readers.

Document formatted nicely in Google Reader

Reading my dissertation in Google Reader

You’ll see that the document sections are listed in order; that is, first section on top. As I noted above, blogs list posts in reverse (most recent first), so I sorted the feed items in Yahoo Pipes and sorted it in ascending order. Yahoo Pipes exports as RSS and it’s that feed that I subscribed to in Google Reader. Wouldn’t it be nice, if I could import my document feed into an Institutional Repository? Wait a minute, I can! :-)

Importing an RSS feed into EPrints

Click to see the item in the repository

Click to see the item in the repository

When importing the default feed, the HTML output is accurate but in reverse order, while the RSS output from Yahoo Pipes didn’t import into EPrints very cleanly at all. I’ll work on this. UPDATE: Forget Yahoo Pipes. WordPress feeds can be sorted with a switch added to the URL: http://example.com/feed/?orderby=post_date&order=ASC

So there it is. An academic paper, published to the web using a modern CMS which supports most authoring and publishing requirements. I would favour an institutional WPMU platform for academics to author directly to, publish their pre-print to the web for open access and detailed comment, and import their RSS feed into the repository. As a proof of concept, I’m quite pleased with this. We are currently developing a widget that can be embedded in a web page or WordPress sidebar and allow a member of staff to upload a document or zipped folder of documents to the Institutional Repository. I wonder if we can also support the import of a feed from the widget, too?

So, what would your requirements be? Tell me and I’ll do my best to test WordPress against them.

  1. Has anyone done a diff on the two code bases to measure exactly what percentage of the code is shared between WP and WPMU?
  2. Actually, I think I’ll save the discussion of its shortfalls for my next post. This one is already long enough.
  3. I pay $5/year for my domain name and as many sub-domains as I need. I pay $10/month for my hosting with unlimited storage and bandwidth.
  4. Like any decent CMS, WordPress supports role-based authoring and editing and maintains a revision history of edits, auto-saved once per minute. Revisions can be compared alongside of each other.
  5. On a scholarly WPMU installation, plugins could be pre-installed and activated, a default theme selected and settings tweaked so very little work is required by the academic author prior to writing her document.
  6. I am using the plugin on this blog!

Related posts

Image Based Quotes from WriteToReply Using Kwout

One of the things we discussed with respect to embedding WriteToReply/JISCPress quotes in third party applications was whether or not we should support an “imagified” embedding – that is, convert a paragraph to a JPG or PNG image format that can then be easily embedded in the third party site.

The advantage? Even if the third party site disallows script, object or embed tags, it will probably allow img tags…

So for example, extending the range of output formats suggested in Taking the Conversation Elsewhere – Embedded Quotes, we might consider something like an &output=png switch that allows us to construct an image embedding code along the lines of:

<img src=”http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER&output=png” longdesc=”http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER”>

Once again, there’s a trackback issue, although it’s easy enough to wrap the image tag in an appropriate anchor tag:

<a href=”http://docserver.example.com?p=POSTNUMBER&para=PARANUMBER”><img src=”http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER&output=png” longdesc=”http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER”></a>

However, this facility was seen as non-essential, so I looked on the web for a solution – and found it in the form of the kwout API which can be used to generate an image based representation of text found in a specified div tag (by ID) on a given web page, which can then in turn be embedded in an arbitrary web page. Although the image may be hard to read, this can work to our advantage: it might drive traffic back to the site that originated the quote :-)

The following javascript snippet uses the Kwwout API to generate an image based representation of a single paragraph from a WriteToReply republished document:

javascript:window.location=’http://kwout.com/grab?address=’+encodeURIComponent(“http://writetoreply.org/pluralnews/2009/07/03/section-1-securing-plural-sources-of-news-in-the-nations-locally-and-in-the-regions/”)+’&block=contentblock_10′

In the API call, “contentblock_10″ is the id of the block element to be quoted. Here’s what the kwouted image looks like:

kwouting a paragrpah from writetoreply http://kwout.com/quote/nbj4nife

And here’s the original paragraph on WriteToReply:

http://writetoreply.org/pluralnews/2009/07/03/section-1-securing-plural-sources-of-news-in-the-nations-locally-and-in-the-regions/#10 Writetoreply orginal quote

Note that the link that the kwout script generates is back to the page in the above case, so to link back to the actual paragraph we’d need to specify this in the link:

javascript:window.location=’http://kwout.com/grab?address=’+encodeURIComponent(“http://writetoreply.org/pluralnews/2009/07/03/section-1-securing-plural-sources-of-news-in-the-nations-locally-and-in-the-regions/#10“)+’&block=contentblock_10′

As a step on the road to full integration (a use of the Kwout API which may or may not be in line with the stated terms and conditions? I don’t know, I haven’t read them…!) is this bookmarklet that should let you highlight a paragraph number on a WriteToReply document, and then take you straight to the Kwout embed page for that paragraph:

javascript:(function(){var l=location.href; window.location=’http://kwout.com/grab?address=’+encodeURIComponent(l)+’&block=contentblock_’+window.getSelection();})()

Actually, that looks a little cluttered, and the usability is a little off. So a better solution maybe to suggest that the user clicks on the paragraph link to get the “paragraph in focus page” page, and then click on the following bookmarklet:

javascript:(function(){var l=location.href;l=l.split(‘#’);window.location=’http://kwout.com/grab?address=’+encodeURIComponent(l[0])+’&block=contentblock_’+l[1];})()

(What this does is pull the paragraph identifier out of the URI and then construct the Kwout API call out of it as a result.)

Or if you want the link to go to the “paragraph in focus” page, rather than the top of the page:

javascript:(function(){var l=location.href;window.location=’http://kwout.com/grab?address=’+encodeURIComponent(l)+’&block=contentblock_’+l.split(‘#’)[1];})()

(Note that neither of these bookmarklets is ideal – a production stable bookmarklet should be able to cope (or fail gracefully) with the lack of hash separated paragraph identifier in the URI.)

Hmm, maybe we need a “labs” area on WriteToReply where we can collect these micro-utilities?

Taking the Conversation Elsewhere – Embedded Quotes

As part of the JISCPress effort, one of the things we’ve been considering is the granularity of appropriate “consultation elements” or “discussion elements”, those pieces of content that people might actually want to reference, question or chat around as compared to a whole 200 page document, for example.

The page and paragraph levels fall out of the CommentPress theme (and its descendants) quite naturally – WordPress gives us the page level (along with a single item RSS feed at the page level), and the theme gives us URIs at the paragraph level.

(Hmmm… I wonder – would it also be useful to provide a multi-item RSS feed, at the page level, with a separate item for each paragraph on that page? Or do we do that already?!)

In many cases, the paragraph level seems to be the most natural chunk for discussion, particularly in an ongoing conversation about a particular document. So a major question for us is how to put those paragraphs to work?

One of the features that Eddie’s been working on as part of the JISCPress project is the ability to embed paragraphs from a document in third party web page. This feature will allow us to increase the surface area of the document by allowing third parties to re-present that content elsewhere, whilst also (hopefully) providing a means to link that external conversation directly back to the original document.

So what benefits does embedding have to offer to:

a) the person grabbing and using the embed code;
b) the publisher/whoever’s running the consultation from which the embed code was grabbed

In a discussion on the JISCPress group, Joss suggested the following:

For the user:

1. More portable transformation of document content into raw data.
2. Personalisation, presentation and ‘ownership’ of documents within their own publishing environment (which is one of the benefits of slideshare/scribd).
3. Direct joined up quoting rather than copying. More aligned with the ideals of the web and linking data. This could also be a benefit to publishers concerned about unattributed copying.

For the publisher:

1. Greater possibilities of content dissemination
2. Greater potential of attracting engagement via trackbacks
3. Further possibility of using JISCPress as an underlying ‘document store’ where authoring, dissemination and engagement occurs mostly remotely via XML-RPC, syndication, embeds and trackbacks.
4. Possibility of site analytics being hooked into embeds so the reach is measurable???? (Analytics can track document types, I’m not sure whether they are used to track embeds…)

So where are we at? Embedding is currently in testing and has the following mechanic. Hovering your mouse cursor over one of the paragraph numbers in a document raises a floating panel that contains a link to the current paragraph, and an embed code. (The panel remains open whilst the cursor is over it, so you can easily grab a copy of the code.)

Embedding in digress.it

Using the embed code in a third party page embeds the corresponding paragraph in that page.

For testing purposes, the pattern we are using for the embed URL is of the form:

http://docserver.example.com?p=POSTNUMBER&digress-embed=PARANUMBER

The POSTNUMBER identifies the actual page (i.e. http://docserver.example.com?p=POSTNUMBER is a valid page URI) and the PARANUMBER identifies the paragraph to be embedded. Note that this is subject to change.

Unfortunately, the simple embed strategy does not trivially generate a linkback (such as a trackback or pingback) to the original document. For these reverse links to be generated automatically, an actual anchor tag linking back to the original page must be present in the page creating the linkback. One commonly used strategy for achieving this is to provide an embed code of the form:

<div>
<object /&gt
<a>Quoted from etc…</a>
</div>

That is, a link is explicitly included in the embed code, although it is easy enough for the person embedding the quote to strip that anchor tag out.

(Although it complicates matters, as the embedded object is being pulled from the document server, I guess that means we could, in principle, generate a linkback by observing the referrer page URIs for requests made on the server for particular embeddable objects and checking those against the current list of trackbacks? Or maybe the embedded object could generate an XML-RPC back to the trackback server itself whenever the page it is embedded in is loaded? [Note to self: can we easily get analytics on third party embeds?] I think Eddie is working on this, so I won’t embarrass myself further wittering on about things I don’t know anything about!;-)

Note that a similar problem arises when using a Javascript (<script> tag) based embed code: there is no explicit anchor link present. Script tags also have the additional problem that they are often sanitised (i.e. stripped out) of web pages in many institutional web publishing systems. (In some circumstances, a workaround for the institutional case may be possible. For example, if a variant of WTR/JISCPress was running as a white label solution in an institution, a shortcode plugin could be provided that allowed authors to embed paragraphs from documents in that environment within other documents in that environment. See the WordPress shortcode API for more details.)

As well as the straightforward embed code, we’ve also been considering other ways in which paragraph level content can be published so that third parties have convenient access to it in a format that is appropriate for their needs.

And this is what we came up with – an output switch that can be appended to the end of a paragraph URI that allows the paragraph level content to be published in a variety of formats:

  • &output=html
  • &output=rss
  • &output=txt
  • &output=js
  • &output=json

As and when these come on stream, we’ll publish use-case examples for each of them.

If you have any comments on our “paragraph republishing” strategy, please post a comment below.

Agile methodologies and open source development

In the course of writing the JISCPress Project Plan, I’ve been thinking again about our project methodology. The original funding call asked for projects to adopt an agile methodology like SCRUM or XP, which I am familiar with. We attempted to use XP while I was working at Amnesty International (not long after half the IT department were trained in Prince2!) and like any methodology, it was used in part rather than in whole. We  collected user stories, held five minute stand up meetings each day and released often and iteratively so that users could feed back on the product. ((It may look like an empty repository but over 20,000 assets are available to logged in Amnesty staff.))

The JISCPress project has four team members, including myself. None of us work on JISCPress full-time, having other work and study commitments. Equally, none of us work together in the same department and only Alex and I work in the same university. Alex is a student of computing at Lincoln, Tony lives on the Isle of Wight and Eddie lives in San Francisco. In addition to this, we’re working wholly with existing open source software (WordPress) that is openly developed and it has never been an option in my mind, to enjoy the benefits of that community but not attempt to contribute back using the same transparency of process. It was also proposed in our funding bid that “the project will seek to promote openness and collaboration from the point of bid announcements onwards.” By this, I was thinking in terms of the open source development process I have seen with WordPress and other projects where asynchronous discussion and contributions take place through mailing listsIRCa code repositoryissue tracker and a wiki.

Reading the excellent OSS Watch website, I came across a page about the sustainability of projects and open development, and was particularly interested to read a quote from Gianugo Rabellino, CEO of SourceSense:

“If you think that one of the key ideas of agile is the unity of time and location – you need to be in the same place at the same time and doing a lot of discussion face-to-face – and then you have open development which is based on asynchronous, distributed working etc., then it looks like oil and water – they don’t mix”.

This is what I’ve been thinking recently, too. It’s not that they are wholly incompatible methods of developing software, but from what I know about agile methods, there is an assumption that the developers are working together in the same physical location, focused intensively on the same client driven product.

“Scrum enables the creation of self-organizing teams by encouraging colocation of all team members, and verbal communication across all team members and disciplines that are involved in the project.” ((Wikipedia: SCRUM))

Frankly, this way of working is impossible for us. On the other hand, projects that are openly developed often don’t have clients but instead have ‘communities’ of users. They rarely have short code sprints, they have open version-controlled repositories that allow anyone to test the code at any time. It’s worth noting that WordPress recently held a code sprint but given the size of the community, there were relatively few contributions. Many contributors work asynchronously and have other commitments over the course of their day, volunteering their time and effort when they can.

Likewise, JISCPress is intended to serve a community rather than a single client. We hope that it is the JISC community who lead the direction of the project through testing and feedback and who eventually benefit the most from the project. Beyond the JISC community, there is the wider community of users of WordPress and CommentPress who will likewise benefit from the project.

Ross Gardler, OSS Watch manager, describes the Open Development Methodology (ODM) as “a way for distributed team members to collaboratively develop a shared resource in a managed and sustainable way.” The ODM is characterised by:

  1. User engagement
  2. Transparency
  3. Collaboration
  4. Agility
  5. Sustainability

Agility and user engagement are also found in SCRUM and XP, but there is no requirement in these methodologies to be transparent, sustainable beyond the client’s specific use for the product or cater for a diverse group of asynchronous contributors.

With this in mind, I will continue to learn about and pursue an open development methodology for JISCPress because it is appropriate for our project. It is already part of an existing (WordPress) open development community and we have, from the start of WriteToReply and then the #jiscri call, placed a great deal of emphasis on openness and transparency of process.

It is too early in the project to measure the effectiveness of this approach. Eddie and Alex only joined us in the last few days and we’re still setting up the basic platform for working with. I have noticed that the use of IRC has not taken off despite my fondness for it. This is partly because all of us use GMail and tend to use Google Chat for quick conversations when we see we’re online at the same time, rather than having an IRC client open. Tony and I have an established way of communicating with each other over Twitter, which is public but a poor method of establishing context for the project as Twitter doesn’t archive tweets long-term and searching for anything seems to be hit and miss. I would like to establish weekly IRC meetings soon though. There is also the issue of working in a significantly different timezone to Eddie. IRC is for synchronous chat and when Eddie is at work,the rest of us are thinking of sleep. Eddie is talking about visiting the UK for a few days (paid for out of his own pocket), and I hope that the four of us and anyone else that is interested, will meet up for a day’s discussion and development.

There are clearly still things to be worked out and a routine to establish that works best for us, but I am keen that if a methodology is to be identified for the project, it is one of ‘open development’ rather than ‘agile’. I intend to devote a lot of my time on the project to ensuring that the wider WordPress community are aware of what we are doing and that they are welcome to contribute in any way they can. I shall write more about how we are addressing Ross’ five characteristics of an Open Development Methodology and am keen hear from anyone who has an opinion on any of this, including members of the JISCPress team, who I haven’t consulted before writing any of this.

Project SWOTing

One of the JISCRI project reporting requirements is a SWOT analysis of each project. It makes sense to attempt our first SWOT analysis sooner rather than later and update it using the comment form as we work through the project. Your comments are very welcome. Surely a bad SWOT analysis is one undertaken by a single individual (like this one!)

SWOT diagram

I think one of the main strengths of the JISCPress project is that we’ve effectively been developing it since February, when Tony and I set up WriteToReply. JISCPress is basically a re-thinking, re-working and further development of WriteToReply for a specific community and we can apply the lessons learned through WriteToReply, to the planning and development of JISCPress. In this sense, we’ve got a decent head start really and both Tony and I know where our own strengths and weaknesses, as well as our own particular interests lie in the project.

Another strength worth highlighting is the range of skills we bring to the project. I’ve been running different WordPress MU installations here at the University of Lincoln for the last year and have several years’ experience tinkering with Linux servers. I’m finding working on AWS to be an enjoyable and welcome learning experience.  Tony prefers to stay away from the bash command line, instead focusing on the way the data published on JISCPress can be repurposed, cross-referenced, syndicated and mashed up with other web services. He’ll also be looking at what value can be gained from the Google Analytics and Piwik APIs.

Anyone that knows a bit about CommentPress (now called ‘Marginalia‘), will understand that Eddie brings an excellent understanding of WordPress code to the project as well as some pretty advanced javascript skills. Eddie has always led the development of CommentPress/Marginalia and as it provides core JISCPress functionality such as paragraph level URIs and commenting, it’s a strength of the project that we have him on board. Were he not on board and we were reliant on another developer to work on CommentPress code, I would consider this a risk to the project. To quote from his site:

I believe in rapid prototyping, open-source, collaborations instead of competition, quick releases, smalls teams, debate, creative thinking, and transparency.

That’s exactly the type of person we need to work on this project.

Finally, Alex is a keen student of computing at the University of Lincoln with good PHP skills. As a student, he’s flexible with his time and not wholly reliant on the project for his income and it’s reassuring to have him working locally (and in the same time zone!). All in all, I think we’ve got a good spread of skills and interests on the team.

While I’m thinking of strengths, I’m confident that using Amazon’s infrastructure to work on the project will prove to be a strength. It allows us to work on the project in an environment that is independent of the University of Lincoln’s IT infrastructure.  I’m very lucky here at Lincoln to have root access to my own Linux server to work on, despite not being a member of the ICT department. I’ve no complaint at all about our ICT department and enjoy working with them, but on a rapid project like JISCPress with four team members working independently and the potential for contributions from the open source community, I’m pleased that we have our own space to work and I don’t need to bother my IT colleagues to restart the virtual machine or make changes to DNS records.

On the other hand, the membership of the team could be seen as a weakness. We’re not a tight team of developers working in the same institution but rather relative strangers working, for the most part, remotely and in Eddie’s case in a considerably different time zone. This could result in poor communication and lack of motivation if we let it and I hope that the pillars of communication in open source projects that we’ve set up (IRC, mailing list, code repository, wiki, blog) will help us stay in touch and motivated.  However, one of the main benefits of this project for both me and my employer, is being able to test this way of working on development projects. We don’t have an in-house team of web developers who could be pulled into this project and as much as I’d like my department to hire a researcher/developer or two, it’s not going to happen. So in order to work on JISCRI and similar funded projects, I need to show that this is an effective way of working. I hope it succeeds, because I like working on these types of projects and in order to innovate in our use of technology to support research, teaching and learning, we need to have the experience and capacity to undertake proper R&D and not just theorise about the potential of technology in the HE sector.

threat to the project is that JISCPress is principally a tool for JISC document authors to publish funding calls and JISC project managers to publish their final reports. We need their buy-in to the project, not only to make it feel worthwhile but also to steer the direction of feature development. JISCPress might be seen as complicating JISC employees’ work, pushing something on them that they never asked for. It might also be seen as yet another requirement from JISC to Project Managers. I take this threat seriously, but I don’t let it worry me too much. JISC has made the decision to fund JISCPress as a ‘demonstrator prototype’ and there’s no obligation for them to put it into production use. They also recognise that we’re building a platform that could equally be of value to other organisations. WriteToReply and JISCPress are just two examples of what we’re developing. WordPress is a popular CMS and the work on Marginalia and additional features that we’ll be developing, can be cherry-picked or taken wholesale and put to good use. All code is developed under a GPL or compatible license. (Note that this has to be the case, because we’re developing for WordPress which is licensed under the GPL, calling functions in WordPress core code – not all WP plugin and theme developers understand this!)

Finally, for now, the project provides opportunities for anyone to get involved and in turn, by working in public on an open source project, I hope we’ll attract others who like what they see and want to contribute in any way at all. Comment, test, review and contribute code, if you can. Join the mailing list and introduce yourself. Working this way, with an emphasis on openness and transparency, I hope that opportunities arise that we don’t yet know about. One that we do know about is Google Wave, due out in September, and if we keep that in the back of our mind, there might be an opportunity to exploit this new and exciting platform and protocol. Maybe we’ll develop a JISCPress gadget for Wave that allows realtime comment and discussion on a document from Wave? Maybe JISCPress will largely become a ‘hidden’ CMS that is used exclusively by via publish and subscribe protocols such as RSS, AtomPub, XML-RPC, and Wave/XMPP?

JISCPress: A document discussion platform

We’re very pleased to announce that JISC have agreed to fund JISCPress, a six-month, £32,500 project led by the University of Lincoln, in partnership with the Open University and based on WriteToReply. JISCPress will provide a scalable community platform for publishing and discussing project calls and final reports, in order to support the grant bidding and project dissemination processes.

As you may know, WriteToReply is run in our spare time – lots of late nights and busy lunchtimes. Since launching the re-publication of the Digital Britain – Interim Report, we’ve been looking for ways to bring benefits from our work on WriteToReply, into the Higher Education community where we work. JISC fund much of the UK development and innovation in the use of ICT in teaching and research and in March, announced their Rapid Innovations funding call.

We quickly re-published the call on WriteToReply to demonstrate the benefits of publishing funding calls in this way and then went on to submit a bid which proposed a community platform for the JISC funding call process, based on our experience of setting up and running WriteToReply. As with WriteToReply, this will be an open, public project and all documentation and code will be available under open licenses.

JISCPress is a platform aimed at people working in UK Higher Education, but the platform itself could be easily adapted for other uses, just as WriteToReply is primarily focused on government consultation documents. The final platform will be available as an Amazon Machine Image so anyone will be able to host their own multi-document discussion platform with all the benefits you see on WriteToReply plus the additional features we’ll be developing throughout this project. We’re already advocating the use the platform in our own universities for the open (and closed) discussion of institutional strategies, for the critique of texts by students and for peer-review of research papers. What might you use it for?

Over on the JISCPress project blog, you’ll find links to a mailing listwiki and code repository. Feel free to join us if this WriteToReply spin-off appeals to you. If you know anyone that might be interested, please do let them know.

You’re probably already aware that WriteToReply uses WordPress Multi-User and CommentPressEddie Tejeda, the developer of CommentPress will be working with us on the project and this will result in significant further development of CommentPress 2. So, if you’re interested in WPMU and CommentPress (as many people are), please consider following, contributing to and testing JISCPress.

We should also note that while the project is a spin-off of our work on WriteToReply, neither Tony or Joss are personally receiving any funds from JISC.  The contributions from JISC to cover our time on this project are paid directly to our employers and does not result in any financial benefit to us or WriteToReply (which is in the process of being formalised as a non-profit business).  In other words, while WriteToReply is a personal project, JISCPress is part of our normal work as employees of our universities (both Tony and I are expected to routinely bid and win project funds – you get used to it after a while!). Money has been allocated to fund dedicated developer time to the project, which will pay Eddie and Alex, a student at the University of Lincoln, for their work as freelancers.

Anyway, on with the project! Here’s the outline from our original bid document:

This project will deliver a demonstrator prototype publishing platform for the JISC funding call and dissemination process. It will seek to show how WordPress Multi-User (WPMU) can be used as an effective document authoring, publishing, discussion and syndication platform for JISC’s funding calls and final project reports, and demonstrate how the cumulative effect of publishing this way will lead to an improved platform for the discovery and dissemination of grant-related information and project outputs. In so doing, we hope to provide a means by which JISC project investigators can more effectively discover, and hence build on, related JISC projects. In general, the project will seek to promote openness and collaboration from the point of bid announcements onwards.

The proposed platform is inspired and informed by WriteToReply, a service developed by the principle project staff (Joss Winn and Tony Hirst) in Spring 2009 which re-publishes consultation documents for public comment and allows anyone to re-publish a document for comment by their target community. In our view, this model of publishing meets many of the intended benefits and deliverables of the Rapid Innovation call and Information Environment Programme. The project will exploit well understood and popular open source technologies to implement an alternative infrastructure that enables new processes of funding-related content creation, improves communication around funding calls and enables web-centric methods of dissemination and content re-use. The platform will be extensible and could therefore be the object of further future development by the HE developer community through the creation of plugins that provide desired functionality in the future.

Subject to user requirements, our planned project deliverables are:

  • A WordPress Multi-User based platform for authoring and publishing JISC funding calls in a form that allows paragraph-level comment and discussion either locally or remotely.
  • A meta-site that aggregates all document data into a single site for search, navigation by categories and tags and can syndicate searches, tags and categories.
  • Develop CommentPress to meet WCAG 2.0 accessibility guidelines, meeting public sector requirements.
  • Evaluation and integration of “related content” utilities to dynamically link related project calls and reports based on content and/or semantic analysis.
  • Evaluation and possible integration of remote, realtime messaging services such as Twitter and XMPP integration.
  • Evaluation and possible integration of enterprise authentication services such as LDAP and Shibboleth.
  • Evaluation and possible integration of OpenCalais, a semantic tagging service.
  • Documentation on how to exploit the benefits of AWS and clone the project instance for other uses.
  • A documented suggested workflow for document authors
  • Documented examples of how to fully exploit the platform for data extraction and syndication.
  • Documented ‘user stories’ for the JISC funding call process.

If this sounds interesting, please do take a look at the full project proposal and join us on the mailing list.