Using JISCPress/Digress.it for Reading List Publication

One of the things I’ve been doodling with but not managing to progress much thinking wise (not enough dog walking time lately!) is how we might be able to use the digress.it WordPress theme to support various course related functions in ways that exploit the disaggregating features of the theme.

Chatting with Huw Jones last week about his upcoming Arcadia seminar on “The Problem of Reading Lists” (this coming Tuesday, Nov 24th – all welcome;-) I started thinking again about the potential for using digress.it as a means of publishing, and collecting comments on, reading lists.

So for example, over on the doodlings WriteToReply site I’ve posted an example of how a reading list posted under the theme is automatically disaggregated into separate, uniquely identified references:

The reading list was generated simply by copying and pasting a PDF based reading list into a WordPress blog post. Looking at the format of the list, one could imagine adding further comments or notes relating to each reference using a blog comment. Given that the basis of each paragraph is a citation to a particular work, it might be possible to parse out enough information to generate a link to a search on the University OPAC for the corresponding work (and if so, pull back an indication of the availability of the book as, for example, my Library Traveler script used to do for books viewed on Amazon).

Under the current in-testing digress.it theme, each paragraph on the page can be made available as a separate item in an RSS feed; that is, as well as the standard ‘single item’ RSS page feed that WordPress generates automatically, we can get an N-item feed from the page for the N-paragraphs contained on a page.

Which in terms means that to generate an itemised RSS feed version of a reading list, all I need to do is paste the reading list – with each reference in a separate paragraph – into a single blog post. (the same is true for disaggregating/feed itemising previous exam papers, for example, or I guess video links in order to generate a DeliTV programme bundle…?!)

(For more details of the various ways in which digress.it can automatically disaggregate/atomise a document, see Open Data: What Have We Got?.)

PS just a reminder again – Huw’s Reading List project talk, which is about far more than just reading lists, is on Tuesday in the Old Combination Room, Wolfson College, Cambridge, at 6pm.

Measuring Website Usage With Google Analytics, Part I

Knowing where to get started with reporting website statistics can often provide new webmasters with something of a challenge. In this post, I’ll quickly review the guidance provided by the Central Office of Information on Measuring Website Usage which:

describes a common approach to measuring website traffic [for central government]. This enables departments to answer Parliamentary Questions and Freedom of Information Requests about website usage consistently and reliably

I’ll also start to explore how to generate reports that satisfy those guidelines using Google Analytics.

The proposed metrics “are defined according to industry standards set by the Joint Industry Committee for Web Standards (JICWEBS)” and specify the following minimal level of reporting (Measuring Website Usage – Reporting requirements):

  1. The following web metrics, as defined by the Joint Industry Committee for Web Standards (JICWEBS), must be measured for each and every publicly accessible website operated by an organisation:
    • Unique User/Browsers
    • Page Impressions
    • Visits
    • Visit Duration
  2. Central government departments must measure Unique User/Browsers, Page Impressions, Visits and Visit Duration starting from 1 April 2009 for every website open on 1 April 2010.
  3. Executive agencies and non-departmental public bodies (NDPBs) must measure Unique User/Browsers, Page Impressions, Visits and Visit Duration starting from 1 April 2010 for every website open on 1 April 2011.
  4. The following information must be provided to COI at the end of each quarter:
    • Number of monthly Unique User/Browsers
    • Number of monthly Page Impressions
    • Number of monthly Visits
    • Number of Visits of at least two Page Impressions
    • Total time in seconds for all Visits of at least two Page Impressions
  5. Each report should contain figures for each of the previous three months. This information should be provided in the format shown in the reporting template in Appendix A.COI Website usage reporting template http://coi.gov.uk/guidance.php?page=237
  6. All figures should exclude internal web development activity, performance monitoring, automated broken link detection and other types of non-human activity (e.g. robots and spiders). Further details on what to exclude are found in the Page Impressions section.

So what does Google Analytics offer “out of the box”?

Headline report - Google Analytics

The Visitors Overview repeats these figures and additionally provides an indication of the number of ‘unique’ visitors:

Visitors Overview

At face value then, it would appear that the Google Analytics are providing at least some of the required stats (though we need to clarify that the numbers as recorded by Google Analytics conform to what the COI has in mind for those reports as described in their guidance on the Minimum standard for web metrics!) But what does that guidance relating to “at least two web pages” mean?

To understand the emphasis on “at least two pages”, it’s worth reflecting on the notion of bounces and the bounce rate. Bounce rate refers to the proportion of visitors to a site who only visit one page on a website before leaving that site, and as such tend to leave no meaningful analytics behind.

According to the ClickTale blog (What Google Analytics Can’t Tell You – Part 1), Google Analytics “has no way of knowing how long a bounced visitor, who only visits one page, spent on your website”. That is, it appears that the time spent looking at a page appears not to be based on the difference between the time when a page has fully loaded (and generated a trackable onload event) and its unload event; instead, it is calculated as the time between two loading one page and clicking through to and loading a second page on the sam site.

Which is why the emphasis on collecting stats from at last two pages: given the current crop of analytics tools that struggle to do anything meaningful with single page visits, specifying a two page visit means that not only visits to the site that are likely to be meaningful are reported, but also that the reports are more likely to contain meaningful data too. (There is an obvious problem here: if visitors visit two pages, and quickly click to the second from the first before exiting the site from the second page, the time spent on the second page won’t be captured? See for example Time on Site & Time on Page – Google Analytics metric mystery)

One of the nice things about Google Analytics is that it lets you create custom views, or “segments” of the data in which you can specify things such as the minimum number of pages visited when generating a particular report. In order to do this, you specify an “Advanced Segment”. Here’s what an Advanced Segment for a “minimum of two pages visited report” might look like:

GA Advancd segment - visited at last two pages

Applying this segment to the same data charted above gives these results:

Segmented goog stats

GA segmented view

So for example, in this version of the report we see that the average number of page views and the average time on site has gone up.

Something I don’t think Google Analytics report is the total time on site. Bearing in mind the lack of data regarding the time spent on exit pages, the best we can do is multiply the number of visits by the average time on site to get an estimate of the total time on site.

With just this single advanced segment, a simple calculation, and the out of the can reports from Google Analytics, I think we can deliver on the suggested stats based on a literal reading of the headings, though in a follow up post I’ll check to see if the more detailed spec on the metrics matches the way that Google ANalytics defines its metrics.

PS Unfortunately, the segmented report appears to have lost the number of absolute unique visitors (although I think the recommended report wanted the number of uniques, including bounces, to the site?) Anyway, let’s play: the number of visits gives the upper bound on the number of unique visitors, but can we also estimate the lower bound? One heuristic might be to look at the number of visits and uniques in the original report (176 uniques, 245 visits), see how many visits were lost in discounting the bounces (245-104 = 141), assume these were all unique and subtract these from the original number of uniques (176-141=35). I think this gives the lower bound on uniques as recorded by Google Analytics for non-bouncing visitors?

Google Analytics, Feedburner and Google Reader

Over the last couple of weeks, it seems as if the Goog has been doing a bit of reconciliation on the old analytics front, in particular the ability to track traffic driven back to your website from links contained within a feed published from that site using Feedburner…

The first thing I’d noticed as being different was the appearance Google Analytics tracking codes on Feedburner powered posts that I was reading in Google Reader – opening such a post in a new window seems to display it with a set full blown set of GA tracking attributes. So for example, opening a post from the Feedburnered OUsful.Info feed results in a URI like this:

http://ouseful.wordpress.com/2009/11/18/under-the-radar/?
utm_source=feedburner&utm_medium=feed
&utm_campaign=Feed%3A+ouseful+%28OUseful+Info%29&utm_content=Google+Reader

…and I’m pretty sure I didn’t put those tracking codes in there explicitly…

In “Campaign” Tracking With Google Analytics, I started sketching out how it might be possible to use Google Analytics campaign tracking codes to to track the spread of referrer links to documents or document fragments hosted on WriteToReply or JISCPress, so let’s see how the Feedburner annoations are structured:

  • utm_source=feedburner (that is, the originator of the feed);
  • utm_medium=feed (that is, the means by which the content was transported/syndicated);
  • utm_campaign=Feed: ouseful (OUseful Info) (that is, the name of the Feedburner feed (I think: the feed URL is http://feedburner.com/ouseful), followed by the feed title (OUseful Info);
  • utm_content=Google Reader (that is, the place where I viewed the link).

Compare this with the suggestion I made for annotating WriteToReply links:

  • utm_source=twitter.com (that is, the place a link was ‘launched’);
  • utm_medium=question (that is, the type of slug content used to qualify the link);
  • utm_campaign=jiscri (that is, the consultation document linked to, e.g. for the link <em.http://writetoreply.org/jiscri/2009/03/11/rapid-innovation-projects/);
  • utm_content=slug3 (that is, a unique ID to identify the text used to qualify the syndicated link).

So how can you get Googalytics tracking codes on your Feedburner feeds? Details are still sketchy, (e.g. see the original announcement on the Goole Analytics blog here: An Integration With Feedburner, and the Google AdSense for Feeds blog here: “Afternoon, Frank.” “Hey howdy, George.”) but this Google FAQ post on How do I set up my FeedBurner feed to report feed clicks in Google Analytics?:

If you use Google Analytics to track web site visitors, you can see feed clicks originating from your FeedBurner feed by activating an option on the Analyze tab.

When someone clicks one of your feed items and ends up back on your web site, Google Analytics will track that activity and include it in the “Traffic Sources” section.

The post also tells you where you can set up the tracking details – from the Configure Stats menu option. And selecting that, I can now see why my feed links are annotated as they are:

(I’m not sure how the $distributionEndpoint is treated for none Google properties?)

The Google AdSense for Feeds post suggests that:

By default, these analytics will show up in the “All Traffic Sources” and “Campaigns” views in Google Analytics. You can filter the results just to only the traffic that comes from Google FeedBurner by filtering on “feedburner” on the All Traffic Sources page or “Feed:” on the campaigns view. You can also use these sources in the Advanced Segments views.

which suggests that for sites like JISCPress/WriteToReply that use Google Analytics on the main site and Feedburner for the public/promoted feeds, the Feedburner integration will automatically annotate feed links with tracking codes that can be tracked from the site’s Google Analytics dashboard.

We’ve had some really useful feedback

We’ve had some really useful feedback from the British Computer Association of the Blind on the accessibility of digress.it. Here’s what they have to say:

***
The website looks as though it has some good accessibility built in, but the
JavaScript it uses isn’t particularly accessible. To fully understand, it would be
worth getting the site audited against a recognised benchmark such as the Web Content
Accessibility Guidelines:
http://www.w3.org/TR/WCAG20/

The JavaScript doesn’t seem to be keyboard/screen reader friendly, particularly
on the comments pages. If the jQuery libraries have been used out of the box, it’s
likely they’ll need adapting.

Form fields will also need text labels associating with them. Guidance on
creating accessible forms can be found here:
http://www.accessify.com/features/tutorials/accessible-forms/

It’s great to see that ARIA landmarks have been used. If the main landmark could
be applied consistently across all pages, it would help people rely on it as a means
of navigation. Adding in more landmarks, particularly for search and navigation,
would also help.

IE6 and IE7 make up an enormous chunk of the browser market. Most IE6 users are
within the corporate and public sectors, where upgrading isn’t an option because of
organisational policy.

Representing the full visual design can sometimes be difficult for these
browsers, particularly IE6. In this case though, that doesn’t look as though it
should be the case.

The general building blocks of the website are good. The separation of
presentation from content has been done well, and the code is reasonably clean.
Headings, lists and other standard elements have all been used well.
***
My notes:

The digress.it theme is based on the default theme for WordPress (‘Kubrick’), which is considered a solid and accessible design which many themes are built on. digress.it has also been consciously developed to be relatively easily styled using CSS.

Accessibility has been a constant, though admittedly secondary, requirement in the JISCPress project and Eddie has made specific efforts to improve the accessibility of the plugin over the original CommentPress. I believe digress.it is partially using the ‘accessible jquery’ library, too. I’ll be looking at the WCAG2 document and reviewing, as best as I can, the areas of improvement that can still be made.

Open Data. What have we got?

I attended the ‘Global Graph’ session at the #cetis09 conference and made a largely failed attempt to demo some of the work we’ve been doing with Triplify and the Talis Platform. (In my defence, it wasn’t a planned demo and jiscpress.org was down while Alex was doing some design work).

Anyway, what I would have shown was how each document site on jiscpress.org uses Triplify to provide Linked Data in the form of RDF/N3 triples, which we store on the Talis Platform using a plugin Alex wrote.

Using Alex’s config file for WordPress MultiUser, we drop the triplify directory into the WPMU root directory, alongside wp-admin, wp-includes and all the other WordPress files. You should take a look at the config file and make sure it’s doing what you want it to do, but it will work as it is.  With this in place, Linked Data in the form of an RDF flat file for each document site (blog) is available at http://document.jiscpress.org/triplify or http://jiscpress.org/document/?triplify

(I should warn you that none of the URLs in this post are genuine URLs. They’re examples of syntax. The server at jiscpress.org will stop running at the end of December).

Now, to get that same data onto the Talis Platform, Alex has written a plugin for WPMU that periodically crawls the documents for changes and pushes the new data to a Talis Platform account.  Here are the WPMU site-wide admin options:

Admin settings

and here are the per document site user settings:

User settings

I won’t explain what the plugin does in detail. Just click on those images above and you’ll see the options that are available and if you’re reading this stuff, you know what it’s all about.  The Talis/Triplify plugin for WPMU will appear on  http://wordpress.org/extend/plugins in the next couple of weeks. It’s been tested and it does what we expect it to do but we want to test it more on sub-directory installs before it’s publicly available. Full documentation will appear soon on http://code.google.com/p/jiscpress/wiki/Documentation

We have also developed a WPMU plugin for Open Calais and the Yahoo! Term Extraction API. This provides a background service which indexes each document section (blog post) and creates relationships between content across the platform. We’ll post here about that very soon.

In addition to the Linked Data, JISCPress, using digress.it on WordPress, provides a long list of other open data (not Linked Data) end-points which might be put to good use. Here you go..

Document paragraphs

These are switches that provide individual paragraph data in different formats.

http://test.jiscpress.org/?p=15&digressit-embed=1&format=xml

http://test.jiscpress.org/?p=15&digressit-embed=1&format=text

http://test.jiscpress.org/?p=15&digressit-embed=1&format=rss

http://test.jiscpress.org/?p=15&digressit-embed=1&format=html

http://test.jiscpress.org/?p=15&digressit-embed=1&format=json

Document sections

This is just the regular WordPress post content in RSS format. In JISCPress terms, it’s the document section which is a single feed item.

http://test.jiscpress.org/2009/07/28/6-how-jisc-invests/feed/?withoutcomments=1

and this is the normal WordPress feed of comments on a particular post/document section.

http://test.jiscpress.org/2009/07/28/6-how-jisc-invests/feed/

We’ve also added the provision of a feed for each document section (‘post’), where each paragraph is a feed item. Note that this makes digress.it a nice tool for building your own feeds out of a single WordPress post.

http://test.jiscpress.org/feed/paragraphlevel/3-jisc-vision-mission-and-objectives/

Per paragraph comments/discussions

For each paragraph, there’s a feed of the comments/discussion.

http://test.jiscpress.org/feed/paragraphcomments/3-jisc-vision-mission-and-objectives,1

Commenter feeds

For each person that comments, there’s a feed of their comments

http://test.jiscpress.org/feed/usercomments/Joss%20Winn

All the other stuff

Don’t forget that the entire document content is also available as a feed

http://test.jiscpress.org/feed/

http://test.jiscpress.org/feed/rss

http://test.jiscpress.org/feed/rss2

http://test.jiscpress.org/feed/atom

http://test.jiscpress.org/feed/rdf

as are all comments from the site, too:

http://test.jiscpress.org/comments/feed

with WordPress, tags also have feeds

http://test.jiscpress.org/tag/tag1/feed

and so do categories

http://test.jiscpress.org/category/category1/feed

You can also combine tags

http://test.jiscpress.org/tag/tag1+tag2+tag3/feed

and you can combine tags and categories

http://test.jiscpress.org/?category_name=category1&tag=tag2,tag3&feed=rss2

Finally, authors have a feed, too

http://test.jiscpress.org/author/joss/feed/

Summary

WordPress is a versatile CMS for organising/designing and publishing data as feeds and therefore a useful source of Open Data. JISCPress has extended this versatility by choosing to develop further data end points using digress.it and offering a simple way of publishing Linked Data to the Talis Platform RDF triple store where is can be queried and mashed up using the platform’s API.

digress.it version 2.3 was released last…

digress.it version 2.3 was released last night and this marks the last major release of this WordPress plugin funded within the time frame of the JISCPress project. It is worth pointing out that our project funding effectively boot strapped the re-birth of CommentPress and paid for Eddie Tejeda, the original CommentPress developer, to rewrite CommentPress from scratch into digress.it. I was recently told that this work has led to Eddie being asked by Cornell University to work on a really interesting and high-profile digress.it-based project for them which we’ll be announcing soon. It’s great to see JISC’s work sustained in this way and hear that digress.it will be properly maintained through additional funding.

This release brings better IE6 & 7 compatibility, a smoother, better Comment Box, a document section level comment view, an option to parse lists into separately commentable points, BuddyPress compatibility, document section level feeds and a bunch of bug fixes. Overall, it feels like stable, feature rich code.

As noted above, we added one more RSS feature which now means digress.it can be used as an RSS feed builder. Each paragraph in any given blog post/document section, can be extracted as an RSS feed ‘item’. See http://writetoreply.org/jiscstrategyreview/feed/paragraphlevel/8-measuring-success/ for an example (and note the /feed/paragraphlevel/post_slug/ syntax used!)

I’ll be writing more in the next day or so about all the other ‘open data’ end points that we’ve developed during the JISCPress project.

“Campaign” Tracking With Google Analytics

Of the very many things that it’s possible to provide webstats reports about, such as tracking visitors arriving from organisational wbsites, one of the most useful is being able to track how much traffic has been driven back to your website from a particular link – such as a link included in a particular tweet, or in a particular email announcement, and so on.

If a link to a JISCPress document appears on a third party webpage, and somebody clicks on that link and then lands on the corresponding JISCPress page, Google Analytics will capture where that incoming visitor cam from via the Referring Sites report. At the top level this is organised by domain:

Google Analytics - Referring sites

We can then tunnel down to the page level:

More referrers

This is all well and good, but sometime we also might want to know where the person who posted the referring link on their web page got hold of it. Did they capture it from a tweet, for example, or via an email list? When we releas a URI into the wild via some sort of marketing campaign, what sort of life does that URI have, and where will it end up sending traffic back from?

In the Googe Analytics FAQ answer How do I tag my links?, a method is described for adding additional tags to a referral URL (that is, a URL that you publish and/or distribute more widely that refers back to your website) that Google Analytics can use to segment traffic referred from that URL. Five tags are available (as described in Understanding campaign variables: The five dimensions of campaign tracking):

Source: Every referral to a web site has an origin, or source. Examples of sources are the Google search engine, the AOL search engine, the name of a newsletter, or the name of a referring web site.
Medium: The medium helps to qualify the source; together, the source and medium provide specific information about the origin of a referral. For example, in the case of a Google search engine source, the medium might be “cost-per-click”, indicating a sponsored link for which the advertiser paid, or “organic”, indicating a link in the unpaid search engine results. In the case of a newsletter source, examples of medium include “email” and “print”.
Term: The term or keyword is the word or phrase that a user types into a search engine.
Content: The content dimension describes the version of an advertisement on which a visitor clicked. It is used in content-targeted advertising and Content (A/B) Testing to determine which version of an advertisement is most effective at attracting profitable leads.
Campaign: The campaign dimension differentiates product promotions such as “Spring Ski Sale” or slogan campaigns such as “Get Fit For Summer”.

(For an alternative description, see Google Analytics Campaign Tracking Pt. 1: Link Tagging.)

The recommendation is that campaign source, campaign medium, and campaign name should always be used.

Elsewhere, (Library Analytics (Part 7), from which elements of this post have been taken), I considered how these codes might be used to track course referrals to Library resources from a VLE (something I need to revisit, now I’ve had a little more time to consider the possible role(s) of these tracking codes). But it also seems to me to be reasonable to raise a few questions about how we might use these tracking codes in the context of a document on JISCPress or WriteToReply in order to track referrals back to the site from social media campaigns highlighting a particular document or section of a document.

So, what are sensible mappings/interpretations for the campaign variables? Remember, these tracking variables are parameters that we might add to a link that we have posted somewherethat is intended to drive traffic back to the site. The tracking variables are there to allow us to see how different links are performing. Thinking about how we might use these five tracking dimensions, whether or not we use them in the “intended” Google Analytics way, may also provide us with some ideas about how to use links to drive traffic back to our site.

To try and ground the exercise, consider this example: a new document is published on JISCPress and we want to compare how well links posted on Facebook compare with links posted on Twitter for driving traffic back. For tracking to be most effective, we hope that if a link is rebroadcast or shared, the tracking variables are carried along with it. This means that if a link is posted to Twitter, that gets shared onto Facebook and onto a blog, we can look at the traffic that comes back, and from where (via the Referral tracking described at the start of this post), for each of the separately released URIs. A second example might relate to a campaign intended to drive traffic back to a particular section or paragraph of a document. This campaign might involve publishing a link back to the same paragraph in a series of separate posts or status updates, each with a different slug or call to action message. That is, each link+message may be published in the same place (and hence have the same referrer information), but at different times and with different link text, or contextual information. A third example might be where there is more than on link back to the same document on a web page, and we want to track how effective each link is compared to the others?

Here are the supported variables again:

  • source: the obvious thing to use this variable for is the domain or URI of the page where the link is published to. So if we tweet a link, twitter.com might be sensible. If we blog it, actually might be best?
  • medium: this is intended to refer to the sort of link that has generated the traffic, such as a banner ad. In our case, we might clarify the intent with which the link was posted, such as announcement, or question;
  • term: this is an optional parameter, and I’m not sure how it should be used or whether it conflicts with other Google services. If we post something with a hashtag on twitter, or a st of tags on delicious, might we use those tags are terms?
  • content The second optional variable, this is often usd to discern A/B test ads. If we tweet the same link with different call to action/prompting questions, maybe this differential content should be uniquely identified with the content field?
  • campaign: typically used for tracking a promotion or campaign, this field might be used to identify a different document when, for example, a link to the top level JISCPress is referred to in a announcement about a particular document?

So for example, we might have something like:
http://writetoreply.org/?utm_campaign=ukgovurisets &utm_medium=announcement&utm_source=actually
appearing as the link for WriteToReply in an announcment about the hosting of the UK Government URI Sets document.

Or maybe a call to action on twitter relating to a particular part of a document:
What benefits would you like to see from #JISCRI calls? http://writetoreply.org/jiscri/2009/03/11/rapid-innovation-projects/#3?utm_campaign=jiscri &utm_medium=question&term=JISCRI&utm_source=twitter.com&utm_content=slug3

To support the generation of tracking URIs, a URL Generator Tool (like the official Tool: URL Builder) that will accept a tweet, for example, along with a JISCPress/WriteToReply URL and then automatically create tracking variable values might be worth considering?

Thoughts on JISCPress

As we come to the final month of the JISCPress project, we had some great news over on WriteToReply last week where we were able to announce that Eduserv would be covering our hosting costs for the immediate future (Eduserv funds hosting for WriteToReply, eFoundations: Write To Reply).

So what exactly does the platform we’ve been working on have to offer? Here’s one of the ways I think of it…

A document publishing platform that automatically atomises documents to the paragraph level, allows aggregated commenting at the paragraph and ‘user’ level, and supports the republication and re-presentation of documents in a variety of standard formats at the document level.

The first part of the process is the (manual assisted) ingress stage, in which documents are imported into the WordPress environment such that each substantive document section ideally maps onto a single WordPress “blog post”:

An RSS for the document as a whole, with one item per section, is generated automatically by the WordPress platform. A single item RSS feed is also generated for each page (so the content of each page can be easily transported around the web).

The second part of the process is the atomisation of each post, carried out automatically by the Digress.It theme, in which each paragraph in the document is given its own unique URI, derived from the URI of the web page (“blog post”) the paragraph appears on:

Potentially, an RSS feed can also be produced for each page in which each paragraph is a separate feed item, thus allowing a page/section to be transported around the web via a single feed, but in atomised form.

The paragraph level chunks produced by the atomistation process can be transcluded as independent elements in independent web documents in other documents by a variety of means (as an embeddable object, via XML, txt, JSON, etc):

The default nature of the WordPress platform allows comments to be made at the level of each web page, with an RSS feed of comments for each page being published ‘for free’. JISCPress extends this functionality by allowing comments to be associated with discrete paragraphs. Views over the comments are also available at the user level, (that is, grouped according to the user who made the comments, wheresoever they are made in the document). An additional RSS fed of comments by user is also available, which means that a document on the platform can actually be used as a scaffold for a critical response to the document by a particular user.

A further level of innovation is based on the automated generation of ‘semantic tags’ at the page level. Once generated, tag based collections of posts can be syndicated in the normal way via WordPress generated tag based RSS feeds:

JISCPress also benefits from the Trackback mechanism implemented by WordPress. When a page or paragraph URI is linked to from a third party web page, a trackback to the originating page may be captured, which we interpret as the automated capture of links remote annotations or comments about the document.

When considered in these terms, the JISCPress/WriteToReply platform is seen to provide a powerful means of publishing documents in which individual sections may carry their own unique URI, and individual paragraphs within a section also contain their own unique URI (which in many situations may be rooted on the section URI).

The platform can also be regarded as republishing – or re-presenting – each section (i.e. page) and each paragraph as an independent entity. That is, whenever a document is published via the platform, each separate paragraph may also be thought of as being independently published “for free”, in the sense that:

– each paragraph is independently addressable,
– each paragraph is independently commentable, and
– each paragraph is independently republishable/syndicatable.

So, given that, can you think of any ways in which the JISCPress/WriteToReply platform can support your document publishing and comment gathering strategy?

I am at Lincoln LocalGovCamp, where 30 o…

I am at Lincoln LocalGovCamp, where 30 or so people have gathered to create an unconference around improving local government online. This morning, I started a session on online consultations where I talked about WriteToReply and the development of our ideas and the platform through the JISCPress project. There was a lot of positive feedback and twitter back channel chat about our work which was really encouraging. People seemed to appreciate our efforts around making the platform a source for open data via the URI switches, RSS feeds and Triplify end points. I’ve just given a five minute video interview where I introduce WriteToReply and JISCPress. It should appear on http://www.lgeoresearch.com/ soon.

Paragraph Embedding from JISCPress

One of the things I was keen to explore within the context of the JISCPress project was the potential for using WordPress as a platform for publishing paragraph level fragments that could be embedded in third party web pages.

As Joss announced on the JISCPress blog, We’ve got paragraph data output switches! that expose paragraph level content through a unique URI in a variety of formats (xml, txt, html, rss and json), as well as object embed codes for each paragraph, though I’m not sure if this is going to be maintained…? e..g at the moment, I think we’re trialling literal text blockquote embeds:

Blockquote embed

(If the object embed does disappear, similar functionality could be achieved using the JSON feed and a Javascript function, though I guess we need JSON-P (i.e. support for something like &callback=foo to make that really easy.)

See also: A Quick Update for a review of the latest feature releases within the digress.it theme we’re using.

To demonstrate one possible use case for object embedding, see the post Engaging With the Issues Raised By The Google Book Settlement which includes three embedded paragraphs from the JISC’s current consultation around the Google books settlement.

Embedding content from write to reply

Here’s the actual HTML:

Embedding content from WriteToReply

Note that currently there is an issue with sizing the embed container (can any CSS gurus out there give us a fix?

Object sizing issue with WTR embeds

Ideally we need to identify the container height and then size it automatically so there are no scrollbars? I’m guessing .scrollHeight might have a role to play in autodetecting this?)

One thing you might notice is that the URIs for the embedded consultation questions follow a similar pattern – only the paragraph number identifier changes:
http://writetoreply.org/googlebooks?p=8&digressit-embed=4

What this means is that we should be able to pull in a random paragraph by constructing a URI with a randomly generated paragraph number. So for example:

If you reload the page, you have an 80% chance of seeing a different question…

Here’s the Javascript snippet:

var n=2+Math.floor(Math.random()*5);
var o=document.createElement('object');
o.setAttribute('style','width: 100%; height:70px;');
o.setAttribute('id','61c197964762012d4819093ebeee4fcf');
var p='http://writetoreply.org/googlebooks?p=8&digressit-embed='+n;
p=p.replace(/#038;/,''); //get round WordPress escaping everything...
o.setAttribute('data',p);
document.getElementById('wtr_embed').appendChild(o);

//There’s a div with an appropriate id attribute (‘wtr_embed’) also added to the page…
//Note that the div needs to be placed before any inline Javascript in the page;-)

I’m not sure yet if we can track the use of embeds (certainly server logs should be able to track calls, but these probably can’t be captured using Google Analytics?), but it’s still early days…

A quick update

A lot of development is happening right now, so I thought I’d write a very quick summary to keep people informed.

Firstly, version 2.2 of the digress.it plugin was released yesterday. Remember that the JISCPress project bootstrapped the re-development of CommentPress (which has been at v1.4.1 for over a year now, I think) and we helped Eddie release digress.it v2 back in mid-August.  We’ve had seven releases since then and v2.2 finally brings IE6 compatibility with it (IE7 came in v2.1.7). It’s feels stable now and provides pretty much the same experience across browsers. Performance is superb on a modern browser like Chrome 3, Firefox 3.5 or Safari 4. I’ve found that with wp-super-cache installed, too, pages are rendered in a snap.

I’ve also started to document the features that come with digress.it. Some of the really interesting stuff isn’t immediately obvious, like the incredible range of RSS feeds that are now available and the switches for RSS, JSON, XML, HTML and text. @paulgeraghty asked on Twitter whether this might be ‘micro-content’. I’d be interested to know if there are other CMS platforms that provide a formal method of obtaining document data at the paragraph level.

http://test.jiscpress.org/?p=15&digressit-embed=1&format=xml
http://test.jiscpress.org/?p=15&digressit-embed=1&format=text
http://test.jiscpress.org/?p=15&digressit-embed=1&format=rss
http://test.jiscpress.org/?p=15&digressit-embed=1&format=html
http://test.jiscpress.org/?p=15&digressit-embed=1&format=json

And remember that this is in addition to the full document or document section level RSS feeds that are built into WordPress.We’ve also introduced RSS feeds for each comment author and for the discussion around each paragraph, so if you want to follow one particular person or a discussion around one particular paragraph, you can.

We’re still working on ways to provide an easy way to copy and paste some code and embed a paragraph in your own site, while at the same time giving us a paragraph-level trackback. We’ve been trying various different methods but none of them have worked so far. We’re close though. If you’ve got any ideas for how this might be achieved, please leave a comment 🙂

Alex has been working hard on platform-wide features. He recently uploaded his ‘related documents’ code which looks across the entire platform of documents and makes suggestions for related document sections in the page sidebar. What’s especially interesting about this is the way this is achieved as a background service that runs periodically (you choose how often) and uses the OpenCalais API to provide contextual tags and the Yahoo! Term Extraction API to extract terms from the document. The relevancy of the tags received can be adjusted and author entered tags are also taken into account. These three different methods of mining the document ensure that the document sections that are ‘advertised’ to readers are relevant to the document they are currently reading.

Alex has also been working on integrating Triplify with JISCPress (and WordPressMU).

Triplify is a small plugin for Web applications, which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.

In practice, this means that the semantic structures for each JISCPress document are now available as RDF triples. Click here and you’ll get an XML/RDF file for a single document. Alex has also written a plugin for WPMU which will work with Triplify and allow the document author to include a license of their own choice in the RDF. Finally, he’s been testing this with Talis’ Connected Commons triple store and now has the WPMU plugin pushing RDF triples to Talis where they can be queried and mashed up using the Talis API. His work on this should go up on our Google Code site in the next few days.

It all needs testing and tweaking a little more, but the substantial part of the work on these three plugins has been done and now it’s a matter of refining them and integrating the platform as a whole and documenting it thoroughly. We’re always interested in what you would like to see the JISCPress project achieve, so please take a look at our UserVoice site and add any suggestions you might have.  We’re also tracking Issues about JISCPress on Google Code and Issues specifically about digress.it on the digress.it Google Code site. You can also get the development code for digress.it there, too.

More soon!

We’ve got paragraph data output switche…

We’ve got paragraph data output switches!

http://test.jiscpress.org/?p=15&digressit-embed=1&format=xml
http://test.jiscpress.org/?p=15&digressit-embed=1&format=text
http://test.jiscpress.org/?p=15&digressit-embed=1&format=rss
http://test.jiscpress.org/?p=15&digressit-embed=1&format=html
http://test.jiscpress.org/?p=15&digressit-embed=1&format=json

Err, I’ll let Tony explain what we might do with those…

Yesterday, I posted to the JISCPress mai…

Yesterday, I posted to the JISCPress mailing list about our use of semantic technologies. It’s a useful summary of where we are. I spoke to Leigh at Talis today and he thinks it’s a good approach with many potential benefits. He’s giving us access to the Talis Connected Commons platform. http://groups.google.com/group/jiscpress/browse_thread/thread/d2e69455c72f724a

Introducing digress.it

At the core of JISCPress is WordPress Multi User and CommentPress. CommentPress is now called digress.it. You can read about, test and download the latest version of digress.it from the WordPress plugin repository.

Commenting in digress.it

It should be pointed out that while the JISCPress project is brand spanking new, the CommentPress/digress.it project is officially two years old and the product of much research, development and testing of document publishing and annotation in a networked environment. I have blogged/raved about CommentPress before, and I encourage urge you to read about the background of CommentPress/digress.it over on the Institute for the Future of the Book’s original CommentPress site.

You’ll see how digress.it has evolved from the original GAM3R 7H30RY 1.1 (Gamer Theory) book site, to Mitchell Stephen’s paper, The Holy of Holies: On the Constituents of Emptiness, which was inspired by Jack Slocum’s WordPress system built for the drafts of version 3 of the GNU General Public License. The next iteration of digress.it was the Iraq Study Group Report and The President’s Address to the Nation, January 10th, 2007. These were followed by HASTAC’s draft paper on The Future of Learning Institutions in a Digital Age and finally by Kathleen Fitzpatrick’s paper, Scholarly Publishing in the Age of the Internet (no longer available).

digress.it is a significant rewrite and development of CommentPress and I’m really pleased that the JISCPress project is not only using it as a core technology but also contributing quite heavily to its further development. CommentPress is already popular in Higher Education for the critique of texts by students, the open peer-review of manuscripts, the peer-review of published books and to solicit comment on Institutions’ policy documents. It has also been used by the UK Government looking for feedback on their Innovation Nation strategy. So just as JISCPress benefits from more than two years of open source development of CommentPress, we hope that apart from the JISCPress platform itself, Educators and the public sector will benefit from the improvements we make to digress.it. We know that difficulty meeting WCAG accessibility guidelines has meant that CommentPress couldn’t be more widely used in the Public Sector and this is one of the first tasks that we’ll be addressing in the JISCPress project.

If you want to have a say about the development of digress.it for JISCPress (remember, all code is open source and can be used for any other WordPress-based project), then post your thoughts to our UserVoice site. We’re always open to suggestions.

#jiscri #talis Alex has checked in his W…

#jiscri #talis Alex has checked in his WPMU Triplify script to Google Code. http://j.mp/78ywC This creates Linked Data RDF triples (or JSON) directly from the WPMU MySQL database. I wrote about this a while back: http://j.mp/2zXiR Our next step is to automate the upload of the RDF to Talis Connected Commons for use through their API: http://n2.talis.com/wiki/Platform_API The idea is that if the JISCPress platform was populated with hundreds/thousands of documents, each of these documents could become Linked Data (http://www.w3.org/DesignIssues/LinkedData.html). The content would be stored on the JISCPress platform but the RDF triples, pushed nightly to Talis.

Innovation as a Side Effect… JISCPress and the JISC Strategy Review

The eagle eyed amongst you may have noticed that we recently republished the JISC Strategy Review 2010-2012 on WriteToReply in part as a way of field testing the new digress.it theme that has been under development as part of our JISCPress project.

Some time ago, I remember reading a book by Gary Hammel (with Bill Breen) on “The Future of Management” that included a model referred to as the innovation stack.

The model was pyramidal, and comprised four layers – at the bottom, operational innovation; sitting on top of that was product or service innovation, followed by strategic innovation, and at the top, management innovation.

Now I’ve never done an MBA, so my reading of this book may be out of line with a ‘traditional’ reading of it, but here’s what came to my mind when we originally floated the idea of republishing the JISC Strategy Review on WriteToReply, offered as a straw man…

  • operational innovation: Dev8D and the development approaches encouraged in JISCRI projects represent operational innovations; publishing documents on JISCPress is an operational innovation aimed at helping JISC programme managers clarify project calls and JISC project teams shape their bids and disseminate their results;
  • product/service innovation: in many cases, the JISC calls for projects seek to encourage product or service innovations, as well as operational innovations; as a hosted service, the JISCPress platform can be seen as a service innovation, running either as a centrally hosted service, or as a document platform in its own right hosted by an institution itself.
  • strategy innovation: to a certain extent, programmes like the #ukoer programme represent operational steps that may support a strategic innovation in the way HEIs disseminate the fruits of their scholastic endeavours. The idea of Open Repositories and Open Science also operates at the level of strategic innovation. I think I’d be pushing a little more than I already am to find a strategy innovation role for JISCPress!
  • management innovation: JISC Reviews are often disseminated to PVCs and research managers on an “I2I” (institution-to-institution) basis. JISCPress breaks that… badly. JISCPress allows anyone to comment and provide their own response directly to JISC, rather than necessarily representing the traditional response from the top of the strategy/research/IT management hierarchy within the institutions.

So, with that warm up exercise over, it’s time for me to get stuck in to reading the JISC Strategy Review properly… Hmm, now I wonder, does Hamel’s innovation pyramid map in any way onto JISC’s strategy for innovation across the HE and FE sector….?!