This is so far just a proposal – I did not fully test it out yet, but have exlored options for two days and this seems best, for my requirements. (And thanks to mantas in the comments, whose tip for Docear lead to this major rework of an earlier article, which was using Freemind instead.)

Requirements

  • literature database, with both metadata and fulltexts
  • adding to the literature database collaboratively, synced to everybody in a group
  • creating notes right when reading
  • creating notes on documents collaboratively, synced to everybody in a group
  • flexible means for organizing note collections, with links to the part of the annotated document
  • visually navigable documents when writing (no, not TeX or HTML source)
  • ability to draft scientific articles and theses in a mindmap-style, "notes organizing" mode
  • ability to write scientific articles in WordPress
  • ability to write scientific articles in LibreOffice, with easy web publishing both as HTML in a WordPress site, and as PDF
  • automatic creation of a references list when writing, both in WordPress and LibreOffice
  • both web-published (HTML and PDF) and local versions shall have working hyperlinks to cited works, including links to the page
  • tools should work cross platform (Linux, Windows, Mac)
  • free and open source software as much as possible

Solution Proposal: Docear + Zotero

Tool collection

My proposal uses the following tools, which you should install (available for both Windows and Linux):

Workflow

[TODO: Show how to use Dropbox or an open source replacement for it to sync your (cited, and only those) fulltext files with annotations to your server. And how to password-protect those fulltext files that cannot be published to everybody, using an efficient command that adds to a .htaccess file.]

  1. Document collecting. Whenever you find an interesting document on the web, use the Zotero browser plugin to create a literature entry in your Zotero group. Use tags to mark documents you have read and want to read. Create attachments to the entries, linking to the online fulltext of these articles.
  2. Finding what to read.
    1. Use your Zotero browser plugin or standalone application to browse through your group's shared bibliography and select items you want to read.
    2. Download them with Zotero to your "Literature" directory and name them according to their key in Zotero, plus your author shortname (example: Gantt2001.matthias.pdf). This is because this file will contain your annotations, and files with other author suffixes will contain their annotations.
    3. Add a (hypothetical) web links to your annotated file as attachment to the respective literature entry in your Zotero group. This will enable other researchers in your group to see your notes, once you read the file and uploaded it to this place.
  3. Reading and annotating. All files you want to read will now appear in your Incoming.mm in Docear, because you added them to your Literature directory. Order them into your reading list in Literature & Annotations.mm. Click the links there to start reading, with PDFXChange as your PDF reader. Create annotations, bookmarks and text highlighting as desired in PDFXChange, and save. To simplify your later task of finishing your own articles, you should include in every bookmark, annotation and highlighted text the source, in brackets; for example "[Gantt2001, pp.21]".
  4. Mobile reading and annotating (optional). [TODO]
  5. Note organizing and drafting with Docear. 
    1. Click "Refresh" in your Docear's Incoming.mm to get your PDF bookmarks, notes and annotations into Docear.
    2. If you intend to create a thesis, you want to use a dedicated mindmap for its draft, else drafting multiple articles in one mindmap seems most comfortable.
    3. I like to create an "inbox" node in the mindmap and collect all my ideas into that whenever you don't have the time to find a more appropriate place for them.
    4. Create an initial table of contents as a hierarchical node structure in your article / thesis draft mindmap.
    5. Add, copy, move nodes and rework your table of contents hierarchy until the textual part of your thesis is done. For me that means, until the written thoughts form collections that I consider drafts for the article(s) I want to write. Finishing the wording and getting the sentences connected properly can be done more comfortably in LibreOffice.
    6. You can click the links of bookmark, annotation and highlighted text notes to look them up in their original document, which is in my view the greatest feature in Docear.
  6. Writing and publishing with WordPress.
    1. Sync to Zotero. That is, sync your Zotero literature database to the Zotero server; necessary because it will be accessed by your ZotPress and Enhanced-BibliPlug plugins.
    2. Copy and paste your article draft. You can just copy your complete list of draft notes into a WordPress post and start editing it from there.
    3. Finish your article. 
    4. Replace text refs with real literature references. You will of course lose the link information then, but as you added the literature refs before to the bookmarks, notes etc. (like "[Gantt2001, pp.21]"), this is not much of a problem. Just double-click to select a literature key of such a reference, copy it, then insert a corresponding literature reference with ZotPress by pasting the key in its dialog. [TODO: How can this be more automated? Selecting the text and using a key combination would be enough. Maybe possible by configuring the rich text editor for inserting a special tag that way, if ZotPress recodnizes literature references in text based on tags.]
    5. Insert links to fulltext files (optional). You can make the "p.xy" part of literature references a regular hyperlink to a PDF file, with a subpart marker to let it go to the page of bookmark conforming to the citation. And for faster own use that also shows your annotations etc., you can add the an analogous link afterwards, leading to the corresponding offline file on your own computer using a file://~/Data/Literature/key.authorname.pdf scheme. I would recommend to make the public link resolve directly to the original article on the web, if possible via a DOI resolver, and not to your annotated version, proving that you cite unchanged material. You might add a link to your public, annotated version afterwards. I like to use Unicode icon characters for these last two links for compactness. My proposals:
      • ⛁: "Squared Plus" (U+229E) for "annotated document" and "White Draughts King" (U+26C1) for "local hyperlink" because it looks like the missing hard disk symbol.
      • ⌂: "Strictly Equivalent To" (U+2263) for "annotated document" and "House" (U+2302) for "local hyperlink".
    6. Upload fulltext files (optional). Upload them into a directory of your own server; using Zotero server space is also possible, but can be quite expensive. Needed when you inserted links to individual pages etc. in them, see above. Also needed when you inserted links to them as files in your reference list, which can be needed when the work is not publicly accessible. You will have to password protect access to these links, but at least you can thus share these files with the researchers in your group.
  7. Writing and publishing with LibreOffice. The advantages over WordPress are more advanced formatting options in LibreOffice, and the option to create a high-quality PDF version from it without manual efforts. Which is both needed when writing for journals, but else, writing directly in WordPress seems like the simpler variant for writing about your research and publishing it on the web. Except of course you have no Internet access, then LibreOffice is best (even if you just use it in HTML mode, basically as a HTML editor for later copying into WordPress).
    1. Copy and paste your article draft. Just as when writing for WordPress. If it's a long and structured draft in Docear, like for a thesis, you better export it as follows, to get your table of contents structure over into LibreOffice as well:
      1. Unfold the mindmap so that all notes belonging to TOC headings are visible, but not more.
      2. Then export to .odt format in Docear. The visible nodes will become the headings there, and thus will be included into the TOC when you add a "Table of Contents" directory into your LibreOffice document.
      3. Open the .odt document in LibreOffice and transform the formatting from hard-coded to style-based. [TODO How can this be done efficiently?]
    2. Replace text refs with real literature references. Just as when writing for WordPress. [TODO: There has to be a feature to convert selected text into a literature reference.]
    3. Insert links to fulltext files (optional). Just as when writing for WordPress. But to link to local fulltext files, you can't use the "file://" URL scheme in LibreOffice, such links are automatically converted to "smb://". Instead, use relative paths to files, without any directory prefix. That's because you will view your article locally in PDF format after it's finished, and then it resides in your own "Literature" files directory, as your own publication alongside all those of others.
    4. Add a link to the PDF version. At the beginning of the document, create a link to the PDF version of it. As this PDF will be part of your regular literature database, just create an entry in Zotero for it and link to it with the same scheme you use for linking to online fulltext versions of cited works.
    5. Remove links to local fulltext files in a copy (optional). Links to local fulltext files are only relevant for yourself, so you might want to hide them in the version that is published on the web. Removing these links is simple, even though LibreOffice supports no links in conditional text: if you used symbols as link text, just search&replace them with nothing.
    6. Export to PDF. Be sure to check "Export links relative to the filesystem." in the PDF export options in LibreOffice. Place the PDF into your "Literature" directory.
    7. Upload fulltext files (optional). Just as when writing for WordPress. Includes uploading the PDF version of your own article.
    8. Publish in WordPress. Should be as simple as exporting from LibreOffice to HTML, copying the HTML into WordPress and (optionally) replacing the textural literature references with ZotPress references. This however means manual work whenever doing changes to the article, so probably you better leave that out.
  8. Viewing your web-published content. It is not ideal to use the Firefox integrated PDF viewer for viewing annotated PDFs; while it correctly understands subpart links (like #2 for "page two") and shows annotations and bookmarks, it does not show highlighted text. So better open links to annotated PDFs in Adobe Reader or PDF-XChange, except you know you're not interested in the highlightings this time.
  9. Cleaning the literature database. You might think from time to time that some document that you once included in the literature database is not worth storing any more. But if you cited it, you should. So there should be a command to determine if you cited a specific work, identified by its BibTex key (which is the same as its file basename). If not cited, delete the file and all references in Docear mindmaps.

Reasoning

My last thesis was written in OpenOffice.org (now LibreOffice), and the one feature I really missed there was a comfortable way of moving notes, snippets and other parts of text around. Because getting a thesis done is, for a good part, just that: ordering and integrating notes and snippets that you wrote whenever you had an idea or insight.

Since that thesis, I became a FreeMind enthusiast, using it for nearly all my personal information management. It's truly great in this "ordering notes" task because three aspects come together: it's speedy enough in displaying and dragging even large mindmaps; it provides good optical clues for quick visual navigation in a huge hierarchical content structure, making it mentally less strenuous to work with it (unlike scrolling text in OOo or using OOo outline mode); it provides great keyboard shortcuts for quick navigation and reordering of nodes in the hierarchy. So I looked for a mindmap-based tool for scientific writing, and the above is what I found.

My reasons for individual aspects of this solution:

  • Why Zotero. This is my current recommendation for a reference management tool for LibreOffice, see my evaluation of reference managers for LibreOffice. Of course you could try to use Docear with integrated JabRef reference manager for reference management, see the discussion of the alternative below for the reasons why Zotero is used here. In any way, you will want a separate reference manager rather than treating references as just another node that you organize with Docear. Becuase a mindmapping tool is rather poor with structured data, it's no all-understanding knowledge management system by itself. Also, collaborating on a common bibliography would be difficult with such an approach.
  • Knowledge, not just cited works. This is a knowledge management system; that's why the literature database contains everything you think is worth storing, and not just works you cite or have annotated or have even read.
  • Docear for note management, not reference management. Note that we don't use Docear for literature reference managemet at all, we don't export anything from Zotero to Docear (although that's possible). That's because so far, there is no way to insert text with inline references from Docear nodes into LibreOffice so that it will contain LibreOffice literature references, and also no way to copy or insert a complete node with a link to a literature work from Docear into LibreOffice so that it will appear there as literature reference. The only thing we could do is use JabRef and JabRef plugins for LibreOffice to manually select literature entries from the Docear bibliography and create LibreOffice literature references from them. But that can equally well be done with Zotero and its LibreOffice plugins, plus, Zotero has a more efficient way of collecting document metadata when browsing the web, which is why we avoid JabRef. So there's no meaning in having the literature database available in Docear at all: Docear is used for notes and drafting, while references are only needed when finishing an article. Of course it still has to be easy to detect to what work of literature a note or quotation in an article drafted in Docear belongs; but that is possible in our system, as the PDF file link will contain the literature key as its filename, a scheme that has been tried and tested in practice.
  • Why not Semantik? Looking for mindmapping software for scientific writing, I found Semantik, a tool developed with just this purpose in mind. However, I found its current (as of 2011-08) version in several aspects more limited than FreeMind, while I found Docear to be more powerful than Freemind, esp. because of the PDF annotation management.
  • Why no collaborative annotating? In a group of researchers it will have some benefits if all can annotate the same work of literature and at the same time, see the annotations of others in there. In our solution, there is one file per author with his / her annotations on the contained work of literature. By downloading others' annotated versions into your Literature directory, Docear will import these annotations as well, so that the set of all annotations from all authors is then available in Docear instead of in a single PDF. This is a much simpler system, as it avoids concurrent edits and sync conflicts when editing one file in parallel. No functionality and not much convenience is given away.
  • Why Chromium? Because some WordPress rich text editors, or at least some functions of them, are horribly slow on some systems in Firefox because of some JavaScript functions being slow there. If it does not affect you, you may as well use Firefox.
  • Why not CiteULike? Zotero was preferred above CiteULike because it is completely free software (though this is not a big issue, as CiteULike offers your literature data in various download formats for you).

Alternative Solution Proposal: Docear + JabRef + BibSonomy

There is an alternative solution without Zotero, thus saving one separate software and also integrating funcionality tighter. However, it misses the sophisticated metadata grabbing features of the Zotero browser plugin, and also the Zotero "scientific social bookmarking" online groups are much more and the service more well-known than newcomer BibSonomy (though they offer interesting features like a WordPress plugin). Still, some might rather want this solution. In this solution, you would use these tools:

  • BibSonomy website. For reference collecting in your team.
  • JabRef. The reference manager integrated into Docear.
  • JabRef standalone. Additionally to the Docear integration using it as a standalone reference manager, for some added options.
  • JabRef plugins for LibreOffice. To insert JabRef references into articles you write.
  • BibSonomy plugin for JabRef. To collaborate with your group via the BibSonomy "scientific social bookmarking" website as an alternative to Zotero online groups and CiteULike.
  • AcademicPress WordPress plugin. Together with an online copy of JabPress's BibTeX file if you want to write directly in WordPress
  • BibSonomy CSL plugin for WordPress. To show your complete group bibliography in WordPress

Setting up WordPress for this

  1. Download the Shoestrap theme [source]:
    cd <DOCUMENT_ROOT>/wp-content/themes/;
    git clone git://github.com/aristath/shoestrap.git;
  2. Create two or more variants:
    1. mv shoestrap shoestrap-default;
      cp -a shoestrap-default shoestrap-special;
    2. Edit file style.css of both your template copies and edit the "Theme Name:" line, configuring a unique name for both.
    3. In the WordPress template list (http://example.com/wp-admin/themes.php), all your template copies should now appear separately with their respective names.
  3. Go to "Appearance -> Shoestrap" and configure it to your needs, incl. a licence key. (It will be used for all installed Shoestrap instances,)
  4. Use the WordPress theme customizer ("Appearance -> Themes -> (choose theme) -> Customize / Live Preview") to customize your theme instances to your needs, and activate the theme instance that you want to be the default. Make sure to choose the correct theme activation options in the additional screen: if you have already a navigation menu, don't let it generate another one.
  5. Install the jonradio Multiple Themes plugin.
  6. Disable the Shoestrap theme activation options screen. This is helpful to save click work, because to configure a non-default theme for jonradio Multiple Themes, you have to activate it shortly, then activate your default screen again ["method 1" in the plugin's FAQ]. To disable this, edit the file lib/activation.php in all your Shoestrap instances and replace line 7 with:
    # wp_redirect(admin_url('themes.php?page=theme_activation_options?page=theme_activation_options')); # Original.
    wp_redirect(admin_url('themes.php?page=theme_activation_options'));

Configuring your multiple themes

In "Appearance -> Themes" you see multiple themes, one for each of the blog's differently styled areas. How to configure them:

  • The activated theme can be configured in the regular WordPress Theme Customizer ("Appearance -> Themes -> (active theme) -> Customize" or the bottom-left blue triangle in the frontend site when you're logged in).
  • To add and configure different (copies of) a theme for different parts of your website, the "Multiple Themes" plugin is used. See "Appearance -> Multiple Themes plugin".
  • To configure the appearance of a theme that is not "activated" as the default but used on some subpart of the site, edit it with the Appearance -> Themes -> (some theme) -> Live Preview" feature, and after saving re-activate your default theme. Detailed explanations here (under "method 1").

If you want to edit even more aspects of the templates, you can do so on CSS level (including the use of LESS variables):

  1. You should install the WPide editor plugin and use it to edit the theme's CSS files, as they reside in subdirectories where they can't be edited with the integrated WordPress editor. Alternatively, if you have SSH access, use a console based editor like vi of course.
  2. First, choose "Enable Developer Mode" in "Appearance -> Shoestrap" and save that settings page.
  3. Then edit this file in the concerned Shoestrap theme: assets/less/app.less.
  4. After all edits are done, disable developer mode again.

 

The WordPress Polylang plugin is a nice system to make your WordPress blog multilingual. However, if you had a different solution in use before, and have a lot of blog posts, migrating to Polylang is an effort that cannot reasonably be done manually. So I developed a little Ruby script polyglot2polylang.rb for that. It is meant to migrate your blog content from the (no longer maintained) Polyglot plugin. However, you can easily adapt it to work for all multilingual plugins that store different language versions inside one WordPress page / post / comment etc. by using special tags, for example <lang_en>…</lang_en>.

Installation

  1. Download the Ruby scripts and Gemfile from the polylang-migrate repository at Github, and save them all into one directory. Or just git clone it, of course.
  2. In that directory, execute bundle install to install necessary gems (basically just nokogiri).

Usage

  1. Export your WordPress blog content as WordPress WXR file using the "Tools -> Export" menu item.
  2. Syntax-check your exported WXR file. (Else, unexpected behavior can occur. For example, an unexpected closing tag with no corresponding opening tag will cause nokogiri to consider the current item as the last one, without any hint or warning. Which means that not all posts of the blog will be processed.) To check the syntax, for example use xmllint --noout, which will print parser errors on stdout and print nothing if all is right.
  3. Run the polyglot2polylang.rb script on the exported WXR file (see the instructions in the script).
  4. Make a backup of your complete WordPress database.
  5. Make a backup of all your media files: cd httpdocs/wp-content/; cp -a uploads/ uploads.orig/;. This is because when deleting the last media library entry that refers to a specific file, that file will be deleted, too. And we will have to delete all media library entries later!
  6. Install the WordPress Polylang plugin.
  7. Delete all posts, pages, comments, categories, tags and media library entries in your WordPress blog, using the WordPress backend interface. You can use the Bulk Delete plugin to speed up that task a bit.
  8. Make sure the files for the media library entries are available at the URLs mentioned in their <item> tags in the WXR file. Adapt the polyglot2polylang.rb script to modify these URLs accordingly, if necessary.
  9. Import the modified WordPress WXR file. When asked, check "Download and import file attachments."

The last step wil fail for WXR files with many attachments to download and import (incl. 500 Internal Server Error), except if your web server is configured to allow very long script execution times. You can either configure it to extend these execution times [instructions, of which step 1 is irrelevant now] or alternatively use this, rather clumsy, workaround (in analogy to what other people found):

  1. Use wxr-separate-attachments.rb (also in the to split your WXR file into attachments and content (posts and pages).
  2. Create a post (media file, page or post) with an ID that is higher than every ID that will be used by your imports. This can be done by creating some media file, then modifying its field "ID" in table wp_posts, via phpMyAdmin. This step is needed because WordPress will create a media file for every WXR file you upload for import, and if its ID is one that should be available to an imported post, the imported post's ID will get shifted instead, without any error message, and the IDs of all following imported posts likewise, producing total cofusion with respect to ID references in the WXR file.
  3. Import the WXR file with attachments into your WordPress blog as many times in a row until no error happens. After every import, some more media files will be successfully downloaded and imported. Of course, check "Download and import file attachments." every time.
  4. Import the WXR file with posts and pages.
  5. You will now have all content imported, but media library entries will appear as not attached to posts and pages [issue report]. This also happens when importing attachments after posts and pages. The solution is to create the attaching directly in the database, using phpMyAdmin to execute the SQL file <outfile>.attach.sql that was also created when you did run polyglot2polylang.rb.

Limitations

  • Workaround for title translations getting lost. As documented in the script, the WXR export will not contain Polyglot markup tags in the post and page titles, so these translations get lost during this process. You can however get the original titles as an export from your database (for example by exporting a one-column result set to CSV in phpMyAdmin) and then use an adapted version of depolyglot.rb to create SQL statements that will convert your artificially-unique titles back to the correct translations that did get lost.
  • Changing language-specific slugs. After this process, you will have the original post slugs for one language version and other language versions with an appendix ("-italiano" in the unmodified scripts). The polyglot2polylang.rb script generates another SQL file <outfile>.attach.sql that you can edit and execute to adapt these more to your liking, by doing proper translations. The original slugs are better not changed this way to keep URL compatibility with existing links, but you can edit them in your WordPress backend – WordPress then creates a 302 forwarder for the original one, and this also gets saved when backing up to a WXR file.
  • Better interface with the database directly. The whole process of transition is quite a complex, and further complicated by title translations getting lost in an WXR export, and timeouts of the WXR import. For that reason, if I had to re-do this task, I would write a script that directly operates on the WordPress SQL database. The database has a clear structure, and I'd rather like to deal with that than all these hacks. If there's no Ruby installed on the server, the script can also run locally and access the database with a remote connection. And, when enabled with some option, it could even ask the user for interactibely correcting slug names etc..

Further Information

WordPress offers different interfaces, and Ruby can access all of them. There are several scripts out there for these, but no "perfect one" yet. They are all in different stages of maturity and age, and none of them seems actively maintained. So, choose according to your personal needs and preferences:

WordPress WXR interface

I would propose you look through this list, ordered by my evaluation of the script's general quality, and choose the first that fits your needs:

  • translatour. A tool dedicated to make WordPress WXR files accessible in Ruby.
  • wp-import-dsl. A Ruby domain-specific language (DSL) to import WordPress WXR files to anything else. However, this seems not to be the best choice if you just want to do a few little changes to the WXR files, as you have to write the complete output rendering yourself. Last update 2011-06.
  • wordpress_import. A minimalist parser for WordPress WXR files, done in Ruby. Filters out junk tags and can be used as a library.
  • wordpress_to_word_to_ebook. x-ian's tiny script to convert a WXR file into a single HTML page. From there, you can follow the instructions on his blog to create an e-book from your WordPress posts, if desired. Uses nokogiri.
  • WordPress WXR to Postmarkdown. Snippets to create a script that can convert a WordPress WXR file into posts for the Postmarkdown gem, using Markdown markup. Quite elegant approach, using the Nokogiri and Upmark gems.
  • wxr_export.rb. A script to output WXR files, using the XML::Builder Ruby gem. It's a bit special because this WXR file is meant to only contain comments and be imported into the Disqus cloud-based comment system.
  • Typo Export to WordPress. Short Ruby on Rails script that can convert content from the Typo blogging system to WordPress.
  • wordpress2blogger.rb. Short Ruby on Rails script that can convert a WordPress WXR file to the XML file format expected for importing into the blogger.com platform. From 2008-04.
  • wordpress_importer. evan's script that can import a "homegrown blog" and convert its content into a WordPress WXR file. Last commit 2010-12.
  • refinerycms-wordpress-import. Script to convert a WordPress WXR file into content for the Refinery CMS. Contains a nice gem for that task, that could be converted to do a different conversion task as well.
  • hackWXR.rb. A tiny script to enclose some tag contents in WordPress WXR files into CDATA tags to prevent them from (slow) XML parsing when handling large WXR files with Ruby code. See also the author's related blog post.
  • splitWXR.rb. A Ruby tool to split a big WXR file into smaller pieces, to avoid errors when importing into WordPress. See also the author's related blog post. There's also a version including a graphical interface: SplitWXR.

WordPress XMLRPC interface

You can also interface with its XMLRPC interface. Alternatives, the most recommendable first:

  • WordPressto. A Ruby gem to access the WordPress XMLRPC interface. (This is an actively developed fork of the original johnl/Wordpressto, which was last updated 2010-09.)
  • wp_rpc. Appears like another fork of WordPressto, last updated 2012-10 (as of 2013-03).

WordPress JSON interface

WordPress can provide a JSON interface by using the JSON-API plugin for WordPress. Then, you can interface with this from Ruby by using:

WordPress REST API

You can use these Ruby tools to interface with the wordpress.com REST API:

WordPress database access

Still another way is to interface directly with the MySQL database of WordPress. Alternatives:

  • WP-Ruby. Tools to map Ruby objects to the WordPress database structure. Last updated 2010-07.

This was one of my project proposals for the Interactivos'13 open-source projects workshop in Madrid. It didn't get selected in the end, but if you feel inspired by this or want to implement this … feel free to do so. This is open content licenced under CC BY 3.0 or at your option, any later version.

Project Summary

For many professions, there's a home for collaboration on the Net: programmers have SourceForge and Github (and many more). Electronics engineers have Open Design Engine (opendesignengine.net) and Upverter (upverter.com). Writers have tools like EtherPad Lite and Google Docs. But artists and designers? Not one I'm aware of. 

Sure, there's deviantart.com and flickr.com. Huge platforms, but not collaborative at all: the only thing to do there is present your work and comment on others. "Fork Me on Art Hub" project wants to fill exactly that gap. It wants to place artists and designers into an open content "rhizome of graphical knowledge", where it feels like everybody collaborating with everybody else, and doing so without needing any special invitation.

Here's how: A web-based platform for social collaboration in artwork and design, grown around version management for artwork files with git, the promotion of open content licenses (like CC-BY-SA), and "uninvited contributions" by fellow designers and those previously known as "art consumers". This kind of "uninvited contributions" is well-known in the software world, for example called "forking" and "pull requests" on Github.

(Note: For the version management part, this project is complemented by the "Git for GIMP" project proposal. That project makes the workflow much more likable for designers, but initially the platform can also work without this and collaborate with SparkleShare for example.)

Project Description

The portal's functionality is best explained by assuming a "social network" type portal, plus the following features (listed by importance, and explained in detail in the "Project Description"):

  1. Free git project hosting. Artists and designers can register for free and host art project for free as long as they assign an open content license to it. All art projects are automatically versioned with git, and the different versions are also accessible via a web interface (gitorious is a nice base software for that). And not just the artwork will be in the git repo, also utility files and everything else needed to collaborate on an artwork, like scripts for generative art.

  2. "Fork Me" function to create derivatives. Like on Github, there will be a prominent "Fork" feature. Once you click this, it allows to initialize a new own git repo with the artwork in question, and to add own versions by building on original ones. Once you have something you want to contribute back to the original author, you can create a "pull request" for that version.

  3. One-click accepting of contributions. Ideally, it will be possible to include others' updates back into your own work by just clicking "accept" for an appropriate pull-request notification that pops up on the website. It would be possible to get pre-views of the changes before accepting a change, of course.

  4. Embeddable widget with "Fork Me" function. This is one of the most innovative aspects here. For embedding an artwork into a website, whether the artists own one or any other, the Art Hub platform provides auto-generated "embed code". That HTML snippet not only shows the artwork, but also a "Fork Me" button that takes the reader to the Art Hub platform and shows some easy steps to create a derivative artwork. And then, all derivatives are shown in a slideshow that is also accessible from within the embedded HTML. Which means that creating derivative works results in immediate publicity in all publications showcasing the original work – and the "consumer" is no longer consumer at all, but co-producer. Esp. for art-related publications it will be a lot of fun for the artistists and art-enthusiast readers to see the derivative works produced by their fellow readers.

  5. Social commit messages. To make the Art Hub system more enjoyable in spute of quite technical version management, the git commit messages for each new version should be split into a technical and social part. Giving thanks, making a funny comment etc. goes into the social part, and update notifications and pull requests on the web platform would should show these social parts of the commit message as well, alongside with the picture of the author, similar to the update notification feed found in social networks like Facebook.

  6. Derivative graph. With artwork, it's not like with software: given a set of derivatives, people will hardly ever agree on a best version, while in software all improvements are regularly merged into the main version. So with artwork, there will be many forks that do not get merged back into the original, and these should be shown as a tree-like graph of derivative works (incl. preview images) on a project's page.
    This would even be the main feature of this invention: allowing not just one version of a graphic to exist, but a lot of interdependent versions. (They can be all incorporated within one git repo, as branches that branch into even more branches.) Those who search for a work to incorporate can then look through all the variants. And it would be the work of the main graphic project's authors to provide a systematic collection of the derivatives that are the most relevant, in her view.

  7. "Getting derivatives" as reputation. Collaboration is also about culture: on Github, you can estimate the popularity of a project by looking at the number of followers and forks. And similarly, people creating derivative works should be considered a good thing on the Art Hub platform and their number would be shown prominently, to encourage the culture of sharing.

  8. Embedding option with automatic attribution. When generating the HTML snippet with the embed code, the platform also automatically includes proper attribution for all base works, in accordance with the artworks' licenses. This automatic attribution removes a major practical hazzle when dealing with open content photography and images: keeping track of sources and attributing correctly.

  9. art hub integration into FLOSS graphics apps. There would be plugins for major FLOSS apps (GIMP, Inkscape, MyPaint) to open and fork Art Hub art repositories directly from the Internet. (Note that these app plugins would manage a local git repo invisibly, no need to care about that.) When saving back to the repo (or a new forked repo) with the graphics application, a "new fork / derivative / pull request" notice will appear on the Art Hub platform. Tihs feature is for workflow improvement only, and not needed for a first working verison.

  10. Federation. Some artists may want to fork the platform itself and create their own self-hosted artist community. As the platform software will be free and open source, this is clearly possible. However there should be an actively promoted "federation" feature that allows a global search on all platforms that have it enabled, plus cross-platform forking of artwork projects.

  11. CC licence registry. The platform can also take over the role of a "copyright licence registry", here for open content licenses, as another way to promote collaboration among the arts. It's a platform to record the fact of people licencing their work, to avoid potential later legal hazzle.

  12. Automatic pingbacks for derivatives. Of course the platform informs the authors about derivatives created on the platform, but additionally it can search the web (with image similarity search etc.) for other derivatives and likewise create notifications for these.

  13. New collaboration option for large graphics. This software would allow new types of collaboration on large infographics etc., by creating placeholders at first, putting them together into the master graphic, then letting everybody work on fleshing out one placeholder each and feeding the changes automatically into the master graphic.
    Similarly, this kind of distributed, versioned graphics creation system should also work for multi-page DTP documents with lots of illustrations, like by integrating it with Scribus. So a lot of authors (including the general public) can work on creating a complex document, both the text and graphics.

This was one of my project proposals for the Interactivos'13 open-source projects workshop in Madrid. It didn't get selected in the end, but if you feel inspired by this or want to implement this … feel free to do so. This piece is open content licenced under CC BY 3.0 or at your option, any later version.

Project Summary

Version control software like git makes collaboration between programmers quite seamless: it can merge together their changes and lets them revert unwanted changes. Not so for artists and designers, where collaboration still can mean mailing files around with timestamps in filenames. That's slow and error prone, not the fun of simultaneous collaboration.

Projects like SparkleShare improve on that, bringing git to designers (and designers love it). But git was originally made for source code and not images, so it's always a manual editing effort to merge changes from two designers who did parallel changes to the same version of an artwork. Resolving all these conflicts manually is also no fun, and effectively blocks designers from experiencing git's true power: branching, for example. You could do some experimental changes to some artwork, exploring your own path or paths, while your collaborators proceed on the main version, fixing little flaws for example. Once you agree what experimental changes to include in the main version, git should do so for you. For source code, git can do so automatically. For GIMP images (or maybe MyPaint, Inkscape or Scribus files instead), this project will extend git with that ability.

An additional aspect of this project is that it complements the "Fork Me on Art Hub" project proposal, which is a git based art sharing platform with novel features that encourage collaboration between artists and those still considered "consumers of art". However, this project can also function without that special platform, as it can work with every git repository (like from GitHub, Gitorious, Bitbucket, or self-hosted).

Finally, here's the main technical innovation of this "Git for GIMP" project: "change instructions" for raster images. For now, when SparkleShare stores a new image version into a git repository, it does so as a binary file. Git can compare it to the earlier version and store only the binary diff to save space (see "git gc"), but it does not understand about its inner structure, so it does not know how to merge parallel changes. After this project, git will instead store an image version as aggregated "change instructions" for a base image. Informal examples of change instructions would be:

  • move layer "person 1" by 14 px to right and by 30 px to top
  • change transparency of layer "flare" to 30%
  • change image data of layer "shadow" by combining it with the attached overlay layer (which has RGBA enabled)

The last type allows git even to merge changes to the actual image data of the same layer. Namely, if they don't conflict (don't affect the same pixels). Note that working with image files is no different with this extended git: when checking out a specific version, git will apply the relevant change instructions to the base version and provide the requested version in your file system.

Project Description

The "Project Summary" contains all the major points about this project already, so here are just some more details about the idea and possible implementation, in a quite random order, one detail per paragraph:

The current situation of images in git / svn. There are several options to add handling of binary data to git [examples]. It seems that changes only create small increments in repo size (at least when using "git gc" garbage collection). This would be the same as in SVN then, as discussed with a Pixelnovel Timeline developer. However in all these cases, git and svn do not yet understand about the inner structure of the image files, so they cannot automatically merge non-conflicting changes.

The user's experience. From a user's perspective, the software should act mostly like SparkleShare (and will probably indeed be based on it!). So, a designer's work is synced to a central git repository and from there to teammates automatically whenever a change is saved. However, to enable advanced versioning like git branching, there will be a little git plugin for the chosen graphics application, probably GIMP, to enter the git commit message, choose or create a branch, revert to a prior version (ideally with thumbnail preview) and so on.

GIMP or Inkscape? The proposal is here so far to build a tool for putting OpenRaster images (from GIMP or MyPaint) into git repositories. This requires a completely new tool to extract the "change instructions" mentioned above, and to build new OpenRaster images by applying them. If that's too complex for a two-week workshop, a similar approach can be done for Inkscape's SVG files. With the advantage that they are XML text already, so it will require little effort to teach git how to merge parallel changes. The main effort would then be to develop a user-friendly git plugin for Inkscape that designers will love to use. (It should show incoming "pull requests" notifications when others have done changes to an open file, and the designer would accept them with a single click.)

OpenRaster, not XCF. In case that a pixel based graphics application is chosen for this project (like GIMP, which is the current proposal), it is advisable to use the OpenRaster format for storing the images. So, not GIMP's native XCF format, which is not recommeded as a data interchange format and mostly represents GIMP's internal data structures [source]. OpenRaster is included in GIMP since version 2.7.1 or 2.8 [source]. The additional advantage of OpenRaster is that it benefits multiple applications (like MyPaint) and allows collaboration between them. A disadvantage is that it is still quite a new, not much adopted file format – but nonetheless the proposed open standard format for raster images. Apart from OpenRaster and XCF, TIFF would be the only other format that could be used. However the modes of saving layer metadata etc. are normally proprietary, as TIFF is basically just a container format.

Deriving instructions from GIMP history? In GIMP's case, these "change instructions" might be derived from the GIMP's history feature. But maybe a better alternative is to derive them by comparing two versions of a saved file directly – as done in the world of source code by "diff".

Inspirations from Pixelnovel Timeline and ComparePSD. The closest existing product for version control in images is Pixelnovel Timeline, and it offers a lot if insights for a great workflow and user interface when developing version control software for designers – see http://pixelnovel.com/timeline . It is based on the SVN version control system, however it can only do linear versioning and rollback and needs manual merging for changes derived in parallel from the same version. Also interesting for UI design in this project is the Pixelnovel ComparePSD tool for comparing PSD files layer by layer.

Inspirations from Kaleidoscope App. There is an app for visually comparing differeing versions of an image, to spot differences optically: Kaleidoscope.