Scientific Writing and Knowledge Management with Docear

This is so far just a proposal – I did not fully test it out yet, but have exlored options for two days and this seems best, for my requirements. (And thanks to mantas in the comments, whose tip for Docear lead to this major rework of an earlier article, which was using Freemind instead.)

Requirements

  • literature database, with both metadata and fulltexts
  • adding to the literature database collaboratively, synced to everybody in a group
  • creating notes right when reading
  • creating notes on documents collaboratively, synced to everybody in a group
  • flexible means for organizing note collections, with links to the part of the annotated document
  • visually navigable documents when writing (no, not TeX or HTML source)
  • ability to draft scientific articles and theses in a mindmap-style, "notes organizing" mode
  • ability to write scientific articles in WordPress
  • ability to write scientific articles in LibreOffice, with easy web publishing both as HTML in a WordPress site, and as PDF
  • automatic creation of a references list when writing, both in WordPress and LibreOffice
  • both web-published (HTML and PDF) and local versions shall have working hyperlinks to cited works, including links to the page
  • tools should work cross platform (Linux, Windows, Mac)
  • free and open source software as much as possible

Solution Proposal: Docear + Zotero

Tool collection

My proposal uses the following tools, which you should install (available for both Windows and Linux):

Workflow

[TODO: Show how to use Dropbox or an open source replacement for it to sync your (cited, and only those) fulltext files with annotations to your server. And how to password-protect those fulltext files that cannot be published to everybody, using an efficient command that adds to a .htaccess file.]

  1. Document collecting. Whenever you find an interesting document on the web, use the Zotero browser plugin to create a literature entry in your Zotero group. Use tags to mark documents you have read and want to read. Create attachments to the entries, linking to the online fulltext of these articles.
  2. Finding what to read.
    1. Use your Zotero browser plugin or standalone application to browse through your group's shared bibliography and select items you want to read.
    2. Download them with Zotero to your "Literature" directory and name them according to their key in Zotero, plus your author shortname (example: Gantt2001.matthias.pdf). This is because this file will contain your annotations, and files with other author suffixes will contain their annotations.
    3. Add a (hypothetical) web links to your annotated file as attachment to the respective literature entry in your Zotero group. This will enable other researchers in your group to see your notes, once you read the file and uploaded it to this place.
  3. Reading and annotating. All files you want to read will now appear in your Incoming.mm in Docear, because you added them to your Literature directory. Order them into your reading list in Literature & Annotations.mm. Click the links there to start reading, with PDFXChange as your PDF reader. Create annotations, bookmarks and text highlighting as desired in PDFXChange, and save. To simplify your later task of finishing your own articles, you should include in every bookmark, annotation and highlighted text the source, in brackets; for example "[Gantt2001, pp.21]".
  4. Mobile reading and annotating (optional). [TODO]
  5. Note organizing and drafting with Docear. 
    1. Click "Refresh" in your Docear's Incoming.mm to get your PDF bookmarks, notes and annotations into Docear.
    2. If you intend to create a thesis, you want to use a dedicated mindmap for its draft, else drafting multiple articles in one mindmap seems most comfortable.
    3. I like to create an "inbox" node in the mindmap and collect all my ideas into that whenever you don't have the time to find a more appropriate place for them.
    4. Create an initial table of contents as a hierarchical node structure in your article / thesis draft mindmap.
    5. Add, copy, move nodes and rework your table of contents hierarchy until the textual part of your thesis is done. For me that means, until the written thoughts form collections that I consider drafts for the article(s) I want to write. Finishing the wording and getting the sentences connected properly can be done more comfortably in LibreOffice.
    6. You can click the links of bookmark, annotation and highlighted text notes to look them up in their original document, which is in my view the greatest feature in Docear.
  6. Writing and publishing with WordPress.
    1. Sync to Zotero. That is, sync your Zotero literature database to the Zotero server; necessary because it will be accessed by your ZotPress and Enhanced-BibliPlug plugins.
    2. Copy and paste your article draft. You can just copy your complete list of draft notes into a WordPress post and start editing it from there.
    3. Finish your article. 
    4. Replace text refs with real literature references. You will of course lose the link information then, but as you added the literature refs before to the bookmarks, notes etc. (like "[Gantt2001, pp.21]"), this is not much of a problem. Just double-click to select a literature key of such a reference, copy it, then insert a corresponding literature reference with ZotPress by pasting the key in its dialog. [TODO: How can this be more automated? Selecting the text and using a key combination would be enough. Maybe possible by configuring the rich text editor for inserting a special tag that way, if ZotPress recodnizes literature references in text based on tags.]
    5. Insert links to fulltext files (optional). You can make the "p.xy" part of literature references a regular hyperlink to a PDF file, with a subpart marker to let it go to the page of bookmark conforming to the citation. And for faster own use that also shows your annotations etc., you can add the an analogous link afterwards, leading to the corresponding offline file on your own computer using a file://~/Data/Literature/key.authorname.pdf scheme. I would recommend to make the public link resolve directly to the original article on the web, if possible via a DOI resolver, and not to your annotated version, proving that you cite unchanged material. You might add a link to your public, annotated version afterwards. I like to use Unicode icon characters for these last two links for compactness. My proposals:
      • ⛁: "Squared Plus" (U+229E) for "annotated document" and "White Draughts King" (U+26C1) for "local hyperlink" because it looks like the missing hard disk symbol.
      • ⌂: "Strictly Equivalent To" (U+2263) for "annotated document" and "House" (U+2302) for "local hyperlink".
    6. Upload fulltext files (optional). Upload them into a directory of your own server; using Zotero server space is also possible, but can be quite expensive. Needed when you inserted links to individual pages etc. in them, see above. Also needed when you inserted links to them as files in your reference list, which can be needed when the work is not publicly accessible. You will have to password protect access to these links, but at least you can thus share these files with the researchers in your group.
  7. Writing and publishing with LibreOffice. The advantages over WordPress are more advanced formatting options in LibreOffice, and the option to create a high-quality PDF version from it without manual efforts. Which is both needed when writing for journals, but else, writing directly in WordPress seems like the simpler variant for writing about your research and publishing it on the web. Except of course you have no Internet access, then LibreOffice is best (even if you just use it in HTML mode, basically as a HTML editor for later copying into WordPress).
    1. Copy and paste your article draft. Just as when writing for WordPress. If it's a long and structured draft in Docear, like for a thesis, you better export it as follows, to get your table of contents structure over into LibreOffice as well:
      1. Unfold the mindmap so that all notes belonging to TOC headings are visible, but not more.
      2. Then export to .odt format in Docear. The visible nodes will become the headings there, and thus will be included into the TOC when you add a "Table of Contents" directory into your LibreOffice document.
      3. Open the .odt document in LibreOffice and transform the formatting from hard-coded to style-based. [TODO How can this be done efficiently?]
    2. Replace text refs with real literature references. Just as when writing for WordPress. [TODO: There has to be a feature to convert selected text into a literature reference.]
    3. Insert links to fulltext files (optional). Just as when writing for WordPress. But to link to local fulltext files, you can't use the "file://" URL scheme in LibreOffice, such links are automatically converted to "smb://". Instead, use relative paths to files, without any directory prefix. That's because you will view your article locally in PDF format after it's finished, and then it resides in your own "Literature" files directory, as your own publication alongside all those of others.
    4. Add a link to the PDF version. At the beginning of the document, create a link to the PDF version of it. As this PDF will be part of your regular literature database, just create an entry in Zotero for it and link to it with the same scheme you use for linking to online fulltext versions of cited works.
    5. Remove links to local fulltext files in a copy (optional). Links to local fulltext files are only relevant for yourself, so you might want to hide them in the version that is published on the web. Removing these links is simple, even though LibreOffice supports no links in conditional text: if you used symbols as link text, just search&replace them with nothing.
    6. Export to PDF. Be sure to check "Export links relative to the filesystem." in the PDF export options in LibreOffice. Place the PDF into your "Literature" directory.
    7. Upload fulltext files (optional). Just as when writing for WordPress. Includes uploading the PDF version of your own article.
    8. Publish in WordPress. Should be as simple as exporting from LibreOffice to HTML, copying the HTML into WordPress and (optionally) replacing the textural literature references with ZotPress references. This however means manual work whenever doing changes to the article, so probably you better leave that out.
  8. Viewing your web-published content. It is not ideal to use the Firefox integrated PDF viewer for viewing annotated PDFs; while it correctly understands subpart links (like #2 for "page two") and shows annotations and bookmarks, it does not show highlighted text. So better open links to annotated PDFs in Adobe Reader or PDF-XChange, except you know you're not interested in the highlightings this time.
  9. Cleaning the literature database. You might think from time to time that some document that you once included in the literature database is not worth storing any more. But if you cited it, you should. So there should be a command to determine if you cited a specific work, identified by its BibTex key (which is the same as its file basename). If not cited, delete the file and all references in Docear mindmaps.

Reasoning

My last thesis was written in OpenOffice.org (now LibreOffice), and the one feature I really missed there was a comfortable way of moving notes, snippets and other parts of text around. Because getting a thesis done is, for a good part, just that: ordering and integrating notes and snippets that you wrote whenever you had an idea or insight.

Since that thesis, I became a FreeMind enthusiast, using it for nearly all my personal information management. It's truly great in this "ordering notes" task because three aspects come together: it's speedy enough in displaying and dragging even large mindmaps; it provides good optical clues for quick visual navigation in a huge hierarchical content structure, making it mentally less strenuous to work with it (unlike scrolling text in OOo or using OOo outline mode); it provides great keyboard shortcuts for quick navigation and reordering of nodes in the hierarchy. So I looked for a mindmap-based tool for scientific writing, and the above is what I found.

My reasons for individual aspects of this solution:

  • Why Zotero. This is my current recommendation for a reference management tool for LibreOffice, see my evaluation of reference managers for LibreOffice. Of course you could try to use Docear with integrated JabRef reference manager for reference management, see the discussion of the alternative below for the reasons why Zotero is used here. In any way, you will want a separate reference manager rather than treating references as just another node that you organize with Docear. Becuase a mindmapping tool is rather poor with structured data, it's no all-understanding knowledge management system by itself. Also, collaborating on a common bibliography would be difficult with such an approach.
  • Knowledge, not just cited works. This is a knowledge management system; that's why the literature database contains everything you think is worth storing, and not just works you cite or have annotated or have even read.
  • Docear for note management, not reference management. Note that we don't use Docear for literature reference managemet at all, we don't export anything from Zotero to Docear (although that's possible). That's because so far, there is no way to insert text with inline references from Docear nodes into LibreOffice so that it will contain LibreOffice literature references, and also no way to copy or insert a complete node with a link to a literature work from Docear into LibreOffice so that it will appear there as literature reference. The only thing we could do is use JabRef and JabRef plugins for LibreOffice to manually select literature entries from the Docear bibliography and create LibreOffice literature references from them. But that can equally well be done with Zotero and its LibreOffice plugins, plus, Zotero has a more efficient way of collecting document metadata when browsing the web, which is why we avoid JabRef. So there's no meaning in having the literature database available in Docear at all: Docear is used for notes and drafting, while references are only needed when finishing an article. Of course it still has to be easy to detect to what work of literature a note or quotation in an article drafted in Docear belongs; but that is possible in our system, as the PDF file link will contain the literature key as its filename, a scheme that has been tried and tested in practice.
  • Why not Semantik? Looking for mindmapping software for scientific writing, I found Semantik, a tool developed with just this purpose in mind. However, I found its current (as of 2011-08) version in several aspects more limited than FreeMind, while I found Docear to be more powerful than Freemind, esp. because of the PDF annotation management.
  • Why no collaborative annotating? In a group of researchers it will have some benefits if all can annotate the same work of literature and at the same time, see the annotations of others in there. In our solution, there is one file per author with his / her annotations on the contained work of literature. By downloading others' annotated versions into your Literature directory, Docear will import these annotations as well, so that the set of all annotations from all authors is then available in Docear instead of in a single PDF. This is a much simpler system, as it avoids concurrent edits and sync conflicts when editing one file in parallel. No functionality and not much convenience is given away.
  • Why Chromium? Because some WordPress rich text editors, or at least some functions of them, are horribly slow on some systems in Firefox because of some JavaScript functions being slow there. If it does not affect you, you may as well use Firefox.
  • Why not CiteULike? Zotero was preferred above CiteULike because it is completely free software (though this is not a big issue, as CiteULike offers your literature data in various download formats for you).

Alternative Solution Proposal: Docear + JabRef + BibSonomy

There is an alternative solution without Zotero, thus saving one separate software and also integrating funcionality tighter. However, it misses the sophisticated metadata grabbing features of the Zotero browser plugin, and also the Zotero "scientific social bookmarking" online groups are much more and the service more well-known than newcomer BibSonomy (though they offer interesting features like a WordPress plugin). Still, some might rather want this solution. In this solution, you would use these tools:

  • BibSonomy website. For reference collecting in your team.
  • JabRef. The reference manager integrated into Docear.
  • JabRef standalone. Additionally to the Docear integration using it as a standalone reference manager, for some added options.
  • JabRef plugins for LibreOffice. To insert JabRef references into articles you write.
  • BibSonomy plugin for JabRef. To collaborate with your group via the BibSonomy "scientific social bookmarking" website as an alternative to Zotero online groups and CiteULike.
  • AcademicPress WordPress plugin. Together with an online copy of JabPress's BibTeX file if you want to write directly in WordPress
  • BibSonomy CSL plugin for WordPress. To show your complete group bibliography in WordPress

Posted

in

,

by

Comments

9 responses to “Scientific Writing and Knowledge Management with Docear”

  1. tigrmesh

    Freemind .mm file format is actually a type of xml. Try copying your thesis.mm file to thesis.xml and opening it in Firefox or an xml-aware editor, and you'll see what I mean. The nodes all start with <node, and somewhere on that line will be TEXT="Whatever Your Heading Is". The notes all start with <richcontent TYPE=”NOTE”>.

    This may be more work than it's worth. But it may not… . But if you take a look at the xml, you’ll figure it out.

  2. mantas

    There is an integrated mind map/reference management solution http://www.docear.org/ It can additionally be synchronized with mendeley, probably zotero too. It is based on freemind clone freeplane and can open freemind’s *.mm files.

  3. mantas – many thanks for your hint to Docear. Looks very promising! I still have no idea how I managed to miss this …

  4. AMS

    matthias – Thank you for writing this up! I’m just starting out with Docear (and literature gathering in general), and this post really helped in setting up a framework for workflow as well as introducing me to the awesome reference capture that is Zotero.

    While browsing through the Zotero documentation I came across RTF scan. It sounds like it would automate reference replacement you mentioned?
    http://www.zotero.org/support/rtf_scan

  5. AMS – glad the article was of use to you. I’m still not completely happy with my setup, as you can see from the multiple to-dos, but it’s good to get positive feedback like yours. About the RTF scan, yes that’s exactly the functionality that I look for, but in my case it would have to be within WordPress, not for RTF files.

  6. M@rtin

    thanks for this very interesting description of your setup!
    I’m working quite some time with Freeplane and Zotero and now I want to see how I can make them work together (Docear).

    Kind regards

    M@rtin

  7. Adam

    Hi Matt,

    Just wondering how this has worked for you?

    Adam

  8. Have you considered replacing Zotero with Juris-M in this Workflow?

  9. […] Scientific Writing and Knowledge Management with Docear […]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.