To convert a bulk of blog posts I created in the past years to the format of Deepest Sender I wrote a small Python script that converts all events from an iCalendar (.ics) file to XML files for Deepest Sender. By the way, this is my first real-world Python script and I am astonished as to the ease, clarity and brevity of Python. Note that you need to save the script posted here with UTF-8 encoding; line mangling is just a visibility / screen width and template problem, just copy and paste the source into a text editor and you’ll be fine! Have fun!
#! /usr/bin/env python
# -*- coding: utf_8 -*-
#
# converts an iCal file with blog entries (as appointments) to Deepest Sender XML
#
# Arguments (in order):
# file the iCal file to convert
# output directory directory where the output files for Deepest Sender go into, one per blog post
#
# The appointments in the iCal input file are converted one by one to blog post XML files as understood by the XUL dektop blogging
# plugin “Deepest Sender” (http://deepestsender.mozdev.org). An inferior alternative to this script’s approach is to convert a HTML
# table as produced by korganizer’s HTML table export format for appointments.
#
# iCal file prerequisites:
# all VEVENT components have the SUMMARY property (else output file name lacks a title)
# no two VEVENTS on one day have the same SUMMARY property (else output files are overwritten)
#
# Deepest sender file structure (note that it is UTF-8 encoded):
# <?xml version=”1.0″ encoding=”utf-8″?>
# <entry>
# <subject><![CDATA[blog entry title]]></subject>
# <event><![CDATA[blog entry content with HTML markup]]></event>
# </entry>
#
# TODO: the filename must only contain a date, not a time, even if the DTSTART property contains one
# TODO: write the values of the DTSTART, CREATED and LAST-MODIFIED properties into the blog post text (via component.decoded())
import sys # argv, …
from xml.dom.minidom import parse, parseString
from codecs import open # overwrite internal open() to enable UTF-8 file access
from icalendar import Calendar, Event
# get it from http://pypi.python.org/pypi/icalendar/1.2 ; if you don’t want to clutter your distro by installing it system-wide,
# copy the directory iCalendar-1.2/src/icalendar/ to the script’s directory
def filenamestr(thestring):
thestring = thestring.replace(‘ ‘,‘_’)
thestring = thestring.replace(u‘»’,”)
thestring = thestring.replace(u‘«’,”)
thestring = thestring.replace(‘/’,‘bzw.’) # slash in a filename is really bad …
while thestring[-1:] is ‘.’: # remove trailing dots as a dot and filename extension will be appended
thestring = thestring[:-1]
return thestring
calfilename = sys.argv[1]
cal = Calendar.from_string(open(calfilename,‘rb’).read())
outputdir = sys.argv[2]
while outputdir[-1:] is ‘/’: # remove trailing slash if present
outputdir = outputdir[:-1]
entrycount = 0;
for event in cal.walk(‘VEVENT’):
# decompose blog entry; event.decoded() is Unicode already
date = event.decoded(‘DTSTART’)
title = event.decoded(‘SUMMARY’)
content = event.decoded(‘DESCRIPTION’,”)
content = content.replace(‘n’,‘<br />’) # the simplest means to convert text to HTML, just as Deepest Sender does when
# writing in WYSIWYG mode; we eliminate n here as blogger.com would create additional <br /> from this
print ‘[processing:’, date, title, ‘]’
# print event.property_items() # debug utility
# calculate output file’s name
filename = str(date) + ‘.PRIVATE.’ + filenamestr(title) + ‘.xml’
# write blog entry to its output file
dsfile = open(outputdir + ‘/’ + filename, ‘w’, ‘utf_8’) # will only accept Unicode strings!
dsfile.write(
u‘<?xml version=”1.0″ encoding=”utf-8″?>n’ +
u‘<entry>n’ +
u‘ <subject><![CDATA[‘ + title + u‘]]></subject>n’ +
u‘ <event><![CDATA[‘ + content + u‘<br /><br />Datum: ‘ + str(date) + u‘]]></event>n’ +
u‘</entry>n’
)
dsfile.close()
entrycount += 1
print ‘———-nconversion successful (‘ + str(entrycount) + ‘ entries processed)’
Leave a Reply