The state of syncing in open source#

There have been two blog posts recently who point out that data synchronization using open source tools still doesn’t work as well as it should:

Nitpicking#

First let me point out some factual mistakes in Adam’s post: *”Maemo’s whole synchronization story”* has never been based on SyncEvolution. SyncEvolution is an add-on, supported entirely on a volunteer basis by Ove Kaveh on the device. If people do not get help for SyncEvolution in the Maemo forums, then I can’t do anything about it because I don’t have the time to keep up with everything that is said (or asked) on the web about SyncEvolution. It’s up to the Maemo community to help their peers. SyncEvolution on Linux to N900 [broken somewhere outside of SyncEvolution](https://bugs.meego.com/show_bug.cgi?id=4835) (device, Bluetooth stack) and without support by Nokia for their closed-source sync component on the device I don’t see much chances to fix it, short of some Bluetooth experts getting involved. The statements about MeeGo are also incorrect. Evolution Data Server was not the preferred PIM storage for MeeGo 1.2 until recently, so depending on it for CalDAV/CardDAV support was not an option. Adam points to [a bug report](https://bugs.meego.com/show_bug.cgi?id=319) where I captured some thoughts around the technical aspects. Perhaps this was too brief to be understand without context, but I still think that the arguments and conclusion are valid. More on that below.

What exactly is the complaint?#

Adam tried SyncEvolution with Horde and eGroupware. Other people were able to get these combinations working, for example just a few days ago [George Runelli with eGroupware](http://www.ruinelli.ch/how-to-sync-egroupware-with-a-tablet-n900-with-syncevolution). It seems to depend a lot on the exact setup on the server side. I had offered Adam to help with diagnosing his problem with Horde (unexpected slow syncs), but he never replied to my email. I’m still convinced that the problem is not in SyncEvolution, but rather on the server side, because SyncEvolution works fine with a variety of other SyncML servers (Funambol, Memotoo, Synthesis, Mobical, to name just those that I test with nightly). It is not SyncEvolution’s fault that the open source groupwares seem to have less stable SyncML support. I tried to work with Horde and eGroupware developers a while back when I started with SyncEvolution. I had a hard time getting anyone to reply to my questions and emails, even when contacting the original developers directly. If the situation is different today, then I’d be happy to restart that effort. I’m not sure what kind of problem Matěj had with SyncEvolution. He doesn’t say in his blog post, only that it does not allow him to reliably sync with his server running Zarafa. I’m not surprised. To the best of my knowledge, the two are unable to synchronize against each other by design, because SyncEvolution is based on SyncML and Zarafa on ActiveSync. So is the complaint that SyncEvolution uses an open protocol and not a patent-encumbered proprietary protocol?

Proposed solutions#

Adam then continues to suggest that the data synchronization model itself is flawed and should be replaced with client/server model where changes are always stored on the server immediately, as in Evolution’s CalDAV and CardDAV backends. This became more clear in an email discussion after he contacted me regarding his blog post. The key difference is this:

  • True synchronization allows offline modification of the data.

  • Capable devices by design store a complete copy of the data, without depending on one particular server to remain online. Adam argued that a client/server model can be combined with caching of items and changes. But then the client/server model **becomes** synchronization and must deal with the same kind of problems that it was meant to avoid, like conflicts between items on client and server. Adam [later said](http://luther.ceplovi.cz/blog/2011/04/synchronization-sucks/#comment-37) that changes that cannot be stored anymore should simply be discarded. That doesn’t sound like a very useful approach, because users won’t be able to remember what changes might have been lost and if they do, would most likely be forced to redo them manually. PIM data is more complex then plain text, so the merge strategies that programers know how to use with source code and revision control systems do not apply. Normal users will be even more challenged. The point about not depending on a central server is important, too. In the SyncML world we have recently seen that ScheduleWorld shut down. No data was lost, because by design all users always had a full copy of their data on their own storage. The same can’t be said for all the popular Web 2.0 cloud services… My devices are not always online, for practical and economical reasons. Therefore I want the ability to make changes while offline and will continue to work on the more capable model.

SyncML#

One other aspect is the question whether the data synchronization model itself is flawed, just some protocols implementing it, or only specific implementations of these protocols. I think the approach itself is sound and useful. Adam’s own observation that other implementations of the concept seem to work better confirms that. But SyncML definitely has its flaws, both in the protocol itself and in implementations. SyncML tries to be too flexible for its own good. It allows the implementation of very dumb devices. The downside is an increased complexity on the server side. Because of its open nature, there have been a variety of implementations with varying degrees of capabilities both in the data that is supported and in the quality of the protocol implementation itself. That makes it challenging today to support the whole range of SyncML capable peers. In that sense, SyncML is a victim of its own success.

The silver lining#

I have some hopes today for CalDAV/CardDAV based synchronization. SyncEvolution 1.2 will have support for that, natively. A native implementation has the conceptual advantage that it can use meta data (resource URI + eTag) to speed up change detection, something that wouldn’t be possible when going through Evolution Data Server. CalDAV enforces that each item must have a globally unique ID, which is a considerable simplification for implementing synchronization. CardDAV unfortunately still doesn’t. Good open source implementations exist, for example Apple’s Calendar server. It passes all of the automated SyncEvolution tests. I also hear good things about DAViCal; unfortunately I haven’t found the time to test with it yet. What might be missing in both cases is good integration into a groupware solution, for those who need that.