title: Collaboration, Operational Transformation, Wave
topics: work, technical, wave
posted: 15:37 -0500

At work, we are building a real-time collaborative editor.  If you've ever used
Google Docs with multiple people working on the same document at the same time,
that's the sort of thing we're trying to do.  I don't think I'm being too bold
in saying that real-time communication and collaboration will soon go from
"killer feature" to "feature that people will assume that you have and are
frustrated if you don't".  Like WYSIWYG for word processors.

The major issue with collaborative editing is synchronization: making sure that
everyone sees the same thing.  In thinking about synchronization, it is
important to not just consider whether everyone's copy of the document is the
same, but also that the document makes sense.  For example, a text-based
protocol is not suitable for XML-like data, and XML is a bad way of storing
text formatting.  Consider two users editing: "##The quick brown fox jumped over
the lazy dog##".  One user makes "##quick brown##" bold, and another user makes
"##brown fox##" italics.  Using a naive XML method, you would get "##The <b>quick
<i>brown</b> fox</i> ...##", which is invalid XML.

For collaborating on textual documents, the [[http://waveprotocol.org|Wave Protocol]]
is certainly appropriate, but it isn't appropriate for all things.
For example, it wouldn't be my first choice to use for vector graphics
(consider: how do you move an object forward or backwards in the drawing
stack?).  Even tables can cause problems unless your server understands them,
has some way of cleaning them up, or you come up with a clever way of
representing them.  Say we have a 2x2 table:
{{{
<table>
  <tr>
    <td></td><td></td>
  </tr>
  <tr>
    <td></td><td></td>
  </tr>
</table>
}}}
 One
user adds a row (which adds another ##<tr><td></td><td></td></tr>## at the end),
while another user adds a column (which adds a ##<td></td>## to each ##<tr>##).
If both edits happen at the same time, the result will be:
{{{
<table>
  <tr>
    <td></td><td></td><td></td>
  </tr>
  <tr>
    <td></td><td></td><td></td>
  </tr>
  <tr>
    <td></td><td></td>
  </tr>
</table>
}}}
That is, the first two rows will have three columns, and the third row will
have two columns.  There are several ways of dealing with this, but you need to
know about the potential problem before you can address it.

The moral is: make sure that your synchronization method is appropriate for
your document types, and/or make sure that your server understands your
document type enough to make sane conflict resolution decisions.