Web Design Agency Web Development Agency App Development Agency
Real-Time Collaboration on the Web: CRDTs vs OT, Presence, and Offline Sync

Real-time collaboration—multiple users editing the same document, board, or dataset simultaneously—has moved from novelty to baseline. Teams expect Google-Docs-style coauthoring, cursors dancing, comments syncing, and zero data loss even when the network blips. Delivering this reliably requires more than “send changes over WebSocket.” You need a conflict-tolerant data model, presence semantics, backpressure control, and an offline strategy. This article provides a practical blueprint and helps you choose between CRDTs (Conflict-free Replicated Data Types) and OT (Operational Transform).

Collaboration goals and constraints

A robust system typically targets:

  • Low latency: Sub-100 ms local echo for edits; remote updates streamed ASAP.
  • Consistency under concurrency: Edits from different users converge to the same state.
  • Offline resilience: Users can edit offline and later sync without losing changes.
  • Fairness: No user’s changes are silently “overwritten.”
  • Security and privacy: Auth, ACLs, and optional end-to-end encryption for sensitive content.
  • Observability: Audit trails, version history, and easy diffing.

These goals influence your algorithm and transport choices.

Two core approaches: OT vs CRDTs

Operational Transform (OT)

Idea: Each user produces operations (insert, delete, replace). When two concurrent operations conflict, a transform function rewrites one relative to the other so both can be applied in a consistent order.

Pros

  • Mature for text editors; proven in large deployments.
  • Compact ops; often smaller payloads than state diffs.
  • Works well when a central server arbitrates ordering.

Cons

  • Implementing correct, composable transforms for every operation type is complex.
  • Offline and multi-leader scenarios are harder; usually relies on a server as the source of truth.
  • Extending beyond linear text (e.g., trees, tables) increases complexity.

Good fit: Centralized real-time text editing where a server is always reachable and you prioritize minimal payloads.

Conflict-free Replicated Data Types (CRDTs)

Idea: Data structures are designed so that merging any two replicas deterministically yields the same result, without needing a central arbiter. Each change carries metadata (timestamps, site IDs, vector clocks, or causal ordering).

Pros

  • Natural support for offline-first: replicas can diverge and later converge.
  • Generalizable to rich structures: lists, maps, sets, graphs, JSON, and rich-text spans.
  • Easier reasoning about correctness in multi-leader topologies.

Cons

  • Metadata overhead (tombstones, per-element IDs) can cause state growth; requires compaction (“garbage collection”).
  • Some implementations increase memory and message size.
  • Correct design still requires rigor (causal order, idempotency).

Good fit: Whiteboards, notes, structured documents, or apps with intermittent connectivity and decentralized flows.

Rule of thumb: If you require robust offline editing or plan to support multiple leaders (edge peers, P2P), choose CRDTs. If your app is always online with a single authoritative server and focused on linear text, OT remains viable.

Transport: keeping streams healthy

  • WebSocket for low-latency duplex messaging; fall back to SSE/long-poll if needed.
  • Message framing: Use a compact binary format (CBOR/MessagePack) if you push many small ops.
  • Backpressure: Buffer locally with size/time caps; drop presence “heartbeats” before dropping document ops.
  • Ordering: Preserve causal order per document; include logical timestamps or vector clocks for CRDTs, incrementing sequence numbers for OT.

Presence, cursors, and awareness

Users need to “feel” collaboration:

  • Presence state: Online/offline, idle/active, role (viewer/editor). Broadcast at a slower cadence (e.g., every 2–5 seconds) to save bandwidth.
  • Cursors & selections: Send high-frequency updates but throttle (e.g., 30–60 Hz max) and mark them as ephemeral (no persistence).
  • Avatars and color assignment: Deterministic mapping from user ID → color to reduce churn.
  • Room membership: Use join/leave events and a server-maintained roster to avoid ghost users when connections drop.

Persistence, history, and snapshots

  • Event log: Append operations or CRDT deltas with timestamps and actor IDs.
  • Snapshots: Periodically store compact snapshots so new clients don’t replay the entire history.
  • Compaction: For CRDTs, compact tombstones and coalesce adjacent edits. For OT logs, squash older operations after checkpointing.
  • Versioning: Expose readable versions (e.g., per minute, per explicit save) for audit and undo/redo across sessions.

Offline-first strategy

  1. Local queue: Buffer operations in IndexedDB (web) or a local store (mobile).
  2. Optimistic UI: Apply edits immediately; render remote merges as they arrive.
  3. Reconciliation: On reconnect, send the op tail since the last acknowledged version. With CRDTs, merge; with OT, transform pending ops against server history.
  4. Conflict visibility: For ambiguous merges (e.g., title field), surface minimal UI hints (badges, highlights) without blocking the flow.

Security and multi-tenant controls

  • Authentication: Short-lived tokens via httpOnly cookies or OAuth/OIDC.
  • Authorization: Room/document ACLs (owner, editor, commenter, viewer). Enforce both server-side and—if needed—at the edge.
  • Row/document-level encryption: For high-sensitivity docs, consider end-to-end: encrypt payloads client-side; server only routes ciphertext. CRDTs can work with E2EE, but you must encrypt per field/element and keep metadata minimal.

Performance guardrails

  • Structured batching: Coalesce small ops into frames (e.g., every 10–20 ms or N ops).
  • Priority lanes: Data ops > selection changes > presence. Drop lower-priority messages under congestion.
  • Rendering costs: Use virtualization for long documents; avoid re-rendering the whole tree on each op.
  • Garbage collection: Run CRDT compaction in the background; prune old presence records.

Testing the hard parts

  • Fuzzing: Generate random concurrent edits across clients; verify convergence (same hash across replicas).
  • Network chaos: Inject latency, packet loss, and reordering; ensure UI stays usable.
  • Persistence restore: Start from snapshots; replay tails; check for identical DOM/JSON state.
  • Scale: Simulate hundreds of “bots” editing to validate throughput and backpressure settings.

A minimal system design (reference)

  • Client:
    • Editor surface (ProseMirror/Slate/TipTap/Quill for text; custom canvas/DOM for boards).
    • Collaboration engine (CRDT or OT adapter) with local op queue in IndexedDB.
    • Presence module (throttled cursors, selections).
    • Transport (WebSocket) with heartbeat and reconnect jitter.
  • Server:
    • Stateless gateway handling auth and WebSocket fan-out.
    • Collaboration coordinator:
      • CRDT mode: Accept deltas, broadcast, and append to an event log; optional server-side merge to create snapshots.
      • OT mode: Transform incoming ops against head, update head, broadcast transformed ops.
    • Storage: document snapshots + op log; TTL for presence streams.
    • Observability: per-room metrics (ops/sec, latency, dropped frames), audit logs.

Choosing between CRDT and OT: a simple decision grid

  • Unreliable connectivity / offline editing important? → CRDT.
  • Linear text only, centralized always-online server, strict bandwidth budget? → OT.
  • Rich data types (lists, maps, annotations, shapes)? → CRDT (easier to extend).
  • Existing editor stack already OT-based with mature transforms? → Stick with OT to leverage tooling.
  • End-to-end encryption with minimal server logic? → CRDT (merge on clients).

Common pitfalls (and how to dodge them)

  • Unbounded state growth: Add compaction and snapshotting from day one.
  • Presence overload: Flooding the wire with cursor updates—throttle and prioritize.
  • Ambiguous merges: Even with CRDTs, UX around conflicting intentions matters. Provide subtle in-document markers.
  • Single region bottleneck: Put gateways near users and shard rooms; keep authoritative storage consistent but avoid funneling all traffic through one point.
  • Editor lock-in: Abstract the collaboration core from the UI editor to allow future migrations.

Practical starter checklist

  • Decide CRDT vs OT based on offline needs, data types, and ops complexity.
  • Define document schema and versioning; plan snapshots and compaction.
  • Implement op framing, causal metadata, and retry semantics.
  • Separate channels for ops, presence, and control messages.
  • Add fuzz tests for convergence and chaos tests for the network.
  • Instrument: ops/sec, merge latency, dropped frames, and editor paint times (INP).
  • Document failure modes and recovery steps (e.g., forced snapshot on corruption).

Bottom line

Real-time collaboration is a systems problem disguised as a UI feature. Pick a data model that converges under concurrency, shape your transport with backpressure and priorities, and treat offline as a first-class path rather than an edge case. Use OT for centralized, text-heavy apps with tight payload budgets; choose CRDTs for offline-friendly, multi-leader scenarios and richer data structures. Wrap it all with presence that’s informative but lightweight, and you’ll deliver collaboration that feels instant—and stays correct.


Leave a Comment