Skip to content

Document Modality

A new Document modality wraps the existing Prose reducer into a proper Modality — with DocumentComponent placements (subtrait of Component, same pattern as WhiteboardComponent), template-driven structure, and a two-phase lifecycle (compiled → materialized). Session<Document> replaces the bespoke ProseSession, gaining undo/redo, chat, generation, export, and the full Session lifecycle for free. Plain documents are single-section; template documents define multiple DocumentComponent placements with properties (e.g. unit plans, activity ideas, ILPs become templates over Document).

A new PropertyType::Prose / PropertyValue::Prose extends the property system to support LoroText-backed content. Prose properties are CRDT containers that support AI streaming via write_prose tool + rig’s StreamingPromptHook, and direct editing via the Prose reducer. This generalizes beyond Document — Whiteboard’s TextComponent adopts the same property type, replacing its current empty-blocks-at-compile-time workaround. TextBoxComponent is deleted; TextComponent with a Prose property handles both live editing and AI-populated template text.

This RFC also splits AiIntent into ChatIntent (agent — multi-turn, tool-calling, open-ended) and GenerateIntent (orchestrator — deterministic pipeline over placements). All Ai-prefixed types are renamed to Chat. Generation becomes a separate session-level concern with a Modality::on_generate_complete() hook for modality-specific post-processing (Document uses this to materialize).

Alongside the new modality, this RFC consolidates persistence naming: Doc<S>Ledger<S>, StoreBackendStore, LocalBackendLocalStore, RemoteBackendRemoteStore. The docs gain a dedicated Persistence reference section. ProseBlockView is renamed to ProseBlock. The prose crate is restructured: ProseSession and UndoStaging deleted, commands.rs split into types.rs (pure data) and intent.rs (reducer protocol), write_blocks_to_loro_text() added for materialization.

Depends on Runtime Struct for the Effect return model. Interacts with Embedded Prose Reducer — Document reuses the same embedded Prose pattern as Whiteboard. Interacts with Session Undo/Redo — Session-level undo covers all section edits; chat streaming uses undo groups. Interacts with Session Clipboard — Document implements clipboard methods for section content. Interacts with Selection Trait — Document’s selection type is text ranges/blocks; Whiteboard’s is element IDs.

Today, standalone prose editing uses ProseSession — a custom wrapper with its own LoroDoc, UndoManager, dispatch lifecycle, effect sinks, and versioning. This duplicates everything Session<M> already provides:

  1. Duplicated lifecycle — ProseSession reimplements commit, version increment, snapshot emission, undo/redo staging, and effect draining. Session does all of this generically.

  2. No AI integration — ProseSession has no agent thread, no chat, no orchestrator. Adding AI to standalone documents would require building it from scratch rather than getting it free from Session.

  3. No export — ProseSession has export_bytes() for raw Loro snapshots but no PDF/markdown export pipeline. Session’s export machinery is unavailable.

  4. No template support — ProseSession is a flat text editor. There’s no concept of structured documents with sections, properties, or templates. Schools need unit plans, activity ideas, and ILPs — structured documents with defined sections that can be templated.

  5. Prose is a reducer, not a modality — Prose implements Reducer but not Modality or Component. It can’t participate in the compilation pipeline, property system, or template binding. A Document modality wrapping Prose bridges this gap.

  6. No streamable property type — Property values are all static (strings, numbers, JSON blobs). Rich text content needs to be streamable — AI writes tokens incrementally into LoroText, visible in real-time. There’s no property type that bridges the static property world and the live CRDT world.

  7. Persistence naming is muddledDoc<S> (the CRDT wrapper) will collide conceptually with Document (the modality). More broadly, Doc, StoreBackend, LoroStore, LocalBackend, RemoteBackend lack a coherent naming scheme. The persistence layer deserves consolidated semantics and its own documentation section.

PropertyType::Prose — streamable rich text values

Section titled “PropertyType::Prose — streamable rich text values”

A new property type bridges the gap between static property values and live CRDT containers:

// Schema level — declares a property is backed by LoroText
enum PropertyType {
Text { ... },
Number { ... },
// ...existing...
Prose, // backed by a LoroText container in the LoroDoc
}
// Value level — references a LoroText container by ID
enum PropertyValue {
String(String),
Number(f64),
// ...existing...
Prose(String), // container ID → LoroText in the LoroDoc
}

When a template declares a property as PropertyType::Prose, the system creates a LoroText container in the LoroDoc keyed by the container ID. The PropertyValue::Prose(container_id) is the reference.

During compilation: PropertyValue::Prose(container_id) stays as a clean container ID reference in the values map. The Modality’s compile_child() override resolves it: reads the LoroText by container ID, builds Vec<ProseBlock>, and passes the resolved blocks through the component’s metadata (e.g. DocumentComponentMeta.prose). The component’s compile() reads blocks from metadata — never touches LoroDoc directly.

During AI streaming: The AI agent calls a write_prose tool with { target: container_id, content: "markdown..." }. The content argument streams token-by-token from the API. A StreamingPromptHook (rig’s callback trait) intercepts on_tool_call_delta events, extracts markdown chunks from the partial JSON, and inserts them into the target LoroText with appropriate marks. Periodic flushes trigger recompile → snapshot so users see text appearing in real-time. the Session Undo/Redo RFC undo groups wrap the entire tool call as one undo step. The AI agent retains full context — no sub-query, no knowledge drift.

During editing: The Prose reducer operates on the LoroText via the container ID. Same mechanism as embedded Prose in Whiteboard (the Embedded Prose RFC).

This generalizes beyond Document — Whiteboard’s TextComponent and any future component with rich text content can declare PropertyType::Prose properties.

TextComponent currently produces an empty blocks: Vec::new() at compile time — the actual blocks come from LoroText content via Whiteboard’s compile_child_hash override. This is a workaround for not having a Prose property type.

With PropertyType::Prose, TextComponent gains a "content" property of type Prose. The property value is a container ID referencing the element’s LoroText in the Whiteboard’s LoroDoc. compile() reads the LoroText via the container ID and builds blocks — no more empty-shell hack.

AI populates text content via the same write_prose tool and streaming hook mechanism. This works for both initial template generation and later AI edits to existing text elements.

TextBoxComponent is deleted. TextComponent with its Prose property, plus existing style properties (font, alignment) and optional background/border properties (merged from TextBoxComponent), handles all text element use cases.

AI generates prose content through a write_prose tool whose content argument is streamed directly into LoroText via rig’s StreamingPromptHook:

// Tool schema — AI sees this as a normal tool
struct WriteProseTool {
target: String, // container ID (LoroText to write into)
content: String, // markdown content (streamed by the API)
}

The AI agent calls write_prose(target, content) like any other tool. The difference is in how we handle the streaming argument deltas:

#[derive(Clone)]
struct ProseStreamHook {
session: Weak<Mutex<Session<M>>>,
state: Arc<Mutex<ProseStreamState>>,
}
struct ProseStreamState {
active_tool: Option<String>, // tool call ID being streamed
target: Option<String>, // container ID, extracted from partial JSON
cursor: usize, // insertion cursor in LoroText
buffer: String, // accumulated content for periodic flush
}
impl<M: CompletionModel> StreamingPromptHook<M> for ProseStreamHook {
async fn on_tool_call_delta(
&self,
tool_call_id: &str,
tool_name: Option<&str>,
delta: &str,
_cancel: CancelSignal,
) {
// 1. Detect write_prose tool
if tool_name == Some("write_prose") {
self.state.lock().active_tool = Some(tool_call_id.to_string());
}
let mut state = self.state.lock();
if state.active_tool.as_deref() != Some(tool_call_id) { return; }
// 2. Extract markdown from partial JSON
// Deltas arrive as: {"target":"body","content":"# Intro
// Then: duction\n\nThis unit
// Track which field we're inside, stream "content" chunks
if let Some(chunk) = extract_content_from_json_delta(delta, &mut state) {
state.buffer.push_str(&chunk);
}
// 3. Periodic flush — insert buffered markdown into LoroText
if should_flush(&state) {
if let Some(session) = self.session.upgrade() {
let session = session.lock();
let text = session.state().ledger.with_doc(|d| d.get_text(state.target.as_ref().unwrap()));
insert_markdown_at_cursor(&text, &mut state.cursor, &state.buffer);
state.buffer.clear();
// Session recompile + snapshot happens on next dispatch cycle
}
}
}
}

Attached to the agent stream:

let hook = ProseStreamHook { session: weak_session, state: Default::default() };
agent.stream_prompt(message)
.with_hook(hook)
.multi_turn(max_turns)
.await;

This approach:

  • Same agent context — the AI retains full conversation history, no sub-query
  • Native streaming — uses rig’s existing ToolCallDelta infrastructure
  • Cross-provider — works with both Anthropic (input_json_delta) and OpenAI (partial arguments)
  • Generalizes — any tool can use this pattern; Whiteboard TextComponent uses the same write_prose tool
  • Undo-grouped — the Session Undo/Redo RFC undo group wraps the entire tool call as one undo step

DocumentComponent is a subtrait of Component — same pattern as WhiteboardComponent. Individual components are separate structs, each with their own properties() and compile(), registered via inventory::submit!. ProseComponent is one implementation (the main one — takes a Prose property and returns blocks from LoroText).

pub struct DocumentComponentMeta {
pub id: String,
/// Resolved prose content — keyed by property key.
/// Populated by Document::compile_child() from LoroText containers.
pub prose: HashMap<String, Vec<ProseBlock>>,
}
pub trait DocumentComponent: Component<Metadata = DocumentComponentMeta, Output = Vec<ProseBlock>> {
fn id(&self) -> &'static str;
fn name(&self) -> &'static str;
fn description(&self) -> &'static str;
}
pub struct DocumentComponentEntry(pub &'static dyn DocumentComponent);
inventory::collect!(DocumentComponentEntry);
pub fn resolve_document_component(id: &str) -> Option<&'static dyn DocumentComponent> {
inventory::iter::<DocumentComponentEntry>
.into_iter()
.find(|e| e.0.id() == id)
.map(|e| e.0)
}

ProseComponent — takes a single PropertyType::Prose property, reads resolved blocks from metadata:

pub struct ProseComponent;
inventory::submit!(DocumentComponentEntry(&ProseComponent));
impl Component for ProseComponent {
type Metadata = DocumentComponentMeta;
type Output = Vec<ProseBlock>;
fn properties(&self) -> Vec<PropertySchema> {
vec![PropertySchema {
key: "content".into(),
name: "Content".into(),
property_type: PropertyType::Prose,
..Default::default()
}]
}
fn compile(
&self,
meta: &DocumentComponentMeta,
_values: &HashMap<String, PropertyValue>,
) -> Result<Vec<ProseBlock>, CompileError> {
Ok(meta.prose.get("content").cloned().unwrap_or_default())
}
}
impl DocumentComponent for ProseComponent {
fn id(&self) -> &'static str { "prose" }
fn name(&self) -> &'static str { "Prose" }
fn description(&self) -> &'static str { "Rich text section backed by LoroText." }
}

FigureComponent — structured properties, no Prose property needed:

pub struct FigureComponent;
inventory::submit!(DocumentComponentEntry(&FigureComponent));
impl Component for FigureComponent {
type Metadata = DocumentComponentMeta;
type Output = Vec<ProseBlock>;
fn properties(&self) -> Vec<PropertySchema> {
vec![
PropertySchema { key: "image_url".into(), name: "Image URL".into(), property_type: PropertyType::Url, ..Default::default() },
PropertySchema { key: "caption".into(), name: "Caption".into(), property_type: PropertyType::Text { max_length: None }, ..Default::default() },
PropertySchema { key: "alt_text".into(), name: "Alt Text".into(), property_type: PropertyType::Text { max_length: None }, ..Default::default() },
]
}
fn compile(&self, _meta: &DocumentComponentMeta, values: &HashMap<String, PropertyValue>) -> Result<Vec<ProseBlock>, CompileError> {
let url = bind_string(values, "image_url", "");
let caption = bind_string(values, "caption", "");
let alt = bind_string(values, "alt_text", "");
let mut blocks = vec![ProseBlock {
id: 0, block_type: BlockType::Image { url, alt: Some(alt).filter(|s| !s.is_empty()) },
segments: vec![], indent_level: 0,
}];
if !caption.is_empty() {
blocks.push(ProseBlock {
id: 1, block_type: BlockType::Paragraph,
segments: vec![StyledSegment { text: caption, italic: true, ..Default::default() }],
indent_level: 0,
});
}
Ok(blocks)
}
}
impl DocumentComponent for FigureComponent {
fn id(&self) -> &'static str { "figure" }
fn name(&self) -> &'static str { "Figure" }
fn description(&self) -> &'static str { "Image with optional caption." }
}

Prose resolution in compile_child() — Document resolves PropertyValue::Prose values by reading LoroText containers and enriching the metadata:

// Document's compile_child override
fn compile_child(
&self,
placement: &ComponentPlacement<()>,
meta: &DocumentComponentMeta,
values: &HashMap<String, PropertyValue>,
) -> Result<Vec<ProseBlock>, CompileError> {
let mut enriched = meta.clone();
for (key, val) in values {
if let PropertyValue::Prose(container_id) = val {
let text = self.state().ephemeral.doc.get_text(container_id);
enriched.prose.insert(key.clone(), render_from_loro_text(&text));
}
}
let component = resolve_document_component(&placement.component_id)
.ok_or(CompileError::UnknownComponent(placement.component_id.clone()))?;
component.compile(&enriched, values)
}
ComponentPropertiesOutput
ProseComponentcontent: ProseText blocks from LoroText
FigureComponentimage_url, caption, alt_textImage block + caption paragraph
ChecklistComponentitems, checked statesTask list item blocks

BlockType is extended with non-text variants to support components like FigureComponent:

enum BlockType {
// ...existing (Paragraph, Heading, ListItem, etc.)...
Image { url: String, alt: Option<String> },
}

Image is just another block type that happens to not use segments — same as HorizontalRule. ProseBlock remains the universal element. No separate DocumentBlock enum.

Not every component needs a PropertyType::Prose property. Only components with live-editable rich text (like ProseComponent) use Prose properties. The rest are pure property → block compilers — structured data in, Vec<ProseBlock> out.

Documents have a two-phase lifecycle driven by the compilation and generation pipeline:

Phase 1 — Compiled (template): The template defines placements with various DocumentComponents (ProseComponent, FigureComponent, ChecklistComponent, etc.). Each compiles to Vec<ProseBlock> from its property values — some from LoroText (Prose properties), others from structured properties (formula, image, checklist). The orchestrator generates content per placement. The document is a set of compiled components.

Phase 2 — Materialized (editing): When generation completes, Modality::on_generate_complete() fires. Document’s override materializes the compiled output — takes the Vec<ProseBlock> from all placements (already in ephemeral output), writes them into a single LoroText with appropriate markdown prefixes and marks, and replaces the placements with one body ProseComponent placement backed by that LoroText. Structured components (figure, checklist, etc.) dissolve into the text. Now it’s a Prose editor.

// Document's on_generate_complete — called by Session after GenerateFeedback::Complete
fn on_generate_complete(state: &mut State<DocumentSynced, DocumentEphemeral>) {
// 1. Collect compiled output from all current placements (already in ephemeral)
let all_blocks: Vec<ProseBlock> = state.ephemeral.output.clone();
// 2. Write blocks into a single "body" LoroText
let body_text = state.ephemeral.doc.get_text("body");
write_blocks_to_loro_text(&body_text, &all_blocks);
// 3. Replace all placements with one ProseComponent body placement
state.synced.placements = vec![
ComponentPlacement {
id: "body".into(),
component_id: "prose".into(),
bindings: vec![PropertyBinding {
key: "content".into(),
value: PropertyValue::Prose("body".into()),
}],
position: (),
},
];
}

The transition is not an explicit mode flag — it’s a natural consequence of how generation feedback is handled. The placements change because the synced state changed. Before generation: N diverse DocumentComponent placements. After generation: one body placement over materialized LoroText.

A plain document (no template) starts directly in Phase 2 — single placement, single LoroText, immediate editing.

Document implements Modality with DocumentComponent placements:

pub struct Document {
rt: Runtime<Self>,
}
impl Modality for Document {
type Synced = DocumentSynced;
type Ephemeral = DocumentEphemeral;
type Snapshot = DocumentSnapshot;
type Position = ();
type ChildComponent = dyn DocumentComponent;
fn new(synced: DocumentSynced, services: Arc<Self::Services>, doc: &LoroDoc) -> Self;
fn placements(&self) -> &[ComponentPlacement<()>];
fn on_generate_complete(state: &mut State<Self::Synced, Self::Ephemeral>) { /* materialize */ }
// ...
}

Minimal — just ResourceMeta. Text content lives in LoroText containers within the LoroDoc, mutated directly by the Prose reducer (same pattern as Whiteboard text elements).

#[derive(Clone, Debug, PartialEq, Default, Serialize, Deserialize)]
pub struct DocumentSynced {
pub meta: ResourceMeta,
}
impl Synced for DocumentSynced {
fn from_doc(doc: &LoroDoc) -> Self {
let map = doc.get_map("document");
Self {
meta: ResourceMeta::from_loro_map(&map),
}
}
fn to_doc(&self, doc: &LoroDoc) {
let map = doc.get_map("document");
self.meta.to_loro_map(&map);
}
}

Mirrors the Whiteboard embedded Prose pattern from the Embedded Prose RFC:

pub struct DocumentEphemeral {
/// Active Prose editors — one per focused section.
pub prose_editors: HashMap<String, Prose>,
/// Which section has focus (receives keyboard input).
pub focused_section_id: Option<String>,
/// Cached compiled output.
pub output: Vec<ProseBlock>,
}

When a section gains focus, a Prose instance is created from the section’s LoroText handle. When focus leaves, the Prose instance can be dropped (the LoroText persists in the LoroDoc). This is identical to how Whiteboard manages prose_editors for text elements.

pub enum DocumentIntent {
/// Focus a section for editing.
FocusSection { id: String },
/// Unfocus the current section.
Blur,
/// Edit within the focused section — delegates to Prose.
SectionEdit { id: String, intent: ProseIntent },
/// Add a new section (plain document or template-allowed).
AddSection { after_id: Option<String> },
/// Remove a section.
RemoveSection { id: String },
/// Reorder a section.
MoveSection { id: String, to_index: u32 },
/// Toggle section fold state (ephemeral).
ToggleFold { id: String },
}

SectionEdit routes to the focused section’s Prose reducer. All other intents operate on the document structure itself.

pub struct DocumentSnapshot {
/// Flat list of all blocks across all sections.
pub blocks: Vec<ProseBlock>,
/// Section boundaries for rendering headers/dividers.
pub sections: Vec<SectionInfo>,
/// Focused section and cursor state.
pub focused_section: Option<FocusedSectionSnapshot>,
/// Chat state.
pub chat: ChatSnapshotState,
/// Export state.
pub export: ExportSnapshotState,
/// Document version.
pub version: u64,
}
pub struct SectionInfo {
pub id: String,
pub title: Option<String>,
pub block_range: (u32, u32),
pub collapsed: bool,
}
pub struct FocusedSectionSnapshot {
pub section_id: String,
pub cursor: CursorPosition,
pub selection: Option<(CursorPosition, CursorPosition)>,
}

Blocks are flat (Vec<ProseBlock>). SectionInfo provides boundaries so Dart can render section headers and fold controls. FocusedSectionSnapshot carries cursor state for the active section.

ProseBlockViewProseBlock throughout. The “View” suffix doesn’t earn its keep — there’s no internal ProseBlock to collide with (BlockType and ParsedBlock are the internal types).

Consolidate persistence naming across the codebase:

OldNewLocation
Doc<S>Ledger<S>crates/session/src/doc.rsledger.rs
StoreBackendStorecrates/core/src/storage.rs
NoopBackendNoopStorecrates/core/src/storage.rs
LoroStoreFileStorecrates/platform/store/
InMemoryLoroStoreInMemoryStorecrates/platform/store/
LocalBackendLocalStorecrates/platform/store/
RemoteBackendRemoteStorecrates/platform/remote/

Ledger<S> is the live CRDT wrapper (LoroDoc + UndoManager) — versioned, in-memory, per-session. Store is the persistence trait (load/save raw bytes to disk/server). Clean separation: Ledger holds state, Store persists it.

SessionState<M> field rename: pub doc: Doc<M::Synced>pub ledger: Ledger<M::Synced>.

Persistence moves from reference/session/persistence to its own top-level reference section:

Reference/
Core/ (Command, Reducer, Effects, Component, Modality, Slot)
Session/ (Overview, ReduceIntent, Dispatch, Children, Snapshots)
Persistence/ (Ledger, Store, LocalStore, RemoteStore) ← NEW
AI/ (Traits, Orchestrator, Subagents, Bridge)
Properties/ (Overview, Bindings, Validation, CEL)
FRB/ (Patterns, Codegen, API Layer)

Document modality gets its own page under Modalities/Document/ with Overview, Intents, and Sections subpages.

AiIntent is split into two separate session-level concerns:

ChatIntent — persistent agent thread, multi-turn, tool-calling, open-ended:

pub enum SessionIntent<I> {
Modality(I),
Chat(ChatIntent), // agent
Generate(GenerateIntent), // orchestrator
Export(OutputFormat),
Undo,
Redo,
}
pub enum ChatIntent {
InitAgent { config: AgentConfig },
SendMessage { message: String },
Cancel,
Flush,
}
pub enum ChatFeedback {
StreamDelta { text: String },
ThinkingDelta { text: String },
ToolCallStarted { id: String, name: String, arguments: String },
ToolCallCompleted { id: String, name: String, result: String },
Complete { content: String },
Error { message: String },
}

GenerateIntent — deterministic orchestrator pipeline over placements:

pub enum GenerateIntent {
Start { prompt: Option<String> },
Cancel,
}
pub enum GenerateFeedback {
PlacementResolved { id: String },
Complete,
Error { message: String },
}

All modalities generate the same way (orchestrator runs the defined pipeline: research → fill placements). The modality hooks the completion via Modality::on_generate_complete() — Document uses this to materialize; other modalities default to no-op.

Full rename table:

OldNew
AiIntentChatIntent
AiFeedbackChatFeedback
AiStateChatState
AiMessageChatMessage
AiMessageRoleChatMessageRole
AiToolCallChatToolCall
AiSnapshotStateChatSnapshotState
SessionEphemeral.aiSessionEphemeral.chat
SessionIntent::AiSessionIntent::Chat
wrap_ai / wrap_ai_feedbackwrap_chat / wrap_chat_feedback
reduce_ai_intentreduce_chat_intent
handle_ai_feedbackhandle_chat_feedback
AiIntent::GenerateGenerateIntent::Start (separate variant on SessionIntent)

ReduceIntent trait gains wrap_generate / wrap_generate_feedback alongside the existing wrap_chat / wrap_chat_feedback. Generation gets its own deferred field (pending_generate: Option<GenerateIntent>) separate from chat state.

With ProseSession deleted, the prose crate is restructured into three layers (flat — no subdirectories):

Deletions:

  • session.rsProseSession, ProseSynced (replaced by Session<Document>)
  • undo.rsUndoStaging, build_undo_manager (Session’s Ledger handles undo)

Split commands.rs into two modules:

  • types.rs — pure data types consumed by Document, Whiteboard, FRB, AI: ProseBlock (renamed from ProseBlockView), BlockType, StyledSegment, CursorPosition, InlineMark
  • intent.rs — reducer protocol, only needed by Prose dispatchers: ProseIntent, ProseFeedback, ProseError, ProseEffect, KeyInput, SlashMenuItem, DeleteDirection, CursorDirection

Addition to render.rs:

  • write_blocks_to_loro_text() — inverse of render_from_loro_text(). Writes Vec<ProseBlock> into a LoroText as markdown-prefix text with Loro marks. Used by Document materialization.

Resulting structure:

prose/src/
├── types.rs # ProseBlock, BlockType, StyledSegment, CursorPosition (pure data)
├── intent.rs # ProseIntent, ProseFeedback, ProseError, ProseEffect (reducer protocol)
├── prose.rs # Prose Reducer + ProseState
├── reduce.rs # reduce_intent()
├── blocks.rs # ParsedBlock from markdown prefixes
├── markdown.rs # parse_blocks()
├── cursor.rs # cursor math
├── styles.rs # Loro marks ↔ styled blocks
├── clipboard.rs # copy/paste
├── table.rs # table operations
├── snapshot.rs # ProseSnapshot, FocusedProseSnapshot
├── render.rs # render_from_loro_text, write_blocks_to_loro_text, markdown_to_blocks

Three layers, one crate:

  • Types (types.rs) — consumed by Document, Whiteboard, FRB, AI
  • Rendering (render.rs, blocks.rs, markdown.rs, styles.rs) — parsing and serialization
  • Editing (prose.rs, reduce.rs, cursor.rs, clipboard.rs, table.rs) — the Reducer

ProseSession is deleted entirely. All standalone prose editing uses Session<Document>:

ProseSession capabilitySession<Document> equivalent
Own LoroDocLedger<DocumentSynced>
UndoManager + stagingLedger undo/redo + the Session Undo/Redo RFC
dispatch(Command<ProseIntent>)dispatch(Command<SessionIntent<DocumentIntent>>)
Snapshot sinksSession sink machinery
Effect sinksSession effect/clipboard machinery
export_bytes()M::export_bytes() on Modality
from_bytes(bytes)Session::from_ledger(Ledger::from_bytes(bytes))

The FRB prose API (open_prose_session, dispatch_prose, etc.) is replaced with open_document_session, dispatch_document.

the Embedded Prose RFC (Embedded Prose): Document uses the same pattern — prose_editors: HashMap<String, Prose> with focused_section_id. Whiteboard embeds Prose for text elements; Document embeds Prose for sections. The Prose reducer is shared.

the Session Undo/Redo RFC (Undo/Redo): Session-level undo covers all section edits automatically. Section edits mutate LoroText in the Document’s LoroDoc → the Ledger’s UndoManager tracks them. Chat streaming wraps in undo groups — begin_undo_group on generate start, end_undo_group on done — so the entire generation is one undo step.

the Session Clipboard RFC (Clipboard): Document implements copy_selection/paste_rich/paste_text on Modality. When a section is focused, clipboard delegates to Prose-level operations. When no section is focused (or document-level selection exists), clipboard operates on section structure.

  1. Persistence renamesDoc<S>Ledger<S>, StoreBackendStore, LocalBackendLocalStore, RemoteBackendRemoteStore, LoroStoreFileStore, NoopBackendNoopStore. Update all references. Verify: cargo check --workspace

  2. Chat/Generate split — split AiIntentChatIntent + GenerateIntent. Rename all Ai-prefixed types to Chat. Add SessionIntent::Generate variant. Add Modality::on_generate_complete() hook (default no-op). Update ReduceIntent trait. Verify: cargo check --workspace

  3. ProseBlock rename + prose crate restructuringProseBlockViewProseBlock. Split commands.rstypes.rs + intent.rs. Delete session.rs (ProseSession, ProseSynced) and undo.rs (UndoStaging). Add write_blocks_to_loro_text() to render.rs. Update lib.rs re-exports. Verify: cargo check --workspace

  4. PropertyType::Prose — add Prose variant to PropertyType and PropertyValue. Verify: cargo check --workspace

  5. DocumentComponent traitDocumentComponent subtrait, DocumentComponentMeta with prose resolution map, DocumentComponentEntry + resolve_document_component(). Implement ProseComponent and FigureComponent. Register via inventory::submit!. Verify: cargo test --package modality_document

  6. Document modality structDocumentSynced, DocumentEphemeral, Document struct implementing Reducer, Component, Modality. compile_child() resolves PropertyValue::Prose → metadata. on_generate_complete() materializes placements. Verify: cargo check --workspace

  7. DocumentIntent + reduceFocusSection, Blur, SectionEdit, AddSection, RemoveSection, MoveSection, ToggleFold. Route SectionEdit to Prose. Verify: cargo test --package modality_document

  8. DocumentSnapshot — flat blocks + section info + focused section state. Wire Modality::snapshot(). Verify: cargo test --package modality_document

  9. Whiteboard TextComponent migration — add PropertyType::Prose content property to TextComponent. Remove empty-blocks workaround in compile(). Delete TextBoxComponent, merge relevant properties (background, border, padding) into TextComponent. Remove compile_child_hash LoroText override. Verify: cargo test --package modality_whiteboard

  10. Delete ProseSession consumers — replace FRB prose API (open_prose_session, dispatch_prose) with document session API (open_document_session, dispatch_document). Verify: cargo check --workspace

  11. FRB + Dart — regenerate bindings, update example app to use Session<Document>. Verify: flutter analyze

  12. Docs — new Persistence reference section (Ledger, Store, LocalStore, RemoteStore). New Document modality page. Move persistence out of Session reference. Verify: cd docs && npx astro build

ProseSession as-is — keep the bespoke wrapper, don’t create a Document modality. Rejected because it duplicates Session lifecycle, blocks AI/export integration, and can’t support structured template documents.

Document as degenerate Modality — empty placements, no child components, compile directly from LoroText. Rejected because it misses the opportunity for template documents with structured sections. DocumentComponent placements follow the same pattern as WhiteboardComponent elements and LessonPlan slides.

Multiple modalities per document typeUnitPlanDocument, ActivityIdeasDocument, etc. as separate Modality structs. Rejected because they share identical editing behavior. Document types are just templates — different section layouts over the same Document modality.

Explicit mode enum for two-phase lifecycleDocumentMode::Compiled / DocumentMode::Live on synced state. Rejected because the transition is a natural consequence of generation feedback — placements change because synced state changed. No mode flag needed.

Sub-query streaming — AI tool spawns a separate AI call to generate content. Rejected because the sub-query loses the parent agent’s conversation context, introduces latency, and risks knowledge drift between parent and child.

Full content per AI tool call, no streaming — AI generates entire section content in one tool call, inserted all at once. Rejected because there’s no streaming UX — user sees nothing until the tool call completes.

AI dispatches ProseIntent::InsertText directly — AI calls InsertText for each chunk. Rejected because the AI shouldn’t know about ProseIntent internals. The AI generates markdown; our code handles LoroText insertion.

Begin/end protocol — Agent calls begin_prose(target), generates markdown as plain text, calls end_prose(). Rejected because it requires the agent to follow a multi-step protocol and plain text between begin/end isn’t validated as a tool argument.

Rename Doc<S> to Store<S> — rejected because Store collides semantically with the persistence layer (StoreBackend, LoroStore, LocalBackend). Ledger<S> is unique and evokes versioned history (undo, CRDT sync).

Keep persistence naming as-isDoc<S> + StoreBackend + LocalBackend. Rejected because Doc<S> collides with Document, and the naming scheme lacks coherence. The consolidated scheme (Ledger + Store + LocalStore/RemoteStore) is clearer.

Structured output (Vec<DocumentSection>) — compile returns sections with titles and block lists instead of flat Vec<ProseBlock>. Deferred — flat blocks with SectionInfo boundaries achieves the same rendering capability without a new wrapper type. Can revisit if Dart needs richer section semantics.

Document types are templates, not modalities

Section titled “Document types are templates, not modalities”

Unit plans, activity ideas, ILPs, vocab lists, and other document types are templates over the Document modality — they define which sections exist, what properties each section has, and what AI generation hints to use. The Document struct is the same regardless of type. Template selection happens at creation time and is stored as metadata on the resource.

A document with no template has a single section (the body). This is the degenerate case — one placement, one ProseComponent, full-document editing. The section structure is invisible to the user; it just works like a text editor.

Section fold state (collapsed_sections: HashSet<String>) lives on DocumentEphemeral. It’s per-user view state, not synced via CRDT. Collapsible behavior is a template property on the section — whether a section CAN fold. Whether it IS folded is ephemeral.

Ledger<S> was chosen over Store<S> (collides with persistence layer), Repo<S> (git metaphor less intuitive), SyncDoc<S> (still has “Doc”), and CrdtDoc<S> (verbose). It’s unique, evokes versioned history, and has zero naming collisions in the codebase.

Chat streaming uses rig’s StreamingPromptHook

Section titled “Chat streaming uses rig’s StreamingPromptHook”

Chat streams prose content via a write_prose tool. The tool’s content argument is streamed token-by-token from the API. Rig’s StreamingPromptHook::on_tool_call_delta intercepts partial JSON chunks, extracts markdown deltas, and inserts them into the target LoroText. This uses the same agent context (no sub-query), works cross-provider (Anthropic input_json_delta + OpenAI partial arguments), and generalizes to any tool that writes prose (Document sections, Whiteboard text elements). The write_prose tool currently logs ToolCallDelta events in runner.rs — this RFC wires them to LoroText insertion.

After GenerateFeedback::Complete, Session calls M::on_generate_complete(). Document’s override collects the compiled Vec<ProseBlock> from ephemeral output, writes them into a single “body” LoroText via write_blocks_to_loro_text() (inverse of render_from_loro_text()), and replaces all placements with a single ProseComponent body placement (PropertyValue::Prose("body")). The LoroText persists structured components (figure → Image block, checklist → TaskListItem blocks) as their markdown-prefix equivalents. Recompile naturally picks up the single placement, and the document is now a live Prose editor.

PropertyValue::Prose resolves through metadata

Section titled “PropertyValue::Prose resolves through metadata”

PropertyValue::Prose(container_id) is a clean container ID reference — storable, serializable, no LoroDoc dependency. At compile time, the Modality’s compile_child() override resolves it: reads the LoroText, builds Vec<ProseBlock>, and places the blocks in DocumentComponentMeta.prose (keyed by property key). The component’s compile() reads resolved blocks from metadata — never touches LoroDoc. Same pattern applies to Whiteboard’s TextComponent migration: Whiteboard’s compile_child() resolves Prose properties into the child metadata.

AiIntent is split into ChatIntent (persistent agent, multi-turn, open-ended) and GenerateIntent (deterministic orchestrator over placements). All Ai-prefixed types rename to Chat. Generation gets its own SessionIntent::Generate variant, its own feedback type (GenerateFeedback), and its own deferred field. The orchestrator is session-level — all modalities generate the same way (research → fill placements). Modality-specific post-processing happens via the on_generate_complete() hook.

With ProseSession deleted, the prose crate is restructured into three layers in flat src/ (no subdirectories — 12 files doesn’t warrant folders). commands.rs splits into types.rs (pure data: ProseBlock, BlockType, StyledSegment) and intent.rs (reducer protocol: ProseIntent, ProseFeedback). write_blocks_to_loro_text() is added to render.rs as the inverse of render_from_loro_text(). The three layers: Types (consumed everywhere), Rendering (parsing/serialization), Editing (the Prose Reducer).

DocumentComponent is a trait — there’s no struct to attach shared properties to. Each impl declares its own properties via properties(). If we ever want shared placement-level metadata (title, collapsible, AI hint), that lives on ComponentPlacement, not on the component trait. Not a concern for this RFC.

ProseBlock → LoroText round-trip is sufficient

Section titled “ProseBlock → LoroText round-trip is sufficient”

ProseBlocks are designed to be text-representable. All current block types (Paragraph, Heading, ListItem, CodeBlock, Table, Image) have clear markdown-prefix encodings. Color/background marks have no markdown representation and would be lost during materialization — this is acceptable because template components produce unstyled blocks, and colors are an editing concern post-materialization. Implementation detail, not a design question.

The exact orchestrator pipeline (research → fill per-placement, parallelism, error handling, progress reporting, how GenerateIntent::Start dispatches) is spun out to its own RFC. This RFC defines the GenerateIntent / GenerateFeedback types and the on_generate_complete() hook — the orchestrator RFC fills in the pipeline.

None — all design questions resolved. Implementation details (exact write_blocks_to_loro_text handling per block type, generation orchestrator pipeline) are deferred to implementation or a follow-up RFC.