This commit is contained in:
Gedion 2012-09-06 16:32:41 -05:00
parent dea020865f
commit 282bcffb62
180 changed files with 37685 additions and 0 deletions

3
doc/all.md Normal file
View file

@ -0,0 +1,3 @@
@include documentation
@include api/api
@include database

7
doc/api/api.md Normal file
View file

@ -0,0 +1,7 @@
# API
@include embed_parameters
@include http_api
@include hooks_client-side
@include hooks_server-side
@include editorInfo
@include changeset_library

View file

@ -0,0 +1,151 @@
# Changeset Library
```
"Z:z>1|2=m=b*0|1+1$\n"
```
This is a Changeset. Its just a string and its very difficult to read in this form. But the Changeset Library gives us some tools to read it.
A changeset describes the diff between two revisions of the document. The Browser sends changesets to the server and the server sends them to the clients to update them. This Changesets gets also saved into the history of a pad. Which allows us to go back to every revision from the past.
## Changeset.unpack(changeset)
* `changeset` {String}
This functions returns an object representaion of the changeset, similar to this:
```
{ oldLen: 35, newLen: 36, ops: '|2=m=b*0|1+1', charBank: '\n' }
```
* `oldLen` {Number} the original length of the document.
* `newLen` {Number} the length of the document after the changeset is applied.
* `ops` {String} the actual changes, introduced by this changeset.
* `charBank` {String} All characters that are added by this changeset.
## Changeset.opIterator(ops)
* `ops` {String} The operators, returned by `Changeset.unpack()`
Returns an operator iterator. This iterator allows us to iterate over all operators that are in the changeset.
You can iterate with an opIterator using its `next()` and `hasNext()` methods. Next returns the `next()` operator object and `hasNext()` indicates, whether there are any operators left.
## The Operator object
There are 3 types of operators: `+`,`-` and `=`. These operators describe different changes to the document, beginning with the first character of the document. A `=` operator doesn't change the text, but it may add or remove text attributes. A `-` operator removes text. And a `+` Operator adds text and optionally adds some attributes to it.
* `opcode` {String} the operator type
* `chars` {Number} the length of the text changed by this operator.
* `lines` {Number} the number of lines changed by this operator.
* `attribs` {attribs} attributes set on this text.
### Example
```
{ opcode: '+',
chars: 1,
lines: 1,
attribs: '*0' }
```
## APool
```
> var AttributePoolFactory = require("./utils/AttributePoolFactory");
> var apool = AttributePoolFactory.createAttributePool();
> console.log(apool)
{ numToAttrib: {},
attribToNum: {},
nextNum: 0,
putAttrib: [Function],
getAttrib: [Function],
getAttribKey: [Function],
getAttribValue: [Function],
eachAttrib: [Function],
toJsonable: [Function],
fromJsonable: [Function] }
```
This creates an empty apool. A apool saves which attributes were used during the history of a pad. There is one apool for each pad. It only saves the attributes that were really used, it doesn't save unused attributes. Lets fill this apool with some values
```
> apool.fromJsonable({"numToAttrib":{"0":["author","a.kVnWeomPADAT2pn9"],"1":["bold","true"],"2":["italic","true"]},"nextNum":3});
> console.log(apool)
{ numToAttrib:
{ '0': [ 'author', 'a.kVnWeomPADAT2pn9' ],
'1': [ 'bold', 'true' ],
'2': [ 'italic', 'true' ] },
attribToNum:
{ 'author,a.kVnWeomPADAT2pn9': 0,
'bold,true': 1,
'italic,true': 2 },
nextNum: 3,
putAttrib: [Function],
getAttrib: [Function],
getAttribKey: [Function],
getAttribValue: [Function],
eachAttrib: [Function],
toJsonable: [Function],
fromJsonable: [Function] }
```
We used the fromJsonable function to fill the empty apool with values. the fromJsonable and toJsonable functions are used to serialize and deserialize an apool. You can see that it stores the relation between numbers and attributes. So for example the attribute 1 is the attribute bold and vise versa. A attribute is always a key value pair. For stuff like bold and italic its just 'italic':'true'. For authors its author:$AUTHORID. So a character can be bold and italic. But it can't belong to multiple authors
```
> apool.getAttrib(1)
[ 'bold', 'true' ]
```
Simple example of how to get the key value pair for the attribute 1
## AText
```
> var atext = {"text":"bold text\nitalic text\nnormal text\n\n","attribs":"*0*1+9*0|1+1*0*1*2+b|1+1*0+b|2+2"};
> console.log(atext)
{ text: 'bold text\nitalic text\nnormal text\n\n',
attribs: '*0*1+9*0|1+1*0*1*2+b|1+1*0+b|2+2' }
```
This is an atext. An atext has two parts: text and attribs. The text is just the text of the pad as a string. We will look closer at the attribs at the next steps
```
> var opiterator = Changeset.opIterator(atext.attribs)
> console.log(opiterator)
{ next: [Function: next],
hasNext: [Function: hasNext],
lastIndex: [Function: lastIndex] }
> opiterator.next()
{ opcode: '+',
chars: 9,
lines: 0,
attribs: '*0*1' }
> opiterator.next()
{ opcode: '+',
chars: 1,
lines: 1,
attribs: '*0' }
> opiterator.next()
{ opcode: '+',
chars: 11,
lines: 0,
attribs: '*0*1*2' }
> opiterator.next()
{ opcode: '+',
chars: 1,
lines: 1,
attribs: '' }
> opiterator.next()
{ opcode: '+',
chars: 11,
lines: 0,
attribs: '*0' }
> opiterator.next()
{ opcode: '+',
chars: 2,
lines: 2,
attribs: '' }
```
The attribs are again a bunch of operators like .ops in the changeset was. But these operators are only + operators. They describe which part of the text has which attributes
For more information see /doc/easysync/easysync-notes.txt in the source.

47
doc/api/editorInfo.md Normal file
View file

@ -0,0 +1,47 @@
# editorInfo
## editorInfo.ace_replaceRange(start, end, text)
This function replaces a range (from `start` to `end`) with `text`.
## editorInfo.ace_getRep()
Returns the `rep` object.
## editorInfo.ace_getAuthor()
## editorInfo.ace_inCallStack()
## editorInfo.ace_inCallStackIfNecessary(?)
## editorInfo.ace_focus(?)
## editorInfo.ace_importText(?)
## editorInfo.ace_importAText(?)
## editorInfo.ace_exportText(?)
## editorInfo.ace_editorChangedSize(?)
## editorInfo.ace_setOnKeyPress(?)
## editorInfo.ace_setOnKeyDown(?)
## editorInfo.ace_setNotifyDirty(?)
## editorInfo.ace_dispose(?)
## editorInfo.ace_getFormattedCode(?)
## editorInfo.ace_setEditable(bool)
## editorInfo.ace_execCommand(?)
## editorInfo.ace_callWithAce(fn, callStack, normalize)
## editorInfo.ace_setProperty(key, value)
## editorInfo.ace_setBaseText(txt)
## editorInfo.ace_setBaseAttributedText(atxt, apoolJsonObj)
## editorInfo.ace_applyChangesToBase(c, optAuthor, apoolJsonObj)
## editorInfo.ace_prepareUserChangeset()
## editorInfo.ace_applyPreparedChangesetToBase()
## editorInfo.ace_setUserChangeNotificationCallback(f)
## editorInfo.ace_setAuthorInfo(author, info)
## editorInfo.ace_setAuthorSelectionRange(author, start, end)
## editorInfo.ace_getUnhandledErrors()
## editorInfo.ace_getDebugProperty(prop)
## editorInfo.ace_fastIncorp(?)
## editorInfo.ace_isCaret(?)
## editorInfo.ace_getLineAndCharForPoint(?)
## editorInfo.ace_performDocumentApplyAttributesToCharRange(?)
## editorInfo.ace_setAttributeOnSelection(?)
## editorInfo.ace_toggleAttributeOnSelection(?)
## editorInfo.ace_performSelectionChange(?)
## editorInfo.ace_doIndentOutdent(?)
## editorInfo.ace_doUndoRedo(?)
## editorInfo.ace_doInsertUnorderedList(?)
## editorInfo.ace_doInsertOrderedList(?)
## editorInfo.ace_performDocumentApplyAttributesToRange()

View file

@ -0,0 +1,47 @@
# Embed parameters
You can easily embed your etherpad-lite into any webpage by using iframes. You can configure the embedded pad using embed paramters.
Example:
Cut and paste the following code into any webpage to embed a pad. The parameters below will hide the chat and the line numbers.
```
<iframe src='http://pad.test.de/p/PAD_NAME?showChat=false&showLineNumbers=false' width=600 height=400></iframe>
```
## showLineNumbers
* Boolean
Default: true
## showControls
* Boolean
Default: true
## showChat
* Boolean
Default: true
## useMonospaceFont
* Boolean
Default: false
## userName
* String
Default: "unnamed"
Example: `userName=Etherpad%20User`
## noColors
* Boolean
Default: false
## alwaysShowChat
* Boolean
Default: false

View file

@ -0,0 +1,181 @@
# Client-side hooks
Most of these hooks are called during or in order to set up the formatting process.
All hooks registered to these events are called with two arguments:
1. name - the name of the hook being called
2. context - an object with some relevant information about the context of the call
## documentReady
Called from: src/templates/pad.html
Things in context:
nothing
This hook proxies the functionality of jQuery's `$(document).ready` event.
## aceDomLineProcessLineAttributes
Called from: src/static/js/domline.js
Things in context:
1. domline - The current DOM line being processed
2. cls - The class of the current block element (useful for styling)
This hook is called for elements in the DOM that have the "lineMarkerAttribute" set. You can add elements into this category with the aceRegisterBlockElements hook above.
The return value of this hook should have the following structure:
`{ preHtml: String, postHtml: String, processedMarker: Boolean }`
The preHtml and postHtml values will be added to the HTML display of the element, and if processedMarker is true, the engine won't try to process it any more.
## aceCreateDomLine
Called from: src/static/js/domline.js
Things in context:
1. domline - the current DOM line being processed
2. cls - The class of the current element (useful for styling)
This hook is called for any line being processed by the formatting engine, unless the aceDomLineProcessLineAttributes hook from above returned true, in which case this hook is skipped.
The return value of this hook should have the following structure:
`{ extraOpenTags: String, extraCloseTags: String, cls: String }`
extraOpenTags and extraCloseTags will be added before and after the element in question, and cls will be the new class of the element going forward.
## acePostWriteDomLineHTML
Called from: src/static/js/domline.js
Things in context:
1. node - the DOM node that just got written to the page
This hook is for right after a node has been fully formatted and written to the page.
## aceAttribsToClasses
Called from: src/static/js/linestylefilter.js
Things in context:
1. linestylefilter - the JavaScript object that's currently processing the ace attributes
2. key - the current attribute being processed
3. value - the value of the attribute being processed
This hook is called during the attribute processing procedure, and should be used to translate key, value pairs into valid HTML classes that can be inserted into the DOM.
The return value for this function should be a list of classes, which will then be parsed into a valid class string.
## aceGetFilterStack
Called from: src/static/js/linestylefilter.js
Things in context:
1. linestylefilter - the JavaScript object that's currently processing the ace attributes
2. browser - an object indicating which browser is accessing the page
This hook is called to apply custom regular expression filters to a set of styles. The one example available is the ep_linkify plugin, which adds internal links. They use it to find the telltale `[[ ]]` syntax that signifies internal links, and finding that syntax, they add in the internalHref attribute to be later used by the aceCreateDomLine hook (documented above).
## aceEditorCSS
Called from: src/static/js/ace.js
Things in context: None
This hook is provided to allow custom CSS files to be loaded. The return value should be an array of paths relative to the plugins directory.
## aceInitInnerdocbodyHead
Called from: src/static/js/ace.js
Things in context:
1. iframeHTML - the HTML of the editor iframe up to this point, in array format
This hook is called during the creation of the editor HTML. The array should have lines of HTML added to it, giving the plugin author a chance to add in meta, script, link, and other tags that go into the `<head>` element of the editor HTML document.
## aceEditEvent
Called from: src/static/js/ace2_inner.js
Things in context:
1. callstack - a bunch of information about the current action
2. editorInfo - information about the user who is making the change
3. rep - information about where the change is being made
4. documentAttributeManager - information about attributes in the document (this is a mystery to me)
This hook is made available to edit the edit events that might occur when changes are made. Currently you can change the editor information, some of the meanings of the edit, and so on. You can also make internal changes (internal to your plugin) that use the information provided by the edit event.
## aceRegisterBlockElements
Called from: src/static/js/ace2_inner.js
Things in context: None
The return value of this hook will add elements into the "lineMarkerAttribute" category, making the aceDomLineProcessLineAttributes hook (documented below) call for those elements.
## aceInitialized
Called from: src/static/js/ace2_inner.js
Things in context:
1. editorInfo - information about the user who will be making changes through the interface, and a way to insert functions into the main ace object (see ep_headings)
2. rep - information about where the user's cursor is
3. documentAttributeManager - some kind of magic
This hook is for inserting further information into the ace engine, for later use in formatting hooks.
## postAceInit
Called from: src/static/js/pad.js
Things in context:
1. ace - the ace object that is applied to this editor.
There doesn't appear to be any example available of this particular hook being used, but it gets fired after the editor is all set up.
## userJoinOrUpdate
Called from: src/static/js/pad_userlist.js
Things in context:
1. info - the user information
This hook is called on the client side whenever a user joins or changes. This can be used to create notifications or an alternate user list.
## collectContentPre
Called from: src/static/js/contentcollector.js
Things in context:
1. cc - the contentcollector object
2. state - the current state of the change being made
3. tname - the tag name of this node currently being processed
4. style - the style applied to the node (probably CSS)
5. cls - the HTML class string of the node
This hook is called before the content of a node is collected by the usual methods. The cc object can be used to do a bunch of things that modify the content of the pad. See, for example, the heading1 plugin for etherpad original.
## collectContentPost
Called from: src/static/js/contentcollector.js
Things in context:
1. cc - the contentcollector object
2. state - the current state of the change being made
3. tname - the tag name of this node currently being processed
4. style - the style applied to the node (probably CSS)
5. cls - the HTML class string of the node
This hook is called after the content of a node is collected by the usual methods. The cc object can be used to do a bunch of things that modify the content of the pad. See, for example, the heading1 plugin for etherpad original.
## handleClientMessage_`name`
Called from: `src/static/js/collab_client.js`
Things in context:
1. payload - the data that got sent with the message (use it for custom message content)
This hook gets called every time the client receives a message of type `name`. This can most notably be used with the new HTTP API call, "sendClientsMessage", which sends a custom message type to all clients connected to a pad. You can also use this to handle existing types.
`collab_client.js` has a pretty extensive list of message types, if you want to take a look.

View file

@ -0,0 +1,143 @@
# Server-side hooks
These hooks are called on server-side.
All hooks registered to these events are called with two arguments:
1. name - the name of the hook being called
2. context - an object with some relevant information about the context of the call
## loadSettings
Called from: src/node/server.js
Things in context:
1. settings - the settings object
Use this hook to receive the global settings in your plugin.
## pluginUninstall
Called from: src/static/js/pluginfw/installer.js
Things in context:
1. plugin_name - self-explanatory
If this hook returns an error, the callback to the uninstall function gets an error as well. This mostly seems useful for handling additional features added in based on the installation of other plugins, which is pretty cool!
## pluginInstall
Called from: src/static/js/pluginfw/installer.js
Things in context:
1. plugin_name - self-explanatory
If this hook returns an error, the callback to the install function gets an error, too. This seems useful for adding in features when a particular plugin is installed.
## init_`<plugin name>`
Called from: src/static/js/pluginfw/plugins.js
Things in context: None
This function is called after a specific plugin is initialized. This would probably be more useful than the previous two functions if you only wanted to add in features to one specific plugin.
## expressConfigure
Called from: src/node/server.js
Things in context:
1. app - the main application object
This is a helpful hook for changing the behavior and configuration of the application. It's called right after the application gets configured.
## expressCreateServer
Called from: src/node/server.js
Things in context:
1. app - the main application object (helpful for adding new paths and such)
This hook gets called after the application object has been created, but before it starts listening. This is similar to the expressConfigure hook, but it's not guaranteed that the application object will have all relevant configuration variables.
## eejsBlock_`<name>`
Called from: src/node/eejs/index.js
Things in context:
1. content - the content of the block
This hook gets called upon the rendering of an ejs template block. For any specific kind of block, you can change how that block gets rendered by modifying the content object passed in.
Have a look at `src/templates/pad.html` and `src/templates/timeslider.html` to see which blocks are available.
## socketio
Called from: src/node/hooks/express/socketio.js
Things in context:
1. app - the application object
2. io - the socketio object
I have no idea what this is useful for, someone else will have to add this description.
## authorize
Called from: src/node/hooks/express/webaccess.js
Things in context:
1. req - the request object
2. res - the response object
3. next - ?
4. resource - the path being accessed
This is useful for modifying the way authentication is done, especially for specific paths.
## authenticate
Called from: src/node/hooks/express/webaccess.js
Things in context:
1. req - the request object
2. res - the response object
3. next - ?
4. username - the username used (optional)
5. password - the password used (optional)
This is useful for modifying the way authentication is done.
## authFailure
Called from: src/node/hooks/express/webaccess.js
Things in context:
1. req - the request object
2. res - the response object
3. next - ?
This is useful for modifying the way authentication is done.
## handleMessage
Called from: src/node/handler/PadMessageHandler.js
Things in context:
1. message - the message being handled
2. client - the client object from socket.io
This hook will be called once a message arrive. If a plugin calls `callback(null)` the message will be dropped. However it is not possible to modify the message.
Plugins may also decide to implement custom behavior once a message arrives.
**WARNING**: handleMessage will be called, even if the client is not authorized to send this message. It's up to the plugin to check permissions.
Example:
```
function handleMessage ( hook, context, callback ) {
if ( context.message.type == 'USERINFO_UPDATE' ) {
// If the message type is USERINFO_UPDATE, drop the message
callback(null);
}else{
callback();
}
};
```

250
doc/api/http_api.md Normal file
View file

@ -0,0 +1,250 @@
# HTTP API
## What can I do with this API?
The API gives another web application control of the pads. The basic functions are
* create/delete pads
* grant/forbid access to pads
* get/set pad content
The API is designed in a way, so you can reuse your existing user system with their permissions, and map it to etherpad lite. Means: Your web application still has to do authentication, but you can tell etherpad lite via the api, which visitors should get which permissions. This allows etherpad lite to fit into any web application and extend it with real-time functionality. You can embed the pads via an iframe into your website.
Take a look at [HTTP API client libraries](https://github.com/Pita/etherpad-lite/wiki/HTTP-API-client-libraries) to see if a library in your favorite language.
## Examples
### Example 1
A portal (such as WordPress) wants to give a user access to a new pad. Let's assume the user have the internal id 7 and his name is michael.
Portal maps the internal userid to an etherpad author.
> Request: `http://pad.domain/api/1/createAuthorIfNotExistsFor?apikey=secret&name=Michael&authorMapper=7`
>
> Response: `{code: 0, message:"ok", data: {authorID: "a.s8oes9dhwrvt0zif"}}`
Portal maps the internal userid to an etherpad group:
> Request: `http://pad.domain/api/1/createGroupIfNotExistsFor?apikey=secret&groupMapper=7`
>
> Response: `{code: 0, message:"ok", data: {groupID: "g.s8oes9dhwrvt0zif"}}`
Portal creates a pad in the userGroup
> Request: `http://pad.domain/api/1/createGroupPad?apikey=secret&groupID=g.s8oes9dhwrvt0zif&padName=samplePad&text=This is the first sentence in the pad`
>
> Response: `{code: 0, message:"ok", data: null}`
Portal starts the session for the user on the group:
> Request: `http://pad.domain/api/1/createSession?apikey=secret&groupID=g.s8oes9dhwrvt0zif&authorID=a.s8oes9dhwrvt0zif&validUntil=1312201246`
>
> Response: `{"data":{"sessionID": "s.s8oes9dhwrvt0zif"}}`
Portal places the cookie "sessionID" with the given value on the client and creates an iframe including the pad.
### Example 2
A portal (such as WordPress) wants to transform the contents of a pad that multiple admins edited into a blog post.
Portal retrieves the contents of the pad for entry into the db as a blog post:
> Request: `http://pad.domain/api/1/getText?apikey=secret&padID=g.s8oes9dhwrvt0zif$123`
>
> Response: `{code: 0, message:"ok", data: {text:"Welcome Text"}}`
Portal submits content into new blog post
> Portal.AddNewBlog(content)
>
## Usage
### Request Format
The API is accessible via HTTP. HTTP Requests are in the format /api/$APIVERSION/$FUNCTIONNAME. Parameters are transmitted via HTTP GET. $APIVERSION is 1
### Response Format
Responses are valid JSON in the following format:
```js
{
"code": number,
"message": string,
"data": obj
}
```
* **code** a return code
* **0** everything ok
* **1** wrong parameters
* **2** internal error
* **3** no such function
* **4** no or wrong API Key
* **message** a status message. Its ok if everything is fine, else it contains an error message
* **data** the payload
### Overview
![API Overview](http://i.imgur.com/d0nWp.png)
## Data Types
* **groupID** a string, the unique id of a group. Format is g.16RANDOMCHARS, for example g.s8oes9dhwrvt0zif
* **sessionID** a string, the unique id of a session. Format is s.16RANDOMCHARS, for example s.s8oes9dhwrvt0zif
* **authorID** a string, the unique id of an author. Format is a.16RANDOMCHARS, for example a.s8oes9dhwrvt0zif
* **readOnlyID** a string, the unique id of an readonly relation to a pad. Format is r.16RANDOMCHARS, for example r.s8oes9dhwrvt0zif
* **padID** a string, format is GROUPID$PADNAME, for example the pad test of group g.s8oes9dhwrvt0zif has padID g.s8oes9dhwrvt0zif$test
### Authentication
Authentication works via a token that is sent with each request as a post parameter. There is a single token per Etherpad-Lite deployment. This token will be random string, generated by Etherpad-Lite at the first start. It will be saved in APIKEY.txt in the root folder of Etherpad Lite. Only Etherpad Lite and the requesting application knows this key. Token management will not be exposed through this API.
### Node Interoperability
All functions will also be available through a node module accessable from other node.js applications.
### JSONP
The API provides _JSONP_ support to allow requests from a server in a different domain.
Simply add `&jsonp=?` to the API call.
Example usage: http://api.jquery.com/jQuery.getJSON/
## API Methods
### Groups
Pads can belong to a group. The padID of grouppads is starting with a groupID like g.asdfasdfasdfasdf$test
* **createGroup()** creates a new group <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {groupID: g.s8oes9dhwrvt0zif}}`
* **createGroupIfNotExistsFor(groupMapper)** this functions helps you to map your application group ids to etherpad lite group ids <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {groupID: g.s8oes9dhwrvt0zif}}`
* **deleteGroup(groupID)** deletes a group <br><br>*Example returns:*
* `{code: 0, message:"ok", data: null}`
* `{code: 1, message:"groupID does not exist", data: null}`
* **listPads(groupID)** returns all pads of this group<br><br>*Example returns:*
* `{code: 0, message:"ok", data: {padIDs : ["g.s8oes9dhwrvt0zif$test", "g.s8oes9dhwrvt0zif$test2"]}`
* `{code: 1, message:"groupID does not exist", data: null}`
* **createGroupPad(groupID, padName [, text])** creates a new pad in this group <br><br>*Example returns:*
* `{code: 0, message:"ok", data: null}`
* `{code: 1, message:"pad does already exist", data: null}`
* `{code: 1, message:"groupID does not exist", data: null}`
### Author
These authors are bound to the attributes the users choose (color and name).
* **createAuthor([name])** creates a new author <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {authorID: "a.s8oes9dhwrvt0zif"}}`
* **createAuthorIfNotExistsFor(authorMapper [, name])** this functions helps you to map your application author ids to etherpad lite author ids <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {authorID: "a.s8oes9dhwrvt0zif"}}`
* **listPadsOfAuthor(authorID)** returns an array of all pads this author contributed to<br><br>*Example returns:*
* `{code: 0, message:"ok", data: {padIDs: ["g.s8oes9dhwrvt0zif$test", "g.s8oejklhwrvt0zif$foo"]}}`
* `{code: 1, message:"authorID does not exist", data: null}`
* **getAuthorName(authorID)** Returns the Author Name of the author <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {authorName: "John McLear"}}`
-> can't be deleted cause this would involve scanning all the pads where this author was
### Session
Sessions can be created between a group and an author. This allows an author to access more than one group. The sessionID will be set as a cookie to the client and is valid until a certain date. The session cookie can also contain multiple comma-seperated sessionIDs, allowing a user to edit pads in different groups at the same time. Only users with a valid session for this group, can access group pads. You can create a session after you authenticated the user at your web application, to give them access to the pads. You should save the sessionID of this session and delete it after the user logged out.
* **createSession(groupID, authorID, validUntil)** creates a new session. validUntil is an unix timestamp in seconds <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {sessionID: "s.s8oes9dhwrvt0zif"}}`
* `{code: 1, message:"groupID doesn't exist", data: null}`
* `{code: 1, message:"authorID doesn't exist", data: null}`
* `{code: 1, message:"validUntil is in the past", data: null}`
* **deleteSession(sessionID)** deletes a session <br><br>*Example returns:*
* `{code: 1, message:"ok", data: null}`
* `{code: 1, message:"sessionID does not exist", data: null}`
* **getSessionInfo(sessionID)** returns informations about a session <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {authorID: "a.s8oes9dhwrvt0zif", groupID: g.s8oes9dhwrvt0zif, validUntil: 1312201246}}`
* `{code: 1, message:"sessionID does not exist", data: null}`
* **listSessionsOfGroup(groupID)** returns all sessions of a group <br><br>*Example returns:*
* `{"code":0,"message":"ok","data":{"s.oxf2ras6lvhv2132":{"groupID":"g.s8oes9dhwrvt0zif","authorID":"a.akf8finncvomlqva","validUntil":2312905480}}}`
* `{code: 1, message:"groupID does not exist", data: null}`
* **listSessionsOfAuthor(authorID)** returns all sessions of an author <br><br>*Example returns:*
* `{"code":0,"message":"ok","data":{"s.oxf2ras6lvhv2132":{"groupID":"g.s8oes9dhwrvt0zif","authorID":"a.akf8finncvomlqva","validUntil":2312905480}}}`
* `{code: 1, message:"authorID does not exist", data: null}`
### Pad Content
Pad content can be updated and retrieved through the API
* **getText(padID, [rev])** returns the text of a pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {text:"Welcome Text"}}`
* `{code: 1, message:"padID does not exist", data: null}`
* **setText(padID, text)** sets the text of a pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: null}`
* `{code: 1, message:"padID does not exist", data: null}`
* `{code: 1, message:"text too long", data: null}`
* **getHTML(padID, [rev])** returns the text of a pad formatted as HTML<br><br>*Example returns:*
* `{code: 0, message:"ok", data: {html:"Welcome Text<br>More Text"}}`
* `{code: 1, message:"padID does not exist", data: null}`
### Pad
Group pads are normal pads, but with the name schema GROUPID$PADNAME. A security manager controls access of them and its forbidden for normal pads to include a $ in the name.
* **createPad(padID [, text])** creates a new (non-group) pad. Note that if you need to create a group Pad, you should call **createGroupPad**.<br><br>*Example returns:*
* `{code: 0, message:"ok", data: null}`
* `{code: 1, message:"pad does already exist", data: null}`
* **getRevisionsCount(padID)** returns the number of revisions of this pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {revisions: 56}}`
* `{code: 1, message:"padID does not exist", data: null}`
* **padUsersCount(padID)** returns the number of user that are currently editing this pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {padUsersCount: 5}}`
* **padUsers(padID)** returns the list of users that are currently editing this pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {padUsers: [{colorId:"#c1a9d9","name":"username1","timestamp":1345228793126},{"colorId":"#d9a9cd","name":"Hmmm","timestamp":1345228796042}]}}`
* `{code: 0, message:"ok", data: {padUsers: []}}`
* **deletePad(padID)** deletes a pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: null}`
* `{code: 1, message:"padID does not exist", data: null}`
* **getReadOnlyID(padID)** returns the read only link of a pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {readOnlyID: "r.s8oes9dhwrvt0zif"}}`
* `{code: 1, message:"padID does not exist", data: null}`
* **setPublicStatus(padID, publicStatus)** sets a boolean for the public status of a pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: null}`
* `{code: 1, message:"padID does not exist", data: null}`
* **getPublicStatus(padID)** return true of false <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {publicStatus: true}}`
* `{code: 1, message:"padID does not exist", data: null}`
* **setPassword(padID, password)** returns ok or a error message <br><br>*Example returns:*
* `{code: 0, message:"ok", data: null}`
* `{code: 1, message:"padID does not exist", data: null}`
* **isPasswordProtected(padID)** returns true or false <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {passwordProtection: true}}`
* `{code: 1, message:"padID does not exist", data: null}`
* **listAuthorsOfPad(padID)** returns an array of authors who contributed to this pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {authorIDs : ["a.s8oes9dhwrvt0zif", "a.akf8finncvomlqva"]}`
* `{code: 1, message:"padID does not exist", data: null}`
* **getLastEdited(padID)** returns the timestamp of the last revision of the pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {lastEdited: 1340815946602}}`
* `{code: 1, message:"padID does not exist", data: null}`
* **sendClientsMessage(padID, msg)** sends a custom message of type `msg` to the pad <br><br>*Example returns:*
* `{code: 0, message:"ok", data: {}}`
* `{code: 1, message:"padID does not exist", data: null}`

64
doc/database.md Normal file
View file

@ -0,0 +1,64 @@
# Database structure
## Keys and their values
### pad:$PADID
Saves all informations about pads
* **atext** - the latest attributed text
* **pool** - the attribute pool
* **head** - the number of the latest revision
* **chatHead** - the number of the latest chat entry
* **public** - flag that disables security for this pad
* **passwordHash** - string that contains a bcrypt hashed password for this pad
### pad:$PADID:revs:$REVNUM
Saves a revision $REVNUM of pad $PADID
* **meta**
* **author** - the autorID of this revision
* **timestamp** - the timestamp of when this revision was created
* **changeset** - the changeset of this revision
### pad:$PADID:chat:$CHATNUM
Saves a chatentry with num $CHATNUM of pad $PADID
* **text** - the text of this chat entry
* **userId** - the autorID of this chat entry
* **time** - the timestamp of this chat entry
### pad2readonly:$PADID
Translates a padID to a readonlyID
### readonly2pad:$READONLYID
Translates a readonlyID to a padID
### token2author:$TOKENID
Translates a token to an authorID
### globalAuthor:$AUTHORID
Information about an author
* **name** - the name of this author as shown in the pad
* **colorID** - the colorID of this author as shown in the pad
### mapper2group:$MAPPER
Maps an external application identifier to an internal group
### mapper2author:$MAPPER
Maps an external application identifier to an internal author
### group:$GROUPID
a group of pads
* **pads** - object with pad names in it, values are 1
### session:$SESSIONID
a session between an author and a group
* **groupID** - the groupID the session belongs too
* **authorID** - the authorID the session belongs too
* **validUntil** - the timestamp until this session is valid
### author2sessions:$AUTHORID
saves the sessions of an author
* **sessionsIDs** - object with sessionIDs in it, values are 1
### group2sessions:$GROUPID
* **sessionsIDs** - object with sessionIDs in it, values are 1

15
doc/documentation.md Normal file
View file

@ -0,0 +1,15 @@
# About this Documentation
<!-- type=misc -->
The goal of this documentation is to comprehensively explain Etherpad-Lite,
both from a reference as well as a conceptual point of view.
Where appropriate, property types, method arguments, and the arguments
provided to event handlers are detailed in a list underneath the topic
heading.
Every `.html` file is generated based on the corresponding
`.markdown` file in the `doc/api/` folder in the source tree. The
documentation is generated using the `tools/doc/generate.js` program.
The HTML template is located at `doc/template.html`.

2
doc/easysync/README.md Normal file
View file

@ -0,0 +1,2 @@
# About this folder
We put all documentations we found about the old Etherpad together in this folder. Most of this is still valid for Etherpad Lite

Binary file not shown.

View file

@ -0,0 +1,372 @@
\documentclass{article}
\usepackage{hyperref}
\begin{document}
\title{Etherpad and EasySync Technical Manual}
\author{AppJet, Inc., with modifications by the Etherpad Foundation}
\date{\today}
\maketitle
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\tableofcontents
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Documents}
\begin{itemize}
\item A document is a list of characters, or a string.
\item A document can also be represented as a list of \emph{changesets}.
\end{itemize}
\section{Changesets}
\begin{itemize}
\item A changeset represents a change to a document.
\item A changeset can be applied to a document to produce a new document.
\item When a document is represented as a list of changesets, it is assumed that the first changeset applies to the empty document, [].
\end{itemize}
\section{Changeset representation} \label{representation}
$$(\ell \rightarrow \ell')[c_1,c_2,c_3,...]$$
where
\begin{itemize}
\item[] $\ell$ is the length of the document before the change,
\item[] $\ell'$ is the length of the document after the change,
\item[] $[c_1,c_2,c_3,...]$ is an array of $\ell'$ characters that described the document after the change.
\end{itemize}
Note that $\forall c_i : 0 \leq i \leq \ell'$ is either an integer or a character.
\begin{itemize}
\item Integers represent retained characters in the original document.
\item Characters represent insertions.
\end{itemize}
\section{Constraints on Changesets}
\begin{itemize}
\item Changesets are canonical and therefor comparable. When represented in computer memory, we always use the same representation for the same changeset. If the memory representation of two changesets differ, they must be different changesets.
\item Changesets are compact. Thus, if there are two ways to represent a changeset in computer memory, then we always use the representation that takes up the fewest bytes.
\end{itemize}
Later we will discuss optimizations to changeset
representation (using ``strips'' and other such
techniques). The two constraints must apply to any
representation of changesets.
\section{Notation}
\begin{itemize}
\item We use the algebraic multiplication notation to represent changeset application.
\item While changesets are defined as operations on documents, documents themselves are represented as a list of changesets, initially applying to the empty document.
\end{itemize}
\paragraph{Example}
$A=(0\rightarrow 5)[``hello"]$
$B=(5\rightarrow 11)[0-4, ``\ world"]$
We can write the document ``hello world'' as $A\cdot B$ or
just $AB$. Note that the ``initial document'' can be made
into the changeset $(0\rightarrow
N)[``<\mathit{the\ document\ text}>"]$.
When $A$ and $B$ are changesets, we can also refer to $(AB)$ as ``the composition'' of $A$ and $B$. Changesets are closed under composition.
\section{Composition of Changesets}
For any two changesets $A$, $B$ such that
\begin{itemize}
\item[] $A=(n_1\rightarrow n_2)[\cdots]$
\item[] $A=(n_2\rightarrow n_3)[\cdots]$
\end{itemize}
it is clear that there is a third changeset $C=(n_1\rightarrow n_3)[\cdots]$ such that applying $C$ to a document $X$ yeilds the same resulting document as does applying $A$ and then $B$. In this case, we write $AB=C$.
Given the representation from Section \ref{representation}, it is straightforward to compute the composition of two changesets.
\section{Changeset Merging}
Now we come to realtime document editing. Suppose two different users make two different changes to the same document at the same time. It is impossible to compose these changes. For example, if we have the document $X$ of length $n$, we may have $A=(n\rightarrow n_a)[\ldots n_a \mathrm{characters}]$, $B=(n\rightarrow n_b)[\ldots n_b \mathrm{characters}]$ where $n\neq n_a\neq n_b$.
It is impossible to compute $(XA)B$ because $B$ can only be applied to a document of length $n$, and $(XA)$ has length $n_a$. Similarly, $A$ cannot be appliet to $(XB)$ because $(XB)$ has length $n_b$.
This is where \emph{merging} comes in. Merging takes two changesets that apply to the same initial document (and that cannot be composed), and computes a single new changeset that presevers the intent of both changes. The merge of $A$ and $B$ is written as $m(A,B)$. For the Etherpad system to work, we require that $m(A,B)=m(B,A)$.
Aside from what we have said so far about merging, there aremany different implementations that will lead to a workable system. We have created one implementation for text that has the following constraints.
\section{Follows} \label{follows}
When users $A$ and $B$ have the same document $X$ on their screen, and they proceed to make respective changesets $A$ and $B$, it is no use to compute $m(A,B)$, because $m(A,B)$ applies to document $X$, but the users are already looking at document $XA$ and $XB$. What we really want is to compute $B'$ and $A'$ such that
$$XAB' = XBA' = Xm(A,B)$$
``Following'' computes these $B'$ and $A'$ changesets. The definition of the ``follow'' function $f$ is such that $Af(A,B)=Bf(B,A)=m(A,B)=m(B,A)$. When we computer $f(A,B)$.
\begin{itemize}
\item Insertions in $A$ become retained characters in $f(A,B)$
\item Insertions in $B$ become insertions in $f(A,B)$
\item Retain whatever characters are retained in \emph{both} $A$ and $B$
\end{itemize}
\paragraph{Example}
Suppose we have the initial document $X=(0\rightarrow 8)[``\mathit{baseball}"]$ and user $A$ changes it to ``basil'' with changeset $A$, and user $B$ changes it to ``below'' with changeset $B$.
We have
$X=(0\rightarrow 8)[``\mathit{baseball}"]$ \\
$A=(8\rightarrow 5)[0-1, ``\mathit{si}", 7]$ \\
$B=(8\rightarrow 5)[0, ``\mathit{e}", 6, ``\mathit{ow}"]$ \\
First we compute the merge $m(A,B)=m(B,A)$ according to the constraints
$$m(A,B)=(8\rightarrow 6)[0, "e", "si", "ow"] = (8\rightarrow 6)[0, ``\mathit{esiow}"]$$
Then we need to compute the follows $B'=f(A,B)$ and $A'=f(B,A)$.
$$B'=f(A,B)=(5\rightarrow 6)[0,``\mathit{e}",2,3,``\mathit{ow}"]$$
Note that the numbers $0$, $2$, and $3$ are indices into $A=(8\rightarrow 5)[0,1,``\mathit{si}",7]$
\begin{tabular}{ccccc}
0 & 1 & 2 & 3 & 4 \\
0 & 1 & s & i & 7
\end{tabular}
$A'=f(B,A)=(5\rightarrow 6)[0,1,"si",3,4]$
We can now double check that $AB'=BA'=m(A,B)=(8\rightarrow 6)[0,``\mathit{esiow}"]$.
Now that we have made the mathematical meaning of the
preceding pages complete, we can build a client/server
system to support realtime editing by multiple users.
\section{System Overview}
There is a server that holds the current state of a
document. Clients (users) can connect to the server from
their web browsers. The clients and server maintain state
and can send messages to one another in real-time, but
because we are in a web browser scenario, clients cannot
send each other messages directly, and must go through the
server always. (This may distinguish from prior art?)
The other critical design feature of the system is that
\emph{A client must always be able to edit their local
copy of the document, so the user is never blocked from
typing because of waiting to to send or receive data.}
\section{Client State}
At any moment in time, a client maintains its state in the
form of 3 changesets. The client document looks like
$A\cdot X \cdot Y$, where
$A$ is the latest server version, the composition of all
changesets committed to the server, from this client or
from others, that the server has informed this client
about. Initially $A=(0\rightarrow N)[<\mathit{initial\ document\ text}>]$.
$X$ is the composition of all changesets this client has
submitted to the server but has not heard back about yet.
Initially $X=(N\rightarrow N)[0,1,2,\ldots, N-1]$, in
other words, the identity, henceforth denoted $I_N$.
$Y$ is the composition of all changesets this client has
made but has not yet submitted to the server yet.
Initially $Y=(N\rightarrow N)[0,1,2,\ldots, N-1]$.
\section{Client Operations}
A client can do 5 things.
\begin{enumerate}
\item Incorporate new typing into local state
\item Submit a changeset to the server
\item Hear back acknowledgement of a submitted changeset
\item Hear from the server about other clients' changesets
\item Connect to the server and request the initial document
\end{enumerate}
As these 5 events happen, the client updates its
representation $A\cdot X \cdot Y$ according to the
relations that follow. Changes ``move left'' as time goes
by: into $Y$ when the user types, into $X$ when change
sets are submitted to the server, and into $A$ when the
server acknowledges changesets.
\subsection{New local typing}
When a user makes an edit $E$ to the document, the client
computes the composition $(Y\cdot E)$ and updates its local
state, i.e. $Y \leftarrow Y\cdot E$. I.e., if $Y$ is the
variable holding local unsubmitted changes, it will be
assigned the new value $(Y\cdot E)$.
\subsection{Submitting changesets to server}
When a client submit its local changes to the server, it
transmits a copy of $Y$ and then assigns $Y$ to $X$, and
assigns the identity to $Y$. I.e.,
\begin{enumerate}
\item Send $Y$ to server,
\item $X \leftarrow Y$
\item $Y \leftarrow I_N$
(the identity).
\end{enumerate}
This happens every 500ms as long as it receives an
acknowledgement. Must receive ACK before submitting
again. Note that $X$ is always equal to the identity
before the second step occurs, so no information is lost.
\subsection{Hear ACK from server}
When the client hears ACK from server,
$A \leftarrow A\cdot X$ \\
$X \leftarrow I_N$
\subsection{Hear about another client's changeset}
When a client hears about another client's changeset $B$,
it computes a new $A$, $X$, and $Y$, which we will call
$A'$, $X'$, and $Y'$ respectively. It also computes a
changeset $D$ which is applied to the current text view on
the client, $V$. Because $AXY$ must always equal the
current view, $AXY=V$ before the client hears about $B$,
and $A'X'Y'=VD$ after the computation is performed.
The steps are:
\begin{enumerate}
\item Compute $A' = AB$
\item Compute $X' = f(B,X)$
\item Compute $Y' = f(f(X,B), Y)$
\item Compute $D=f(Y,f(X,B))$
\item Assign $A \leftarrow A'$, $X \leftarrow X'$, $Y \leftarrow Y'$.
\item Apply $D$ to the current view of the document
displayed on the user's screen.
\end{enumerate}
In steps 2,3, and 4, $f$ is the follow operation described
in Section \ref{follows}.
\paragraph{Proof that $\mathbf{AXY=V \Rightarrow A'X'Y'=VD}$.}
Substituting $A'X'Y'=(AB)(f(B,X))(f(f(X,B),Y))$, we
recall that merges are commutative. So for any two
changesets $P$ and $Q$,
$$m(P,Q)=m(Q,P)=Qf(Q,P)=Pf(P,Q)$$
Applying this to the relation above, we see
\begin{eqnarray*}
A'X'Y'&=& AB f(B,X) f(f(X,B),Y) \\
&=&AX f(X,B) f(f(X,B),Y) \\
&=&A X Y f(Y, f(X,B)) \\
&=&A X Y D \\
&=&V D
\end{eqnarray*}
As claimed.
\subsection{Connect to server}
When a client connects to the server for the first time,
it first generates a random unique ID and sends this to
the server. The client remembers this ID and sends it
with each changeset to the server.
The client receives the latest version of the document
from the server, called HEADTEXT. The client then sets
\begin{itemize}
\item[] $A \leftarrow \mathrm{HEADTEXT}$
\item[] $X \leftarrow I_N$
\item[] $Y \leftarrow I_N$
\end{itemize}
And finally, the client displays HEADTEXT on the screen.
\section{Server Overview}
Like the client(s), the server has state and performs
operations. Operations are only performed in response to
messages from clients.
\section{Server State}
The server maintains a document as an ordered list of
\emph{revision records}. A revision record is a data
structure that contains a changeset and authorship
information.
\begin{verbatim}
RevisionRecord = {
ChangeSet,
Source (unique ID),
Revision Number (consecutive order, starting at 0)
}
\end{verbatim}
For efficiency, the server may also store a variable
called HEADTEXT, which is the composition of all
changesets in the list of revision records. This is an
optimization, because clearly this can be computed from
the set of revision records.
\section{Server Operations Overview}
The server does two things in addition to maintaining
state representing the set of connected clients and
remembering what revision number each client is up to date
with:
\begin{enumerate}
\item Respond to a client's connection requesting the initial document.
\item Respond to a client's submission of a new changeset.
\end{enumerate}
\subsection{Respond to client connect}
When a server recieves a connection request from a client,
it receives the client's unique ID and stores that in the
server's set of connected clients. It then sends the
client the contents of HEADTEXT, and the corresponding
revision number. Finally the server notes that this
client is up to date with that revision number.
\subsection{Respond to client changeset}
When the server receives information from a client about
the client's changeset $C$, it does five things:
\begin{enumerate}
\item Notes that this change applies to revision number
$r_c$ (the client's latest revision).
\item Creates a new changeset $C'$ that is relative to the
server's most recent revision number, which we call
$r_H$ ($H$ for HEAD). $C'$ can be computed using
follows (Section \ref{follows}). Remember that the server has a series of
changesets,
$$S_0\rightarrow S_1\rightarrow \ldots S_{r_c}\rightarrow S_{r_c+1} \rightarrow \ldots \rightarrow S_{r_H} $$
$C$ is relative to $S_{r_c}$, but we need to compute $C'$ relative to $S_{r_H}$.
We can compute a new $C$ relative to $S_{r_c+1}$ by computing $f(S_{r_c+1},C)$. Similarly we can repeat for
$S_{r_c+2}$ and so forth until we have $C'$ represented relative to $S_{r_H}$.
\item Send $C'$ to all other clients
\item Send ACK back to original client
\item Add $C'$ to the server's list of revision records by creating a new revision record out of this and the client's ID.
\appendix
\section*{Additional topics}
\begin{enumerate}
\item Optimizations (strips, more caching, etc.)
\item Pseudocode for composition, merge, and follow
\item How authorship information is used to color-code the document based on who typed what
\item How persistent connections are maintained between client and server
\end{enumerate}
\end{enumerate}
\end{document}

Binary file not shown.

View file

@ -0,0 +1,200 @@
\documentclass[12pt]{article}
\usepackage[T1]{fontenc}
\usepackage[USenglish]{babel}
\begin{document}
\title{Easysync Protocol}
\author{AppJet, Inc., with modifications by the Etherpad Foundation}
\date{\today}
\maketitle
\section{Attributes}
An ``attribute'' is a (key,value) pair such as
\verb|(author,abc123)| or \verb|(bold,true)|.
Sometimes an attribute is treated as an instruction to add
that attribute, in which case an empty value means to
remove it. So \verb|(bold,)| removes the ``bold''
attribute. Attributes are interned and given numeric IDs,
so the number ``\verb|6|'' could represent
``\verb|(bold,true)|'', for example. This mapping is
stored in an attribute pool which may be shared by
multiple changesets.
Entries in the pool must be unique, so that attributes can
be compared by their IDs. Attribute names cannot contain
commas.
A changeset looks something like the following:
\begin{verbatim}
Z:5g>1|5=2p=v*4*5+1$x
\end{verbatim}
With the corresponding pool containing these entries (among others):
\begin{itemize}
\item[] \verb|4| $\rightarrow$ \verb|(author,1059348573)|
\item[] \verb|5| $\rightarrow$ \verb|(bold,true)|
\end{itemize}
This changeset, together with the attribute pool,
represents inserting a bold letter ``x'' into the middle
of a line.
The string consists of:
\begin{itemize}
\item a letter \verb|Z| (the ``magic character'' and
format version identifier)
\item a series punctuation marks (operation codes or
``opcodes'' for short), together with alphanumerics
(numeric values in base 36).
\item a dollar sign (\verb|$|)
\item a string of characters used for insertion operations
(the ``char bank'')
\end{itemize}
In the example above, if we separate out the operations
and convert the numbers to base 10, then we get:
\begin{verbatim}
Z :196 >1 |5=97 =31 *4 *5 +1 $x
\end{verbatim}
Here are descriptions of the operations, where capital
letters are variables:
\begin{description}
\item{{\bf :N}} \quad \\
Source text has length $N$ (must be first op)
\item{{\bf >N}} \quad \\
Final text is $N$ (positive) characters longer than source
text (must be second op)
\item{{\bf <N }} \quad \\
Final text is $N$ (positive) characters shorter than
source text (must be second op)
\item{{\bf >0 }} \quad \\
Final text is same length as source text
\item{{\bf +N }} \quad \\
Insert $N$ characters from the bank, none of them newlines
\item{{\bf -N}} \quad \\
Skip over (delete) $N$ characters from the source text,
none of them newlines
\item{{\bf =N}} \quad \\
Keep $N$ characters from the source text, none of them newlines
\item{{\bf |L+N}} \quad \\
Insert $N$ characters from the source text, containing $L$
newlines. The last character inserted MUST be a newline,
but not the (new) document's final newline.
\item{{\bf |L-N}} \quad \\
Delete $N$ characters from the source text, containing $L$
newlines. The last character inserted MUST be a newline,
but not the (old) document's final newline.
\item{{\bf |L=N}} \quad \\
Keep $N$ characters from the source text, containing L
newlines. The last character kept MUST be a newline, and
the final newline of the document is allowed.
\item{{\bf *I}} \quad \\
Apply attribute $I$ from the pool to the following
\verb|+|, \verb|=|, \verb_|+_, or \verb_|=_ command. In
other words, any number of \verb|*| ops can come before a
\verb_+_, \verb_=_, or \verb_|_ but not between a \verb_|_
and the corresponding \verb_+_ or \verb_=_. If \verb_+_,
text is inserted having this attribute. If \verb_=_, text
is kept but with the attribute applied as an attribute
addition or removal. Consecutive attributes must be sorted
lexically by (key,value) with key and value taken as
strings. It's illegal to have duplicate keys for
(key,value) pairs that apply to the same text. It's
illegal to have an empty value for a key in the case of an
insertion (\verb_+_), the pair should just be omitted.
\end{description}
Characters from the source text that aren't accounted for
are assumed to be kept with the same attributes.
\paragraph{Additional Constraints}
\begin{itemize}
\item Consecutive \verb_+_, \verb_-_, and \verb_=_ ops of
the same type that could be combined are not allowed.
Whether combination is possible depends on the
attributes of the ops and whether each is multiline or
not. For example, two multiline deletions can never be
consecutive, nor can any insertion come after a
non-multiline insertion with the same attributes.
\item ``No-op'' ops are not allowed, such as deleting 0
characters. However, attribute applications that don't
have any effect are allowed.
\item Characters at the end of the source text cannot be
explicitly kept with no changes; if the change doesn't
affect the last $N$ characters, those ``keep'' ops must
be left off.
\item In any consecutive sequence of insertions (\verb_+_)
and deletions (\verb_-_) with no keeps (\verb_=_), the
deletions must come before the insertions.
\item The document text before and after will always end
with a newline. This policy avoids a lot of
special-casing of the end of the document. If a final
newline is always added when importing text and removed
when exporting text, then the changeset representation
can be used to process text files that may or may not
have a final newline.
\end{itemize}
\paragraph{Attribution string}
An \emph{attribution string} is a series of inserts with
no deletions or keeps. For example, ``\verb_*3+8|1+5_''
describes the attributes of a string of length 13, where
the first 8 chars have attribute 3 and the next 5 chars
have no attributes, with the last of these 5 chars being a
newline. Constraints apply similar to those affecting
changesets, but the restriction about the final newline of
the new document being added doesn't apply.
Attributes in an attribution string cannot be empty, like
``\verb|(bold,)|'', they should instead be absent.
\section{Further Considerations}
\begin{itemize}
\item composing changesets/attributions with different
pools.
\item generalizing ``applyToAttribution'' to make
``mutateAttributionLines'' and ``compose''
\end{itemize}
\section{Using Unicode?}
\begin{itemize}
\item no unicode (for efficient escaping, sightliness)
\item efficient operations for ACE and collab (attributed text, etc.)
\item good for time-slider
\item good for API
\item line-ending aware
X more coherent (deleting or styling text merging with insertion)
\item server-side syntax highlighting?
\item unify author map with attribute pool
\item unify attributed text with changeset rep
\item not: reversible
\item force final newline of document to be preserved
\end{itemize}
\paragraph{Unicode bad!}
\begin{itemize}
\item ugly (hard to read)
\item more complex to parse
\item harder to store and transmit correctly
\item doesn't save all that much space anyway
\item blows up in size when string-escaped
\item embarrassing for API
\end{itemize}
\end{document}

View file

@ -0,0 +1,133 @@
Copied from the old Etherpad. Found in /infrastructure/ace/
Goals:
- no unicode (for efficient escaping, sightliness)
- efficient operations for ACE and collab (attributed text, etc.)
- good for time-slider
- good for API
- line-ending aware
X more coherent (deleting or styling text merging with insertion)
- server-side syntax highlighting?
- unify author map with attribute pool
- unify attributed text with changeset rep
- not: reversible
- force final newline of document to be preserved
- Unicode bad:
- ugly (hard to read)
- more complex to parse
- harder to store and transmit correctly
- doesn't save all that much space anyway
- blows up in size when string-escaped
- embarrassing for API
# Attributes:
An "attribute" is a (key,value) pair such as (author,abc123456) or
(bold,true). Sometimes an attribute is treated as an instruction to
add that attribute, in which case an empty value means to remove it.
So (bold,) removes the "bold" attribute. Attributes are interned and
given numeric IDs, so the number "6" could represent "(bold,true)",
for example. This mapping is stored in an attribute "pool" which may
be shared by multiple changesets.
Entries in the pool must be unique, so that attributes can be compared
by their IDs. Attribute names cannot contain commas.
A changeset looks something like the following:
Z:5g>1|5=2p=v*4*5+1$x
With the corresponding pool containing these entries:
...
4 -> (author,1059348573)
5 -> (bold,true)
...
This changeset, together with the pool, represents inserting
a bold letter "x" into the middle of a line. The string consists of:
- a letter Z (the "magic character" and format version identifier)
- a series of opcodes (punctuation) and numeric values in base 36 (the
alphanumerics)
- a dollar sign ($)
- a string of characters used by insertion operations (the "char bank")
If we separate out the operations and convert the numbers to base 10, we get:
Z :196 >1 |5=97 =31 *4 *5 +1 $"x"
Here are descriptions of the operations, where capital letters are variables:
":N" : Source text has length N (must be first op)
">N" : Final text is N (positive) characters longer than source text (must be second op)
"<N" : Final text is N (positive) characters shorter than source text (must be second op)
">0" : Final text is same length as source text
"+N" : Insert N characters from the bank, none of them newlines
"-N" : Skip over (delete) N characters from the source text, none of them newlines
"=N" : Keep N characters from the source text, none of them newlines
"|L+N" : Insert N characters from the source text, containing L newlines. The last
character inserted MUST be a newline, but not the (new) document's final newline.
"|L-N" : Delete N characters from the source text, containing L newlines. The last
character inserted MUST be a newline, but not the (old) document's final newline.
"|L=N" : Keep N characters from the source text, containing L newlines. The last character
kept MUST be a newline, and the final newline of the document is allowed.
"*I" : Apply attribute I from the pool to the following +, =, |+, or |= command.
In other words, any number of * ops can come before a +, =, or | but not
between a | and the corresponding + or =.
If +, text is inserted having this attribute. If =, text is kept but with
the attribute applied as an attribute addition or removal.
Consecutive attributes must be sorted lexically by (key,value) with key
and value taken as strings. It's illegal to have duplicate keys
for (key,value) pairs that apply to the same text. It's illegal to
have an empty value for a key in the case of an insertion (+), the
pair should just be omitted.
Characters from the source text that aren't accounted for are assumed to be kept
with the same attributes.
Additional Constraints:
- Consecutive +, -, and = ops of the same type that could be combined are not allowed.
Whether combination is possible depends on the attributes of the ops and whether
each is multiline or not. For example, two multiline deletions can never be
consecutive, nor can any insertion come after a non-multiline insertion with the
same attributes.
- "No-op" ops are not allowed, such as deleting 0 characters. However, attribute
applications that don't have any effect are allowed.
- Characters at the end of the source text cannot be explicitly kept with no changes;
if the change doesn't affect the last N characters, those "keep" ops must be left off.
- In any consecutive sequence of insertions (+) and deletions (-) with no keeps (=),
the deletions must come before the insertions.
- The document text before and after will always end with a newline. This policy avoids
a lot of special-casing of the end of the document. If a final newline is
always added when importing text and removed when exporting text, then the
changeset representation can be used to process text files that may or may not
have a final newline.
Attribution string:
An "attribution string" is a series of inserts with no deletions or keeps.
For example, "*3+8|1+5" describes the attributes of a string of length 13,
where the first 8 chars have attribute 3 and the next 5 chars have no
attributes, with the last of these 5 chars being a newline. Constraints
apply similar to those affecting changesets, but the restriction about
the final newline of the new document being added doesn't apply.
Attributes in an attribution string cannot be empty, like "(bold,)", they should
instead be absent.
-------
Considerations:
- composing changesets/attributions with different pools
- generalizing "applyToAttribution" to make "mutateAttributionLines" and "compose"

23
doc/template.html Normal file
View file

@ -0,0 +1,23 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>__SECTION__ Etherpad-Lite Manual &amp; Documentation</title>
<link rel="stylesheet" href="style.css">
</head>
<body class="apidoc" id="api-section-__FILENAME__">
<header id="header">
<h1>Etherpad-Lite Manual &amp; Documentation</h1>
</header>
<div id="toc">
<h2>Table of Contents</h2>
__TOC__
</div>
<div id="apicontent">
__CONTENT__
</div>
</body>
</html>