Before this commit, webaccess.checkAccess saved the authorization in
user.padAuthorizations[padId] with padId being the read-only pad ID,
however later stages, e.g. in PadMessageHandler, use the real pad ID for
access checks. This led to authorization being denied.
This commit fixes it by only storing and comparing the real pad IDs and
not read-only pad IDs.
This fixes test case "authn user readonly pad -> 200, ok" in
src/tests/backend/specs/socketio.js.
Normally I would let `eslint --fix` do this for me, but there's a bug
that causes:
const x = function ()
{
// ...
};
to become:
const x = ()
=> {
// ...
};
which ESLint thinks is a syntax error. (It probably is; I don't know
enough about the automatic semicolon insertion rules to be confident.)
This will be a breaking change for some people.
We removed all internal password control logic. If this affects you, you have two options:
1. Use a plugin for authentication and use session based pad access (recommended).
1. Use a plugin for password setting.
The reasoning for removing this feature is to reduce the overall security footprint of Etherpad. It is unnecessary and cumbersome to keep this feature and with the thousands of available authentication methods available in the world our focus should be on supporting those and allowing more granual access based on their implementations (instead of half assed baking our own).
There's no need to perform an authentication check in the socket.io
middleware because `PadMessageHandler.handleMessage` calls
`SecurityMananger.checkAccess` and that now performs authentication
and authorization checks.
This change also improves the user experience: Before, access denials
caused socket.io error events in the client, which `pad.js` mostly
ignores (the user doesn't see anything). Now a deny message is sent
back to the client, which causes `pad.js` to display an obvious
permission denied message.
This also fixes a minor bug: `settings.loadTest` is supposed to bypass
authentication and authorization checks, but they weren't bypassed
because `SecurityManager.checkAccess` did not check
`settings.loadTest`.
Before, a malicious user could bypass authorization restrictions
imposed by the authorize hook:
* Step 1: Fetch any resource that the malicious user is authorized to
access (e.g., static content).
* Step 2: Use the signed express_sid cookie generated in step 1 to
create a socket.io connection.
* Step 3: Perform the CLIENT_READY handshake for the desired pad.
* Step 4: Profit!
Now the authorization decision made by the authorize hook is
propagated to SecurityManager so that it can approve or reject
socket.io messages as appropriate.
This also sets up future support for per-user read-only and
modify-only (no create) authorization levels.
* Move session validity check and session author ID fetch to a
separate function. This separate function can be used by hooks,
making it easier for them to properly determine the author ID.
* Rewrite the remainder of checkAccess. Benefits:
- The function is more readable and maintainable now.
- Vulnerability fix: Before, the session IDs in sessionCookie
were not validated when checking settings.requireSession. Now,
sessionCookie must identify a valid session for the
settings.requireSession test to pass.
- Bug fix: Before, checkAccess would sometimes use the author ID
associated with the token even if sessionCookie identified a
valid session. Now it always uses the author ID associated
with the session if available.
There are two different ways an author ID becomes associated with a
user: either bound to a token or bound to a session ID. (The token and
session ID come from the `token` and `sessionID` cookies, or, in the
case of socket.io messages, from the `token` and `sessionID` message
properties.) When `settings.requireSession` is true or the user is
accessing a group pad, the session ID should be used. Otherwise the
token should be used.
Before this change, the `/p/:pad/import` handler was always using the
token, even when `settings.requireSession` was true. This caused the
following error because a different author ID was bound to the token
versus the session ID:
> Unable to import file into ${pad}. Author ${authorID} exists but he
> never contributed to this pad
This bug was reported in issue #4006. PR #4012 worked around the
problem by binding the same author ID to the token as well as the
session ID.
This change does the following:
* Modifies the import handler to use the session ID to obtain the
author ID (when appropriate).
* Expands the documentation for the SecurityManager checkAccess
function.
* Removes the workaround from PR #4012.
* Cleans up the `bin/createUserSession.js` test script.
"token" is a random token representing the author, of the form
t.randomstring_of_lenght_20. The random string is generated by the client. The
cookie is used for every pad in the web UI, and is not used for HTTP API.
This comes from the discussion at https://github.com/ether/etherpad-lite/issues/3563
Sometimes, RFC 6265-compliant [0] web servers may send back a cookie whose value
is enclosed in double quotes, such as:
Set-Cookie: sessionCookie="s.37cf5299fbf981e14121fba3a588c02b,s.2b21517bf50729d8130ab85736a11346"; Version=1; Path=/; Domain=localhost; Discard
Where the double quotes at the start and the end of the header value are just
delimiters. This is perfectly legal: Etherpad parsing logic should cope with
that, and remove the quotes early in the request phase.
Somehow, this does not happen, and in such cases the actual value that
sessionCookie ends up having is:
sessionCookie = '"s.37cf5299fbf981e14121fba3a588c02b,s.2b21517bf50729d8130ab85736a11346"'
As quick measure, let's strip the double quotes (when present).
Note that here we are being minimal, limiting ourselves to just removing quotes
at the start and the end of the string.
Fixes#3819.
Also, see #3820.
[0] https://tools.ietf.org/html/rfc6265
Steps to reproduce (via HTTP API):
1. create a group via createGroup()
2. create a group pad inside that group via createGroupPad()
3. make that pad public calling setPublicStatus(true)
4. access the pad via a clean web browser (with no sessions)
5. UnhandledPromiseRejectionWarning: apierror: sessionID does not exist
This was due to an overlook in 769933786c: "apierror: sessionID does not
exist" may be a legal condition if we are also visiting a public pad. The
function that could throw that error was sessionManager.getSessionInfo(), and
thus it needed to be inside the try...catch block.
Please note that calling getText() on the pad always return the pad contents,
*even for non-public pads*, because the API bypasses the security checks and
directly talks to the DB layer.
Fixes#3600.
some code chunks previously used `async.parallel` but if you
use `await` that forces them to be run serially. Instead,
you can initiate the operation (getting a Promise) and then
_later_ `await` the result of that Promise.
Also converted the handler functions that depend on checkAccess() into async
functions too.
NB: this commit needs specific attention to it because it touches a lot of
security related code!
This change is in preparation of the future async refactoring by Ray. It tries
to extract as many changes in boolean conditions as possible, in order to make
more evident identifying eventual logic bugs in the future work.
This proved already useful in at least one case.
BEWARE: this commit exposes an incoherency in the DB API, in which, depending
on the driver used, some functions can return null or undefined. This condition
will be externally fixed by the final commit in this series ("db/DB.js: prevent
DB layer from returning undefined"). Until that commit, the code base may have
some bugs.
This change is only cosmetic. Its aim is do make it easier to understand the
async changes that are going to be merged later on. It was extracted from the
original work from Ray Bellis.
To verify that nothing has changed, you can run the following command on each
file touched by this commit:
npm install uglify-es
diff --unified <(uglify-js --beautify bracketize <BEFORE.js>) <(uglify-js --beautify bracketize <AFTER.js>)
This is a complete script that does the same automatically (works from a
mercurial clone):
```bash
#!/usr/bin/env bash
set -eu
REVISION=<THIS_REVISION>
PARENT_REV=$(hg identify --rev "${REVISION}" --template '{p1rev}')
FILE_LIST=$(hg status --no-status --change ${REVISION})
UGLIFYJS="node_modules/uglify-es/bin/uglifyjs"
for FILE_NAME in ${FILE_LIST[@]}; do
echo "Checking ${FILE_NAME}"
diff --unified \
<("${UGLIFYJS}" --beautify bracketize <(hg cat --rev "${PARENT_REV}" "${FILE_NAME}")) \
<("${UGLIFYJS}" --beautify bracketize <(hg cat --rev "${REVISION}" "${FILE_NAME}"))
done
```
This is documented to be more performant.
The substitution was made on frontend code, too (i.e., the one in /static),
because Date.now() is supported since IE 9, and we are life supporting only
IE 11.
Commands:
find . -name *.js | xargs sed --in-place "s/new Date().getTime()/Date.now()/g"
find . -name *.js | xargs sed --in-place "s/(new Date()).getTime()/Date.now()/g"
Not done on jQuery.