Breaking

JWKS rotate / cleanup scheduler

The JWKS rotate scheduler was not started on some of the last versions when it should have. For this reason, depending on which version you were running and for how long, the cleanup scheduler might clean up "too much". It is advisable to trigger a manual JWKS rotation before doing the upgrade:

Admin UI -> Config -> JWKS -> Rotate Keys

At least if you see errors after the upgrade with something like "kid not found" when trying to fetch public keys for validation, you need to rotate manually once.

`preferred_username` in Tokens

The preferred_username was always added to both access_token and id_token and it always contained the same value as the email claim. This it NOT the case anymore! This value is configurable now. To match the OIDC spec, it will never be added to the access_token anymore, and only exist in the id_token if the client requested the profile scope during login. The value of this claim depends on your configuration. For more details, check the "preferred_username and tz" changes below.

Because of these changes, the email will not show up as the username in the response from the OAuth2 /introspect endpoint as well.

Note: If you are using the rauthy-client, make sure to upgrade it to 0.11 beforehand.

User Request and Response API data

The user values are much more configurable now (see in changes below). At the same time, the given_name is now always optional in responses from the API. The necessary values during user registration, if you have an open endpoint and use direct API requests from somewhere else, have changed as well. They now also depend on your configuration.

If you don't change anything in the new [user_values] section, you will not experience any breaking changes for direct API requests.

Changes

`AuthorizedKeys` for PAM users

If a user is linked to an existing PAM user, and the config allows it, users can upload their own public keys. A server can then make use of the AuthorizedKeysCommand via the sshd_config and resolve these public keys dynamically:

AuthorizedKeysCommand
   Specifies a program to be used to look up the user's
   public keys.  The program must be owned by root, not
   writable by group or others and specified by an absolute
   path.  Arguments to AuthorizedKeysCommand accept the
   tokens described in the “TOKENS” section.  If no arguments
   are specified then the username of the target user is
   used.

   The program should produce on standard output zero or more
   lines of authorized_keys output (see “AUTHORIZED_KEYS” in
   sshd(8)).  AuthorizedKeysCommand is tried after the usual
   AuthorizedKeysFile files and will not be executed if a
   matching key is found there.  By default, no
   AuthorizedKeysCommand is run.

rauthy-pam-nss was updated and can work with this new feature. You need to update to v0.2.0 for compatibility.

You will have the following new config options:

[pam.authorized_keys]

# If set to `true`, a user with a linked PAM user can upload
# public SSH keys via the account dashboard. This is disabled
# by default, because the auto-expiring PAM user passwords are
# the safer option.
#
# default: true
# overwritten by: PAM_SSH_AUTHORIZED_KEYS_ENABLE
authorized_keys_enable = true

# By default, even though these are "public" keys, the endpoint
# to retrieve them quires authentication. This will be a `basic"
# `Authentication` header in the form of `host_id:host_secret` of
# any valid PAM host configured on Rauthy.
# If you set it to `false`, the endpoint will be publicly available.
# This is fine in the sense that you cannot leak any keys (they are
# public keys anyway), but the endpoint could be abused for username
# enumeration. Depending on the `include_comments` settings below,
# you might even leak some more information that is not strictly
# sensitive, but could be abused in some other way.
#
# default: true
# overwritten by: PAM_SSH_AUTH_REQUIRED
auth_required = true

# By default, SSH keys that have expired because of
# `forced_key_expiry_days` below will be added to an internal
# blacklist. This blacklist will be checked upon key add to
# make sure keys were actually rotated and that not an old key
# is added again.
#
# default: true
# overwritten by: PAM_SSH_BLACKLIST_SSH_KEYS
blacklist_used_keys = true

# Configure the days after which blacklisted SSH keys will be
# cleaned up.
#
# default: 730
# overwritten by: PAM_SSH_BLACKLIST_CLEANUP_DAYS
blacklist_cleanup_days = 730

# You can include comments in the public response for the
# `authorized_keys` for each user. This can be helpful for
# debugging, but should generally be disabled to not
# disclose any possibly somewhat "internal" information.
#
# default: true
# overwritten by: PAM_SSH_INCLUDE_COMMENTS
include_comments = true

# You can enforce an SSH key expiry in days. After this time,
# users must generate new keys. This enforces a key rotation
# with is usually overlooked especially for SSH keys.
# Set to `0` to disable the forced expiry.
#
# default: 365
# overwritten by: PAM_SSH_KEY_EXP_DAYS
forced_key_expiry_days = 365

#1249
#1250
#1253

`preferred_username` and `tz`

The custom user values have been expanded. Each user can now provide a preferred_username and a tz (timezone) via the account dashboard. The default timezone will always be UTC, just like it was up until this version. The preferred_username behavior depends on some new configuration values. In addition to that, the requirements of all other already existing values has more config options as well. Everything that is required will also be requested during the initial registration, if you have an open registration endpoint.

Because we have these new values, they will also show up in the id_token if the profile scope was requested. Until now, the preferred_username was always existing and simply set to the email. However, this has the potential to produce issues in downstream clients, if they don't handle the preferred_username properly and require some specific value (which they really should not ...). If the user has anything else than UTC or Etc/UTC configured as timezone, the zoneinfo claim will be added to the id_token as well.

CAUTION: If your client does not request the profile scope during login, the preferred_username will NOT be set to the email like it was the case up until this version!

These are the new config options:

[user_values]

# In this section, you can configure the requirements for different
# user values to adjust them to your needs. The `preferred_username`
# as a special value provide some additional options.
# The `email` is and always will be mandatory.
#
# A value of `hidden` will only hide these values for normal users
# in the account dashboard. An admin will still see all values.
#
# You can set one of the following values:
# - required
# - optional
# - hidden

# default: 'required'
given_name = 'required'
# default: 'optional'
family_name = 'optional'
# default: 'optional'
birthdate = 'optional'
# default: 'optional'
street = 'optional'
# default: 'optional'
zip = 'optional'
# default: 'optional'
city = 'optional'
# default: 'optional'
country = 'optional'
# default: 'optional'
phone = 'optional'
# default: 'optional'
tz = 'optional'

[user_values.preferred_username]

# If the `preferred_username` is not set for a given user, the
# `email` will be used as a fallback. This can happen, if it is
# not set to `required`, or if you had it optional before and
# then changed it, while the user may have not updated it yet
# according to the new policy.
#
# one of: required, optional, hidden
# default: 'optional'
preferred_username = 'optional'

# The `preferred_username` is an unstable claim by the OIDC RFC.
# This means it MUST NOT be trusted to be unique, be a stable
# map / uid for a user, or anything like that. It is "just
# another value" and should be treated like that.
#
# However, `preferred_username`s from Rauthy will always be
# guaranteed to be unique. You can define if these usernames
# are immutable once they are set, which is the default, or if
# users can change them freely at any time.
#
# default: true
immutable = true

# Provide an array of blacklisted names.
#
# CAUTION: Provide all these names as lowercase! The value
#  submitted via API will be converted to lowercase and
#  then compared to each entry in this list.
#
# default: ['admin', 'administrator', 'root']
blacklist = ['admin', 'administrator', 'root']

# You can define the validation regex / pattern.
#
# The `pattern_html` it will be sent to the frontend as a
# String value dynamically. It must be formatted in a way,
# that it will work as a
# [`pattern` attribute](https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Attributes/pattern)
# after the conversion. If you are unsure if it works, check
# your developer tools console. You will see an error log
# if the conversion fails.
# NOTE: These are NOT Javascript regexes!
#
# By default, the validation matches the Linux username regex,
# but you may want to increase the minimum characters for
# instance.
#
# default: '^[a-zA-Z0-9][a-zA-Z0-9-.]*[a-zA-Z0-9]$'
regex_rust = '^[a-zA-Z0-9][a-zA-Z0-9-.]*[a-zA-Z0-9]$'
# default: '^[a-z][a-z0-9_\-]{1,61}$'
pattern_html = '^[a-z][a-z0-9_\-]{1,61}$'

# If a user does not have a `preferred_username`, the `email`
# can be used as a fallback value for the id token.
#
# default: true
email_fallback = true

Terms of Service

It is now possible to add Terms of Service (ToS) to Rauthy. These can be found in the Admin UI under Config -> ToS. I am not a lawyer, but I would say the implementation is legally "safe". If you have an open registration, the latest existing ToS is being shown to the user and the registration can only be completed after an accept. The other situation is an update to the ToS for existing users. To have it legally correct, a user must accept in the middle of a login flow.

After ToS have been added, they are immutable in every regard, which is another important thing for legal reasons. But, you can always add a new version, which users then have to accept. It is also possible to enable an optional transition time for new ToS. For instance, if you have Rauthy in front of an application that contains user data, you can make it possible for users to at least get their data and download it or whatever, even if they don't want to accept the new ToS. Only after the transition time is over, it becomes mandatory to accept updated ToS.

It is also possible to check the accept status for each user via the same page on the Admin UI.

While working in this feature, some major refactoring has been made for the code logic of the login flow. The goal was to simplify everything and also make it easier to maintain in the future, because the ToS added some addition complexity.

You can do most things via the Admin UI, but there is a single new config value:

[tos]

# The timeout in seconds for a user to accept update ToS during the
# login flow.
# The initial lifetime of an AuthCode after a successful authentication
# will be extended by the `accept_timeout`. This gives the user a bit
# more time to read through updates ToS and avoids an AuthCode expiry
# if it takes a bit longer. This is mainly a UX improvement. After the
# ToS have been accepted, the original AuthCode will be re-saved with
# the actual lifetime to not weaken the security in these cases.
#
# CAUTION: Even though you can extend the lifetime on Rauthys side, you
# can run into issues with logins on the client side. For legal reasons,
# accepting updated ToS must happen after a successful login but before
# providing any access. Login flows are not only time-limited on Rauthys
# side, but most often also on the client side. This means if it takes
# too long to read and accept update ToS, the user may run into an auth
# error and do the login again.
#
# default: 900
# overwritten by: TOS_ACCEPT_TIMEOUT
accept_timeout = 900

#1221

Send custom E-Mail

It is now possible to send custom E-Mails to users and filtered user groups. This is important for instance when you are planning a bigger maintenance window, or maybe you have a deadline for s specific client / user group when you enforce MFA-secured logins, and so on.

You can now find a simple editor in the Admin UI -> Users overview in the navigation. You can decide to send out E-Mail to all users, or filter them by

in group
not in group
has role
has not role

It is also possible to not send out the mails directly, but schedule them to a specific date and time. There is no such thing as embedding images or sending attachments though.

You will have the following new config options:

[email.jobs]

# This section cares about email sending to users, which can
# be done via the Admin UIs user page. These settings only
# apply for custom emails sent via UI. All automatic mails like
# a new user registration will be sent immediately.

# If an open email job has not been updated for more than
# `orphaned_seconds` seconds, it will be considered as orphaned.
# In this case, the current cluster leader can pick up this
# job and start after the last successful email sent.
#
# default: 300
# overwritten by: EMAIL_JOBS_ORPHANED_SECONDS
orphaned_seconds = 300

# The interval in seconds at which the scheduler for orphaned
# or scheduled jobs should run and check. Smaller values
# increase precision for scheduled jobs with sacrificing a bit
# higher resource usage.
#
# default: 300
# overwritten by: EMAIL_JOBS_SCHED_SECONDS
scheduler_interval_seconds = 300

# Configures the batch size and delay between batches of users
# for sending custom emails. The batch size configures the
# batch of users being retrieved from the DB at once. This means,
# if you have a filter on your email targets, the total amount
# of emails sent can be lower of course. Users are filtered
# on the client side to take the load off the DB.
#
# The default is pretty conservative to not have CPU and memory
# spikes if there is a huge amount of users, and to not overwhelm
# the SMTP server or reach rate limits.
# Depending on the speed of your SMTP server, the conservative
# default will handle ~5000 users in 1 hour. Even if it can
# take a higher load, be careful with sending too quickly to not
# trigger spam filters. Only increase throughput if needed.
#
# Note: If any error comes up during a batch, some users from this
# very batch may get duplicate emails when it is retried after
# being marked as orphaned.
#
# default: 3
# overwritten by: EMAIL_JOBS_BATCH_SIZE
batch_size = 3
#
# Delay in ms between email batches. If you set this to 0,
# Rauthy will send out emails as fast as possible. This
# should be avoided, especially for high user counts.
#
# default: 2000
# overwritten by: EMAIL_JOBS_BATCH_DELAY_MS
batch_delay_ms = 2000

#1247

UI Improvements

The UI received some updates and improvements.

#1230
#1268
#1269

Customize Timestamp formatting in E-Mails

You can now customize how the timestamp in E-Mails will be formatted. In combination with the new tz value for users (see above), timestamps can now be formatted very specific for each user to avoid confusion with UTC.

These are the new config options:

[email.tz_fmt]

# The formatting of timestamps in emails can be configured
# depending on the users' language.
#
# You can generally use all options from
# https://docs.rs/chrono/0.4.42/chrono/format/strftime/index.html
#
# default: '%d.%m.%Y %T (%Z)'
# overwritten by: TZ_FMT_DE
de = '%d.%m.%Y %T (%Z)'
# default: '%m/%d/%Y %T (%Z)'
# overwritten by: TZ_FMT_EN
en = '%m/%d/%Y %T (%Z)'
# default: '%Y-%m-%d %T (%Z)'
# overwritten by: TZ_FMT_KO
ko = '%Y-%m-%d %T (%Z)'
# default: '%d.%m.%Y %T (%Z)'
# overwritten by: TZ_FMT_NO
no = '%d.%m.%Y %T (%Z)'
# default: '%d-%m-%Y %T (%Z)'
# overwritten by: TZ_FMT_ZHHANS
zhhans = '%d-%m-%Y %T (%Z)'

# If a user has no timezone set, you can configure a
# fallback. This is useful for instance when you run a
# regional deployment.
#
# default: 'UTC'
# overwritten by: TZ_FALLBACK
tz_fallback = 'UTC'

#1246

User self-delete

Users can now be allowed to self-delete their accounts. By default, it is disabled, because when you are using e.g. SCIM, a user deletion can trigger quite a few events in other clients as well, and it might delete data that you need to clean up (or archive for legal reasons) before you can fully delete a user. So, it's opt-in, and it probably makes the most sense when you have an open
registration as well.

[user_delete]

# You can enable user self-deletion via the Account Dashboard.
# It is disabled by default, because especially if you use things like
# SCIM, the deletion of a user might trigger a series of events which
# will delete other important data as well, that might be linked to a
# user account, and you want to clean up manually before a user is being
# fully deleted.
#
# default: false
# overwritten by: USER_ENABLE_SELF_DELETE
enable_self_delete = false

#1267

Relax input validation for Client URIs

The input validation for different URIs when configuring clients via the Admin UI has been relaxed. This makes it possible to e.g have an allowed redirect_uri or http://localhost:* to work with dynamic callback ports.

#1243

Rauthy Logo as client fallback

If you have a custom logo for the Rauthy client, the same logo will automatically be used as the fallback for all other clients that do not have a custom one on their own. This is an addition of the rauthy client branding as a fallback in these cases from an earlier version.

#1242

`skip_okp=true` for GET `/oidc/certs`

As a workaround for some buggy OIDC client implementations like e.g. Cloudflare Zero Trust, you can now add skip_okp=true as a query param to the JWKS URI. If set to true, it will strip all OKP keys from the response. The URI will then look like this:

https://iam.example.com/auth/v1/oidc/certs?skip_okp=true

#1263

HA Stability Improvements

The work was done in hiqlite, but with the updates, Rauthys stability in HA deployments was improved quite a bit.

The cluster.shutdown_delay_millis config option was removed. It is not necessary to set it manually anymore. Instead, more automatic detection is being applied and a necessary delay to smooth out rolling releases or make sure the readiness of a container is being caught is added without the need for additional config.

Apart from that, lots of improvements have been made to rolling releases and how WebSocket re-connects and node startups are being handled in general. There is a new /ready endpoint on the public API as well. It can be used in e.g. Kubernetes to smooth out rolling releases and detect a pod shutdown before it becomes unable to handle Raft requests. To do so, it is important however to not have too high periodSeconds, and the headless service needs to publishNotReadyAddresses ports before ready, like so:

apiVersion: v1
kind: Service
metadata:
  name: rauthy-headless
spec:
  type: ClusterIP
  clusterIP: None
  # Make sure to only publish them on the headless service 
  # and NOT the one you are using via your reverse proxy!
  publishNotReadyAddresses: true
  sessionAffinity: None
  selector:
    app: rauthy
  ports:
    - name: hiqlite-raft
      protocol: TCP
      port: 8100
      targetPort: 8100
    - name: hiqlite-api
      protocol: TCP
      port: 8200
      targetPort: 8200

Then you can make use of the new readiness check in the StatefulSet:

readinessProbe:
  httpGet:
    scheme: HTTP
    # Hiqlite API port
    port: 8200
    path: /ready
  initialDelaySeconds: 5
  # Do NOT increase this period, because otherwise K8s may not catch
  # a shutting down pod fast enough and may keep routing requests to
  # it while is will be unable to handle them properly because of
  # the shutdown.
  periodSeconds: 3
  # We may get a single failure during leader switches
  failureThreshold: 2
livenessProbe:
  httpGet:
    scheme: HTTP
    # Rauthy API port
    port: 8080
    path: /auth/v1/health
  initialDelaySeconds: 60
  periodSeconds: 30
  # We may get a single failure during leader switches
  failureThreshold: 2

Apart from that, the hiqlite-wal had a bug where the last_purged_log_id was overwritten with None during a log truncation, even if it had a value from a log purge before. If the node restarted before another log purge fixed it, it would result in an error during startup. The new version includes a check + fix, if you start up an instance with a data set that currently has this issue.

NOTE: Rauthys shutdown in HA deployments can take up to 30 seconds when done gracefully, depending on the config and the current state the cluster is in. Some container runtimes may force-kill a container after only a few seconds by default. Make sure to adjust that timeout.

Bugfix

With a bigger internal code migration and cleanup some time ago, a few housekeeping schedulers got lost and were not started anymore.
#1247
The UI for the device_code flow had a wrong value for the user_code_length inserted via HTML <template>s.
#1258
A query param was missing in the SQL for cleaning up old JWKs.
#1265

sebadob/rauthy v0.33.0 on GitHub