Matrix auto redact bot

What?

Subject pretty much says it. Get rid of Matrix messages in a specified rooms after a specified time has passed since the posting.

Why?

Matrix doesn't have any data retention policy, all messages are kept indefinitely by default. To fix this, I created Matrix auto redact bot. Which redacts (deletes) messages from the room after N days.

Also got bored on Friday evening so I wanted to develop do something where I could learn a bit more about Matrix.

How?

  1. Linux (RedHat, not Ubuntu / Debian / Alpine this time)

  2. Python

  3. PostgreSQL

Operation - Usage instructions

  1. Create a room

  2. Invite auto_redact_bot

  3. Bot joins the room automatically after a small delay

  4. !marb 7

  5. Then it deletes (redacts in Matrix terms) all messages older than the specified N days defined by the user and keeps doing this until it's stopped, but either kicking the bot out or giving !marb stop command.

Matrix API

The features used by the integration

  1. login

  2. sync

  3. join

  4. context

  5. redact

  6. read_markers

  7. leave

  8. forget

Does not use PUSH gateway, because the task isn't latency sensitive.

Optimizations

The oldest event timestamp is kept in local database, if it's not expired, it's pointless to check if there's anything to redact.

The newest event is checked (from sync), if there's nothing new in the room it's also pointless to fetch the detailed room information.

If room has been inactive (idle, stale) for more than 30 days ,the bot will automatically leave the room. This will also happen immediately, if the bot is left in the room alone.

Other remarks

Encrypted rooms are supported, bot doesn't require encryption keys

Bot only stores / processes event_id's and room_id's + origin_server_ts (timestamps)

Database contains only room id, event id and oldest timestamp

No other than exception logging, even then the only logged infor is room_id and event_id

Currently deletion is rate limited to 400 messages / room / hour

Bot talks to matrix-client.matrix.org server, service is behind wireguard and server doesn't reply to any scans / requests other than valid wireguard packets.

Server runs in Oracle Cloud @ Frankfurt / RedHat.

Other reasons

The Element UI doesn't make it easy to mass delete messages, requiring multiple "accurate" clicks to delete messages. It's much easier to automate it.

Tech reference

Matrix Specification

Keywords:

Matrix, chat, Element, data retention, lifetime, security, expiry, privacy, enhancing tools, bots, manual, documentation, ttl_bot, time to live, Matrix Auto Redact Bot (MARB)

2021-03-07