Deterministic vs Probabilistic Email Filtering

Every email filter uses AI to guess. We built one that doesn't guess at all. Here's the technical rationale behind Rythm's binary approach.

Every email filter you’ve used makes a guess. Gmail uses TensorFlow and Gemini Nano. Proofpoint uses behavioral AI. SaneBox uses machine learning trained on your habits. They all analyze content, context, and signals to predict whether an email is legitimate.

Rythm doesn’t predict anything. It asks one question: is this sender on your guest list? Yes means deliver. No means hold and request verification.

This is a deliberate architectural choice, not a limitation. It is the foundation of what we call economic email filtering. Here’s why we made it.

The Problem With Prediction

Probabilistic filters work by scoring. Every incoming email gets a spam probability, say, 0.87 out of 1.0. If it’s above a threshold (typically 0.5-0.7), it’s filtered. If it’s below, it’s delivered.

This means every email exists in a gray zone. The filter is never certain. It’s always making a tradeoff between false positives (real email caught as spam) and false negatives (spam that gets through).

Traditionally, this tradeoff was manageable. Spam was crude. The signals were obvious: bulk sending, known malicious domains, poor grammar, suspicious attachments. A well-trained model could score with high accuracy.

AI changed the equation. We documented the specific attack types in 5 types of phishing emails that fool Gmail. When a phishing email is written by the same AI models that the filters use for detection, the content-based signals converge. The phishing email looks exactly like a real email because it was generated by the same kind of system that’s trying to detect it.

The arms race isn’t one side getting ahead. It’s both sides using the same weapon, with diminishing returns on each escalation.

The Deterministic Alternative

A deterministic filter doesn’t score. It classifies. Binary. Known sender or unknown sender.

Known sender: Deliver. No scanning, no scoring, no delay.

Unknown sender: Hold. Request verification (in Rythm’s case, a small cover charge).

There’s no gray zone. No probability threshold. No tradeoff between false positives and false negatives. The classification is based on identity, not content.

How the guest list works

Rythm builds the guest list automatically:

Starred/flagged messages - your VIPs
Contacts - imported from your address book
Sent folder - anyone you’ve emailed
Inbox activity - senders who’ve emailed you regularly

The guest list is dynamic. It updates as you communicate. When you rescue an email from the filtered folder, that sender is added permanently.

Why verification, not blocking

An unknown sender isn’t necessarily unwanted. They might be a potential customer, a journalist, or someone with a legitimate reason to reach you. Blocking them entirely would be a disservice.

Rythm holds their email and requests a cover charge, about 4 cents by default. This is a verification mechanism: it proves a real person made a deliberate decision to send this message. Mass senders can’t afford it at scale. Individuals don’t notice it.

The email isn’t deleted. It waits in a labeled folder. If you want to read it without the sender paying, you can. And rescuing it adds the sender to your guest list.

The Tradeoffs (Honest Assessment)

What deterministic filtering does better:

Zero false positives on known senders (if you’ve communicated, they get through)
Immune to content-based attacks (AI phishing, sophisticated impersonation)
No training period, no AI learning curve
Binary decisions are auditable and explainable

What probabilistic filtering does better:

Catches known spam patterns (bulk, obvious scams) without requiring user setup
Works on day one with zero configuration
Handles newsletters and marketing email sorting
Doesn’t require senders to take any action

The honest answer: They’re complementary. Gmail’s probabilistic filter catches 99.9% of obvious junk, though as we explored in why your Gmail spam filter isn’t enough, that remaining 0.1% is increasingly dangerous. Rythm’s deterministic filter catches the sophisticated messages that get through, the ones that look real from senders you don’t know.

We didn’t build Rythm to replace Gmail’s filter. We built it to cover the gap that probabilistic filtering can’t close.

The Philosophical Argument

Beyond the technical rationale, there’s a philosophical one.

Probabilistic filtering puts the decision in the algorithm’s hands. It decides what you see. You can influence it (mark as spam, move to inbox), but the core judgment is the system’s, not yours.

Deterministic filtering puts the decision in your hands. You define who’s on the guest list. You set the cover charge. You choose whether to rescue a filtered email. The system executes your rules. It doesn’t make rules for you.

We believe that’s how email should work. Your inbox, your rules. Not your inbox, our algorithm’s best guess. For the technical details of how payments settle without Rythm touching funds, see our non-custodial architecture deep dive.

Why We Chose Deterministic Over Probabilistic Email Filtering