What Comes Next: Multi-Tenancy, Compliance Exports, and Closing the Loop to EventHorizon
A compliance system that can’t be audited is just an expensive alerting system.
That’s been the north star for this project: not just detecting suspicious activity, but producing artifacts that a compliance officer could actually use — narratives grounded in policy documents, a complete audit trail in Postgres, cross-references back to the originating events. The pipeline exists. The artifacts are there. There’s still work to do to make them useful to people other than me sitting in a terminal.
This is where the project stands, what I think comes next, and what I’ve actually learned.
What’s Half-Built
Multi-Tenancy
The placeholder for this is already in routes/web.php — a comment noting where tenant-scoped middleware should wrap the auth group. The shape is clear: every stream key gets a tenant prefix ({tenant}:sentinel:transactions, {tenant}:synapse:axioms), every dashboard query gets a tenant filter on the Postgres side, and the middleware resolves the tenant from the authenticated user’s organization.
flowchart TD
Req[Request → tenant.sentinel.app] --> MW[TenantResolver middleware]
MW --> Resolve[Look up tenant by subdomain<br/>or authenticated user.org]
Resolve --> Ctx[Bind tenant_id into request context]
Ctx --> Route[Route handler]
Route -->|scoped key| Stream["XADD {tenant}:sentinel:transactions"]
Route -->|scoped query| Query["WHERE tenant_id = ?<br/>on compliance_events"]
This is a clean architectural extension to what already exists. The consumer commands take a configurable stream key. The models have a tenant_id column. The missing piece is the resolution layer: how does a request know which tenant it belongs to, and how does that propagate to the stream consumer? Subdomain routing is the most natural answer in a Laravel context — tenant.sentinel.app resolves to the tenant record, which provides the stream key prefix.
I haven’t needed this yet. When there’s a second user, I’ll need it immediately.
Compliance Report Export
The audit trail in compliance_events is exactly what you’d want to export: timestamp, source ID, risk level, narrative, policy references, driver used, whether it was AI-routed. A CSV export endpoint filtered by date range and optional risk level would turn the database into a report.
flowchart LR
UI[Dashboard form<br/>date range + risk level] --> EP["GET /export/compliance.csv"]
EP --> Q["SELECT FROM compliance_events<br/>WHERE timestamp BETWEEN ?<br/>AND risk_level IN (?)"]
Q --> Stream[Stream rows]
Stream --> CSV["CSV / PDF response<br/>+ Content-Disposition"]
This is straightforward to build and genuinely useful. A compliance officer should be able to pull a quarterly report without needing database access. The data is there; the endpoint isn’t.
EventHorizon Deep-Link
Every compliance_event row has a source_id column that references the originating event in EventHorizon — the upstream event store that Synapse-L4 taps into. Right now, that column is populated but inert. It’s a string. It doesn’t link anywhere.
flowchart LR
Row["compliance_events row<br/>source_id = eh:42a1b..."] --> Dash[Compliance dashboard]
Dash -->|click source_id| URL["URL template<br/>https://eventhorizon/events/{source_id}"]
URL --> EH[EventHorizon detail view<br/>raw telemetry]
The deep-link would be a URL template that takes source_id and constructs a link to the EventHorizon event detail view. From the compliance dashboard, you’d click a row and land directly on the raw event that generated the anomaly. That’s the audit chain: from compliance narrative back to raw telemetry, traceable at every step.
The implementation is simple — a config value for the EventHorizon base URL, a computed property on the model, a link in the dashboard row. The design question is whether EventHorizon is a separate service or a separate domain in the same application. That’s still open.
Decisions Still Open
The 0.90 similarity threshold. ADR-0015 originally proposed 0.95 as the threshold, but empirical testing showed that 0.90 provides the right balance: common transaction patterns (coffee shop, grocery, gas station) cluster tightly enough with bucketized fingerprints that false positives are rare, while still being conservative enough for edge cases near reporting thresholds. The threshold is unlikely to be per-category — the bucketing itself provides the semantic specificity needed.
Amount representation in fingerprints. ADR-0002 is still open. The current buckets (micro/small/medium/large/very_large) work, but the boundaries are fixed. A $9.99 transaction and a $10.01 transaction land in different buckets (micro vs small) despite being nearly identical in risk profile. Softer bucket boundaries — or a continuous representation — might improve hit rate near the edges. I don’t have the data to know whether this matters in practice yet.
Prompt versioning discipline. All LLM prompt templates live in prompts/ as versioned Markdown files. This has been more valuable than I expected: when narratives started looking off after a driver switch, I could diff the prompt files and immediately see whether anything had changed. The versioning overhead is low (increment a number, add a changelog line) and the diagnostic value is high. I’d call this a clear success and would do it from day one on any future LLM project.
What I’ve Actually Learned
Semantic caching is a fingerprint design problem, not a similarity math problem. The vector operations are easy. Deciding which features of an event carry meaningful signal for similarity is hard and requires domain knowledge. Time-of-day buckets work because compliance analysis cares about when in the day, not the exact minute. Amount tiers work because compliance thresholds are categorical, not continuous. Get the fingerprint wrong and the math can’t save you.
Silent failures in multi-stage pipelines are the hardest class of bug. Every stage that can return empty needs to log whether it returned empty. Zero RAG results is not an error; it’s a signal. Zero cache hits is not an error; it’s a signal. If you only log exceptions, you’ll never see the cases where a stage succeeds at returning nothing and all downstream stages gracefully handle the void.
At-least-once delivery with Redis Streams is the right trade-off for this domain. You’d rather process an event twice (reclaimer picks up a zombie) than never process it. Idempotency on the write side (upsert, not insert) means duplicates are harmless. The reclaimer has fired a handful of times in real operation — always because a worker hit a quota error and exited mid-batch. The messages were recovered within 60 seconds.
The tier-3 fallback earns its existence. I expected the rule-based ThreatAnalysisService to be a rarely-used last resort. It fires on every Gemini rate-limit event, every network timeout, every embedding API blip. In a system that can’t drop events, the fallback isn’t a hedge — it’s load-bearing infrastructure. Design it to be correct, not just present.
Driver abstraction isn’t over-engineering when the domain requires it. The ComplianceDriver interface has one method. It took thirty minutes to write. It saved four hours when quota exhaustion required a same-day backend swap in production. The abstraction boundary existed for exactly the reason you’d hope.
Where This Goes
The pipeline is solid. The audit trail is complete. The missing pieces are UX and integration: exports for compliance officers, deep-links to source events, multi-tenancy for real deployments.
The interesting future work is on the intelligence side. The current system treats each event independently. A more capable version would correlate events over time — identifying patterns across the audit trail, surfacing recurring sources, generating periodic summary reports rather than only per-event narratives. That’s a different problem: not stream processing but longitudinal analysis. It would require a different kind of query and a different kind of prompt.
I’ll get there. For now, the dashboard is refreshing, the stream is flowing, and the narratives are grounded in actual policy text.
That’s enough.
// comments via github discussions