Restore Armis northbound availability updates #3134

Closed
opened 2026-04-13 14:26:27 +00:00 by mfreeman451 · 1 comment
Owner

Summary

ServiceRadar currently supports inbound Armis discovery but no longer performs the northbound update flow that writes post-sweep availability back to Armis.

This needs to be restored using the current architecture, not the old NATS KV-heavy path.

Problem

Today we can:

  • discover devices from Armis
  • populate the database
  • perform ICMP/TCP availability checks through agents

But we do not currently:

  • schedule outbound Armis update runs in a first-class way
  • update Armis device state/tag/custom field by armis_device_id
  • expose northbound controls/status in the UI
  • emit clear per-run metrics/events for northbound updates

Required behavior

  • Use database-backed state as the source of truth for latest consolidated device availability
  • Schedule Armis northbound updates with AshOban/Oban
  • Make the cadence user-configurable from Settings -> Network -> Integrations
  • Support manual "run now"
  • Emit one outbound update per Armis device keyed by armis_device_id
  • Persist northbound run history, status, and failure details separately from inbound sync status
  • Emit metrics and success/failure events for every run
  • Surface the behavior in the Integrations and Jobs UI
  • Use bulk Armis API updates sized for large fleets; do not perform one-at-a-time writes for ~50k devices

Implementation notes

  • Do not revive the legacy NATS KV-driven northbound path
  • Prefer the current Elixir control-plane architecture for scheduling/execution
  • Existing related work/history:
    • #726 feat: 2-way integration support

Validation

  • OpenSpec change: add-armis-northbound-availability-updates
  • Demo namespace verification required
  • Resulting work should be deployed/tested in Kubernetes demo
## Summary ServiceRadar currently supports inbound Armis discovery but no longer performs the northbound update flow that writes post-sweep availability back to Armis. This needs to be restored using the current architecture, not the old NATS KV-heavy path. ## Problem Today we can: - discover devices from Armis - populate the database - perform ICMP/TCP availability checks through agents But we do not currently: - schedule outbound Armis update runs in a first-class way - update Armis device state/tag/custom field by `armis_device_id` - expose northbound controls/status in the UI - emit clear per-run metrics/events for northbound updates ## Required behavior - Use database-backed state as the source of truth for latest consolidated device availability - Schedule Armis northbound updates with AshOban/Oban - Make the cadence user-configurable from Settings -> Network -> Integrations - Support manual "run now" - Emit one outbound update per Armis device keyed by `armis_device_id` - Persist northbound run history, status, and failure details separately from inbound sync status - Emit metrics and success/failure events for every run - Surface the behavior in the Integrations and Jobs UI - Use bulk Armis API updates sized for large fleets; do not perform one-at-a-time writes for ~50k devices ## Implementation notes - Do not revive the legacy NATS KV-driven northbound path - Prefer the current Elixir control-plane architecture for scheduling/execution - Existing related work/history: - #726 feat: 2-way integration support ## Validation - OpenSpec change: `add-armis-northbound-availability-updates` - Demo namespace verification required - Resulting work should be deployed/tested in Kubernetes `demo`
Author
Owner

closing as completed

closing as completed
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar#3134
No description provided.