Replacing manual workflows and static CSV processes with a rule-based system using live API data — and learning why configuration is a system, not a form.
The product needed to move beyond campaigns that ops teams stitched together by hand — one CSV, one call list, one trigger at a time. The goal was a system that could run itself.
Every campaign previously ran because a person made a series of judgment calls in sequence. Automating the system meant encoding those decisions into configuration — and making sure they composed correctly.
"Small configuration errors can scale into thousands of incorrect calls."
Unlike a single misclick in a manual workflow, a wrong rule in an automated system doesn't fail once — it fails at every execution interval until someone catches it. The design problem was as much about error prevention as it was about capability.
Moving from a static CSV to a live API feed fundamentally changed the nature of the problem. A CSV is a snapshot. An API is a stream. The system had to be designed around data that could change between any two executions.
Static configuration patterns — where you set parameters once and the system runs — don't hold up when the underlying dataset is fluid. Every configuration decision needed to be evaluated against a moving target.
The first design broke the campaign setup into a linear wizard — step 1 through step N. The logic felt clean: each decision had its own screen, and users moved forward once each was complete.
The step-based wizard produced configurations that looked correct in isolation but broke in practice. The most revealing example was a simple scheduling conflict nobody caught until it shipped.
The date range and day-of-week restrictions were configured in different steps — and nothing surfaced the conflict between them. Users finished the wizard confident their campaign was ready.
"The problem wasn't the UI. It was broken decision continuity — each step assumed it was independent, but the decisions weren't."
The wizard model was the wrong mental model from the start. Breaking configuration into steps implied the decisions were sequential and independent. They weren't — every setting was downstream of another.
Scheduling, concurrency, and retry logic form a single operating contract — each parameter constrains the others.
System behavior isn't a property of any single setting. It emerges from combinations — and those combinations need to be visible together.
Related decisions need to occupy the same visual space, not separate steps. Conflict detection requires co-presence.
"You can't prevent bad configurations if the user can only see one decision at a time."
Linear flow, fragmented inputs, no cross-step visibility. Built fast, broke fast. Exposed the fundamental flaw in treating configuration as a sequence.
Moved to a single-page layout with grouped panels. Better than a wizard, but collapsed sections still hid interdependencies. Users could configure retry logic without seeing the termination conditions it interacted with.
All configuration on one surface with a live system summary panel on the right. First time conflicts could surface in real time. Introduced the idea that the summary was as important as the form — it was the system talking back.
Termination as a concept was too final and hard to reason about. Replaced with pause + resume logic — a lighter mental model that matched how operators actually thought about campaign control. Reduced misconfiguration of exit conditions significantly.
All interdependent settings visible simultaneously. Conflict warnings surfaced inline. Schedule preview showed real execution windows rather than abstract dates. Input complexity reduced by removing options that added configuration burden without meaningful control.
The final design consolidated all settings into a single, structured surface. Related decisions sit in proximity. The system previews its own behavior — execution windows, call distribution, conflict warnings — so operators can validate before launch.
Scope decisions were as consequential as design decisions. We deliberately left three areas out of v1 to maintain clarity and build on a stable foundation.
The system doesn't predict call success rates or suggest configuration adjustments based on historical outcomes. We chose to capture data first before building prediction on top of it.
Configuration remains entirely manual. The system executes rules as set — it doesn't adjust timing, concurrency, or retry logic based on performance signals. Flexibility was prioritized over automation.
Campaigns are independent units. A patient who qualifies for two active campaigns can be targeted by both — deduplication across campaigns was out of scope and would have required significantly deeper data architecture work.
Designing the system didn't eliminate fragility — it relocated it. The remaining risks are structural, not cosmetic.
A valid configuration can produce unexpected system behavior when multiple rule conditions interact under dynamic data conditions. No amount of upfront validation fully eliminates this.
High concurrency settings on a large API dataset can trigger thousands of calls before an operator notices. Cost controls exist but are not automatically enforced at the configuration stage.
Some configuration errors only manifest on specific execution cycles. A conflict between retry intervals and scheduling windows might only surface on day three of a campaign.
Rule-based targeting against a live patient feed means the system's effective patient pool changes continuously. Operators don't always have visibility into how eligibility shifts mid-campaign.
Campaigns run continuously without manual intervention per cycle.
Unified surface and reduced inputs cut time-to-launch for new campaigns.
System handles thousands of concurrent calls across multiple active campaigns without per-call manual oversight.
The biggest gap in the current system is that operators can't see what their configuration will actually do before it runs. A dry-run mode that simulates execution against a sample of the current patient pool would catch most scheduling and targeting conflicts that currently only surface post-launch.
As execution data accumulates, the system should be able to surface patterns — optimal call windows per patient segment, effective retry intervals, concurrency settings that reduce drop-off. Not prescriptive automation, but informed recommendations operators can choose to apply.
The current design surfaces real-time patient counts at configuration time, but gives no indication of how that pool is likely to change. Even a basic projection — "estimated eligible patients in 7 days based on recent trends" — would help operators configure more confidently against a moving target.