Safety Constraints

Hard-coded safety rules that override the DQN agent's decisions in production to protect fish welfare and equipment.

The DQN agent's decisions pass through a safety constraint layer before reaching the feeder. This layer is implemented in CageFeedingAgent.apply_safety_constraints() and serves as a deterministic guardian that always overrides the model when critical thresholds are breached.

Why a Separate Safety Layer?

Reinforcement learning agents can occasionally produce unsafe actions, especially:

In out-of-distribution states the model has never seen during training
During the early deployment phase when the model hasn't been fine-tuned on real data
When sensor readings are noisy or delayed

The safety layer ensures that no feeding decision can harm the fish or waste feed, regardless of what the model recommends.

Critical Safety Blocks

These conditions result in complete feed suppression (amount set to 0 kg):

Condition	Threshold	Reason
Critical dissolved oxygen	DO < 4.5 mg/L	Oxygen debt is immediately life-threatening. Feeding increases O₂ demand.
Critical O₂ saturation	Sat < 65%	Same as above, expressed as a percentage.
Extreme heat stress	Temp > 31 °C	Fish metabolic rates spike; feeding adds thermal load and O₂ consumption.
Too cold	Temp < 23 °C	Fish are metabolically inactive; feed will not be consumed.
Maximum daily feeds	feeds_today ≥ 6	Prevents over-stimulation and digestive issues.
Too frequent	time_since_last_feed < 1.5 h	Insufficient digestion time between feeds.
Extreme weather	wind_speed > 15 m/s	Pellets will drift outside the cage; safety risk for equipment.

When a block triggers, the decision object returns is_safe = False and the override is logged.

Amount Reduction Rules

If no critical block applies but conditions are sub-optimal, the dispensed amount is capped rather than zeroed:

Condition	Max Allowed	Cap (% of max_feed_kg)
Low dissolved oxygen (< 5.5 mg/L) or O₂ sat < 75%	1.5 kg	30% of 5.0 kg
Declining O₂ trend (3h trend < −0.5 mg/L)	2.0 kg	40%
High temperature (> 29.5 °C)	2.5 kg	50%
Rapid temperature change (\|change\| > 1.5 °C/h)	3.0 kg	60%
High waste rate (> 30%)	2.5 kg	50%

Multiple reduction rules can stack — the most restrictive cap wins.

After all reductions, a floor of 0.3 kg is enforced if the agent intended to feed at all (avoids dispensing negligible amounts that waste motor cycles).

Decision Flow

Agent recommends X kg
         │
         ▼
    ┌────────────────┐    YES
    │ Critical block? │────────► amount = 0 kg, is_safe = false
    └────────┬───────┘
             │ NO
             ▼
    ┌────────────────┐    YES    ┌─────────────────────┐
    │ Sub-optimal?   │──────────►│ Cap amount to safe   │
    │                │           │ max, is_safe = false  │
    └────────┬───────┘           └──────────┬──────────┘
             │ NO                            │
             ▼                               ▼
    amount = X kg                  amount = min(X, cap)
    is_safe = true                 floor at 0.3 kg

Confidence Score

The agent also computes a confidence score for each decision:

model_confidence = min(1.0, |original_amount| / max_feed_kg)

if safety override was applied:
    confidence -= 0.3

confidence = max(0.0, confidence)

A low confidence score (especially when combined with is_safe = false) signals to farm operators that the model is uncertain and human review may be warranted.

Monitoring Safety Overrides

Every feeding decision is logged to the in-memory recent_actions deque with:

safety_override: boolean indicating if the safety layer changed the model's recommendation
original_amount: what the model wanted to dispense
feed_amount: what was actually dispensed

When adaptive learning is enabled, the override metadata is also stored in the ExperienceDatabase so that retraining can incorporate the safety layer's corrections.