Safety Constraints
Hard-coded safety rules that override the DQN agent's decisions in production to protect fish welfare and equipment.
The DQN agent's decisions pass through a safety constraint layer before reaching the feeder. This layer is implemented in CageFeedingAgent.apply_safety_constraints() and serves as a deterministic guardian that always overrides the model when critical thresholds are breached.
Why a Separate Safety Layer?
Reinforcement learning agents can occasionally produce unsafe actions, especially:
- In out-of-distribution states the model has never seen during training
- During the early deployment phase when the model hasn't been fine-tuned on real data
- When sensor readings are noisy or delayed
The safety layer ensures that no feeding decision can harm the fish or waste feed, regardless of what the model recommends.
Critical Safety Blocks
These conditions result in complete feed suppression (amount set to 0 kg):
| Condition | Threshold | Reason |
|---|---|---|
| Critical dissolved oxygen | DO < 4.5 mg/L | Oxygen debt is immediately life-threatening. Feeding increases O₂ demand. |
| Critical O₂ saturation | Sat < 65% | Same as above, expressed as a percentage. |
| Extreme heat stress | Temp > 31 °C | Fish metabolic rates spike; feeding adds thermal load and O₂ consumption. |
| Too cold | Temp < 23 °C | Fish are metabolically inactive; feed will not be consumed. |
| Maximum daily feeds | feeds_today ≥ 6 | Prevents over-stimulation and digestive issues. |
| Too frequent | time_since_last_feed < 1.5 h | Insufficient digestion time between feeds. |
| Extreme weather | wind_speed > 15 m/s | Pellets will drift outside the cage; safety risk for equipment. |
When a block triggers, the decision object returns is_safe = False and the override is logged.
Amount Reduction Rules
If no critical block applies but conditions are sub-optimal, the dispensed amount is capped rather than zeroed:
| Condition | Max Allowed | Cap (% of max_feed_kg) |
|---|---|---|
| Low dissolved oxygen (< 5.5 mg/L) or O₂ sat < 75% | 1.5 kg | 30% of 5.0 kg |
| Declining O₂ trend (3h trend < −0.5 mg/L) | 2.0 kg | 40% |
| High temperature (> 29.5 °C) | 2.5 kg | 50% |
| Rapid temperature change (|change| > 1.5 °C/h) | 3.0 kg | 60% |
| High waste rate (> 30%) | 2.5 kg | 50% |
Multiple reduction rules can stack — the most restrictive cap wins.
After all reductions, a floor of 0.3 kg is enforced if the agent intended to feed at all (avoids dispensing negligible amounts that waste motor cycles).
Decision Flow
Agent recommends X kg
│
▼
┌────────────────┐ YES
│ Critical block? │────────► amount = 0 kg, is_safe = false
└────────┬───────┘
│ NO
▼
┌────────────────┐ YES ┌─────────────────────┐
│ Sub-optimal? │──────────►│ Cap amount to safe │
│ │ │ max, is_safe = false │
└────────┬───────┘ └──────────┬──────────┘
│ NO │
▼ ▼
amount = X kg amount = min(X, cap)
is_safe = true floor at 0.3 kgConfidence Score
The agent also computes a confidence score for each decision:
model_confidence = min(1.0, |original_amount| / max_feed_kg)
if safety override was applied:
confidence -= 0.3
confidence = max(0.0, confidence)A low confidence score (especially when combined with is_safe = false) signals to farm operators that the model is uncertain and human review may be warranted.
Monitoring Safety Overrides
Every feeding decision is logged to the in-memory recent_actions deque with:
safety_override: boolean indicating if the safety layer changed the model's recommendationoriginal_amount: what the model wanted to dispensefeed_amount: what was actually dispensed
When adaptive learning is enabled, the override metadata is also stored in the ExperienceDatabase so that retraining can incorporate the safety layer's corrections.