Avg reward (24h)
1.55
+0.32 vs prev
Consumption
91%
target ≥ 90%
FCR (7d)
1.41
−0.06 vs baseline
Daily feed
18.5 kg
−2.1 kg waste
Decisions (24h)
142
6 overrides
Cost saved (wk)
$1,840
projected
Selected action · idx 3
Medium · 2.0 kg
High feeding-frenzy score (0.81) · DO normal · Within daily cap
Confidence
87%
No safety override
a0 · Hold
0.0 kg
Q -0.62
a1 · Probe
0.5 kg
Q +0.04
a2 · Modest
1.0 kg
Q +0.41
a3 · Standard
2.0 kg
Q +0.78
chosena4 · Hungry
3.5 kg
Q +0.55
a5 · Optimal
5.0 kg
Q +0.13
DO 6.4 mg/L
Healthy oxygen, no throttle.
Frenzy 0.81 / Motion 78%
Strong visual hunger signal · supports feeding.
PASS
7 / 7 hard rules · 3 / 3 soft rules.
Inputs to the DQN policy, normalized to [0,1] before inference.
Environmental13 dims
Biomass6 dims
Computer Vision5 dims
Feeding History9 dims
Performance5 dims
Cage6 dims
Mean reward
+1.62
Best decision
+3.0
Worst decision
−3.5
Real-world experiences feed the SB3 replay buffer. Retraining triggers once the buffer reaches the minimum and is auto-deployed if the new policy beats the live one.
Experience buffer
1,842 / 1,000
Live policy
+1.18
Candidate policy
+1.84
+56% reward
- v3.2.1· May 8+1.84deployed
- v3.2.0· Apr 24+1.62deployed
- v3.1.4· Apr 03+1.41deployed
- v3.1.3· Mar 19+1.18rejected
- v3.1.2· Feb 28+1.05deployed
Hard rules · block feeding entirely
DO ≥ 4.5 mg/L
Now: 6.4 mg/L
Sat ≥ 65%
Now: 88%
Temp ≤ 31 °C
Now: 27.1 °C
Temp ≥ 23 °C
Now: 27.1 °C
Feeds < 6 / day
Now: 3 / 6
≥ 1.5 h since feed
Now: 2.3 h
Wind < 15 m/s
Now: 4.1 m/s
Soft rules · throttle the amount
Cap to 30% if DO < 5.5
Now: 6.4 mg/L
Cap to 60% if Δ>1.5°C/h
Now: 0.3 °C/h
Cap if waste > 20%
Now: 8%
| Time | Cage | Raw → Applied | Consumed | Reward | Reason |
|---|---|---|---|---|---|
| 09:42 | L1 | 2.0 kg | 96% | +3.0 | Excellent timing · DO 7.2 |
| 09:38 | L4 | 3.5 kg 0.0 kg | — | -3.5 | DO 4.3 mg/L < 4.5 threshold |
| 09:30 | L3 | 3.5 kg | 93% | +1.5 | High frenzy · Heavy feed |
| 09:22 | L2 | 2.0 kg 1.0 kg | 78% | +0.4 | Throttled (DO trending down) |
| 09:15 | L1 | 1.0 kg | 99% | +3.0 | Low-amount probe successful |
| 09:08 | L3 | 0.5 kg | 95% | +1.5 | Appetite probe |
| 09:01 | L4 | 0.0 kg | — | +0.5 | AI chose to wait — DO trending down |
| 08:54 | L2 | 2.0 kg | 91% | +1.5 | Within tolerance |