Skip to content

Commit 75401ac

Browse files
committed
fix: v1.8.5 — tunnel-node caps TCP drain at 16 MiB to stay under Apps Script body ceiling
@bankbunk reported (#460) that on a 1 Gbps VPS, raw MP4 streams in Full mode died with `batch JSON parse error: EOF while parsing a string at line 1 column 52428685` minutes into playback. Root cause: drain_now took the entire per-session read buffer in one shot. On high-bandwidth VPS the reader task fills the buffer with tens of MiB between polls; the resulting batch response (raw + base64 1.33× + JSON envelope) exceeded Apps Script's ~50 MiB hard cap; Apps Script truncated mid-base64; the client's serde_json parse hit EOF and the stream tore. Fix: drain_now now returns at most TCP_DRAIN_MAX_BYTES (16 MiB) per call and leaves the tail in the buffer for the next poll. EOF is held back until the buffer is fully drained so partial drains don't tear the session prematurely. Three regression tests cover the cap, the under-cap pass-through, and the EOF-holdback case (33 tunnel-node tests passing). @bankbunk's wondershaper rate-limit workaround (40 Mbps cap on the VPS interface) is no longer necessary — high-bandwidth VPS users can run at line rate again.
1 parent 8273325 commit 75401ac

4 files changed

Lines changed: 102 additions & 7 deletions

File tree

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "mhrv-rs"
3-
version = "1.8.4"
3+
version = "1.8.5"
44
edition = "2021"
55
description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting"
66
license = "MIT"

docs/changelog/v1.8.5.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
<!-- see docs/changelog/v1.1.0.md for the file format: Persian, then `---`, then English. -->
2+
• fix tunnel-node: cap هر TCP drain روی ۱۶ MiB تا batch response از سقف Apps Script (~۵۰ MiB) عبور نکنه ([#460](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/460) از @bankbunk): روی VPS های پر-bandwidth (۱ Gbps) reader task می‌تونه هزاران مگابایت رو در buffer per-session جمع کنه قبل از اینکه poll بعدی بیاد. قبلاً `drain_now` همه‌ی buffer رو در یک batch response می‌گرفت، base64 encoding (~۱.۳۳×) + JSON envelope اضافه می‌کرد، نتیجه از سقف ۵۰ MiB Apps Script رد می‌شد. Apps Script body رو wrap-around mid-base64 کوتاه می‌کرد + client side `serde_json` parse error با `EOF while parsing a string at line 1 column 52428685` می‌گرفت. برای استریم MP4 یا هر بایت‌سنگین upstream این bug stream رو مرتب کرش می‌داد. حالا `drain_now` حداکثر ۱۶ MiB در هر poll برمی‌گردونه + tail رو در buffer برای poll بعدی نگه می‌داره. eof تا finalize شدن buffer reported نمی‌شه که session بی‌موقع tear نشه. workaround قبلی @bankbunk (محدودکردن interface VPS با `wondershaper` به ۴۰ Mbps) دیگر لازم نیست — fix server-side پیاده شد و کاربران throughput عادی VPS رو خواهند داشت
3+
---
4+
• Fix tunnel-node: cap each TCP drain at 16 MiB so batch responses stay under Apps Script's ~50 MiB body ceiling ([#460](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/460) by @bankbunk): on high-bandwidth VPS (1 Gbps+), the reader task can stuff the per-session read buffer with tens of MiB between client polls. The old `drain_now` took the entire buffer in one shot, base64-encoded it (1.33× overhead), wrapped it in JSON, and the resulting body exceeded Apps Script's hard ~50 MiB Web App response limit. Apps Script truncated the body mid-base64; the client failed `serde_json` parse with `EOF while parsing a string at line 1 column 52428685` (= 50 MiB) and the stream tore. Most visibly, raw MP4 streams crashed minutes into playback. The fix splits oversized buffers: at most `TCP_DRAIN_MAX_BYTES` (16 MiB) is returned per drain, and the remainder stays in the buffer for the next poll. EOF is held back until the buffer is fully drained so partial drains don't prematurely close the session. Three regression tests cover the cap, the under-cap pass-through, and the EOF-holdback case (33 tunnel-node tests passing). @bankbunk's `wondershaper` workaround (rate-limiting the VPS interface to 40 Mbps) is no longer necessary — high-bandwidth VPS users can let throughput run at line rate again.

tunnel-node/src/main.rs

Lines changed: 96 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,22 @@ const UDP_QUEUE_LIMIT: usize = 256;
8383
/// a maximum-size IPv4 datagram without truncation.
8484
const UDP_RECV_BUF_BYTES: usize = 65536;
8585

86+
/// Maximum raw bytes per TCP drain that we hand back to Apps Script in
87+
/// one batch response. Apps Script's hard cap on Web App response body
88+
/// is ~50 MiB. Accounting for base64 encoding (1.33×) and JSON envelope
89+
/// overhead, the safe ceiling for raw bytes is roughly 32 MiB — but
90+
/// `serde_json::to_vec` for a single 32-MiB string is also a CPU spike,
91+
/// so we lean further back at 16 MiB. On a high-bandwidth VPS (1 Gbps+)
92+
/// the reader task can stuff the per-session buffer with tens of MiB
93+
/// between polls (issue #460); without this cap, `drain_now` would take
94+
/// the lot, the response would exceed Apps Script's ceiling, the body
95+
/// would be truncated mid-base64, and the client would fail JSON parse
96+
/// with `EOF while parsing a string at line 1 column ~52428685`. By
97+
/// returning at most this many bytes per drain and leaving the rest in
98+
/// the read buffer for the next poll, we keep responses comfortably
99+
/// under the cap and let throughput recover across batches.
100+
const TCP_DRAIN_MAX_BYTES: usize = 16 * 1024 * 1024;
101+
86102
/// First queue-drop on a session always logs at warn level; subsequent
87103
/// drops log at debug only every Nth occurrence so a single congested
88104
/// session can't flood the operator's log.
@@ -324,13 +340,33 @@ async fn udp_reader_task(socket: Arc<UdpSocket>, session: Arc<UdpSessionInner>)
324340
}
325341
}
326342

327-
/// Drain whatever is currently buffered — no waiting.
328-
/// Used by batch mode where we poll frequently.
343+
/// Drain up to `TCP_DRAIN_MAX_BYTES` from the per-session read buffer —
344+
/// no waiting. Used by batch mode where we poll frequently.
345+
///
346+
/// If the buffer is larger than the cap, we return a prefix of the
347+
/// data and leave the remainder in the buffer for the next poll. The
348+
/// cap exists to keep batch responses under Apps Script's ~50 MiB body
349+
/// ceiling on high-bandwidth VPS — see `TCP_DRAIN_MAX_BYTES` for the
350+
/// underlying issue (#460).
351+
///
352+
/// `eof` is reported as true only when the buffer has been fully
353+
/// drained AND upstream has signaled EOF — otherwise a partial drain
354+
/// would prematurely tear the session down on the client side.
329355
async fn drain_now(session: &SessionInner) -> (Vec<u8>, bool) {
330356
let mut buf = session.read_buf.lock().await;
331-
let data = std::mem::take(&mut *buf);
332-
let eof = session.eof.load(Ordering::Acquire);
333-
(data, eof)
357+
let raw_eof = session.eof.load(Ordering::Acquire);
358+
if buf.len() <= TCP_DRAIN_MAX_BYTES {
359+
let data = std::mem::take(&mut *buf);
360+
(data, raw_eof)
361+
} else {
362+
// Take the prefix; leave the tail in the buffer.
363+
let tail = buf.split_off(TCP_DRAIN_MAX_BYTES);
364+
let head = std::mem::replace(&mut *buf, tail);
365+
// Don't propagate eof yet — buffer still has data even if upstream
366+
// has closed. The client will get eof on the drain that returns
367+
// an empty (or sub-cap) buffer.
368+
(head, false)
369+
}
334370
}
335371

336372
/// Block until *any* of `inners` has buffered data, hits EOF, or the
@@ -1590,6 +1626,61 @@ mod tests {
15901626
})
15911627
}
15921628

1629+
#[tokio::test]
1630+
async fn drain_now_caps_at_tcp_drain_max_bytes() {
1631+
// Issue #460: a 1 Gbps VPS reader fills the buffer with tens of MiB
1632+
// between polls; drain_now used to take the lot, the JSON response
1633+
// exceeded Apps Script's body cap, and the client failed JSON parse.
1634+
// The cap leaves the tail in the buffer for the next drain.
1635+
let inner = fake_inner().await;
1636+
let oversized = TCP_DRAIN_MAX_BYTES + 4096;
1637+
inner.read_buf.lock().await.resize(oversized, 0xab);
1638+
1639+
let (first, eof) = drain_now(&inner).await;
1640+
assert_eq!(first.len(), TCP_DRAIN_MAX_BYTES);
1641+
assert!(!eof, "shouldn't propagate eof while buffer still has data");
1642+
1643+
// Tail remains for the next poll.
1644+
assert_eq!(inner.read_buf.lock().await.len(), 4096);
1645+
1646+
let (second, _) = drain_now(&inner).await;
1647+
assert_eq!(second.len(), 4096);
1648+
assert!(inner.read_buf.lock().await.is_empty());
1649+
}
1650+
1651+
#[tokio::test]
1652+
async fn drain_now_passes_through_when_under_cap() {
1653+
let inner = fake_inner().await;
1654+
inner.read_buf.lock().await.extend_from_slice(b"hello world");
1655+
1656+
let (data, eof) = drain_now(&inner).await;
1657+
assert_eq!(data, b"hello world");
1658+
assert!(!eof);
1659+
assert!(inner.read_buf.lock().await.is_empty());
1660+
}
1661+
1662+
#[tokio::test]
1663+
async fn drain_now_holds_eof_until_buffer_drained() {
1664+
// If upstream signals EOF while the buffer is still oversized, we
1665+
// must drain the head, leave the tail, and *not* set eof yet.
1666+
// Eof flips on the final drain that returns a sub-cap buffer.
1667+
let inner = fake_inner().await;
1668+
inner.eof.store(true, Ordering::Release);
1669+
inner
1670+
.read_buf
1671+
.lock()
1672+
.await
1673+
.resize(TCP_DRAIN_MAX_BYTES + 100, 0);
1674+
1675+
let (head, head_eof) = drain_now(&inner).await;
1676+
assert_eq!(head.len(), TCP_DRAIN_MAX_BYTES);
1677+
assert!(!head_eof, "premature eof would tear the session");
1678+
1679+
let (tail, tail_eof) = drain_now(&inner).await;
1680+
assert_eq!(tail.len(), 100);
1681+
assert!(tail_eof, "eof finally flips when buffer is drained");
1682+
}
1683+
15931684
#[tokio::test]
15941685
async fn wait_for_any_drainable_returns_immediately_when_buffer_has_data() {
15951686
let inner = fake_inner().await;

0 commit comments

Comments
 (0)