Skip to content

Commit 9b32151

Browse files
committed
v0.9.4: actionable diagnostic when outbound TLS to Google edge fails (#18 follow-up)
@Behzad9 reports: after the EMFILE fix in v0.9.3 landed cleanly, the relay now fails with a different error: ERROR Relay failed: io: invalid peer certificate: UnknownIssuer repeated on every request. This is rustls (via domain_fronter) rejecting the server cert that whatever sits on our TLS connection to google_ip presents. In practice this means one of three things, in decreasing order of likelihood for an Iranian OpenWRT user: 1. The ISP / a middlebox is intercepting outbound TLS to Google IPs and presenting its own cert. webpki-roots (Mozilla trust store, baked in) correctly rejects it. 2. The user's google_ip setting points at a non-Google host. 3. Router clock is wildly off (NTP not synced), certs look not-yet-valid. Before this change: one identical ERROR per failed relay, no guidance. Log filled with the same line. Now: - New DomainFronter::log_relay_failure() detects cert-related error strings (UnknownIssuer, CertificateExpired, CertNotValidYet, NotValidForName, 'invalid peer certificate'). - First occurrence logs an ERROR with the three root causes and three concrete fixes: run to find a working Google IP, check the system clock, or as a LAST RESORT set verify_ssl=false (with the explicit warning that traffic is then only protected by the Apps Script auth_key, not outer TLS). - Subsequent occurrences drop to debug so the log stays readable — an AtomicBool gate on the DomainFronter instance tracks whether the hint was shown. Resets on proxy restart. - Non-cert errors still log at error level unchanged. 49 tests pass, no code-path regressions (log line content changed, not behavior). Shipping so users hit this get actionable output.
1 parent 54d6931 commit 9b32151

3 files changed

Lines changed: 44 additions & 3 deletions

File tree

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "mhrv-rs"
3-
version = "0.9.3"
3+
version = "0.9.4"
44
edition = "2021"
55
description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting"
66
license = "MIT"

src/domain_fronter.rs

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,9 @@ pub struct DomainFronter {
8383
/// response cache isn't busted by the constantly-changing `features`
8484
/// / `fieldToggles` params.
8585
normalize_x_graphql: bool,
86+
/// Set once we've emitted the "UnknownIssuer means ISP MITM" hint,
87+
/// so we don't spam it every time a cert-validation error repeats.
88+
cert_hint_shown: std::sync::atomic::AtomicBool,
8689
tls_connector: TlsConnector,
8790
pool: Arc<Mutex<Vec<PoolEntry>>>,
8891
cache: Arc<ResponseCache>,
@@ -179,6 +182,7 @@ impl DomainFronter {
179182
auth_key: config.auth_key.clone(),
180183
parallel_relay: config.parallel_relay as usize,
181184
normalize_x_graphql: config.normalize_x_graphql,
185+
cert_hint_shown: std::sync::atomic::AtomicBool::new(false),
182186
script_ids,
183187
script_idx: AtomicUsize::new(0),
184188
tls_connector,
@@ -307,6 +311,43 @@ impl DomainFronter {
307311
);
308312
}
309313

314+
/// Log a relay failure with extra guidance on cert-validation cases.
315+
/// Rate-limited so a flood of identical "UnknownIssuer" errors doesn't
316+
/// fill the log.
317+
fn log_relay_failure(&self, e: &FronterError) {
318+
let msg = e.to_string();
319+
let is_cert_issue = msg.contains("UnknownIssuer")
320+
|| msg.contains("invalid peer certificate")
321+
|| msg.contains("CertificateExpired")
322+
|| msg.contains("CertNotValidYet")
323+
|| msg.contains("NotValidForName");
324+
if is_cert_issue
325+
&& !self
326+
.cert_hint_shown
327+
.swap(true, std::sync::atomic::Ordering::Relaxed)
328+
{
329+
// First time — print the full diagnostic. Subsequent hits
330+
// drop to debug so the log stays readable.
331+
tracing::error!(
332+
"Relay failed: {} — this almost always means one of:\n \
333+
(1) your ISP or a middlebox is intercepting TLS to the Google edge \
334+
(common in Iran / IR);\n \
335+
(2) the `google_ip` in your config is pointing at a non-Google host;\n \
336+
(3) your system clock is way off (NTP not synced).\n\
337+
Fixes (try in order): run `mhrv-rs scan-ips` to find a different Google \
338+
frontend IP that isn't being MITM'd; check `date` on your host; as a \
339+
LAST RESORT set `\"verify_ssl\": false` in config.json — this lets the \
340+
relay work even through a middlebox, but your traffic is then only \
341+
protected by the Apps Script relay's secret `auth_key`, not by outer TLS.",
342+
e
343+
);
344+
} else if is_cert_issue {
345+
tracing::debug!("Relay failed (cert): {}", e);
346+
} else {
347+
tracing::error!("Relay failed: {}", e);
348+
}
349+
}
350+
310351
fn next_sni(&self) -> String {
311352
let n = self.sni_hosts.len();
312353
let i = self.sni_idx.fetch_add(1, Ordering::Relaxed) % n;
@@ -479,7 +520,7 @@ impl DomainFronter {
479520
Ok(Ok(bytes)) => bytes,
480521
Ok(Err(e)) => {
481522
self.relay_failures.fetch_add(1, Ordering::Relaxed);
482-
tracing::error!("Relay failed: {}", e);
523+
self.log_relay_failure(&e);
483524
return error_response(502, &format!("Relay error: {}", e));
484525
}
485526
Err(_) => {

0 commit comments

Comments
 (0)