Kianmhz/GooseRelayVPN v1.7.0 on GitHub

Highlights

connect_data ordering bug — the real cause of "v1.6 feels slower than v1.5"

Session.EnqueueInitialData was prepending each call's bytes instead of appending. When a SOCKS5 client wrote in several small chunks before the SYN had drained — TLS handshake, HTTP/1.1 request line + headers, anything with a length prefix — the chunks shipped to the upstream in reverse order. The upstream either errored, parsed garbage, or (in the bench harness's sized sink) silently misread the body, which is how this regressed unnoticed since the connect_data optimization landed.

Fixes the upload measurement artifact that made v1.5 look 5× faster than v1.6, fixes real-world protocol corruption for fast multi-write clients, and ships a regression test that asserts byte order across multiple EnqueueInitialData calls. Closes #147.

Worker pool sized by endpoint count again

workersPerEndpoint × len(endpoints) is back — v1.6 had collapsed this to (workersPerEndpoint + idleSlotsPerBucket − 1) × bucketCount, which starved configs with multiple endpoints per account. The per-account idle-slot semaphore that v1.6 introduced is preserved (#56 stays fixed), but it now bounds slot acquisition instead of throttling worker creation. Configs that pair every account with 2+ deployments will see materially more concurrency under bursty load. Closes #146.

Default `idle_slots_per_bucket` raised from 1 → 2

With the per-account semaphore from #146 enforcing the safety cap, two idle slots per bucket is the better default for the recommended multi-deployment setup. Lower to 1 if every account has only one deployment and you see issue-#56-style 429s; raise to 3 for accounts with 3+ deployments.

Faster recovery after local network outages

The carrier now treats local-offline errors (airplane mode, hotel Wi-Fi captive portal, mobile signal drop) differently from upstream Apps Script failures. Endpoints are paused for 15s instead of being escalated into the 30m/1h backoff penalty box, and a 5s recovery probe restores them automatically when the network returns. Closes a long-standing mobile-client annoyance. Closes #137.

Server-side response bounds

The exit server now caps the initial downstream response size and bounds relay body reads, preventing single oversized upstream responses from spiking memory or stalling the long-poll. No protocol changes; pure defense-in-depth. Closes #138, #139.

Apps Script `Code.gs` rewrite

RELAY_URL became RELAY_URLS (array), with a relay-loop guard that rejects URLs pointing at script.google.com (footgun protection), opt-in invocation counting (no per-request cost by default), and improved error reporting for upstream failures. Recommended to redeploy your Code.gs to pick up the loop guard and cost savings. Closes #129.

Diagnostics bundle

goose-client --diag now emits a redacted snapshot (config minus secrets, recent stats lines, build info) suitable for sharing on a GitHub issue without manual scrubbing. Closes #136.

Frame marshaling perf

EncodeBatch migrated to AppendMarshal, dropping per-encode allocations on the hot path. Server-side throughput and client CPU both benefit at high frame rates. Closes #130.

Compatibility

Wire protocol: unchanged. v1.7 clients work against v1.6 servers and vice versa.
server_config.json: no breaking changes.
client_config.json: no breaking changes. idle_slots_per_bucket default changes from 1 to 2 if unset; existing configs that explicitly set the value are unaffected.
apps_script/Code.gs: RELAY_URL constant renamed to RELAY_URLS (array). Old deployments keep working until you redeploy.

Upgrading

Update binaries on your client and VPS (use scripts/deploy.sh user@host for the server).
Optional but recommended: redeploy apps_script/Code.gs to pick up the loop guard and the opt-in invocation counter.
No config edits required.

Benchmark deltas vs v1.6.0

Metric	v1.6.0	v1.7.0	Δ
Upload 8 MB / single session	4.49 MB/s	5.61 MB/s	+25%
Upload 8 MB / 4 sessions	21.97 MB/s	22.02 MB/s	≈ tie
Download 8 MB / single session	19.93 MB/s	20.10 MB/s	≈ tie
TTFB p50 / p95 / p99	352 / 352 / 352 ms	352 / 353 / 353 ms	≈ tie

Connect-data bundling verified at ~98% in the worst-case bench shape (ttfb_p50_p95, Dial → Write → Read 50×). The 1–2% race surfaced in #144 is real but bounded; not worth gating release on.