TruGrid SecureRDP - Mac Network Monitor Script

TruGrid Mac Network Test


The companion to the Windows .bat ping launcher for Mac users. Same idea: when an RDP session drops and you do not know whether the problem is the LAN, the ISP, or TruGrid, run this in the background and let it tell you which link broke when the disconnect happened.


Unlike the Windows version which opens three terminal windows running ping, the Mac version is a single zsh script that probes five targets every two seconds (public DNS, the default gateway, the TruGrid front end, and the currently assigned TruGrid relay), clusters any outages it sees, and writes a self-contained HTML report to the Desktop with a correlation verdict explaining which side of the path failed.


How to use it

  1. Open TextEdit on the Mac. Choose Format then Make Plain Text before doing anything else, or you will save an RTF and it will not run.
  2. Copy the script (below) into the empty document.
  3. Save the file to ~/Downloads (or anywhere convenient) with the name trugrid-mac-network-test.zsh.
  4. Open Terminal, then make the script executable by running chmod +x ~/Downloads/trugrid-mac-network-test.zsh.
  5. Run the script by typing ~/Downloads/trugrid-mac-network-test.zsh and pressing Return.
  6. Minimize the Terminal window and work normally. The script keeps probing for 60 minutes by default, or until you press Ctrl+C, or until it has caught 10 outages, whichever comes first.
  7. When the script ends, an HTML report opens automatically in the default browser. Attach that file to the support ticket.


Options


The script accepts a few optional flags. Append them to the run command in step 5.

  • --duration N runs for N minutes instead of the default 60.
  • --indefinite runs until Ctrl+C is pressed. The HTML report is generated when you stop it.
  • --no-open skips auto-opening the HTML report. Useful over SSH or in batch runs.
  • -h or --help prints usage.


If a disconnect is rare and you want the script to run all day until it catches one, use --indefinite.

What the report tells you


The report opens in the browser and has four parts:

  • Run summary: how many sample rounds completed, how many outages were recorded, and why the run stopped.
  • Correlation verdict: a plain-English sentence explaining where the failure was. For example: "At 14:32:18, only TruGrid path failed for 6s. Other connectivity healthy. Suggests a TruGrid Azure or transit issue, not your network."
  • Per-target statistics: success rate and latency stats (mean, median, P95, max) for each probed target.
  • Latency charts: one chart per target with red bands marking outages and a blue line showing latency over time.

If everything stayed up for the entire run, you will see a green "Connectivity was stable across all probed targets for the entire run" line in place of the verdict. In that case the support team will know the disconnect happened outside the probed paths, most often on the RDP host machine itself.


The script

Paste the contents of the script below into your TextEdit document at step 2.


#!/bin/zsh
# trugrid-mac-network-monitor.zsh
# Mac port of TruGrid SecureRDP Toolkit Test-TruGridNetworkMonitor.ps1
# v1.1
#
# Matches Windows behavior: 5 targets, 2s interval, outage clustering,
# per-target SVG charts, correlation verdict.

emulate -L zsh
setopt null_glob

# ===== Config =====
SampleIntervalSeconds=2
MaxRunDurationMinutes=60 # 60 min default (was 240; can override)
MaxOutageEvents=10
OutageThresholdConsecutive=2
ClusterWindowSeconds=10

ICMP_TIMEOUT_MS=1000
TCP_TIMEOUT_SEC=2
TCP_MAX_TIME_SEC=3

OutputDir="${HOME}/Desktop"
StopFlagPath="${TMPDIR:-/tmp}/TruGridDiag/netmonitor.stop"
# ==================

INDEFINITE=0
NO_OPEN=0
DURATION_OVERRIDE=""

usage() {
cat <<EOF
Usage: ${0##*/} [options]
--indefinite Run until Ctrl+C or stop flag
--duration N Override max duration in minutes (default 60)
--no-open Skip auto-open of HTML report
-h, --help Show this help
EOF
exit 0
}

while (( $# > 0 )); do
case "$1" in
--indefinite) INDEFINITE=1; shift ;;
--duration) DURATION_OVERRIDE="$2"; shift 2 ;;
--no-open) NO_OPEN=1; shift ;;
-h|--help) usage ;;
*) echo "Unknown arg: $1" >&2; usage ;;
esac
done

[[ -n "$DURATION_OVERRIDE" ]] && MaxRunDurationMinutes="$DURATION_OVERRIDE"

(( EUID == 0 )) && echo "Warning: running as root, lsof will not see your user's TruGrid client." >&2

RUN_STAMP=$(date +%Y%m%d_%H%M%S)
HOSTNAME_SHORT=$(hostname -s)
HTML_PATH="${OutputDir}/TruGridNetMonitor_${HOSTNAME_SHORT}_${RUN_STAMP}.html"

TMP_DIR="/tmp/trugrid-netmon.$$"
mkdir -p "$TMP_DIR" || { echo "Cannot create $TMP_DIR" >&2; exit 1; }
SAMPLES_TSV="$TMP_DIR/samples.tsv"
: > "$SAMPLES_TSV"

mkdir -p "$(dirname "$StopFlagPath")" 2>/dev/null
rm -f "$StopFlagPath"

INTERRUPTED=0
on_int() { INTERRUPTED=1; }
trap on_int INT TERM
trap "rm -rf $TMP_DIR" EXIT

# ---------- Helpers ----------

get_default_gateway() {
route -n get default 2>/dev/null | awk '/gateway:/{print $2; exit}'
}

get_relay_ip() {
local listen_data client_pid
listen_data=$(lsof -iTCP -nP 2>/dev/null \
| awk '
NR>1 {
local_side = $9
sub(/->.*/, "", local_side)
if (local_side ~ /^127\.0\.0\.1:43[0-9]+$/) print $2 "\t" local_side
}' \
| sort -u)
[[ -z "$listen_data" ]] && return 0
client_pid=$(echo "$listen_data" | sort -t: -k2 -n | tail -1 | cut -f1)

local ws_ip app_ip
ws_ip=$(dig +short +tries=1 +time=2 ws.trugrid.com A 2>/dev/null | grep -E '^[0-9]' | head -1)
app_ip=$(dig +short +tries=1 +time=2 app.trugrid.com A 2>/dev/null | grep -E '^[0-9]' | head -1)

lsof -i -nP -a -p "$client_pid" 2>/dev/null \
| awk '/(ESTABLISHED|CLOSE_WAIT)/ {print $9}' \
| awk -F'->' 'NF==2 {print $2}' \
| grep -E '^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+:443$' \
| grep -vE "^(${ws_ip:-NOMATCH}|${app_ip:-NOMATCH}):" \
| awk -F: '{print $1}' \
| sort -u | head -1
}

probe_icmp() {
local h="$1" out
out=$(ping -c 1 -W $ICMP_TIMEOUT_MS -q "$h" 2>&1)
if echo "$out" | grep -q "round-trip"; then
echo "$out" | grep "round-trip" | awk -F'/' '{print $5}' | awk '{printf "%.0f", $1+0.5}'
fi
}

probe_tcp() {
local h="$1" p="$2" t
t=$(curl -s -o /dev/null -w "%{time_connect}" \
--connect-timeout $TCP_TIMEOUT_SEC --max-time $TCP_MAX_TIME_SEC \
"https://${h}:${p}/" 2>/dev/null)
if [[ -n "$t" ]] && awk -v v="$t" 'BEGIN{exit (v > 0) ? 0 : 1}'; then
awk -v v="$t" 'BEGIN{printf "%.0f", v*1000+0.5}'
fi
}

# Perl-based ISO timestamp to epoch (reliable across timezones, ships on macOS)
iso_to_epoch() {
local iso="$1"
[[ -z "$iso" ]] && { echo 0; return; }
perl -MTime::Piece -e '
my $s = $ARGV[0];
$s =~ s/([+-])(\d{2})(\d{2})$/$1$2:$3/;
my $t = Time::Piece->strptime($s, "%Y-%m-%dT%H:%M:%S%z");
print $t->epoch;
' "$iso" 2>/dev/null || echo 0
}

# ---------- Targets ----------

typeset -a TARGET_NAMES TARGET_TYPES TARGET_HOSTS TARGET_PORTS TARGET_KINDS

add_target() {
TARGET_NAMES+=("$1"); TARGET_TYPES+=("$2"); TARGET_HOSTS+=("$3"); TARGET_PORTS+=("$4"); TARGET_KINDS+=("$5")
}

echo "TruGrid Network Monitor (Mac port v1.1)"
echo "============================================================"
echo ""

add_target "Cloudflare DNS (1.1.1.1)" "icmp" "1.1.1.1" "" "internet"
add_target "Google DNS (8.8.8.8)" "icmp" "8.8.8.8" "" "internet"

GW=$(get_default_gateway)
if [[ -n "$GW" ]]; then
add_target "Local Gateway ($GW)" "icmp" "$GW" "" "gateway"
echo " Detected default gateway: $GW"
else
echo " Note: could not detect default gateway, LAN probe skipped."
fi

add_target "TruGrid Front End (ws.trugrid.com:443)" "tcp" "ws.trugrid.com" "443" "trugrid"

RELAY_IP=$(get_relay_ip)
if [[ -n "$RELAY_IP" ]]; then
add_target "TruGrid Relay ($RELAY_IP)" "icmp" "$RELAY_IP" "" "relay"
echo " Found latest relay IP from lsof: $RELAY_IP"
else
echo " No active TruGrid session detected. Relay probe skipped."
fi

NUM_TARGETS=${#TARGET_NAMES[@]}
echo ""
if (( INDEFINITE )); then
echo "Probing $NUM_TARGETS target(s) every ${SampleIntervalSeconds}s."
echo "Stops after $MaxOutageEvents outage events, stop flag, or Ctrl+C."
else
echo "Probing $NUM_TARGETS target(s) every ${SampleIntervalSeconds}s."
echo "Stops after $MaxRunDurationMinutes min, $MaxOutageEvents outage events, stop flag, or Ctrl+C."
fi
echo "Will write report to: $HTML_PATH"
echo ""

# ---------- Main loop ----------

START_TS=$(date +%s)
END_TS=$(( START_TS + MaxRunDurationMinutes * 60 ))
SAMPLE_IDX=0
TOTAL_OUTAGES=0
STOP_REASON=""

typeset -A CONSEC_FAIL LAST_GLYPH OUTAGE_RECORDED
for ((i=1; i<=NUM_TARGETS; i++)); do
CONSEC_FAIL[${TARGET_NAMES[$i]}]=0
LAST_GLYPH[${TARGET_NAMES[$i]}]="?"
done

while (( INTERRUPTED == 0 )); do
if [[ -f "$StopFlagPath" ]]; then
STOP_REASON="GUI stop button"
break
fi
if (( INDEFINITE == 0 )) && (( $(date +%s) >= END_TS )); then
STOP_REASON="max duration reached"
break
fi
if (( TOTAL_OUTAGES >= MaxOutageEvents )); then
STOP_REASON="max outage events"
break
fi

SAMPLE_IDX=$(( SAMPLE_IDX + 1 ))
now_iso=$(date +%Y-%m-%dT%H:%M:%S%z)

for ((i=1; i<=NUM_TARGETS; i++)); do
t_type="${TARGET_TYPES[$i]}"
t_host="${TARGET_HOSTS[$i]}"
t_port="${TARGET_PORTS[$i]}"
outfile="$TMP_DIR/probe_$i.out"
if [[ "$t_type" == "icmp" ]]; then
(probe_icmp "$t_host" > "$outfile") &
else
(probe_tcp "$t_host" "$t_port" > "$outfile") &
fi
done
wait

for ((i=1; i<=NUM_TARGETS; i++)); do
name="${TARGET_NAMES[$i]}"
outfile="$TMP_DIR/probe_$i.out"
result=$(cat "$outfile" 2>/dev/null)
rm -f "$outfile"

if [[ -n "$result" ]]; then
printf "%s\t%d\t%s\t%s\tOK\n" "$name" "$SAMPLE_IDX" "$now_iso" "$result" >> "$SAMPLES_TSV"
LAST_GLYPH[$name]="+"
CONSEC_FAIL[$name]=0
unset "OUTAGE_RECORDED[$name]" 2>/dev/null
else
printf "%s\t%d\t%s\t\tFAIL\n" "$name" "$SAMPLE_IDX" "$now_iso" >> "$SAMPLES_TSV"
LAST_GLYPH[$name]="X"
CONSEC_FAIL[$name]=$(( CONSEC_FAIL[$name] + 1 ))
if (( CONSEC_FAIL[$name] == OutageThresholdConsecutive )) && [[ -z "${OUTAGE_RECORDED[$name]:-}" ]]; then
TOTAL_OUTAGES=$(( TOTAL_OUTAGES + 1 ))
OUTAGE_RECORDED[$name]=1
echo ""
echo "[outage] $name failed ${OutageThresholdConsecutive} consecutive samples (event #${TOTAL_OUTAGES})"
fi
fi
done

elapsed=$(( $(date +%s) - START_TS ))
elapsed_str=$(printf '%02d:%02d:%02d' $((elapsed/3600)) $((elapsed%3600/60)) $((elapsed%60)))
status_line="[${elapsed_str}] s=${SAMPLE_IDX}"
for ((i=1; i<=NUM_TARGETS; i++)); do
n="${TARGET_NAMES[$i]}"
short="${n%% \(*}"
status_line+=" ${short}:${LAST_GLYPH[$n]}"
done
printf "\r%s\033[K" "$status_line"

for ((s=0; s<SampleIntervalSeconds; s++)); do
(( INTERRUPTED )) && break
sleep 1
done
done

(( INTERRUPTED )) && STOP_REASON="${STOP_REASON:-Ctrl+C}"
[[ -z "$STOP_REASON" ]] && STOP_REASON="loop exit"

END_TS_ACTUAL=$(date +%s)
ACTUAL_DURATION=$(( END_TS_ACTUAL - START_TS ))
echo ""
echo "Stop reason: $STOP_REASON"
echo "Samples collected: $SAMPLE_IDX rounds across $NUM_TARGETS targets"

rm -f "$StopFlagPath" 2>/dev/null

if (( SAMPLE_IDX == 0 )); then
echo "No samples collected, skipping report."
exit 0
fi

# ---------- Post-processing ----------

echo "Generating report..."

# Stats per target
echo " computing per-target stats..."
compute_stats() {
local target="$1"
local total ok
total=$(awk -F'\t' -v t="$target" '$1==t' "$SAMPLES_TSV" | wc -l | tr -d ' ')
ok=$(awk -F'\t' -v t="$target" '$1==t && $5=="OK"' "$SAMPLES_TSV" | wc -l | tr -d ' ')
: ${total:=0}
: ${ok:=0}

local rate="0.0"
(( total > 0 )) && rate=$(awk -v o="$ok" -v t="$total" 'BEGIN{printf "%.1f", (o/t)*100}')

if (( ok == 0 )); then
echo "${total}|0|${rate}|-|-|-|-|-"
return
fi

local lat_file="$TMP_DIR/lat.$RANDOM"
awk -F'\t' -v t="$target" '$1==t && $5=="OK" {print $4}' "$SAMPLES_TSV" | sort -n > "$lat_file"
local n=$(wc -l < "$lat_file" | tr -d ' ')
: ${n:=0}
local mean=$(awk '{s+=$1; n++} END{if(n>0) printf "%.0f", s/n; else print "-"}' "$lat_file")
local med_i=$(( (n + 1) / 2 ))
local p95_i=$(awk -v n="$n" 'BEGIN{i=int(n*0.95+0.5); if(i<1)i=1; if(i>n)i=n; print i}')
local p99_i=$(awk -v n="$n" 'BEGIN{i=int(n*0.99+0.5); if(i<1)i=1; if(i>n)i=n; print i}')
(( med_i < 1 )) && med_i=1
(( med_i > n )) && med_i=$n
local median=$(sed -n "${med_i}p" "$lat_file")
local p95=$(sed -n "${p95_i}p" "$lat_file")
local p99=$(sed -n "${p99_i}p" "$lat_file")
local maxv=$(tail -1 "$lat_file")
rm -f "$lat_file"

echo "${total}|${ok}|${rate}|${mean}|${median}|${p95}|${p99}|${maxv}"
}

extract_outages() {
local target="$1"
awk -F'\t' -v t="$target" -v thr="$OutageThresholdConsecutive" '
$1==t {
if ($5 == "FAIL") {
if (n == 0) { si = $2; sts = $3 }
n++; ei = $2; ets = $3
} else {
if (n >= thr) printf "%s\t%d\t%d\t%s\t%s\n", t, si, ei, sts, ets
n = 0
}
}
END { if (n >= thr) printf "%s\t%d\t%d\t%s\t%s\n", t, si, ei, sts, ets }
' "$SAMPLES_TSV"
}

ALL_OUTAGES="$TMP_DIR/outages.tsv"
: > "$ALL_OUTAGES"
echo " extracting outage events..."
for ((i=1; i<=NUM_TARGETS; i++)); do
extract_outages "${TARGET_NAMES[$i]}" >> "$ALL_OUTAGES"
done

generate_verdicts() {
local outage_count
outage_count=$(wc -l < "$ALL_OUTAGES" | tr -d ' ')
: ${outage_count:=0}
if (( outage_count == 0 )); then
echo "<p class=\"ok\">Connectivity was stable across all probed targets for the entire run.</p>"
return
fi

local sorted="$TMP_DIR/outages_sorted.tsv"
sort -t$'\t' -k4 "$ALL_OUTAGES" > "$sorted"

echo "<ul class=\"verdicts\">"
local cluster_kinds="" cluster_targets="" cluster_start=""
local cluster_max_dur=0 last_end_epoch=0
while IFS=$'\t' read -r tgt si ei sts ets; do
local s_epoch e_epoch dur kind=""
s_epoch=$(iso_to_epoch "$sts")
e_epoch=$(iso_to_epoch "$ets")
: ${s_epoch:=0}
: ${e_epoch:=0}
dur=$(( e_epoch - s_epoch ))
(( dur < 1 )) && dur=$(( SampleIntervalSeconds * (ei - si + 1) ))

for ((j=1; j<=NUM_TARGETS; j++)); do
[[ "${TARGET_NAMES[$j]}" == "$tgt" ]] && kind="${TARGET_KINDS[$j]}" && break
done

if [[ -z "$cluster_start" ]]; then
cluster_start="$sts"; last_end_epoch="$e_epoch"
cluster_kinds="$kind"; cluster_targets="$tgt"; cluster_max_dur=$dur
elif (( s_epoch - last_end_epoch <= ClusterWindowSeconds )); then
cluster_kinds="$cluster_kinds $kind"
cluster_targets="$cluster_targets|$tgt"
(( e_epoch > last_end_epoch )) && last_end_epoch="$e_epoch"
(( dur > cluster_max_dur )) && cluster_max_dur=$dur
else
emit_verdict "$cluster_start" "$cluster_max_dur" "$cluster_kinds" "$cluster_targets"
cluster_start="$sts"; last_end_epoch="$e_epoch"
cluster_kinds="$kind"; cluster_targets="$tgt"; cluster_max_dur=$dur
fi
done < "$sorted"

[[ -n "$cluster_start" ]] && emit_verdict "$cluster_start" "$cluster_max_dur" "$cluster_kinds" "$cluster_targets"
echo "</ul>"
}

emit_verdict() {
local start_ts="$1" max_dur="$2" kinds="$3" targets="$4"
local time_short
time_short=$(echo "$start_ts" | awk -F'T' '{print $2}' | cut -c1-8)
local has_inet=0 has_gw=0 has_tg=0 has_relay=0
[[ " $kinds " == *" internet "* ]] && has_inet=1
[[ " $kinds " == *" gateway "* ]] && has_gw=1
[[ " $kinds " == *" trugrid "* ]] && has_tg=1
[[ " $kinds " == *" relay "* ]] && has_relay=1

local verdict
if (( has_inet && has_gw && has_tg )); then
verdict="At ${time_short}, total connectivity loss for ${max_dur}s. Likely full ISP or upstream outage."
elif (( has_gw && ! has_inet )); then
verdict="At ${time_short}, gateway failed for ${max_dur}s but internet stayed reachable. Suspicious, possibly a transient routing or LAN issue."
elif (( has_inet && ! has_gw )); then
verdict="At ${time_short}, internet probes failed for ${max_dur}s but gateway stayed up. Likely an ISP-side issue past your gateway."
elif (( (has_tg || has_relay) && ! has_inet && ! has_gw )); then
verdict="At ${time_short}, only TruGrid path failed for ${max_dur}s. Other connectivity healthy. Suggests a TruGrid Azure or transit issue, not your network."
elif (( has_relay && ! has_tg )); then
verdict="At ${time_short}, the assigned relay failed for ${max_dur}s but the TruGrid front end stayed reachable. Possible ACI rotation or relay-specific incident."
else
local names="${targets//\|/, }"
verdict="At ${time_short}, outage affecting: ${names}. Duration ${max_dur}s. Manual review of the timeline recommended."
fi
echo "<li>${verdict}</li>"
}

echo " building correlation verdict..."
VERDICT_HTML=$(generate_verdicts)

generate_svg() {
local target="$1"
local has_data
has_data=$(awk -F'\t' -v t="$target" '$1==t && $5=="OK" {n++} END{print n+0}' "$SAMPLES_TSV")
: ${has_data:=0}
if (( has_data == 0 )); then
echo "<p class=\"meta\">No successful samples to chart.</p>"
return
fi

awk -F'\t' -v target="$target" -v thr="$OutageThresholdConsecutive" '
BEGIN {
W = 900; H = 200
ML = 50; MR = 15; MT = 15; MB = 30
PW = W - ML - MR; PH = H - MT - MB
max_lat = 0; min_idx = -1; max_idx = -1
}
$1 == target {
if (min_idx < 0) min_idx = $2
max_idx = $2
if ($5 == "OK") {
v = $4 + 0
if (v > max_lat) max_lat = v
ok_idx[$2] = v
} else {
fail_idx[$2] = 1
}
}
END {
if (max_lat < 1) max_lat = 1
span = max_idx - min_idx
if (span < 1) span = 1

ymax = max_lat
if (ymax < 10) ymax = 10
else if (ymax < 50) ymax = int((ymax + 9) / 10) * 10
else if (ymax < 200) ymax = int((ymax + 24) / 25) * 25
else ymax = int((ymax + 99) / 100) * 100

printf "<svg viewBox=\"0 0 %d %d\" xmlns=\"http://www.w3.org/2000/svg\" style=\"width:100%%;max-width:%dpx;height:auto;\">\n", W, H, W
printf "<rect x=\"%d\" y=\"%d\" width=\"%d\" height=\"%d\" fill=\"rgba(0,0,0,0.02)\" stroke=\"#ccc\" stroke-width=\"0.5\"/>\n", ML, MT, PW, PH
for (gy = 1; gy <= 3; gy++) {
ly = MT + (PH * gy / 4)
printf "<line x1=\"%d\" y1=\"%.1f\" x2=\"%d\" y2=\"%.1f\" stroke=\"#ddd\" stroke-width=\"0.5\"/>\n", ML, ly, ML+PW, ly
}

run_start = -1
for (idx = min_idx; idx <= max_idx; idx++) {
if (idx in fail_idx) {
if (run_start < 0) run_start = idx
run_end = idx
} else {
if (run_start >= 0 && (run_end - run_start + 1) >= thr) {
x1 = ML + ((run_start - min_idx) / span) * PW
x2 = ML + ((run_end - min_idx + 1) / span) * PW
printf "<rect x=\"%.1f\" y=\"%d\" width=\"%.1f\" height=\"%d\" fill=\"#c0392b\" opacity=\"0.18\"/>\n", x1, MT, x2-x1, PH
}
run_start = -1
}
}
if (run_start >= 0 && (run_end - run_start + 1) >= thr) {
x1 = ML + ((run_start - min_idx) / span) * PW
x2 = ML + ((run_end - min_idx + 1) / span) * PW
printf "<rect x=\"%.1f\" y=\"%d\" width=\"%.1f\" height=\"%d\" fill=\"#c0392b\" opacity=\"0.18\"/>\n", x1, MT, x2-x1, PH
}

in_seg = 0
for (idx = min_idx; idx <= max_idx; idx++) {
if (idx in ok_idx) {
x = ML + ((idx - min_idx) / span) * PW
y = MT + PH - (ok_idx[idx] / ymax) * PH
if (in_seg == 0) { printf "<polyline points=\""; in_seg = 1 }
printf "%.1f,%.1f ", x, y
} else {
if (in_seg) { printf "\" fill=\"none\" stroke=\"#1972F5\" stroke-width=\"1.4\"/>\n"; in_seg = 0 }
}
}
if (in_seg) printf "\" fill=\"none\" stroke=\"#1972F5\" stroke-width=\"1.4\"/>\n"

printf "<text x=\"%d\" y=\"%d\" font-size=\"10\" text-anchor=\"end\" fill=\"#888\">%dms</text>\n", ML-5, MT+5, ymax
printf "<text x=\"%d\" y=\"%d\" font-size=\"10\" text-anchor=\"end\" fill=\"#888\">%dms</text>\n", ML-5, MT+PH/2+3, ymax/2
printf "<text x=\"%d\" y=\"%d\" font-size=\"10\" text-anchor=\"end\" fill=\"#888\">0ms</text>\n", ML-5, MT+PH+3

for (k = 0; k <= 4; k++) {
tx = ML + (PW * k / 4)
ti = min_idx + int(span * k / 4)
printf "<line x1=\"%.1f\" y1=\"%d\" x2=\"%.1f\" y2=\"%d\" stroke=\"#888\" stroke-width=\"0.5\"/>\n", tx, MT+PH, tx, MT+PH+4
printf "<text x=\"%.1f\" y=\"%d\" font-size=\"10\" text-anchor=\"middle\" fill=\"#888\">s%d</text>\n", tx, MT+PH+16, ti
}

print "</svg>"
}
' "$SAMPLES_TSV"
}

echo " rendering per-target tables..."
STATS_ROWS=""
for ((i=1; i<=NUM_TARGETS; i++)); do
name="${TARGET_NAMES[$i]}"
stats=$(compute_stats "$name")
IFS='|' read -r total ok rate mean median p95 p99 maxv <<< "$stats"
rate_class="ok"
awk -v r="${rate:-0}" 'BEGIN{exit (r < 99) ? 0 : 1}' && rate_class="warn"
awk -v r="${rate:-0}" 'BEGIN{exit (r < 95) ? 0 : 1}' && rate_class="fail"
STATS_ROWS+="<tr><td>${name}</td><td>${total}</td><td class=\"${rate_class}\">${rate}%</td><td>${mean}</td><td>${median}</td><td>${p95}</td><td>${p99}</td><td>${maxv}</td></tr>"
done

echo " generating per-target SVG charts..."
CHARTS_HTML=""
for ((i=1; i<=NUM_TARGETS; i++)); do
name="${TARGET_NAMES[$i]}"
CHARTS_HTML+="<h3>${name}</h3>$(generate_svg "$name")"
done

echo " building outage table..."
OUTAGE_ROWS=""
while IFS=$'\t' read -r tgt si ei sts ets; do
[[ -z "$tgt" ]] && continue
dur_samples=$(( ei - si + 1 ))
s_epoch=$(iso_to_epoch "$sts")
e_epoch=$(iso_to_epoch "$ets")
: ${s_epoch:=0}
: ${e_epoch:=0}
dur=$(( e_epoch - s_epoch ))
(( dur < 1 )) && dur=$(( SampleIntervalSeconds * dur_samples ))
OUTAGE_ROWS+="<tr><td>${tgt}</td><td>${sts}</td><td>${ets}</td><td>${dur}s</td><td>${dur_samples}</td></tr>"
done < "$ALL_OUTAGES"

OUTAGE_SECTION=""
if [[ -n "$OUTAGE_ROWS" ]]; then
OUTAGE_SECTION="<h2>Outage Events</h2><table><tr><th>Target</th><th>Start</th><th>End</th><th>Duration</th><th>Samples</th></tr>${OUTAGE_ROWS}</table>"
fi

echo " writing HTML..."

GEN_NOW=$(date +%Y-%m-%dT%H:%M:%S%z)
START_LOCAL=$(date -r $START_TS +%H:%M:%S)
END_LOCAL=$(date -r $END_TS_ACTUAL +%H:%M:%S)

cat > "$HTML_PATH" <<HTML
<!DOCTYPE html>
<html lang="en"><head><meta charset="utf-8">
<title>TruGrid Network Monitor - ${HOSTNAME_SHORT}</title>
<style>
:root { color-scheme: light dark; }
body { font-family: -apple-system, BlinkMacSystemFont, "Helvetica Neue", sans-serif; max-width: 1100px; margin: 2em auto; padding: 0 1.2em; line-height: 1.45; }
h1 { border-bottom: 3px solid #1972F5; padding-bottom: 0.3em; }
h2 { color: #1972F5; margin-top: 2em; font-size: 1.05em; text-transform: uppercase; letter-spacing: 0.5px; }
h3 { margin-top: 1.5em; font-size: 0.95em; color: #444; }
.meta { color: #888; font-size: 0.9em; }
table { border-collapse: collapse; width: 100%; margin-top: 0.5em; font-size: 0.9em; }
th, td { padding: 0.45em 0.7em; text-align: left; border-bottom: 1px solid #ddd; vertical-align: top; }
th { background: rgba(25,114,245,0.07); }
.ok { color: #1f8a1f; font-weight: 600; }
.warn { color: #b8860b; font-weight: 600; }
.fail { color: #c0392b; font-weight: 600; }
.counts { display: flex; gap: 1em; flex-wrap: wrap; margin: 0.5em 0; }
.counts > div { background: rgba(25,114,245,0.06); padding: 0.5em 0.9em; border-radius: 4px; min-width: 100px; }
.counts > div .num { font-size: 1.4em; font-weight: 700; display: block; line-height: 1.1; }
.counts > div .lbl { font-size: 0.75em; color: #888; text-transform: uppercase; letter-spacing: 0.5px; }
ul.verdicts li { margin-bottom: 0.4em; }
.note { background: #fff8d6; padding: 0.6em 1em; border-left: 4px solid #e0c000; margin: 1.5em 0 0; font-size: 0.85em; }
code { font-family: ui-monospace, "SF Mono", Menlo, monospace; font-size: 0.9em; }
@media (prefers-color-scheme: dark) {
body { background: #1a1a1a; color: #eee; }
th { background: rgba(25,114,245,0.15); }
th, td { border-bottom: 1px solid #333; }
h3 { color: #ccc; }
.counts > div { background: rgba(25,114,245,0.12); }
.note { background: #2a260c; border-left-color: #b08800; color: #eee; }
}
</style></head><body>

<h1>TruGrid Network Monitor</h1>
<p class="meta">Generated ${GEN_NOW} &middot; Host: ${HOSTNAME_SHORT} &middot; Run: ${START_LOCAL} to ${END_LOCAL} (${ACTUAL_DURATION}s)</p>

<h2>Run Summary</h2>
<div class="counts">
<div><span class="num">${SAMPLE_IDX}</span><span class="lbl">Sample rounds</span></div>
<div><span class="num">${NUM_TARGETS}</span><span class="lbl">Targets</span></div>
<div><span class="num">${TOTAL_OUTAGES}</span><span class="lbl">Outage events</span></div>
<div><span class="num" style="font-size:0.95em;">${STOP_REASON}</span><span class="lbl">Stop reason</span></div>
</div>

<h2>Correlation Verdict</h2>
<div>${VERDICT_HTML}</div>

<h2>Per-Target Statistics</h2>
<p class="meta">Latency in milliseconds. Mean is sensitive to outliers; median and p95 are usually the interesting ones.</p>
<table>
<tr><th>Target</th><th>Samples</th><th>Success</th><th>Mean</th><th>Median</th><th>P95</th><th>P99</th><th>Max</th></tr>
${STATS_ROWS}
</table>

<h2>Latency Charts</h2>
${CHARTS_HTML}

${OUTAGE_SECTION}

<div class="note">v1.1 Mac port. Behavior mirrors Windows Test-TruGridNetworkMonitor.ps1. Relay IP discovered via lsof on the loopback 127.0.0.1:43xxx tunnel (Mac log does not record upstream IP, unlike Windows Connector log).</div>

</body></html>
HTML

echo " HTML report: $HTML_PATH"
echo "REPORT_PATH=$HTML_PATH"
echo "Done."

[[ "$NO_OPEN" == "1" ]] || open "$HTML_PATH"


If none of this surfaces the cause, share the HTML report with TruGrid support along with the approximate time of the most recent disconnect.


HTML Report


Here is the resulting HTML report:



Updated on: 19/05/2026

Was this article helpful?

Share your feedback

Cancel

Thank you!