CyberByrd Essay

The Controls That Should Have Caught It Sooner

A user at a federal agency was working remotely from a foreign country for months, using a consumer VPN to make their access look domestic. The architecture was in place. The logging was active. And nobody noticed until the VPN slipped.

By CyberByrd Field Notes March 2026

That’s not a failure of technology. That’s a failure of control design.

This is a breakdown of what was in place, what was missing, and what would have shortened the detection window from months to minutes.

What Was in Place (And Why It Wasn’t Enough)

The environment had a solid foundation. Zscaler ZPA for private application access. Microsoft Entra ID as the identity provider. MFA enforced. Device certificates issued through the Zscaler client. SAML-based authentication. Centralized logging through a security operations center.

On paper, that’s a strong stack. In practice, every one of those controls had a gap that this situation walked right through.

Zscaler ZPA trusted the network layer it was handed. ZPA captures the client’s public IP at connection time. If that IP is already masked by a VPN, ZPA logs the VPN’s exit IP, not the real one. There’s no native mechanism in standard ZPA configurations to determine whether the source IP belongs to a commercial VPN provider. The session looked domestic because the first hop was domestic. ZPA did its job. It just didn’t have the context to know the source was fraudulent.

Entra ID authenticated the user, not the location. The identity provider validated credentials, device certificate, and MFA. All passed. Conditional access policies were either not configured for location-based restrictions or were relying on the same masked IP that Zscaler saw. If your conditional access policy says “allow from U.S. IPs” and the user is presenting a U.S. IP through a VPN, the policy passes. The control exists but it’s evaluating spoofed input.

MFA confirmed identity, not geography. Multi-factor authentication proves the person holding the device is who they claim to be. It says nothing about where they are. This is a common misconception in security programs: MFA is treated as a catch-all when it’s actually narrowly scoped. In this case, MFA worked perfectly and caught nothing, because the problem wasn’t identity. It was location.

The monitoring center triggered on the right signal but only once. When the VPN dropped and the real IP surfaced, the SOC flagged it immediately. Foreign IP, user not on the authorized travel list, alert generated. That response was correct. But it only fired because the VPN failed. Every previous session looked normal in the logs. The monitoring was configured for anomalies, not patterns. It caught the exception. It missed the months of activity that preceded it.

What Was Missing (The Specific Gaps)

Each of these represents a control that, if implemented, would have either prevented the access or detected it significantly earlier.

Geolocation validation at the network level, not just the IP level.

Standard IP geolocation checks are easy to defeat with a VPN. What’s harder to fake is network-level intelligence: ASN reputation, ISP classification, known VPN/proxy provider ranges, and connection latency analysis.

Zscaler does offer capabilities to flag traffic originating from known commercial VPN providers. If the ZPA deployment had been configured to check source IPs against threat intelligence feeds that include consumer VPN exit nodes, the very first session would have raised a flag. Not because the IP was foreign, but because the IP belonged to a VPN provider rather than a residential or corporate ISP.

This isn’t exotic. It’s a configuration decision that most deployments skip because it introduces friction for legitimate remote workers who occasionally use VPNs for personal reasons.

Conditional access policies scoped to named/compliant locations only.

Entra ID supports named locations, which are defined IP ranges or geographic regions that your organization explicitly trusts. A stronger policy would have restricted access to named locations only, meaning the user would need to connect from a recognized corporate network, a known home IP range, or an explicitly approved location.

The gap here is common: conditional access was configured to block known-bad locations rather than restrict to known-good ones. That’s a permissive model. It allows anything that isn’t explicitly denied. Switching to an allowlist model would have forced the user to either connect from a recognized network or request an exception, which creates a paper trail.

Device posture checks that include network environment signals.

The device itself was legitimate. Registered, certificate-bound, compliant. But device posture checks in most deployments evaluate the device in isolation: is the OS patched, is the endpoint agent running, is the disk encrypted.

What they typically don’t evaluate is the network the device is sitting on. More advanced posture configurations can assess: is the device connected through a known VPN? What is the DNS configuration? Is the network adapter showing characteristics consistent with tunneled traffic? Does the connection latency match the expected geography for the claimed IP?

These checks exist in various forms across Zscaler, Intune, and third-party endpoint tools. They’re rarely turned on because they require tuning and can generate false positives for legitimate mobile workers. But in an environment where foreign access is restricted by policy, they represent a critical detection layer.

Impossible travel detection tuned for access patterns, not just alerts.

Entra ID has built-in impossible travel detection that flags when a user authenticates from two geographically distant locations in a timeframe that doesn’t allow for physical travel between them. This works well for catching compromised credentials used simultaneously from different countries.

It works less well when a user is consistently accessing from one location through a VPN. There’s no “travel” to detect because the apparent location never changes.

A more effective approach would layer behavioral analytics on top of raw sign-in data. Track not just where logins originate, but how they originate. Connection times relative to the user’s historical patterns. Session duration distributions. Time zone signals embedded in usage behavior. A user consistently logging in at 2 AM Eastern while claiming to be on the East Coast is a soft signal that, combined with other indicators, builds a case over time.

Correlation between identity logs and network logs.

This is the gap that matters most, and it’s the one that was explicitly called out during the investigation.

Entra ID logs showed a U.S. IP. Zscaler logs showed a Nigerian IP. Those two data sources were not being correlated in real time. The identity team saw clean U.S. logins. The network team saw one foreign anomaly. Nobody was comparing the two side by side on a continuous basis.

If an automated rule had been in place to flag cases where the Entra sign-in IP and the Zscaler client IP diverge significantly for the same user and session window, this would have surfaced much earlier. The divergence between identity-layer geography and network-layer geography is one of the strongest indicators of VPN masking.

This doesn’t require a SIEM overhaul. It requires a correlation rule that asks: for this user, at this time, do the identity logs and the network logs agree on where they are? If not, investigate.

Periodic access reviews that include location telemetry.

Most access reviews ask: does this user still need access? They rarely ask: where is this user accessing from, and does that match expectations?

A quarterly review that included a summary of each user’s connection geography, even at the country level, would have surfaced this pattern. Not as an alert. Not as an incident. Just as a data point in a review: “This user has connected exclusively from U.S.-based IPs for the past 90 days, with consistent session times between 1 AM and 9 AM Eastern.”

That pattern alone would prompt a question. And a question is all it would have taken.

The Bigger Lesson

None of these gaps are unusual. Every one of them exists in environments with mature security programs. The stack was solid. The logging was there. The architecture was sound. What was missing was the connective tissue between controls.

Each system did its job in isolation. Zscaler authenticated the device. Entra validated the identity. The SOC monitored for anomalies. But nobody had built the logic that ties those systems together and asks: taken as a whole, does this access pattern make sense?

That’s the gap that matters for any organization, not just federal ones.

If you’re running a team of 10 or 30 people and your employees are using cloud tools, remote access platforms, or AI applications from wherever they happen to be, the question isn’t whether your tools are logging. They probably are.

The question is whether anyone is reading the logs together instead of apart.

Controls that work in isolation protect against simple threats. Controls that work together protect against the threats that actually keep you up at night: the ones that look completely normal until they don’t.

What to Do With This

If you’re reviewing your own environment after reading this, here’s where to start:

Ask whether your access policies are built on allowlists or blocklists. If you’re only blocking known-bad locations, you’re running a permissive model that a $10 VPN defeats.

Check whether your identity logs and your network logs are being compared. If they live in separate dashboards and nobody’s correlating them, you have a blind spot.

Look at your conditional access policies and ask what they’re actually evaluating. If they’re checking IP geolocation but not source reputation, they’re checking the mask, not the face.

Review your access telemetry over time, not just per event. A single login from anywhere looks fine. Three months of logins at unusual hours from a user who claims to be local tells a different story.

And above all, remember that the most dangerous gaps aren’t the ones where controls are missing. They’re the ones where controls exist but aren’t talking to each other.

That’s where the real work is.

Built from inside real security environments. No hype. Just honest takes.

CyberByrd