HIPAA-Compliant Analytics Setup Guide

Why HIPAA and web analytics are in tension

HIPAA's Privacy Rule protects individually identifiable health information — information that could be used to identify a person and relates to their past, present, or future health condition, treatment, or payment. Web analytics tools, by design, collect detailed behavioral data about individual users across sessions.

The tension arises not because web analytics tools intentionally collect health information, but because the URLs and parameters on health-related websites often contain information that, combined with IP addresses and other identifiers, can constitute PHI. A URL like /patient-portal?condition=diabetes&user_id=12345 passing through GA4 is a PHI problem, even if no one intended to track health conditions.

What counts as PHI in analytics data

The specific analytics data points that create HIPAA risk:

URL parameters containing health information: Query strings with condition names, medication names, diagnostic codes, or treatment references.
URL parameters containing identifiers: Patient IDs, member numbers, or any identifier that could be linked to a health record.
IP addresses combined with health page visits: Visiting a page about a specific condition, combined with an IP address, may be sufficient to constitute PHI in some interpretations.
User IDs in analytics that correspond to patient records: If your analytics user ID can be cross-referenced with a health record system, the analytics data becomes PHI.

The risk level depends heavily on what your site actually does. A medical device company marketing to clinicians is in a different risk category than a patient-facing health platform. The former has limited PHI exposure; the latter has significant exposure that requires careful architecture.

GA4 configuration steps for healthcare-tech sites

Redact URL query parameters

GA4 collects the full URL of each page view, including query parameters. If those parameters contain health information, that information lands in your analytics data. In GA4's Data Settings → Data Collection, enable URL query parameter redaction to strip query parameters from URLs before they are stored. Alternatively, configure specific parameter exclusions for the parameters that contain sensitive data.

Disable Google Signals

Google Signals enables cross-device tracking and audience features by linking GA4 data to signed-in Google accounts. For healthcare-tech sites with any patient-facing components, disable Google Signals in your GA4 property. This limits the cross-device tracking capabilities but eliminates a data linkage that creates PHI risk.

Shorten data retention

GA4 allows you to set the user-data retention period. Set this to 2 months rather than the default 14 months. Less retained data means less exposure in the event of a data breach or regulatory inquiry.

Disable advertising features

Turn off all advertising personalization and remarketing features in GA4. These features involve sharing behavioral data with Google's advertising ecosystem — inappropriate for health-related behavioral data.

The BAA problem with GA4

A Business Associate Agreement (BAA) is a contract required by HIPAA between a covered entity and any vendor that handles PHI on its behalf. Google currently does not offer a BAA for GA4. This means that if your site is operated by a covered entity and GA4 is processing PHI, the arrangement is not HIPAA-compliant regardless of how you configure GA4.

For covered entities — hospitals, health plans, healthcare clearinghouses — this is a genuine constraint that configuration alone cannot resolve. The options are: use a privacy-first analytics tool that does not transmit data to Google, implement server-side analytics that keeps data within your own infrastructure, or restrict GA4 to pages that genuinely have no PHI exposure risk.

Privacy-first analytics alternatives

For healthcare-tech sites where GA4's BAA gap is a blocker, privacy-first analytics tools provide measurement without the data-sharing implications. Plausible and Fathom are the two most widely used options — both collect aggregated, cookieless traffic data without transmitting individual-level behavioral data to third-party servers.

The tradeoff is capability. Plausible and Fathom provide traffic counts, sources, top pages, and goal conversions — but not the exploration-level analysis, funnel visualization, or attribution modeling available in GA4. For marketing teams accustomed to GA4's depth, this is a significant constraint. For compliance teams concerned about PHI exposure, it is the right tradeoff.

What to document

Whatever analytics configuration you implement, document it. If regulators or auditors ever inquire about your data practices, you need to be able to demonstrate that you identified the PHI risk, made deliberate configuration decisions to mitigate it, and have controls in place. A gap analysis that identifies what data you collect, where it goes, and what mitigations are in place is the minimum defensible documentation.

Need analytics that works within your compliance requirements?

I have configured analytics stacks for medical device companies, genomics firms, and healthcare-tech startups — measurement that gives you the data you need without the compliance exposure you do not want.

Get a quote