Privacy & Benchmarking Explainer
How GigAnalytics protects your income data, what the optional benchmark feature collects, and the technical safeguards (k-anonymity, differential privacy) that make it safe to participate.
Contents
1. Privacy-First Architecture
GigAnalytics is built on a data minimization principle: we collect only what's necessary to compute your ROI dashboard, and we store it in a user-partitioned database where your data is isolated by Row Level Security (RLS) from every other user's data.
Encrypted at rest
All data stored in Supabase (Postgres) with AES-256 encryption. TLS 1.3 in transit.
Row Level Security
Postgres RLS ensures every query is automatically scoped to your user_id. No query can return another user's data.
No data selling
We never sell, rent, or license your income data to third parties. The benchmark feature is opt-in and anonymized.
Free users: zero benchmark participation
Free tier users never contribute to or appear in any benchmark dataset. Benchmark contribution is an opt-in Pro feature only. Free users can still view benchmark statistics generated from Pro user contributions.
2. What Data GigAnalytics Stores
Here is a complete inventory of data stored in your GigAnalytics account:
3. Benchmark Feature: Full Data Flow
The benchmark feature allows Pro users to opt in to contributing anonymized aggregate metrics that power the "how do I compare?" insights in the dashboard. Here is the complete data flow, step by step:
- 1
You enable "Contribute to benchmarks" in Settings → Privacy
This toggle is OFF by default. Enabling it starts the contribution pipeline for your account. You can disable it at any time.
- 2
Our backend computes aggregate metrics from your raw data
The pipeline runs nightly. It reads your transactions and time entries and computes: hourly rate percentile bucket, revenue range bucket, experience range bucket, platform category, and region (country-level). It does NOT read transaction descriptions, client names, or exact amounts.
- 3
Buckets are assigned (not raw values)
Raw hourly rate of $87/hr is bucketed to "$80–$100/hr". Revenue of $4,200/mo is bucketed to "$4,000–$5,000/mo". This prevents precise individual inference.
- 4
K-anonymity check
Before your bucket is included in any aggregate, the pipeline checks: does this (platform, rate_bucket, region) combination have ≥ 25 other contributors? If not, your data is suppressed for that segment until the pool grows.
- 5
Differential privacy noise is added
Even for qualifying buckets, we apply Laplace noise (ε = 0.5) to the aggregate counts before storing them. This means the published percentile is statistically indistinguishable from what it would be if any single user were removed from the pool.
- 6
Contribution is decoupled from user ID
The final aggregated values in the benchmark store are not linked to your user_id. The pipeline uses a one-way hash of (user_id + contribution_date) as a deduplication key that cannot be reversed.
- 7
Benchmark values are served to Pro users
The published p25/median/p75 rates are served to all Pro users querying their platform + region segment. Your personal data never appears — only the anonymized aggregate.
4. K-Anonymity: The Math
K-anonymity is a formal privacy guarantee. A dataset satisfies k-anonymity if, for every record, at least k−1 other records share the same quasi-identifying attributes.
In GigAnalytics's benchmark dataset, the quasi-identifying attributes are:
- platform_category — bucketed (design, development, writing, consulting, other)
- hourly_rate_bucket — $10 increments up to $200, then $50 increments
- region — country-level (US, UK, CA, AU, DE, other)
- experience_range — <1yr, 1–3yr, 3–7yr, 7yr+
What this means in practice: If you're the only freelance copywriter in New Zealand earning $90–$100/hr with 3–7 years of experience, your data won't appear in the benchmark for that segment. You'll see "insufficient data" for that specific combination.
5. Differential Privacy: The Math
K-anonymity alone is vulnerable to attacks where an adversary knows about a specific individual. Differential privacy (DP) provides a stronger guarantee: the probability of any inference about an individual changes by at most e^ε whether or not that individual's data is in the dataset.
GigAnalytics applies the Laplace mechanism to aggregate counts and statistics before publication:
Why ε = 0.5? Lower ε = stronger privacy guarantee but noisier statistics. We chose ε = 0.5 as a balance: it satisfies "strong" DP by most academic standards (ε ≤ 1) while keeping the benchmark accuracy high enough to be useful (±$10 noise on a $60–$150 range is acceptable).
The combination of k=25 anonymity and ε=0.5 differential privacy provides defense-in-depth: k-anonymity protects against record linkage attacks; differential privacy protects against membership inference attacks.
6. Benchmark Contribution Pipeline Architecture
Contribution scheduler
Runs nightly at 02:00 UTC. Processes all opted-in Pro users added or modified since last run.
Aggregation worker
Supabase Edge Function that reads user metrics, applies bucketing, and computes per-segment aggregate stats.
K-anonymity filter
Postgres function that counts contributors per segment and nulls out segments below k=25.
DP noise injector
Python Lambda function using NumPy's Laplace distribution to add calibrated noise to qualifying aggregates.
Benchmark store
Separate Postgres schema with no user_id columns. Tables: benchmark_segments, benchmark_percentiles, benchmark_metadata.
Delayed publish
New contributions are held in staging for 72 hours before going live. This prevents near-real-time membership inference.
Audit log
Append-only log of pipeline runs (no user data). Rotated after 90 days. Used for debugging and compliance review.
8. Data Deletion and Opt-Out
Disable benchmark contribution
Settings → Privacy → Benchmark contribution → toggle off
Your data stops contributing within 24 hours. Future aggregations exclude you. Previously published aggregates remain (they're group statistics, not individual records).
Delete your GigAnalytics account
Settings → Account → Delete Account
All your raw data (transactions, time entries, streams) is permanently deleted from Supabase within 7 days. Your benchmark contributions are removed from future pipeline runs. The benchmark store retains no user-identifying records, so there's nothing to delete there.
GDPR / CCPA data export
Settings → Privacy → Export my data
Downloads a JSON file of all your raw data: transactions, time entries, streams, and account metadata. We don't export benchmark data because it's not linked to you in the benchmark store.
GDPR right to be forgotten
Email hello@hourlyroi.com
We'll confirm deletion of your account and raw data within 30 days per GDPR Article 17.
9. Third-Party Data Processors
10. Security Measures
Auth
Supabase Auth with bcrypt password hashing, JWT tokens with 1hr expiry, optional MFA.
Row Level Security
Every table enforces user_id = auth.uid() via Postgres RLS. No shared table scans.
TLS everywhere
TLS 1.3 for all API calls. HSTS header with 2-year max-age. No plain-HTTP fallback.
Dependency audit
npm audit runs on every PR. Critical vulnerabilities block deployment.
Input validation
All API inputs validated with Zod schemas. SQL injection mitigated by Supabase parameterized queries.
Breach notification
If a breach is detected, affected users are notified within 72 hours per GDPR Article 33.