Limited time: Save up to 33% on every planView pricing
Voibe Logovoibe Resources

AI Tool Privacy Tracker

What every major AI tool actually does with your data. Training behavior, retention, and on-device support β€” verified against primary sources, with a separate row for consumer and business tiers because the answer is different.

Last updated May 12, 202630 tools trackedNext review June 12, 2026

Recent Changes

Dated policy shifts that changed what a tool does with your data. Each entry is linked to a primary source.

  1. Atlassian (Jira / Confluence) β€” forthcoming

    Atlassian will begin using customer data from Jira, Confluence, Jira Service Management, and other Cloud products to train AI offerings including Rovo and Rovo Dev. Affects roughly 300,000 customers. Opt-out is available now via Atlassian Administration β†’ Security β†’ Data contribution and must be set before the August 17 effective date.

    Source: us.seibert.group
  2. GitHub Copilot

    GitHub began using Free / Pro / Pro+ user interaction data, including code snippets, to train AI models by default. Existing opt-outs are honored. Business and Enterprise are unaffected.

    Source: github.blog/news-insights
  3. Granola

    The Verge reported Granola makes meeting notes accessible to anyone with the share link by default and opts users into internal AI training in default settings, contradicting the app's "private by default" marketing. Enterprise tier has training off by default; consumer tier requires manual opt-out in account settings.

    Source: themeridiem.com
  4. X / xAI (Grok)

    X's January 2026 Terms of Service update explicitly classified Grok prompts and outputs as user "Content" available for AI training and fine-tuning. Combined with the platform's prior default-on training of public X posts, this codifies the consumer-tier Grok experience as opt-in-by-default: opt-out paths exist in three separate settings surfaces (Grok on X / Grok mobile / grok.com), each requiring a separate toggle.

    Source: privacy.x.com
  5. Microsoft 365 Copilot

    Microsoft enabled Anthropic as a default subprocessor for Microsoft 365 Copilot β€” used in Researcher, Copilot Studio, Word / Excel / PowerPoint agents, and Agent Mode in Excel. Processing occurs outside the EU Data Boundary; EU and UK tenants have it disabled by default. The shift is opt-out for commercial tenants in unaffected regions. Microsoft 365 Copilot's no-training contractual posture is unchanged β€” the data still isn't used to train foundation models β€” but the surface area for who processes customer content expanded.

    Source: learn.microsoft.com
  6. Meta AI

    Meta began using Meta AI conversations for ad personalization in addition to model training across US and most non-EU regions. EU, UK, and South Korea are carved out pending GDPR/regional clearance. Sensitive categories (health, religion, politics) are excluded from ad targeting. First major consumer AI platform to feed active chat content into ad-targeting pipelines at scale; EPIC requested an FTC suspension in response.

    Source: snopes.com (fact-check Nov 2025)
  7. Fireflies.ai

    Class action filed under Illinois' Biometric Information Privacy Act (BIPA) alleging Fireflies collected voiceprint biometric data without written consent. The "Fireflies.ai Notetaker" bot joins meetings as a visible participant, creating consent complications in two-party consent jurisdictions.

    Source: meetily.ai
  8. OpenAI / ChatGPT

    OpenAI's obligation to indefinitely retain consumer ChatGPT and API content, imposed under the NYT litigation order, ended. Standard 30-day retention practices resumed. Limited April–September 2025 data is still preserved under the order.

    Source: openai.com
  9. Anthropic / Claude (incl. Claude Code on Pro/Max)

    Anthropic shifted consumer Claude (Free, Pro, Max) from "not used for training" to a user-choice model. Users who opt in have data retained up to 5 years (vs. previous 30 days). Existing-user choice deadline: October 8, 2025. The update explicitly applies to Claude Code sessions running on Pro/Max accounts β€” a developer using Claude Code via a Pro/Max account is on consumer terms (default-on training unless toggled off), while Claude Code via API key / Console / Enterprise remains on Commercial Terms (no training).

    Source: anthropic.com/news
  10. Grok / xAI

    Approximately 370,000+ Grok shared chat conversations were exposed publicly on Google Search because Grok's share feature generated unauthenticated URLs without `noindex` directives. The exposed prompts included medical information, business details, passwords, and requests for illegal activity. Disclosed first by Forbes / Fortune / TechCrunch; xAI patched after media intervention. Combined with the July 2025 "MechaHitler" content failure and Turkey's criminal probe + content block, this anchors the consumer-tier track-record floor.

    Source: techcrunch.com
  11. Otter.ai

    Brewer v. Otter.ai filed in the Northern District of California alleging Otter "deceptively and surreptitiously" recorded private conversations and used meeting data to train AI models without explicit permission from all participants. Complaint alleges violations of the Electronic Communications Privacy Act (ECPA), Computer Fraud and Abuse Act (CFAA), and California Invasion of Privacy Act (CIPA). Case ongoing.

    Source: meetily.ai (case summary)
  12. DeepSeek (cumulative bans Jan 2025 – Apr 2026)

    Italy's Garante imposed a 72-hour ban (January 30, 2025) followed by investigations in 13 European jurisdictions and the formation of a dedicated EDPB AI Enforcement Task Force. Government device bans followed in Australia, Taiwan, South Korea, Czech Republic, the Netherlands, Germany, multiple US federal agencies (Pentagon, NASA, US Navy), and several US states. The bipartisan "No DeepSeek on Government Devices Act" is pending in the US Senate. DeepSeek's privacy policy explicitly stores personal data β€” including keystroke patterns, IP addresses, and uploaded files β€” on servers in the People's Republic of China.

    Source: iapp.org

How this tracker is maintained

  • Each cell is verified against the vendor's own privacy policy, terms of service, or technical documentation. We don't paraphrase what the policy "probably" says β€” only what it actually states.
  • We separate consumer and business / API tiers because they operate under different contracts. The tier filter on each table reflects this; conflating them is the most common error in third-party comparisons.
  • "Trains on your data?" answers what happens by default at sign-up. The note under each chip describes the toggle, if one exists.
  • Each tier carries four independent scores on a 0–25 scale: Training, Retention, On-device, and Track record. The four axes sum to a 0–100 composite. We publish per-axis scores because the dimensions trade off differently for different use cases, and the composite for readers who need a single answer.
  • Track record reflects documented incidents, breaches, and unfavorable policy changes β€” sourced from the vendor's own announcements, court filings, and reporting that we link from this page or from related Voibe Resources articles. We don't penalize a vendor for a past mistake they have visibly fixed; we do penalize opaque reverses and silent expansions of data collection.
  • The matrix is reviewed on a roughly monthly cadence and updated immediately whenever a vendor announces a policy change. Each row carries its own Last verified date.
  • Errors get fixed fast: email hi@getvoibe.com with a primary-source link and we'll update the row.

Spotted an error? hi@getvoibe.com. Include the cell and a primary-source link and we'll update on the next pass.

Legend:Yes (default)β€” Trains by default with no opt-out pathYes (opt-out)β€” Trains by default; user can disable in settingsUser choiceβ€” User must actively choose during signup or in settingsNoβ€” Does not train on user data, period

Score legend β€” per axis (0–25)

22–25Strong β€” Architectural or contractual guarantee
17–21Solid β€” Default-off training, short retention, or partial on-device
11–16Mixed β€” User must take action, or policy has caveats
6–10Weak β€” Unfavorable default, opt-out only
1–5Poor β€” No opt-out path, or indefinite retention
0Unclear β€” Policy does not address this dimension

Total score (0–100, sum of four axes)

85–100Excellent
70–84Strong
55–69Adequate
40–54Weak
0–39Poor

Privacy scoreboard

One row per tool, showing its best-scoring tier in the selected tier group. Each axis is scored 0–25; the total adds the four axes for a 0–100 composite. Sorted by total by default. Tools with multiple sub-tiers in the same group (e.g., Superwhisper’s on-device vs cloud modes) are detailed in the matrix tables below.

Sort
#ToolCategoryPlan tierTrainingRetentionOn-deviceTrack recordTotal
1Voice & DictationAll plans2525252398/100Excellent
2Local LLM RuntimesOpen-source (MIT) β€” local install2525252398/100Excellent
3Local LLM RuntimesOpen-source (Apache 2.0) β€” local install2525252398/100Excellent
4Voice & DictationFree build (GPL v3) / Pro $39.99 one-time2525252297/100Excellent
5Local LLM RuntimesFree desktop app β€” Mac / Windows / Linux2523252093/100Excellent
6Voice & DictationOn-device modes (Fast / Nano / Standard / Parakeet β€” Free + Pro)2523251891/100Excellent
7Voice & DictationPro (Gumroad) / Whisper Transcription (App Store)2320232288/100Excellent
8AI AssistantsmacOS / iOS / iPadOS / visionOS (Apple Silicon, supported regions)2322222087/100Excellent
9Coding ToolsOpen-source extension (BYOK)2325132384/100Strong
10Voice & DictationmacOS / iOS (Apple Silicon, supported languages)2013181869/100Adequate
11Coding ToolsClaude Code via Pro / Max account (Aug 2025 consumer terms apply)1515131962/100Adequate
12Voice & DictationFree (1,000 words) / Pro ($144/yr) / Every Bundle ($30/mo)132213856/100Adequate
13AI AssistantsFree / Pro / Max151831551/100Weak
14Voice & DictationFree Trial / Individual ($12/mo annual = $144/yr, Private Mode default)201351351/100Weak
15AI AssistantsFree / Plus / Pro101831546/100Weak
16Voice & DictationFree / Pro15233546/100Weak
17AI AssistantsFree / Copilot Pro $20/mo / Windows Copilot101281040/100Weak
18Coding ToolsIndividual default (Privacy Mode OFF)83131337/100Poor
19Voice & DictationFree (Privacy Mode OFF, default)20331036/100Poor
20Meeting TranscriptionFree / Pro10133834/100Poor
21AI AssistantsFree / Gemini Advanced101031033/100Poor
22Coding ToolsIndividual (no ZDR, default)10531331/100Poor
23Meeting TranscriptionFree / Pro883827/100Poor
24Coding ToolsFree / Pro / Pro+1053826/100Poor
25Voice & DictationFree / Pro / iOS Pro8031526/100Poor
26AI AssistantsFree / Pro / Max833822/100Poor
27Meeting TranscriptionFree (800 min/mo) / Pro853521/100Poor
28AI AssistantsMeta AI in WhatsApp / Instagram / Facebook / Messenger / Threads / Ray-Ban Meta462517/100Poor
29AI AssistantsGrok on X (Free / Premium $16/mo / Premium+ $40/mo) + grok.com + mobile533314/100Poor
30AI AssistantsFree / Pro / Premium (consumer chatbot and mobile app)513110/100Poor

Use case fit: which score is right for what?

A composite score is only useful if you know what threshold to look for. The table below maps four common buyer scenarios to the minimum score we recommend, the reasoning, and the tools that currently meet that bar.

Healthcare, legal, financial β€” regulated work

85+

Patient records, attorney-client privileged content, financial data β€” anywhere a leak triggers regulator notification or contract liability.

Why this floor: Below 85, the vendor either retains data longer than 30 days, lacks a contractual training exclusion, or has a track-record incident in the past 24 months. Regulated work has no margin for any of those.

Watch out for: BAA / DPA availability β€” score doesn't reflect HIPAA contracts. A tool can score 85+ and still be unusable for PHI without a signed BAA. Always verify the contract before using a tool for regulated content.

Tools that meet the bar (business tier)

  • VoibeVoice & Dictation Β· All plans98/100Excellent
  • OllamaLocal LLM Runtimes Β· Open-source (MIT) β€” local install (same posture)98/100Excellent
  • JanLocal LLM Runtimes Β· Open-source (Apache 2.0) β€” local install (same posture)98/100Excellent
  • VoiceInkVoice & Dictation Β· Same posture (open-source local install)97/100Excellent
  • LM StudioLocal LLM Runtimes Β· Free desktop app (same posture)93/100Excellent
  • MacWhisperVoice & Dictation Β· Same posture (no separate enterprise tier)88/100Excellent
  • Apple Intelligence / SiriAI Assistants Β· macOS / iOS (same posture)87/100Excellent

Proprietary code, internal docs, M&A drafts

70+

Sensitive but not strictly regulated. Strategy docs, source code, contract drafts, customer data without compliance overlay.

Why this floor: 70+ means contractual ZDR or short retention with a vendor that has a clean recent track record β€” you trust them not to retain or train, even if you'd still avoid pasting raw secrets.

Watch out for: Consumer-tier mistakes. Most consumer tiers score below 50 β€” make sure your team is on the business plan, not pasting internal docs into ChatGPT Free. The tier filter on each table makes this visible.

Tools that meet the bar (business tier)

  • VoibeVoice & Dictation Β· All plans98/100Excellent
  • OllamaLocal LLM Runtimes Β· Open-source (MIT) β€” local install (same posture)98/100Excellent
  • JanLocal LLM Runtimes Β· Open-source (Apache 2.0) β€” local install (same posture)98/100Excellent
  • VoiceInkVoice & Dictation Β· Same posture (open-source local install)97/100Excellent
  • LM StudioLocal LLM Runtimes Β· Free desktop app (same posture)93/100Excellent
  • MacWhisperVoice & Dictation Β· Same posture (no separate enterprise tier)88/100Excellent
  • Apple Intelligence / SiriAI Assistants Β· macOS / iOS (same posture)87/100Excellent
  • Cline (open-source agent)Coding Tools Β· Open-source extension (BYOK)84/100Strong

Day-to-day drafting, research, light coding

55+

Public-facing content, general knowledge work, code that isn't a trade secret, prompts you wouldn't mind appearing in a leak.

Why this floor: 55+ means the tool is reasonably well-behaved by default or has a clear, working opt-out path that most users will actually flip.

Watch out for: Default settings. Many tools score 55+ only after the privacy toggle is on. Verify each user has actually flipped it β€” onboarding teams to a private-by-default workflow is more reliable than chasing settings.

Tools that meet the bar

  • VoibeVoice & Dictation Β· All plans98/100Excellent
  • OllamaLocal LLM Runtimes Β· Open-source (MIT) β€” local install98/100Excellent
  • JanLocal LLM Runtimes Β· Open-source (Apache 2.0) β€” local install98/100Excellent
  • VoiceInkVoice & Dictation Β· Free build (GPL v3) / Pro $39.99 one-time97/100Excellent
  • LM StudioLocal LLM Runtimes Β· Free desktop app β€” Mac / Windows / Linux93/100Excellent
  • SuperwhisperVoice & Dictation Β· On-device modes (Fast / Nano / Standard / Parakeet β€” Free + Pro)91/100Excellent
  • MacWhisperVoice & Dictation Β· Pro (Gumroad) / Whisper Transcription (App Store)88/100Excellent
  • Apple Intelligence / SiriAI Assistants Β· macOS / iOS / iPadOS / visionOS (Apple Silicon, supported regions)87/100Excellent

Personal, low-stakes use

Any

Notes to self, brainstorming, creative writing β€” content you wouldn't mind seeing in a leak.

Why this floor: Any tool works if you understand the tradeoff. Default consumer tiers of major assistants land in the 30–50 range; that's fine for non-sensitive prompts.

Watch out for: Voice input. Audio is uniquely sensitive β€” even casual dictation may capture identity-revealing details, ambient conversations, or addresses. The On-device score matters more for voice than for text.

Tools that meet the bar

  • VoibeVoice & Dictation Β· All plans98/100Excellent
  • OllamaLocal LLM Runtimes Β· Open-source (MIT) β€” local install98/100Excellent
  • JanLocal LLM Runtimes Β· Open-source (Apache 2.0) β€” local install98/100Excellent
  • VoiceInkVoice & Dictation Β· Free build (GPL v3) / Pro $39.99 one-time97/100Excellent
  • LM StudioLocal LLM Runtimes Β· Free desktop app β€” Mac / Windows / Linux93/100Excellent
  • SuperwhisperVoice & Dictation Β· On-device modes (Fast / Nano / Standard / Parakeet β€” Free + Pro)91/100Excellent
  • MacWhisperVoice & Dictation Β· Pro (Gumroad) / Whisper Transcription (App Store)88/100Excellent
  • Apple Intelligence / SiriAI Assistants Β· macOS / iOS / iPadOS / visionOS (Apple Silicon, supported regions)87/100Excellent

AI Assistants

Chatbots and search assistants. Consumer tiers vary the most β€” check whether your account is logged-in vs. logged-out, and whether you've reviewed your data settings since the last policy change.

ToolPlan tierData collectedTrains on your data?RetentionOn-deviceTrack recordTotalLast verifiedSource
Free / Plus / ProPrompts, outputs, uploaded files, usage, IP, device info, account infoYes (opt-out)

Off via Settings β†’ Data Controls β†’ "Improve the model for everyone." Temporary Chat is never used for training.

Training10/25
30 days after deletion. April–September 2025 data preserved due to NYT order; standard practice resumed Sept 26, 2025.
Retention18/25
No
On-device3/25

March 2023 chat-history bug; April–September 2025 NYT-mandated indefinite retention.

Track record15/25
46/100WeakApr 27, 2026
Free / Pro / MaxChats, coding sessions (when using Claude Code with consumer accounts), feedback (thumbs)User choice

Active choice required during signup or in Privacy Settings ("You can help improve Claude"). Off by default for users who decline. Policy changed August 2025.

Training15/25
30 days if declined. 5 years if enabled. Flagged conversations: 2–7 years for trust & safety.
Retention18/25
No
On-device3/25

August 2025 reversal: consumer Claude moved from 'never used for training' to user choice.

Track record15/25
51/100WeakApr 27, 2026
Free / Gemini AdvancedChats, files, photos, videos, screen content, account info, IP, device infoYes (opt-out)

Off via "Gemini Apps Activity" β†’ Off. Even when off, future chats are kept for 72 hours so Gemini can respond and process feedback.

Training10/25
18 months default (adjustable to 3 months / 36 months / never). Human-reviewed chats retained up to 3 years (disconnected from account).
Retention10/25
No
On-device3/25

Apps Activity 'off' still keeps chats 72h; human-reviewed conversations retained up to 3 years.

Track record10/25
33/100PoorApr 27, 2026
Free / Pro / MaxQueries, prompts, AI responses, usage, device infoYes (opt-out)

Off via Account Settings β†’ Preferences β†’ "AI Data Retention." Logged-out users are trained on by default with no opt-out path.

Training8/25
Threads kept until manually deleted. Account deletion processed within 30 days.
Retention3/25
No (Comet browser stores some data locally β€” separate policy)
On-device3/25

2024 reporting documented robots.txt evasion via undisclosed user-agent; logged-out users still trained on.

Track record8/25
22/100PoorApr 27, 2026
Free / Pro / Premium (consumer chatbot and mobile app)Prompts, outputs, uploaded files, keystroke patterns, IP, device info, account infoYes (default)

DeepSeek's privacy policy stores user data on servers in the People's Republic of China. No documented opt-out path for training in the consumer interface. Note: open-source MIT-licensed weights can be self-hosted, which removes this concern entirely β€” but that is not the consumer chatbot row.

Training5/25
Indefinite on Chinese servers. February 2025 EEA Supplemental Clause acknowledges "your personal data may be processed and stored in our servers in the People's Republic of China."
Retention1/25
No
On-device3/25

Banned by Italy (Garante, Jan 2025); 13-jurisdiction EU probes; US gov device bans (Pentagon, NASA, US Navy); Jan 2025 breach exposed 1M+ records; Feroot Security found code linking to China Mobile authentication registry.

Track record1/25
10/100PoorMay 6, 2026
macOS / iOS / iPadOS / visionOS (Apple Silicon, supported regions)Prompts, optional personal context (only when feature invoked), request size and duration metadata. ChatGPT integration is opt-in and routes through OpenAI's enterprise terms with IP obfuscation.No

Apple's published policy states Apple does not use users' private personal data or user interactions when training its foundation models. ChatGPT integration is a separate, opt-in boundary not covered by Apple's guarantees.

Training23/25
Stateless. Private Cloud Compute processes data only to fulfill the request and returns results to device; data is not stored or made accessible to Apple. Apple collects only request metadata (size, feature, duration) β€” not content.
Retention22/25
Yes (primary). On-device foundation model handles most tasks; Private Cloud Compute is an architectural extension of device security to Apple Silicon servers with verifiable transparency, signed binaries, no SSH or admin access.
On-device22/25

Verifiable transparency model; signed binaries publicly inspectable; no documented incidents in the Apple Intelligence era. ChatGPT integration introduces an opt-in external boundary not covered by PCC guarantees.

Track record20/25
87/100ExcellentMay 6, 2026
Free / Copilot Pro $20/mo / Windows CopilotPrompts, conversation activity, voice inputs, uploaded images and files, Bing / MSN browsing signals, ad interactions, identifiers tied to Microsoft Account when signed in. Verbatim from Microsoft: "your voice and conversation activity with Copilot, including the images or files you upload."Yes (opt-out)

"Except for certain categories of users or users who have opted out, Microsoft uses data from Bing, MSN, Copilot, and interactions with ads on Microsoft for AI training." Opt-out at Profile icon β†’ Privacy β†’ "Training on conversation activity." Retroactive β€” applies to past, present, and future use; propagates within 30 days. Signed-out users are not used for training. Copilot Pro inherits the same consumer training posture as Free β€” no contractual carve-out.

Training10/25
18 months default for conversation activity per Microsoft Support. User can delete individual items or full history; opt-out from training is separate from history deletion.
Retention12/25
Partial β€” Recall and Click to Do run locally on Copilot+ PCs (NPU β‰₯40 TOPS); snapshots stored on-device only. Most Copilot chat (text / voice / image generation) runs in the cloud. Recall is opt-in (off by default) post-2024 backlash, with encrypted local database.
On-device8/25

Recall controversy (2024 β€” Kevin Beaumont / DoublePulsar plaintext-database findings) is the adjacent track-record drag; Copilot itself has no comparable public breach. Microsoft's general advertising / data history weighs on the consumer composite.

Track record10/25
40/100WeakMay 12, 2026
Meta AI in WhatsApp / Instagram / Facebook / Messenger / Threads / Ray-Ban MetaPublic Facebook and Instagram posts, captions, and comments from adults (18+); all interactions with Meta AI (prompts, queries, responses); public profile data; images shared publicly or in which the user is tagged; voice recordings from Ray-Ban / Oakley Meta glasses (stored by default since April 2025, retained up to 1 year); device and location data; ad and interaction signals; public Vibes (AI-generated content shared to feed).Yes (default β€” regional opt-out)

Opt-in by default globally on public posts and Meta AI conversations. EU / EEA users have opt-out via the "Right to Object" form in Privacy Center β†’ "How Meta uses information for generative AI models and features" β€” Meta resumed EU training May 27 2025 under GDPR Article 6(1)(f) "legitimate interest" after a one-year pause. UK opt-out form available after ICO concessions. US / Australia / most non-GDPR jurisdictions have no general opt-out β€” Meta confirmed in Australian Senate hearings (Sept 2024) it scrapes AU public posts back to 2007 and will only offer an opt-out "if governments force it to." WhatsApp 1:1 and group messages remain end-to-end encrypted and are not used for training UNLESS the user explicitly invokes @Meta AI in chat. As of December 16 2025, Meta AI conversations also feed ad personalization.

Training4/25
Indefinite unless user manually deletes (Settings β†’ Data and privacy β†’ Manage your information β†’ Delete all chats and media). No published auto-purge window. Ray-Ban Meta voice recordings: up to 1 year by default. Memory is managed separately via Accounts Center.
Retention6/25
No for the Meta AI assistant. WhatsApp "Private Processing" (2025) is a server-side TEE-based confidential compute layer with hardware attestation, not on-device.
On-device2/25

June 2024 EU training pause forced by Irish DPC + noyb pressure; May 2025 noyb cease-and-desist threatening class action; Sept 2024 admission to AU Senate that AU public posts scraped back to 2007 with no general opt-out; April 2025 Ray-Ban Meta voice-recording-by-default expansion (removed prior user control); Dec 2025 ad-targeting expansion β†’ EPIC FTC complaint. Pattern of expand-first, retreat-only-under-regulator.

Track record5/25
17/100PoorMay 12, 2026
Grok on X (Free / Premium $16/mo / Premium+ $40/mo) + grok.com + mobilePublic X posts, likes, follows, replies, reposts; Grok prompts, inputs, outputs, voice prompts, images uploaded for context; device data (IP, browser, OS, advertising ID, carrier, language, memory, apps installed, battery level); approximate location (IP-derived) plus GPS if granted; account info, contact list if shared; DM contents, recipients, timestamps; Grok memory stores conversation context across sessions.Yes (default)

Opt-in by default for X posts AND Grok chats / outputs. Three separate opt-out paths across surfaces: (1) Grok on X β€” Settings β†’ Privacy and safety β†’ Data sharing and personalization β†’ Grok & xAI β†’ uncheck "Allow your posts as well as your interactions, inputs, and results with Grok and xAI to be used for training and fine-tuning." (2) Grok mobile app β€” Settings β†’ Data Controls β†’ uncheck "Improve the model." (3) Grok web (grok.com) β€” Settings β†’ Data β†’ uncheck "Improve the model." EU / EEA users excluded from X-post training under the Sept 4 2024 DPC undertaking; DPC opened a fresh statutory inquiry into XIUC on April 11 2025. The Jan 15 2026 X ToS update explicitly classifies Grok prompts and outputs as user "Content" available for AI training. Private Chat mode on grok.com auto-opts-out of training for that session.

Training5/25
Indefinite unless user-deleted; deletion processed within 30 days. No tier-based retention differences across Free / Premium / Premium+.
Retention3/25
No β€” cloud-only via xAI's Memphis Colossus / Colossus 2 supercomputer.
On-device3/25

Aug–Sept 2024 Irish DPC High Court action over Grok EU-post training; April 2025 DPC fresh statutory inquiry into XIUC; July 8–12 2025 "MechaHitler" content failure β†’ EU Commission technical meeting + Turkey criminal probe + court-ordered content block + Poland DSA referral; August 20 2025 β€” ~370,000+ Grok shared-chat conversations indexed on Google because share feature generated unauthenticated URLs without `noindex` directives; March 2026 Lieff Cabraser class action over Grok-generated CSAM / non-consensual sexual deepfakes.

Track record3/25
14/100PoorMay 12, 2026

AI Coding Tools

IDE assistants and agents. Consumer defaults shifted in April 2026 (GitHub Copilot now trains on consumer interaction data by default). Most tools offer a Privacy / Zero-Data-Retention mode that flips the answer; check whether yours is on.

ToolPlan tierData collectedTrains on your data?RetentionOn-deviceTrack recordTotalLast verifiedSource
Individual default (Privacy Mode OFF)Code, prompts, editor actions, code snippetsYes (default)

Default for individual accounts ("Share Data" on). Used to improve Cursor's models. Toggle Privacy Mode ON to opt out β€” code never trained on, plaintext discarded after request.

Training8/25
Stored indefinitely (Share Data ON). Privacy Mode ON: plaintext discarded after request; cached files encrypted with client-generated keys.
Retention3/25
No (configurable to use local Ollama / LM Studio models, which bypass Privacy Mode entirely)
On-device13/25

Default 'Share Data' on for individuals; no documented incidents; transparent data-use page.

Track record13/25
37/100PoorApr 27, 2026
Free / Pro / Pro+Inputs, outputs, code snippets, associated contextYes (opt-out)

Policy changed April 24, 2026: GitHub now trains on consumer interaction data by default. Existing opt-outs honored. Toggle in Settings β†’ Privacy.

Training10/25
User Engagement Data: 2 years. Coding Agent session logs: lifetime of account. Private repo code at rest is NOT used for training; in-flight interaction data IS.
Retention5/25
No
On-device3/25

April 2026 reversal trains consumer interaction data by default; pending class action over training data.

Track record8/25
26/100PoorApr 27, 2026
Individual (no ZDR, default)Logs may contain code snippets and user trajectoriesYes (opt-out)

ZDR is opt-in for individuals β€” toggle in profile to enable. With ZDR ON, code submitted is never trained on.

Training10/25
With ZDR ON: in-memory for request lifetime, plus minutes-to-hours for prompt caching. Without ZDR: logs may persist.
Retention5/25
No
On-device3/25

2024 Codeium β†’ Windsurf rebrand; default-off ZDR for individuals; no documented incidents.

Track record13/25
31/100PoorApr 27, 2026
Open-source extension (BYOK)Cline operates no model server. Code goes only to your configured API provider (Anthropic, OpenAI, Bedrock, Gemini, etc.) and is governed by that provider's terms.No (by Cline)

Cline's stated principle: "Code never leaves your machine" toward Cline servers. Anonymous telemetry (features used, task completion) is opt-out via the "Cline Telemetry" setting. Code, file contents, command arguments, and conversation content are not collected by telemetry.

Training23/25
Cline retains nothing about your code. Provider retention applies (e.g., Anthropic API ZDR, OpenAI API 30 days).
Retention25/25
Partial β€” extension runs locally; inference happens at your chosen provider, or fully on-device if you configure Ollama / LM Studio.
On-device13/25

Open-source; no model server; no documented incidents.

Track record23/25
84/100StrongApr 27, 2026
Claude Code via Pro / Max account (Aug 2025 consumer terms apply)All user prompts and model outputs, encrypted via TLS. Includes file contents the model reads, tool-use results, and Bash command outputs that are returned to the model. Additionally: telemetry (latency / reliability / usage metrics β€” no code or file paths per the docs), Sentry error reports, and optional `/feedback` submissions (full conversation history including code).User choice (default-on)

Critical disambiguation: Claude Code run from a Claude.ai Pro or Max account inherits the August 28 2025 consumer-terms update β€” default-on training, user choice required by Oct 8 2025. Anthropic's announcement explicitly states the update applies "when they use Claude Code from accounts associated with those plans." Opt-out at claude.ai/settings/data-privacy-controls.

Training15/25
5 years if model-improvement enabled; 30 days if disabled. Local session transcripts cached in plaintext under ~/.claude/projects/ for 30 days (tunable via cleanupPeriodDays). /feedback transcripts retained 5 years. Optional session-transcript uploads retained 6 months.
Retention15/25
Partial β€” Claude Code binary executes locally (file edits, shell execution, MCP servers, local session log). Inference is cloud-only via Anthropic API. Code never leaves machine unless sent as part of an inference request, telemetry fires (no code in telemetry per docs), Sentry reports an error, or the user runs /feedback.
On-device13/25

No documented Claude Code data-handling incidents to date; newer surface than Cursor (Claude Code GA'd 2025). Slight deduction for the Aug 2025 consumer-terms reversal scope catching some users by surprise (developers assumed API-tier terms).

Track record19/25
62/100AdequateMay 12, 2026

Voice & Dictation

Speech-to-text and dictation tools. Voice input is uniquely sensitive β€” audio carries identity, biometric data, and ambient context β€” so the on-device column matters more than for text-only tools.

ToolPlan tierData collectedTrains on your data?RetentionOn-deviceTrack recordTotalLast verifiedSource
All plansNo audio or transcription leaves the device. Account holders: email (account auth) plus non-identifying usage analytics; crash reports exclude dictated content.No

"The Voibe application processes your voice entirely on your device. No audio is transmitted to our servers at any point."

Training25/25
Audio: not transmitted, not retained. Account email: kept while account is active.
Retention25/25
Yes (only mode) β€” Whisper models running on Apple Silicon Neural Engine
On-device25/25

New entrant; on-device-only architecture removes the surface for retention or training incidents.

Track record23/25
98/100ExcellentApr 27, 2026
Free (Privacy Mode OFF, default)Audio, transcripts, edits, optional Context Awareness (screen content from active app)Yes (opt-in)

After 2024 community backlash, training is now off by default and requires opt-in. Audio retained indefinitely; 30 days for data passed to third-party LLMs (OpenAI, Meta).

Training20/25
Indefinite for retained dictation data; 30 days for third-party LLM passthrough.
Retention3/25
No β€” transcription always happens in the cloud, even in Privacy Mode (zero-retention cloud, not local).
On-device3/25

2024 community backlash forced opt-in training and Privacy Mode; Free tier still indefinite by default.

Track record10/25
36/100PoorApr 27, 2026
On-device modes (Fast / Nano / Standard / Parakeet β€” Free + Pro)None β€” audio processed locally and never transmittedNo

"Your data is not retained on Superwhisper servers" and "not used for training AI models or any other machine learning purposes." Audio recordings are saved to local disk by default β€” opt out in settings.

Training25/25
N/A on servers. Local recordings persist until the user deletes them.
Retention23/25
Yes
On-device25/25

Stable privacy-first stance; cloud modes added without separate disclosure in the public privacy policy.

Track record18/25
91/100ExcellentApr 27, 2026
Cloud modes (Ultra transcription / Super Mode LLMs β€” Pro)Audio sent to Superwhisper's proxy infrastructureNo (per vendor)

Superwhisper says cloud audio is proxied through their infrastructure, third-party providers don't see user account or content, and there is no training or retention. Cloud-mode handling is not currently distinguished in the public privacy policy from on-device modes β€” verify the latest with the vendor before sensitive use.

Training18/25
Stated as not retained on servers; not separately documented for cloud modes.
Retention13/25
No
On-device3/25

Stable privacy-first stance; cloud modes added without separate disclosure in the public privacy policy.

Track record18/25
52/100WeakApr 27, 2026
Pro (Gumroad) / Whisper Transcription (App Store)On-device modes: none transmitted. App Store version discloses "Usage Data" and "Product Interaction" as Data Not Linked to You. Cloud Assistant or BYOK (OpenAI / ElevenLabs) features send audio to those providers under their terms.No (by MacWhisper)

MacWhisper does not train its own models on user audio. Cloud Assistant and BYOK integrations inherit the chosen provider's terms (e.g., OpenAI Whisper API, Anthropic / ElevenLabs).

Training23/25
On-device transcription: not retained. Cloud Assistant / BYOK: per third-party provider's terms.
Retention20/25
Yes (primary mode) β€” local Whisper models plus Apple Foundation Models for AI features. Cloud Assistant is opt-in for higher-quality transcription.
On-device23/25

Long-running indie tool; on-device by default; no documented incidents.

Track record22/25
88/100ExcellentApr 27, 2026
Free / Pro / iOS ProAudio inputs, technical data (IP, browser, OS, performance metrics), session metadata. With Privacy Mode disabled, "we may securely store transcript data on our servers."Yes (opt-out)

Privacy Mode toggle stops transcript storage on Aqua Voice servers; with it enabled, "transcript data is not collected" though session metadata may still be. The privacy policy does not explicitly state whether stored transcript data is used for AI training. SOC 2 Type II certified by Advantage Partners. No HIPAA BAA publicly advertised.

Training8/25
With Privacy Mode disabled: not specified in policy. With Privacy Mode enabled: transcripts not stored; session metadata (timestamps, device type, performance metrics) may be retained.
Retention0/25
No β€” cloud transcription
On-device3/25

SOC 2 Type II via Advantage Partners; privacy policy ambiguous on whether stored transcripts are used for AI training.

Track record15/25
26/100PoorApr 27, 2026
Free / ProAudio plus limited contextual information, processed on Typeless's cloud servers. Subprocessors include third-party LLM providers, analytics, and cloud infrastructure.No (per vendor)

Privacy policy: "Your data is never used to train these services and is configured for zero retention by the providers." Note: the November 2025 reverse-engineering analysis documented in our Typeless privacy issues investigation reported collection beyond what the public policy describes β€” verify against the current policy and subprocessor list before sensitive use.

Training15/25
Per privacy policy, audio + contextual information are "processed in real time on our cloud servers and immediately discarded once the result is returned to your device."
Retention23/25
No β€” cloud-processed in real time
On-device3/25

Nov 2025 reverse-engineering analysis documented collection (URLs, window-title metadata, broad permissions) beyond what the public policy describes.

Track record5/25
46/100WeakApr 27, 2026
macOS / iOS (Apple Silicon, supported languages)Audio inputs, plus contextual data (contacts, app names, etc.) when sent to serversOpt-in only

"Improve Siri & Dictation" must be enabled. Default at setup is to be asked.

Training20/25
If opted in: audio + transcripts kept under a rotating random ID for up to 6 months, dissociated and kept up to 2 years for improvement; reviewed subset retained beyond 2 years. If opted out: not retained for improvement.
Retention13/25
Yes (partially) β€” most languages on Apple Silicon process locally for general text fields (Notes, Mail, Messages). Server fallback applies to unsupported languages, search-box dictation, and some third-party Speech Recognition API uses.
On-device18/25

2019 Siri grading scandal led to opt-in for human review; otherwise privacy-forward.

Track record18/25
69/100AdequateApr 27, 2026
Free Trial / Individual ($12/mo annual = $144/yr, Private Mode default)Private Mode (default): basic technical and account-related data only β€” verbatim: "In private mode, Willow only collects basic technical and account-related data needed to run the app and nothing else. No voice, no dictated text." Opt-In Mode: anonymized text and usage data to improve text-correction models.No (Private Mode default)

Private Mode is the default. Training only occurs if the user explicitly opts into Opt-In Mode β€” "You can allow Willow to collect minimal usage data to help improve our text correction models." This is the inverse of Wispr Flow Free, which had Privacy Mode off by default until the 2024 community backlash. Privacy policy last updated April 30 2025.

Training20/25
Private Mode: no audio or transcript collected on Willow servers. Opt-In Mode: "We keep anonymized text and usage data only as long as needed to train and improve the app." No specific retention window disclosed for the Opt-In path.
Retention13/25
No β€” cloud-first transcription. The privacy policy's "this is only stored locally on your device for you to view" refers to local UI storage of past transcripts, not local transcription. No offline transcription mode advertised on the current pricing page.
On-device5/25

Newer entrant (YC X25 cohort). Marketing pricing page advertises HIPAA + SOC 2 + zero data retention at the Enterprise tier, but the privacy policy text itself only references "GDPR and SOC 2" β€” HIPAA / BAA scope is not documented in the policy. Minor discrepancy between marketing claims and policy substantiation.

Track record13/25
51/100WeakMay 12, 2026
Free (1,000 words) / Pro ($144/yr) / Every Bundle ($30/mo)Per the monologue.to data-privacy page: "No audio files or transcripts are saved on our servers," "Deep context screenshots are deleted immediately," "Zero LLM data retention," and "Custom modes and dictionaries stay on your device." The page does not enumerate categories of data collected beyond these statements.Unknown β€” policy silent

The published monologue.to data-privacy page and the Notion-hosted privacy policy neither explicitly state that user data is used for training nor that it is excluded. This silence is load-bearing: peer cloud dictation products (Wispr Flow, Typeless, Superwhisper, Willow Voice) all explicitly address training one way or the other. Treat as ambiguous until clarified.

Training13/25
Verbatim: "No audio files or transcripts are saved on our servers." "Zero LLM data retention." Deep context screenshots deleted immediately. Custom modes and dictionaries stay on the device.
Retention22/25
Partial β€” "Offline transcription support" is listed as a feature on both Free and Pro tiers per the monologue.to marketing site, and "Custom modes and dictionaries stay on your device," but the marketing site does not document the architecture (cloud-default with offline fallback vs. on-device-default).
On-device13/25

Indie product (1-person team). Privacy policy hosted on Notion at modern-ton-234.notion.site and is sparse β€” no list of data categories collected, no subprocessor list, no retention windows beyond the broad "not saved" claims, no SOC 2 / HIPAA / BAA. Training-silence is the load-bearing gap.

Track record8/25
56/100AdequateMay 12, 2026
Free build (GPL v3) / Pro $39.99 one-timeNone on VoiceInk servers β€” there is no VoiceInk-operated cloud service in the inference path. Audio is transcribed on-device using whisper.cpp.No

VoiceInk README verbatim: "100% offline processing ensures your data never leaves your device." Marketing site: "privacy-focused dictation app for macOS with local transcription." No telemetry, BYOK cloud feature, or AI-command post-processing service documented at this verification pass. Source code is auditable on GitHub under GPL v3.0.

Training25/25
Nothing leaves the device. No vendor server, no retention surface.
Retention25/25
Yes (only mode) β€” local Whisper models via whisper.cpp. Apple Silicon native.
On-device25/25

Open-source GPL v3.0; auditable on GitHub at Beingpax/VoiceInk; 4,900+ stars and 675+ forks; 119 releases; latest v1.76 (May 7 2026). No documented incidents. Mild deduction for being a single-maintainer project (continuity risk distinct from privacy risk).

Track record22/25
97/100ExcellentMay 12, 2026

Meeting Transcription

AI meeting assistants β€” bot-based notetakers and meeting transcription apps. This category had its privacy reckoning in 2025–2026: Otter and Fireflies were both sued; Granola was reported by The Verge for default-public note links and opt-out training. Audio + transcripts of meetings are uniquely high-stakes because participants often haven't consented individually, and 12 US states require all-party consent for recording.

ToolPlan tierData collectedTrains on your data?RetentionOn-deviceTrack recordTotalLast verifiedSource
Free / ProAudio recordings, transcripts, meeting metadata, speaker identification, account infoYes (opt-out)

Otter trains automatically on de-identified user data; recordings and transcripts are not manually reviewed by humans unless the customer gives explicit consent for support troubleshooting. Training data is encrypted. Opt-out exists in account settings.

Training8/25
Conversations stored until manually deleted; trash holds for 30 days then auto-purges.
Retention8/25
No (cloud)
On-device3/25

Brewer v. Otter.ai (N.D. Cal., Aug 2025) β€” class action alleging unauthorized recording and use of meeting data to train AI without all-participant consent. Visible bot creates two-party consent issues. SOC 2 Type 2. Case ongoing.

Track record8/25
27/100PoorMay 6, 2026
Free (800 min/mo) / ProAudio, video recordings, transcripts, speaker identification, voiceprint data, meeting metadataYes (default for consumer)

Fireflies trains on user transcripts in the default consumer configuration. Opt-out exists in admin settings; Enterprise plans differ.

Training8/25
Stored until manual deletion. Data residency options available for international compliance.
Retention5/25
No (cloud)
On-device3/25

December 2025 BIPA class action alleging Fireflies collected voiceprint biometric data without written consent. Visible "Fireflies.ai Notetaker" bot creates two-party consent issues. GDPR compliant. Case ongoing.

Track record5/25
21/100PoorMay 6, 2026
Free / ProAudio is transcribed in real time on macOS/Windows and not stored; transcripts and user notes are stored on AWS. iOS uses temporarily cached audio.Yes (opt-out)

Granola uses de-identified user data to train internal AI models supporting its services by default. Opt-out is in account settings and is not surfaced prominently. Third-party LLM providers (OpenAI, Anthropic) are contractually prohibited from training on Granola data.

Training10/25
Audio not stored. Transcripts and notes stored in US-hosted AWS Virtual Private Cloud, encrypted at rest and in transit, backed up daily. Notes are accessible to anyone with the share link by default β€” link sharing is on, not off, until the user changes it.
Retention13/25
No. Despite real-time transcription on the user's device, audio is sent to cloud transcription providers (Deepgram, AssemblyAI) and summarization uses cloud LLMs (OpenAI, Anthropic).
On-device3/25

April 2 2026 β€” The Verge reported Granola makes notes accessible via shareable link by default and opts users into internal AI training in default settings, contradicting the app's "private by default" marketing. No documented breaches.

Track record8/25
34/100PoorMay 6, 2026

Local LLM Runtimes

Tools for running large language models entirely on the user's own hardware. These tools score uniformly high on Training, Retention, and On-device because there is no vendor server in the inference path β€” the user supplies the model and the compute. They sit in the same architectural category as Voibe and earn the same scores for the same reason.

ToolPlan tierData collectedTrains on your data?RetentionOn-deviceTrack recordTotalLast verifiedSource
Open-source (MIT) β€” local installNone on Ollama servers. Models run locally; OpenAI-compatible local REST API on localhost:11434.No

Ollama is a local runtime β€” there is no Ollama-operated model server to send prompts or completions to.

Training25/25
Nothing leaves the user's machine. No vendor servers; no retention surface.
Retention25/25
Yes (only mode). Models run entirely on local hardware.
On-device25/25

Open-source; over 100,000 GitHub stars; transparent codebase; no documented incidents. Users who deliberately expose the local API beyond localhost create their own attack surface β€” that is a deployment choice, not an Ollama default.

Track record23/25
98/100ExcellentMay 6, 2026
Free desktop app β€” Mac / Windows / LinuxNone on LM Studio servers for inference. Optional LM Link feature shares only device-list metadata (not chats) with LM Studio's backend for device discovery.No

LM Studio runs models locally with an optional OpenAI-compatible local server. The LM Link remote-inference feature uses end-to-end encrypted Tailscale mesh VPNs; chats remain local even across linked devices.

Training25/25
Inference is local. LM Link sends only device list metadata for discovery; nothing else is uploaded to LM Studio's backend.
Retention23/25
Yes (only mode for inference).
On-device25/25

Privately operated; no documented incidents; transparent about data flows on the LM Link product page. Closed-source application.

Track record20/25
93/100ExcellentMay 6, 2026
Open-source (Apache 2.0) β€” local installNone on Jan servers. Optional cloud-provider integrations inherit the chosen provider's terms.No

Jan runs models locally; there is no Jan-operated model server. Optional cloud-provider integrations are explicitly opt-in.

Training25/25
Nothing leaves the device in default operation.
Retention25/25
Yes (default mode). Optional cloud integrations are opt-in.
On-device25/25

Open-source; auditable codebase; active community roadmap; no documented incidents.

Track record23/25
98/100ExcellentMay 6, 2026

Privacy Policy Quick Read: Does Each AI Tool Train on Your Data?

For each of the 30 tools in the matrix above, here is what the vendor's own privacy policy says about training, retention, and on-device support β€” quoted verbatim where the policy text supports a clean citation. Each entry links to the primary source we verified against on .

AI Assistants

Does ChatGPT train on my data?

Yes, by default β€” opt-out available.

ChatGPT's consumer plans (Free, Plus, Pro) train on user prompts, outputs, and uploaded files by default. To opt out, navigate to Settings β†’ Data Controls and disable "Improve the model for everyone." Conversations are retained for 30 days after deletion. Temporary Chat is never used for training. ChatGPT Team, Enterprise, and API plans are explicitly excluded from training under OpenAI's enterprise terms β€” API users can optionally opt in via Playground feedback. Limited April–September 2025 data is preserved due to the NYT litigation order; OpenAI's standard 30-day retention practices resumed September 26, 2025.

Primary source: OpenAI privacy policy

Does Claude train on my data?

User choice required (since Aug 2025).

As of August 28, 2025, Anthropic shifted Claude's consumer plans (Free, Pro, Max) from "not used for training" to a user-choice model. New users must actively choose during signup whether to share data for training; existing users had until October 8, 2025. Users who opt in have their data retained for up to 5 years; users who decline keep the previous 30-day retention window. Flagged conversations are retained 2–7 years for trust & safety review. Claude for Work, the Claude API, Amazon Bedrock, and Google Vertex AI are all contractually excluded from training under Anthropic's Commercial Terms.

Primary source: Anthropic Aug 2025 update

Does Gemini train on my data?

Yes, by default β€” opt-out via "Gemini Apps Activity."

Free Gemini and Gemini Advanced (consumer) train on user conversations by default. Per Google's documentation, when Gemini Apps Activity is on, "Google uses your activity to provide, develop, and improve its services (including training generative AI models)." To opt out, set Apps Activity to OFF β€” but even when off, future chats are saved for 72 hours so Gemini can respond and process feedback. Default retention is 18 months, adjustable to 3 months, 36 months, or never. Human-reviewed conversations are kept up to 3 years (disconnected from your Google Account). Vertex AI customer data is contractually excluded from training: "Google won't use your data to train or fine-tune any AI/ML models without your prior permission or instruction."

Primary source: Gemini Apps Activity controls

Does Perplexity train on my data?

Yes, by default β€” opt-out for logged-in users only.

Perplexity trains on user queries, prompts, and AI responses by default for Free, Pro, and Max plans. The "AI Data Retention" toggle in Account Settings β†’ Preferences disables this. Logged-out users are trained on by default with no opt-out path β€” sign in to gain control. Threads are retained until manually deleted; account deletion is processed within 30 days. The Sonar API offers Zero Data Retention with prompts and responses never stored. Third-party providers (OpenAI, Anthropic) are contractually prohibited from training on Perplexity's API data. Enterprise file uploads are deleted after 7 days.

Primary source: Perplexity data collection policy

Does DeepSeek train on my data?

Yes, by default β€” no documented opt-out for the consumer chatbot.

DeepSeek's published privacy policy stores user data β€” including prompts, outputs, uploaded files, keystroke patterns, IP addresses, device info, and account info β€” on servers in the People's Republic of China. The February 2025 EEA Supplemental Clause acknowledges "your personal data may be processed and stored in our servers in the People's Republic of China." There is no documented opt-out path for training in the consumer interface. Italy's Garante imposed a 72-hour ban in January 2025 followed by 13 European jurisdiction probes; government device bans followed in Australia, Taiwan, South Korea, Czech Republic, the Netherlands, Germany, and multiple US federal agencies (Pentagon, NASA, US Navy). The bipartisan "No DeepSeek on Government Devices Act" is pending in the US Senate. A January 2025 database breach exposed over 1 million records. Open-source MIT-licensed weights can be self-hosted, which removes this concern entirely β€” but that is the self-hosted path, not the consumer chatbot.

Primary source: DeepSeek privacy policy

Does Apple Intelligence train on my data?

No β€” Apple's published policy excludes user interactions from foundation-model training.

Apple's published policy states Apple does not use users' private personal data or user interactions when training its foundation models. The on-device foundation model handles most tasks; for larger requests, Private Cloud Compute extends device security to Apple Silicon servers using a verifiable transparency model β€” signed binaries are publicly inspectable, there is no SSH or admin access, and researchers can audit via the Virtual Research Environment. PCC is stateless: data is processed only to fulfill the request and returned to the device; it is not stored or made accessible to Apple. Apple collects only request metadata (size, feature, duration), not content. The optional ChatGPT integration is a separate, opt-in boundary that routes through OpenAI's enterprise terms with IP obfuscation β€” Apple's guarantees do not extend to ChatGPT requests.

Primary source: Intelligence Engine privacy

Does Microsoft Copilot train on my data?

Consumer: yes by default β€” opt-out available. M365 Copilot business: no, contractually excluded.

First, the disambiguation: Microsoft Copilot is the consumer / Windows / Microsoft 365 assistant β€” it is distinct from GitHub Copilot (the developer product) which is tracked separately in the coding section. For Copilot Free, Copilot Pro ($20/mo), and Windows Copilot, Microsoft trains by default on signed-in users' conversation activity. Microsoft Support states: "Except for certain categories of users or users who have opted out, Microsoft uses data from Bing, MSN, Copilot, and interactions with ads on Microsoft for AI training." Opt-out is at Profile icon β†’ Privacy β†’ "Training on conversation activity" and is retroactive β€” it applies to past, present, and future use, with propagation within 30 days. Signed-out users are not used for training. Conversation activity is retained 18 months by default; user-deletable, but deletion is independent of the training opt-out. On Copilot+ PCs (NPU β‰₯40 TOPS), Recall and Click to Do run locally on-device with snapshots stored on-device; most Copilot chat still runs in the cloud. Microsoft 365 Copilot ($30/user/mo, enterprise) is contractually excluded from training under the Microsoft Products and Services Data Protection Addendum: "Your data isn't used to train foundation models... the prompts, responses, and data accessed through Microsoft Graph aren't used to train foundation models." Retention is tenant-controlled via Microsoft Purview. On January 7 2026, Microsoft enabled Anthropic as a default subprocessor for M365 Copilot (used in Researcher, Copilot Studio, Office agents); EU and UK tenants have it disabled by default, processing occurs outside the EU Data Boundary.

Primary source: Privacy FAQ for Microsoft Copilot

Does Meta AI train on my data?

Yes, by default globally. EU / UK have an opt-out form; US / Australia / most non-GDPR regions do not.

Meta AI β€” the assistant available in WhatsApp, Instagram, Facebook, Messenger, Threads, and Ray-Ban / Oakley Meta β€” is opt-in by default for training on adult users' public Facebook and Instagram posts and on all Meta AI conversations. The regional split is load-bearing. EU / EEA users have opt-out via the "Right to Object" form in Privacy Center β†’ "How Meta uses information for generative AI models and features" β€” Meta resumed EU training on May 27 2025 under GDPR Article 6(1)(f) "legitimate interest" after a one-year pause forced by Irish DPC + noyb pressure. UK opt-out is available after ICO concessions. US, Australia, and most non-GDPR jurisdictions have no general opt-out β€” Meta confirmed in Australian Senate hearings (Sept 2024) that it scrapes Australian public posts back to 2007 and will offer an opt-out only "if governments force it to." WhatsApp 1:1 and group messages remain end-to-end encrypted and are not used for training unless the user explicitly invokes @Meta AI in chat. On December 16 2025, Meta extended Meta AI conversations into ad personalization in addition to model training (US and most non-EU regions; EU, UK, South Korea carved out). Retention is indefinite unless the user manually deletes; Ray-Ban Meta voice recordings are retained up to 1 year by default. No on-device inference for the Meta AI assistant β€” WhatsApp "Private Processing" is server-side TEE confidential compute, not local.

Primary source: Meta Privacy Center β€” How Meta uses information for generative AI

Does Grok train on my data?

Consumer: yes by default β€” three separate opt-out paths. API: no, by default.

Consumer Grok β€” on X (Free, Premium $16/mo, Premium+ $40/mo), grok.com, and the Grok mobile app β€” is opt-in by default for training on both X posts and Grok prompts / outputs. The opt-out lives in three separate places: (1) Grok on X: Settings β†’ Privacy and safety β†’ Data sharing and personalization β†’ Grok & xAI β†’ uncheck "Allow your posts as well as your interactions, inputs, and results with Grok and xAI to be used for training and fine-tuning." (2) Grok mobile app: Settings β†’ Data Controls β†’ uncheck "Improve the model." (3) Grok web at grok.com: Settings β†’ Data β†’ uncheck "Improve the model." EU / EEA users were excluded from X-post training under the September 4 2024 DPC undertaking; the DPC opened a fresh statutory inquiry on April 11 2025. The January 15 2026 X Terms of Service update explicitly classifies Grok prompts and outputs as user "Content" available for AI training. Grok chats are retained indefinitely unless the user deletes them. The xAI API has a materially different posture: "xAI never trains on your API inputs or outputs without your explicit permission," matching the OpenAI and Anthropic API defaults. API retention is 30 days for abuse monitoring; Zero Data Retention is available for enterprise customers via sales@x.ai and on Oracle Cloud Infrastructure since June 17 2025. Track record: "MechaHitler" antisemitic-content failure (July 2025) drew EU Commission and Turkey enforcement; ~370,000 Grok shared chats were indexed on Google in August 2025 because the share feature generated unauthenticated URLs without `noindex`.

Primary source: xAI Privacy Policy

AI Coding Tools

Does Cursor train on my code?

Yes, by default for individuals β€” Privacy Mode opt-out.

For individual accounts, Cursor's "Share Data" mode is enabled by default, sending code, prompts, editor actions, and code snippets to Cursor for model improvement. Toggling Privacy Mode ON prevents training and discards plaintext after each request β€” cached files are encrypted with client-generated keys, with the encryption keys existing on Cursor's servers only for the duration of each request. Team and Enterprise accounts default to Privacy Mode ON, with zero-data-retention agreements with OpenAI, Anthropic, Google, xAI, Fireworks, Baseten, and Together. The strictest tier, Privacy Mode (Legacy), guarantees no code is stored at all, by Cursor or any third party. Cursor can also be configured to use local Ollama or LM Studio models, which bypass Privacy Mode entirely.

Primary source: Cursor data use

Does GitHub Copilot train on my code?

Yes, by default for consumer plans β€” opt-out as of April 24, 2026.

On April 24, 2026, GitHub began using Free, Pro, and Pro+ user interaction data β€” including code snippets β€” to train AI models by default. Existing opt-outs are honored. To disable training going forward, go to Settings β†’ Privacy. User Engagement Data is retained for 2 years; Coding Agent session logs persist for the lifetime of the account. Private repository code at rest is NOT used for training, but in-flight interaction data IS. Business and Enterprise plans are explicitly prohibited from being used for training under GitHub's agreements: subscription Prompts and Suggestions are retained 28 days, and User Engagement Data 2 years.

Primary source: April 2026 policy change

Does Windsurf (Codeium) train on my code?

Yes for individuals by default β€” Zero Data Retention opt-in available.

Windsurf (formerly Codeium) trains on individual user code by default β€” without zero-data-retention enabled, logs may contain code snippets and user trajectories. Individuals can toggle ZDR on in their profile to prevent training; with ZDR on, "the code data submitted by zero-data retention mode users will never be trained on," code is never serialized in plaintext on Windsurf's servers, and is held only in-memory for the request lifetime (plus minutes-to-hours for prompt caching). Teams and Enterprise plans default to ZDR ON. The Enterprise Self-hosted tier deploys via Docker Compose or Helm Charts inside the customer's firewall β€” no traffic leaves customer infrastructure.

Primary source: Windsurf security

Does Cline train on my code?

No β€” Cline operates no model server. Privacy depends on your chosen API provider.

Cline is an open-source VS Code extension that operates no model server of its own. User code is sent only to whichever API provider you configure (Anthropic, OpenAI, AWS Bedrock, Google Gemini, Cerebras, Groq, etc.) and is governed by that provider's terms. Cline's stated principle: "Code never leaves your machine" toward Cline servers. Anonymous telemetry (features used, task completion rates) is collected but can be disabled via the Cline Telemetry setting. Code, file contents, command arguments, and conversation content are explicitly NOT collected by telemetry. For fully on-device use, configure Cline with a local Ollama or LM Studio model.

Primary source: Cline telemetry docs

Does Claude Code train on my code?

Pro/Max account: yes by default β€” opt-out. API / Console / Enterprise: no, contractually excluded.

Claude Code is Anthropic's CLI for Claude β€” separate from Claude.ai consumer chat and from Claude for Work. The data-handling posture depends entirely on how the user is billed. A developer running Claude Code from a Claude.ai Pro or Max account inherits the August 28 2025 consumer-terms update, which made training default-on with user-choice opt-out. Anthropic stated explicitly: "These updates apply to users on our Claude Free, Pro, and Max plans... including when they use Claude Code from accounts associated with those plans." Opt-out lives at claude.ai/settings/data-privacy-controls. Retention with training enabled is 5 years; with training disabled, 30 days. A developer running Claude Code via an API key, Anthropic Console, Claude for Enterprise, AWS Bedrock, Google Vertex AI, or Microsoft Foundry is on Anthropic's Commercial Terms Section B: "Anthropic may not train models on Customer Content from Services." Reinforced in the Claude Code data-usage docs: "Anthropic does not train generative models using code or prompts sent to Claude Code under commercial terms." Bedrock and Vertex users cannot enroll in the Development Partner Program at all. Zero Data Retention is available specifically for Claude Code on Claude for Enterprise per the docs, enabled per-organization by the account team. Architecturally, Claude Code is partial on-device: the binary runs locally (file edits, shell execution, MCP servers, local session log), but inference is cloud-only via the Anthropic API.

Primary source: Claude Code data usage

Voice & Dictation

Does Voibe train on my voice data?

No β€” audio never leaves the device.

Voibe processes audio entirely on your Mac using OpenAI Whisper models running on Apple Silicon's Neural Engine. Per Voibe's privacy policy: "The Voibe application processes your voice entirely on your device. No audio is transmitted to our servers at any point" and "Your dictated content never leaves your Mac and we have no access to it." Because audio never crosses the network, there is no training to opt out of. Account holders provide an email (for authentication) and non-identifying usage analytics; crash reports exclude dictated content. The Free plan does not require an account at all.

Primary source: Voibe privacy policy

Does Wispr Flow train on my voice data?

Off by default since 2024 backlash β€” opt-in for training.

After 2024 community backlash, Wispr Flow shifted training to opt-in. Privacy Mode is OFF by default for Free users, meaning audio, transcripts, edits, and optional Context Awareness (screenshots of the active app's screen) are retained indefinitely. Data passed to third-party LLM providers (OpenAI, Meta) is retained for 30 days. Enterprise plans default to Privacy Mode ON with zero data retention by Wispr or any third party β€” audio is processed and immediately discarded after transcription. A Business Associate Agreement is available for Enterprise; once signed, Privacy Mode locks irreversibly. Transcription always happens in the cloud; even Privacy Mode is "zero-retention cloud," not local processing.

Primary source: Wispr Flow privacy policy

Deep dive: Is Wispr Flow Safe? β€” full investigation

Does Superwhisper train on my voice data?

No β€” verbatim from policy.

Superwhisper's privacy policy states explicitly: "Your data is not retained on Superwhisper servers" and "not used for training AI models or any other machine learning purposes." On-device modes (Fast, Nano, Standard Whisper, Parakeet β€” available on the Free plan and within Pro) process audio entirely locally; nothing is transmitted. Cloud modes (Ultra transcription, Super Mode LLMs β€” Pro tier) proxy audio through Superwhisper's infrastructure with no retention. One caveat: audio recordings are saved to local disk by default. Opt out in settings if local audio retention is a concern. Note: the privacy policy does not currently distinguish between on-device and cloud modes β€” verify cloud-mode specifics with the vendor before sensitive use.

Primary source: Superwhisper privacy

Deep dive: Is Superwhisper Safe? β€” full investigation

Does MacWhisper train on my voice data?

No β€” primarily on-device, with optional cloud + BYOK paths.

MacWhisper does not train its own models on user audio. The on-device transcription path uses local Whisper models that you can download for offline use; Apple Foundation Models also run on-device for AI features. MacWhisper's optional "Assistant" cloud transcription service and BYOK integrations (OpenAI Whisper API, ElevenLabs) inherit those providers' terms when used. The App Store version's privacy disclosure shows only "Usage Data" and "Product Interaction" as Data Not Linked to You. There is no separate enterprise tier; the data-handling architecture is identical for individuals and bulk-licensing customers.

Primary source: App Store listing

Does Aqua Voice train on my voice data?

Privacy policy does not explicitly address training β€” opt-out via Privacy Mode.

Aqua Voice's privacy policy does not explicitly state whether stored data is used for AI training. With Privacy Mode disabled, "we may securely store transcript data on our servers"; with Privacy Mode enabled, "transcript data is not collected" though session metadata (timestamps, device type, performance metrics) may still be. Aqua Voice is SOC 2 Type II certified by Advantage Partners. Teams and Enterprise plans support an org-wide Privacy Mode that applies the same protections across an entire organization. No HIPAA Business Associate Agreement is publicly advertised. Audio is cloud-processed; there is no on-device option.

Primary source: Aqua Voice privacy policy

Deep dive: Is Aqua Voice Safe? β€” full investigation

Does Typeless train on my voice data?

No, per the published privacy policy β€” but verify the architecture.

Typeless's privacy policy states: "Your data is never used to train these services and is configured for zero retention by the providers." Audio plus contextual information is "processed in real time on our cloud servers and immediately discarded once the result is returned to your device." Free and Pro tiers receive the same data-handling treatment. However, a November 2025 reverse-engineering analysis (covered in our Typeless privacy issues investigation) reported collection beyond what the published policy describes β€” including URL capture, window-title metadata via the macOS accessibility API, and broad permission requests. Verify the current subprocessor list at trust.typeless.com/subprocessors before relying on Typeless for sensitive content.

Primary source: Typeless privacy policy

Deep dive: Typeless Privacy Issues β€” full investigation

Does Apple Dictation train on my voice data?

Only if you opt in via "Improve Siri & Dictation."

Apple Dictation only uses your audio to improve its models if you have explicitly enabled "Improve Siri & Dictation" β€” the default at setup is to be asked. If opted in, audio and transcripts are retained under a rotating random ID for up to 6 months, then dissociated and kept for up to 2 years for improvement; a reviewed subset is retained beyond 2 years. If opted out, recordings are not retained for improvement. On Apple Silicon Macs running modern macOS or iOS, most languages process locally for general text fields (Notes, Mail, Messages). Server-side fallback applies to unsupported languages, search-box dictation, and some third-party Speech Recognition API uses. Apple does not sign a Business Associate Agreement for consumer Dictation, so it is not HIPAA-compliant.

Primary source: Ask Siri & Dictation policy

Does Willow Voice train on my voice data?

No by default β€” Private Mode is the default; opt-in required for training.

Willow Voice operates with Private Mode as the default for all users. The privacy policy states verbatim: "In private mode, Willow only collects basic technical and account-related data needed to run the app and nothing else. No voice, no dictated text." Training only occurs if the user explicitly opts into Opt-In Mode: "You can allow Willow to collect minimal usage data to help improve our text correction models." This is the inverse of Wispr Flow's pre-2024 posture, which had Privacy Mode off by default until community backlash forced the switch. With Opt-In Mode enabled, anonymized text and usage data are retained "only as long as needed to train and improve the app" β€” no specific retention window is published. Architecturally, Willow Voice is cloud-first; the privacy policy's reference to local storage applies to the UI display of past transcripts, not to local transcription. The current willowvoice.com pricing page does not advertise an offline transcription mode. The Enterprise tier is marketed as zero data retention with HIPAA and SOC 2 compliance and BAA availability, but the privacy policy itself only references GDPR and SOC 2 β€” request signed BAA terms in writing before processing PHI. Policy last updated April 30 2025.

Primary source: Willow Voice privacy policy

Does Monologue train on my voice data?

Unknown β€” the privacy policy does not address training either way.

Monologue's published data-privacy page makes a narrow set of claims: "No audio files or transcripts are saved on our servers," "Deep context screenshots are deleted immediately," "Zero LLM data retention," and "Custom modes and dictionaries stay on your device." The page does not explicitly state whether de-identified or aggregated data is used to improve models. Peer cloud dictation products β€” Wispr Flow, Typeless, Superwhisper, Willow Voice β€” all explicitly address the training question one way or the other. Monologue's silence is the load-bearing gap. The Notion-hosted privacy policy at modern-ton-234.notion.site is similarly sparse β€” no list of data categories collected, no subprocessor disclosures, no retention windows beyond the broad "not saved" claim, no SOC 2 / HIPAA / BAA mention. Offline transcription is listed as a feature on both the Free and Pro tiers of monologue.to, but the architecture (cloud-default with offline fallback vs on-device-default) is not documented. For regulated work, request explicit training and subprocessor disclosure in writing before use.

Primary source: Monologue data privacy

Does VoiceInk train on my voice data?

No β€” 100% on-device. Source code is auditable on GitHub.

VoiceInk is an open-source Mac dictation app under GPL v3.0. The README states verbatim: "100% offline processing ensures your data never leaves your device." The marketing site at tryvoiceink.com describes it as a "privacy-focused dictation app for macOS with local transcription." Transcription runs locally on Apple Silicon via whisper.cpp; there is no VoiceInk-operated server in the inference path, so there is no retention surface and no training path. No telemetry, BYOK cloud feature, or AI-command post-processing service is documented at this verification pass. The source code is auditable on GitHub at Beingpax/VoiceInk with 4,900+ stars, 675+ forks, and 119 releases (latest v1.76 on May 7 2026). Pricing is $39.99 one-time for the Pro license, or free if built from source under GPL v3. Because the codebase is open, the privacy claim is verifiable β€” the same reason Ollama, Jan, and LM Studio score highly in our Local LLM category.

Primary source: VoiceInk (tryvoiceink.com)

Deep dive: VoiceInk Review β€” full investigation

Meeting Transcription

Does Otter.ai train on my voice data?

Yes, by default on de-identified data β€” opt-out in account settings.

Otter trains automatically on de-identified user data. Recordings and transcripts are not manually reviewed by humans unless the customer gives explicit consent for support troubleshooting; training data is encrypted. Opt-out exists in account settings. Brewer v. Otter.ai was filed in the Northern District of California in August 2025 β€” a class action alleging Otter "deceptively and surreptitiously" recorded private conversations and used meeting data to train AI models without explicit permission from all participants, alleging violations of the Electronic Communications Privacy Act, Computer Fraud and Abuse Act, and California Invasion of Privacy Act. The case is ongoing. The visible "Otter.ai" bot creates consent issues in the 12 US states with two-party consent laws. Conversations are stored until manually deleted; trash holds for 30 days then auto-purges. Otter holds SOC 2 Type 2.

Primary source: Otter privacy & security

Does Fireflies.ai train on my voice data?

Yes, by default for consumer plans β€” opt-out in admin settings.

Fireflies trains on user transcripts in the default consumer configuration. Opt-out exists in admin settings; Enterprise plans differ contractually. A December 2025 class action under Illinois' Biometric Information Privacy Act (BIPA) alleges Fireflies collected voiceprint biometric data without written consent β€” the case is ongoing. The visible "Fireflies.ai Notetaker" bot creates consent complications in two-party consent jurisdictions. Audio, video recordings, transcripts, speaker identification, voiceprint data, and meeting metadata are stored until manual deletion. Data residency options are available for international compliance. Fireflies is GDPR compliant.

Primary source: Fireflies privacy policy

Does Granola train on my meeting data?

Yes, by default for consumer β€” Enterprise plans have training off by default.

Granola uses de-identified user data to train internal AI models supporting its services by default. The opt-out is in account settings and is not surfaced prominently. Enterprise users have training off by default. Third-party LLM providers (OpenAI, Anthropic) are contractually prohibited from training on Granola data. On April 2 2026, The Verge reported that Granola makes meeting notes accessible to anyone with the share link by default and opts users into internal AI training in default settings, contradicting the app's "private by default" marketing. Architecturally, audio is transcribed in real time on macOS/Windows and not stored, but audio is sent to cloud transcription providers (Deepgram, AssemblyAI) and summarization uses cloud LLMs (OpenAI, Anthropic). Transcripts and notes are stored in a US-hosted AWS Virtual Private Cloud, encrypted at rest and in transit, backed up daily. iOS uses temporarily cached audio.

Primary source: Granola privacy policy

Local LLM Runtimes

Does Ollama train on my data?

No β€” Ollama operates no model server.

Ollama is a local runtime: there is no Ollama-operated model server to send prompts or completions to. Models run entirely on local hardware. Ollama exposes an OpenAI-compatible local REST API on localhost:11434; nothing leaves the user's machine in default operation. There is no vendor server, so there is no retention surface and no training path. Ollama is open-source under the MIT license with over 100,000 GitHub stars; the codebase is transparent and there are no documented incidents. Caveat: users who deliberately expose the local API beyond localhost create their own attack surface β€” that is a deployment choice, not an Ollama default.

Primary source: Ollama GitHub repo

Does LM Studio train on my data?

No β€” inference runs locally; no chat data leaves the device.

LM Studio runs models locally on the user's hardware with an optional OpenAI-compatible local server. Inference happens entirely on-device; there are no LM Studio servers in the inference path. The optional LM Link feature uses end-to-end encrypted Tailscale mesh VPNs to connect a user's own devices for remote inference β€” chats remain local even when accessed from another linked device. LM Link shares only device-list metadata (not chats) with LM Studio's backend for device discovery. The application is closed-source but transparent about data flows on the LM Link product page; no documented incidents.

Primary source: LM Studio

Does Jan train on my data?

No β€” Jan operates no model server.

Jan is an open-source local LLM runtime under Apache 2.0. Models run locally; there is no Jan-operated model server, so there is nothing to train on and no retention surface. Optional cloud-provider integrations are explicitly opt-in β€” when enabled, the chosen provider's terms apply for that specific feature. In default operation, nothing leaves the device. The codebase is auditable on GitHub with an active community roadmap; no documented incidents.

Primary source: Jan

Frequently Asked Questions

What does "on-device" actually mean?
On-device means a tool can complete its core workflow without sending your input to a vendor's servers. For dictation, that means audio is captured, transcribed, and discarded entirely on your computer β€” nothing leaves the machine. Most AI assistants and coding tools are not on-device by default: they transmit your prompts and code to a cloud model, even if the vendor doesn't retain or train on it. "Partially on-device" means parts of the workflow are local but specific cases (unsupported languages, agentic operations, large models) fall back to the cloud. Apple Dictation and Cline (when paired with a local Ollama or LM Studio model) are examples of partial on-device.
Why does the same tool show different answers for Free vs Business?
Consumer and business tiers operate under separate contracts. Most major AI vendors train on consumer data (or did until very recently) and explicitly exclude business / API / enterprise data from training under their commercial agreements. The two tiers can use the same underlying model but with different data-handling guarantees. Conflating the two is the most common error in third-party comparison articles. The tier filter at the top of each table separates them so you can answer either question independently.
Can a tool "unlearn" my data after training?
Practically, no. Once a model has been trained on a piece of data, the parameters reflect that training and cannot be cleanly reverted on a per-record basis. Vendors offering deletion typically delete the conversation record but cannot remove its influence on the model that has already absorbed it. This is why the relevant question is "will it be used for training in the first place," not "can I delete it later." Pages like this one focus on the training question because the deletion question rarely changes the outcome for already-trained models.
How often is this updated?
Each row carries a Last verified date. We re-check every cell against its primary source on a roughly monthly cadence, plus immediately whenever a vendor announces a policy change. The Recent Changes timeline at the top of the page lists every dated change we have logged. If you find an outdated cell or a missing change, email hi@getvoibe.com β€” we'll update and credit the report.
How is each tool scored?
Every tier carries four independent scores on a 0–25 scale that sum to a 0–100 composite: Training (does the vendor train on your data by default), Retention (how briefly is data kept), On-device (can the workflow run without sending data to the vendor), and Track record (the vendor's documented incidents, breaches, and unfavorable policy changes). We separate the axes because a tool with a strong contract but a weak track record isn't strictly better or worse than the inverse β€” but we also publish the composite total because most readers ultimately want a single answer. Per-axis buckets: 22–25 architectural or contractual guarantee, 17–21 default-off or short retention, 11–16 user must take action or policy has caveats, 6–10 unfavorable default with opt-out, 1–5 no opt-out or indefinite retention, 0 the policy does not address that dimension. Composite buckets: 85+ Excellent, 70–84 Strong, 55–69 Adequate, 40–54 Weak, below 40 Poor.
What score should I look for?
The right floor depends on the work. For regulated industries (healthcare, legal, financial) or anything that triggers regulator notification on leak, look for 85+ in the business tier and verify the vendor signs a BAA or DPA. For sensitive but unregulated business work β€” proprietary code, internal docs, M&A drafts β€” 70+ is the floor; below that, you are relying on opt-out toggles that team members may not have flipped. For day-to-day drafting, research, and light coding on non-secret content, 55+ is fine. For personal low-stakes use (notes to self, brainstorming) any score works as long as you understand what the tool retains. The Use case fit section on this page lists the tools that meet each threshold.
How do I report an error or a missing tool?
Email hi@getvoibe.com with the tool name, the cell you think is wrong, and a primary-source link. We'll verify and update on the next pass. Tool requests are welcome, but to be added we need a vendor-published privacy or data-handling page that we can cite β€” marketing claims aren't enough.

This tracker is maintained by the team at Voibe. We built it because privacy is the central design constraint of our product, and we kept being asked these questions. Voibe is one of the tools listed β€” the methodology is the same for every row.

Related reading on this site: Is Wispr Flow safe? Β· Typeless privacy issues Β· Apple Dictation privacy Β· Voice data privacy Β· Cloud vs local dictation Β· HIPAA dictation.