Clean, analyze and optimize AI text
TextPurify removes hidden Unicode artifacts, scores readability and SEO structure, analyzes sentiment, and checks semantic relevance — entirely in your browser.
It's important to note that AI-generated content contains various Unicode artifacts -- invisible characters, non-standard hyphens, and smart quotes.
It's important to note that AI-generated content contains various Unicode artifacts -- invisible characters, non-standard hyphens, and smart quotes.
Everything you need in one place
Six specialized tools built for teams that work with AI-generated content every day.
Cleaner
Detect and remove zero-width characters, homoglyphs, non-standard spaces, and typographic artifacts from any text.
Inspector
Explore every character in your text with code point details, Unicode category, and script origin.
SEO & Health
Readability scores (Flesch-Kincaid / LIX), keyword density, water word analysis, passage structure, and a composite Health Score.
Bulk Upload
Process dozens of files at once. Upload .txt or .md files and get a clean version of each with an artifact report.
API & Keys
Integrate Unicode cleaning into your pipeline. Generate API keys, call the REST endpoint, and process text programmatically.
History
Every text you process is saved locally on your device. Restore any previous version with one click — no account required.
Deeper analysis, in your browser
On-device ML models run locally — your text never leaves your machine.
Health Score
A single 0–100 composite score combining artifact cleanliness, readability, water density, and passage structure. Know at a glance whether your text is publish-ready.
Sentiment Analysis
On-device BERT model classifies your text as Positive, Negative, or Neutral with a confidence score. Runs in your browser — no text is sent to a server.
Semantic Relevance
Enter a topic or query and see how semantically relevant your text is. Sentence embeddings find the most and least relevant paragraphs.
Passage & AEO Score
Measures heading structure, question headings, answer-first patterns, and paragraph length — the factors that determine whether your content gets surfaced by AI overviews.
Four categories of Unicode artifacts.
Zero-Width Characters
Invisible characters with no width — U+200B, U+FEFF, U+200C and more. AI tools insert them silently and they can break search, clipboard copy, and text processing.
Non-Standard Spaces
Unicode has dozens of space characters beyond U+0020. Non-breaking spaces (U+00A0), em spaces (U+2003), and thin spaces (U+2009) look identical but behave differently.
Typographic Substitutions
Smart quotes (“”), em dashes (—), and ellipsis (…) replace their ASCII equivalents in AI output. These break code, CSV parsing, and plain-text pipelines.
Homoglyphs
Cyrillic letters like а (U+0430) look identical to Latin a but are different code points. Used for obfuscation, SEO spam, or introduced accidentally by multilingual AI models.
Not an AI detector
TextPurify does not try to guess whether text was written by AI. It detects specific Unicode artifacts that AI tools happen to leave behind. Works on any text, from any source.
Your text never leaves your browser
Detection and cleaning run entirely in your browser using JavaScript. Nothing is sent to a server for analysis. Your content stays private.
Learn about text quality
Deep dives into Unicode, readability, AEO, and why AI text is full of hidden characters.
Why AI-Generated Text Contains Hidden Characters
Read moreReadability Scores Explained: Flesch-Kincaid, LIX, and Why They Matter
Read morePassage Optimization and AEO: Writing for AI Overviews
Read moreYour text never leaves your browser
Every computation — from Unicode scanning to sentiment analysis — runs locally. Open DevTools and watch the network tab: no text is ever transmitted.
Unicode detection & cleaning
Pure TypeScript, runs synchronously in the main thread.
The entire artifact detection engine is a ~8 KB JS module. It scans every character against Unicode category tables without any network calls. Results appear in under 5 ms for texts up to 100,000 characters.
Readability & SEO metrics
Flesch-Kincaid, LIX, Gunning Fog, passage analysis — all computed locally.
All scoring formulas are implemented in plain TypeScript. No external API is called. The keyword and water-word analysis runs against static dictionaries bundled with the app.
Sentiment analysis
DistilBERT model running via WebAssembly (ONNX Runtime Web).
The ~67 MB model weights are downloaded once from Hugging Face CDN and cached in your browser's IndexedDB. After the first load, inference is fully offline. Your text is passed only to the local WASM runtime — never to a remote endpoint.
Semantic relevance
all-MiniLM-L6-v2 sentence embeddings, computed in a Web Worker.
A ~23 MB embedding model runs in a background Web Worker so it never blocks the UI. Cosine similarity between your query and each paragraph is computed in-memory. The Worker is terminated when you leave the tool.
Want the technical details?
Read: Why we process everything in your browser →Three steps. Zero complexity.
Paste your text
Drop in any AI-generated or AI-assisted text. Blog posts, emails, reports — anything.
Detect artifacts
TextPurify scans for patterns across four Unicode artifact categories, flagging every issue instantly.
Get clean text
Copy your cleaned text with one click. Same words, zero hidden characters, no more surprises.
Stop the Unicode aftertaste.
Free tier: up to 5,000 characters without an account. Sign up free for 10,000.
Open TextPurify