Now in public beta

Clean, analyze and optimize AI text

Name: TextPurify
Author: TextPurify

TextPurify removes hidden Unicode artifacts, scores readability and SEO structure, analyzes sentiment, and checks semantic relevance — entirely in your browser.

Try It Free — No Sign-Up See Pricing

// before → after

INPUT

It's important to note that AI-generated content contains various Unicode artifacts -- invisible characters, non-standard hyphens, and smart quotes.

Detected:

zero-width spacesmart quotes x2non-breaking hyphens x2em dash

OUTPUT

It's important to note that AI-generated content contains various Unicode artifacts -- invisible characters, non-standard hyphens, and smart quotes.

5 artifacts removed0 invisible chars100% text preserved

// TOOLS

Everything you need in one place

Six specialized tools built for teams that work with AI-generated content every day.

Cleaner

Detect and remove zero-width characters, homoglyphs, non-standard spaces, and typographic artifacts from any text.

Inspector

Explore every character in your text with code point details, Unicode category, and script origin.

SEO & Health

Readability scores (Flesch-Kincaid / LIX), keyword density, water word analysis, passage structure, and a composite Health Score.

Bulk Upload

Process dozens of files at once. Upload .txt or .md files and get a clean version of each with an artifact report.

API & Keys

Integrate Unicode cleaning into your pipeline. Generate API keys, call the REST endpoint, and process text programmatically.

History

Every text you process is saved locally on your device. Restore any previous version with one click — no account required.

Open the app

// AI INTELLIGENCE

Deeper analysis, in your browser

On-device ML models run locally — your text never leaves your machine.

Health Score

A single 0–100 composite score combining artifact cleanliness, readability, water density, and passage structure. Know at a glance whether your text is publish-ready.

Sentiment Analysis

On-device BERT model classifies your text as Positive, Negative, or Neutral with a confidence score. Runs in your browser — no text is sent to a server.

Semantic Relevance

Enter a topic or query and see how semantically relevant your text is. Sentence embeddings find the most and least relevant paragraphs.

Passage & AEO Score

Measures heading structure, question headings, answer-first patterns, and paragraph length — the factors that determine whether your content gets surfaced by AI overviews.

// what we catch

Four categories of Unicode artifacts.

Zero-Width Characters

Invisible characters with no width — U+200B, U+FEFF, U+200C and more. AI tools insert them silently and they can break search, clipboard copy, and text processing.

Non-Standard Spaces

Unicode has dozens of space characters beyond U+0020. Non-breaking spaces (U+00A0), em spaces (U+2003), and thin spaces (U+2009) look identical but behave differently.

Typographic Substitutions

Smart quotes (“”), em dashes (—), and ellipsis (…) replace their ASCII equivalents in AI output. These break code, CSV parsing, and plain-text pipelines.

Homoglyphs

Cyrillic letters like а (U+0430) look identical to Latin a but are different code points. Used for obfuscation, SEO spam, or introduced accidentally by multilingual AI models.

Not an AI detector

TextPurify does not try to guess whether text was written by AI. It detects specific Unicode artifacts that AI tools happen to leave behind. Works on any text, from any source.

Your text never leaves your browser

Detection and cleaning run entirely in your browser using JavaScript. Nothing is sent to a server for analysis. Your content stays private.

// BLOG

Learn about text quality

Deep dives into Unicode, readability, AEO, and why AI text is full of hidden characters.

AI textunicode

// PRIVACY BY DESIGN

Your text never leaves your browser

Every computation — from Unicode scanning to sentiment analysis — runs locally. Open DevTools and watch the network tab: no text is ever transmitted.

Unicode detection & cleaning

Pure TypeScript, runs synchronously in the main thread.

The entire artifact detection engine is a ~8 KB JS module. It scans every character against Unicode category tables without any network calls. Results appear in under 5 ms for texts up to 100,000 characters.

Readability & SEO metrics

Flesch-Kincaid, LIX, Gunning Fog, passage analysis — all computed locally.

All scoring formulas are implemented in plain TypeScript. No external API is called. The keyword and water-word analysis runs against static dictionaries bundled with the app.

Sentiment analysis

DistilBERT model running via WebAssembly (ONNX Runtime Web).

The ~67 MB model weights are downloaded once from Hugging Face CDN and cached in your browser's IndexedDB. After the first load, inference is fully offline. Your text is passed only to the local WASM runtime — never to a remote endpoint.

Semantic relevance

all-MiniLM-L6-v2 sentence embeddings, computed in a Web Worker.

A ~23 MB embedding model runs in a background Web Worker so it never blocks the UI. Cosine similarity between your query and each paragraph is computed in-memory. The Worker is terminated when you leave the tool.

Want the technical details?

Read: Why we process everything in your browser →

// how it works

Three steps. Zero complexity.

Paste your text

Drop in any AI-generated or AI-assisted text. Blog posts, emails, reports — anything.

Detect artifacts

TextPurify scans for patterns across four Unicode artifact categories, flagging every issue instantly.

Get clean text

Copy your cleaned text with one click. Same words, zero hidden characters, no more surprises.

Stop the Unicode aftertaste.

Free tier: up to 5,000 characters without an account. Sign up free for 10,000.

Open TextPurify

Clean, analyze and optimize AI text

Everything you need in one place

Cleaner

Inspector

SEO & Health

Bulk Upload

API & Keys

History

Deeper analysis, in your browser

Health Score

Sentiment Analysis

Semantic Relevance

Passage & AEO Score

Four categories of Unicode artifacts.

Zero-Width Characters

Non-Standard Spaces

Typographic Substitutions

Homoglyphs

Not an AI detector

Your text never leaves your browser

Learn about text quality

Why AI-Generated Text Contains Hidden Characters

Readability Scores Explained: Flesch-Kincaid, LIX, and Why They Matter

Passage Optimization and AEO: Writing for AI Overviews

Your text never leaves your browser

Unicode detection & cleaning

Readability & SEO metrics

Sentiment analysis

Semantic relevance

Three steps. Zero complexity.

Paste your text

Detect artifacts

Get clean text

Stop the Unicode aftertaste.