Photo by Aaron Burden on Unsplash
A client asked me a question last month that I couldn't answer well.
"How do we know AI didn't copy from someone else — or just make the dosage up? Legal needs answers."
I build their content system. Dozens of health and wellness articles a month, written for professionals. AI drafts the first pass. Editors revise. The client publishes.
Their legal team had started asking the right question: what exactly is checking that AI didn't hallucinate a dosage, or copy three sentences from another site?
The honest answer was: nothing systematic. Editors caught what editors caught. Sometimes that was enough. Sometimes it wasn't.
So I built the systematic step.
CheckApp is what came out of that conversation — an open-source quality gate that runs 12 checks on an article before publish. Plagiarism, fact-check, grammar, legal risk, tone, and 7 more. It runs locally, finishes in under 90 seconds, and costs somewhere between nothing and 25 cents per article depending on which providers you configure.
This post is the long version. Why it exists, what's in it, how it runs, and what "free and open source" actually means in practice.
Three Failure Patterns AI Content Reviewers Miss
I spent most of 2025 ghostwriting across fintech, wellness, and SaaS. By mid-year, AI was the default first draft across every client. And three specific failure patterns kept showing up — always subtle, always in articles that had already passed human review.
Pattern 1: Factual drift. One health article bumped a vitamin C dosage from 100mg in the brief to 200mg in the draft. Not a typo. Not a rounding error. The model just generated a plausible-sounding number that happened to be wrong. Three reviewers signed off. The client almost published a dosage recommendation they had never agreed to.
Pattern 2: Accidental plagiarism. Another piece had a paragraph on antioxidants that read almost word-for-word like an existing wellness blog. Not a direct copy — close enough to get flagged if anyone ran it through Copyscape, which nobody was doing by default. Close enough to create a legal-dispute risk if the other site's lawyer ever noticed.
Pattern 3: Confident fabrication. "The average B2B buyer engages with 13 pieces of content before purchasing." That specific statistic, in a thought-leadership piece I was editing. No source. No paper. The number did not exist. It sounded right, which is exactly what LLMs are optimized to produce.
Each article looked clean on first read. Each one passed human review. Each one could have shipped.
Why Grammarly, ChatGPT, and Copyscape Aren't Enough
Before I built anything, I did the obvious thing: looked at what was already out there.
Grammarly is excellent at grammar and style. It's not designed to check whether a dosage is right or whether a statistic has a source. That's fine — grammar isn't the same problem.
ChatGPT as a fact-checker is worse than useless. Ask an LLM to verify a claim and it answers confidently. Ask it to cite sources and it invents URLs that don't exist. LLMs are trained to produce plausible answers. Fact-checking is a retrieval problem, and retrieval is a different skill.
Copyscape is the best plagiarism index on the market — 2 billion+ pages. But running it by hand on every draft is a manual step most teams skip. And it doesn't tell you anything about factual accuracy, legal risk, or tone.
Surfer and Clearscope optimize for keyword coverage against search intent. Great for SEO positioning. Not designed to tell you whether your claims are defensible.
Human review catches structure and flow. Three of my clients' editors caught every typo and every awkward sentence in the articles above. They didn't catch a fake statistic that sounded credible, because human attention doesn't run the same checks every time. It gets tired. It reads for argument, not provenance.
The gap wasn't any of these tools individually. The gap was that nobody had composed them into a single pipeline that ran before publish.
What CheckApp does
CheckApp is the composition.
One CLI command. One file or one Google Doc URL. One report that answers: is this publishable?
npm install -g checkapp
checkapp --setup
checkapp article.md
Under the hood, 12 skills run in parallel. Each one handles one dimension of the "is this publishable" question.

The report shows the overall score, the verdict (pass / warn / fail / skipped), and a per-skill breakdown. Click into any skill for findings — the specific line flagged, the severity, the source evidence when it exists.
The 12 skills
Not all of these run by default. The ones with "enabled by default" come up automatically on a fresh install. The rest require an API key from a provider you choose. CheckApp does not provide API keys. It uses yours.
| Skill | What it checks | Default engine | Cost per check |
|---|---|---|---|
| Plagiarism | Full indexed web for copied passages | Copyscape | ~$0.09 |
| AI Detection | Probability the text was AI-generated | Copyscape | ~$0.09 |
| SEO | Word count, headings, readability, keywords, links | Offline (no API) | Free |
| Grammar & Style | Grammar + per-finding rewrites | LanguageTool (free tier) | Free |
| Academic Citations | Merges citations onto fact-check findings | Semantic Scholar | Free |
| Self-Plagiarism | Overlap with your own past articles | Cloudflare Vectorize | ~$0.0001 |
| Fact Check | Retrieves evidence for claims; assesses confidence | Exa Search / Exa Deep Reasoning / Parallel Task | $0.008–$0.03 per claim |
| Tone of Voice | Compares against a brand voice guide; returns rewrites | Claude / MiniMax | ~$0.002 |
| Legal Risk | Scans for health, defamation, false-promise claims | Claude / MiniMax | ~$0.002 |
| Content Summary | Extracts topic, argument, audience, tone | Claude / MiniMax | ~$0.002 |
| Brief Matching | Coverage vs. an uploaded content brief | Claude / MiniMax | ~$0.002 |
| Content Purpose | Classifies article type; flags missing elements | Claude / MiniMax | ~$0.002 |
Two of these are worth calling out individually.
Fact check is where the most interesting work happens. The skill extracts factual claims from the article, sends each one to a retrieval provider (Exa Search for budget, Exa Deep Reasoning for depth, or Parallel Task for the middle ground), and returns sources with relevance scores and a confidence assessment. If the claim is "the average B2B buyer engages with 13 pieces of content before purchasing," the skill tries to find a source. If it can't, that's a finding worth seeing before publish.

Self-plagiarism is for people publishing under the same byline or agency on a regular cadence. You run checkapp index <dir> once to ingest your own past articles into a local vector index. After that, every new piece gets compared against your own archive. Catches the subtle "I already said this three months ago, almost word-for-word" pattern that normal plagiarism tools can't see.
Want to try it on your next draft?
npm install -g checkapp checkapp article.mdOr start at checkapp.xyz if the terminal isn't your thing.
How it runs
Three ways, depending on how you work.
Terminal. checkapp article.md is the default. Drop the file path, get a report in the terminal, exit with a non-zero code on failure (useful for CI pipelines).
Dashboard. checkapp --ui starts a local web server at localhost:3000. The dashboard shows reports history, lets you configure providers, tracks cost, and exports to HTML or JSON. No account, no login — it runs on your machine, writes to SQLite locally.
MCP server. For anyone using Claude Code or Cursor, CheckApp exposes an MCP server. check_article becomes a native tool in your agent workflow. Your agent drafts. Your agent checks. The whole loop stays inside one conversation.
All three surfaces use the same pipeline. The CLI is the ground truth.
The cost reality
"Free, open source, BYOK" means three different things and they matter separately.
Free means CheckApp itself — the CLI, the dashboard, the MCP server — costs nothing. It's MIT-licensed. There's no subscription, no tier, no per-document fee. No login, no account, no "connect your team."
Open source means the code is at github.com/sharonds/checkapp. You can read it, fork it, modify it, run it offline, audit what it does with your content. Nothing leaves your machine except the API calls you choose to enable.
BYOK — Bring Your Own Keys means that when you configure a paid skill (fact-check via Exa, plagiarism via Copyscape), you use your own API keys. You pay the providers directly. CheckApp takes zero margin.
Three cost tiers exist in practice:
- $0.00 — LanguageTool free tier for grammar, Semantic Scholar for citations, offline SEO. This is a real, useful check that doesn't require a single paid key.
- ~$0.05 per article — Add Exa Search for fact-check and Claude Haiku for LLM skills. This is the configuration most writers land on.
- ~$0.25 per article — Add Exa Deep Reasoning, Sapling for premium grammar, Copyscape for plagiarism. This is the "nothing gets through" configuration for regulated-industry content.
Compare to Grammarly Premium at $12/month. If you publish 20 articles a month, Grammarly is $0.60 per article for grammar alone. CheckApp's grammar skill is $0.00 for 20 articles a month. The math lands in the user's favor, not the platform's.
This isn't a competitive shot at Grammarly. Grammarly is excellent at grammar. The point is that BYOK removes the platform markup that SaaS pricing depends on. You pay the providers, not the wrapper.
Who it's for
Three kinds of people, same underlying pain. Each one has a specific failure mode that CheckApp catches before it ships.
Professional and agency writers. You draft with AI, edit by hand, and publish under someone else's name. Remember the vitamin C article — 100mg brief, 200mg draft, three reviewers signed off. The cost of that miss isn't an edit round. It's the next retainer conversation. It's the client's legal team asking your agency the hard question instead of a rhetorical one. CheckApp is the systematic step between "writer done" and "editor starts," so editors can read for argument and flow — the things humans are actually good at.
Marketing and content teams. You're shipping 5–20 pieces a month under tight timelines. The founder's next byline could have a made-up statistic — like "the average B2B buyer engages with 13 pieces of content before purchasing." That exact number has been cited in at least a dozen thought-leadership pieces in the last two years. It has no source. When an investor or a journalist notices, the founder is the one holding the bag. A minute of checking prevents the apology post and the deleted tweet.
Anyone publishing in regulated spaces. Health, finance, legal, insurance, supplements. The cost of one wrong claim isn't an embarrassing correction — it's a letter from a regulator or a legal-dispute email. "Clinically proven to reduce inflammation in 72 hours." That's a straight FDA violation in a health context. The legal risk skill scans for the specific patterns — unverified health claims, false promises, defamation risk — that create those letters. One flagged finding costs less than one regulatory response.
If you're a Claude Code or Cursor user, there's a fourth flavor: the MCP server makes check_article a native tool. Your agent drafts, your agent checks, your agent reports — all inside the same session.
What makes open source different here
Most AI content tools are SaaS. Sign up, pick a tier, connect your account, pay monthly. That model exists because hosting, support, and onboarding cost money, and the markup pays for those.
CheckApp doesn't have those costs because it runs on your machine. No hosting. No support infrastructure. No onboarding funnel. Just a CLI you install once and API keys you already have or can get in 10 minutes.
That also means if I disappear tomorrow, CheckApp still works. The code is on GitHub. The providers are standard APIs. There's no platform to sunset. If you want to fork it and build your own version, the MIT license explicitly says you can.
There's another consequence worth naming: I build CheckApp based on what people use it for, not what they pay for it. There are no feature tiers to optimize, no upsell ladder to protect. If you open an issue describing a skill you want, it gets weighed on merit. If the community runs a skill I haven't built, that's the signal for what ships next.
I shipped it publicly because the interesting content quality work happens in the open. The full backstory and 8 more deep-dive posts live at checkapp.xyz/blog — how the fact-check pipeline works, the grammar skill honesty post, the BYOK economics explainer, the studio roadmap, and a real case study on five client articles.
Try CheckApp Free (No API Keys Required to Start)
If you publish content that you'll be judged on, here's the smallest useful thing to do today:
npm install -g checkapp
checkapp --setup
checkapp article.md
Start with the free providers. LanguageTool for grammar, Semantic Scholar for citations, offline SEO. That's a useful check for zero cents. Add Exa Search when you're ready to spend a few cents on fact-check.
If the terminal isn't your thing, the dashboard runs at localhost:3000 after checkapp --ui.
If the whole thing isn't your thing, the repo and the deep-dive blog live here:
- Product: checkapp.xyz
- Repo: github.com/sharonds/checkapp
- Blog with 9 deep-dives: checkapp.xyz/blog
Open an issue if you hit a bug or want a skill added. The feedback loop is short. If you're using it, you shape what ships next.
Phase Two: CheckApp Studio (Inline Fact-Checking)
CheckApp v1.2.0 is the checker. You finish writing, you run it, you fix what it finds. 12 skills, CLI + dashboard + MCP, BYOK, 338 tests passing. That pipeline is stable and shipped today.
Phase two is Studio — a web-based writing editor where findings appear inline as you type, not after you export. Same skills, same providers, same BYOK model. The only thing changing is where the results appear. It's not shipped. I'm building in public. If you have opinions on what an inline-checker should feel like, the GitHub issues are open.
For now: install it on the next draft you're about to hand off. Tell me what breaks.
That's how this thing gets better.
Related reading on this blog: What Is OpenClaw — the personal infrastructure layer I run alongside CheckApp; How I Built an AI Agent That Finds Warm Leads While I Sleep — a different slice of the "AI as infrastructure" thesis.