AI Token Counter — Count Tokens for GPT, Claude, and Gemini for Free
Count tokens in your prompts before sending them to OpenAI, Anthropic, or Google APIs. Estimate costs and stay within context limits. Runs entirely in your browser.
LLMs don’t process text character by character or word by word — they process tokens. A token might be a word, part of a word, punctuation, or whitespace. SimpleTools Token Counter counts tokens for GPT-4, Claude, Gemini, and other models so you can estimate costs, manage context limits, and optimise your prompts.
Why Token Count Matters
Cost estimation: OpenAI, Anthropic, and Google charge per token. Counting tokens before sending lets you estimate costs and avoid surprises.
Context limit management: Every LLM has a maximum context window (e.g., GPT-4o: 128K tokens, Claude 3.5 Sonnet: 200K tokens). If your input + output exceeds this, the API returns an error or truncates the response.
Prompt optimisation: Understanding which parts of your prompt consume the most tokens helps you trim unnecessary content.
Budget tracking: Teams building AI-powered applications need to track token usage per feature to allocate costs.
Token Counting by Model
Different models use different tokenizers, so the same text produces different token counts:
| Model Family | Tokenizer | Tokens for “Hello, world!” |
|---|---|---|
| GPT-4, GPT-3.5, GPT-4o | tiktoken (cl100k_base) | 4 |
| Claude 3 (all versions) | Claude tokenizer | ~4 |
| Gemini 1.5 Pro | SentencePiece | ~4 |
| Llama 2/3 | tiktoken variant | ~4 |
For English text, a rough approximation is: 1 token ≈ 4 characters ≈ 0.75 words. But this varies significantly for code, non-English text, and special characters.
Features
- Multi-model support: Select from GPT-4o, GPT-3.5, Claude 3 variants, Gemini, Llama, and more
- Live counting: Token count updates as you type
- Cost estimation: Based on current published pricing for the selected model
- Context window indicator: Visual indicator showing what percentage of the model’s context window your text uses
- Token breakdown: Optionally see which tokens map to which text segments
- Multi-segment analysis: Count tokens for system prompt, user message, and assistant response separately
Why Count Tokens in Your Browser?
Prompts often contain proprietary content — system instructions, business logic, sensitive user data. Sending prompts to a third-party token counting service is unnecessary:
✅ Your prompts never leave your browser
✅ Works offline — count tokens without internet
✅ No API key required
✅ Free for any volume of text
How It Works
For GPT models, the tool uses tiktoken-js — the JavaScript port of OpenAI’s tiktoken tokenizer library, compiled to run in the browser via WebAssembly:
import { encoding_for_model } from 'tiktoken';
const enc = encoding_for_model('gpt-4o');
const tokens = enc.encode(text);
console.log(tokens.length); // Number of tokens
enc.free(); // Free WASM memoryThe tiktoken library uses byte-pair encoding (BPE) — the exact same algorithm used by the OpenAI API — ensuring accurate token counts.
For Claude models, the tool uses Anthropic’s published tokenizer details to provide accurate estimates.
How to Use the Token Counter
- Visit simpletools.one/token-counter
- Select the target model from the dropdown
- Paste or type your text (or your full messages array)
- See the token count and estimated cost instantly
- The context bar shows what percentage of the model’s limit you’re using
- Trim your prompt if you’re close to the limit
Token Counting Tips
System prompts consume tokens too: Your system prompt is part of every request — 500-token system prompt × 1000 requests/day = 500K input tokens per day just from the system prompt.
Few-shot examples are expensive: Adding 5 example user/assistant pairs might add 1000+ tokens to every request.
JSON formatting costs tokens: {"role": "user", "content": "..."} adds overhead compared to plain text.
Code is more token-efficient than English: Programming syntax tokenises more efficiently than natural language prose in most models.
Non-English text varies: Languages with more complex morphology (Arabic, Finnish, Hungarian) may tokenise less efficiently than English, costing more per word.
Context Window Reference
| Model | Context Window | Typical Max Output |
|---|---|---|
| GPT-4o | 128K tokens | 16K tokens |
| GPT-4 Turbo | 128K tokens | 4K tokens |
| Claude 3.5 Sonnet | 200K tokens | 8K tokens |
| Claude 3 Opus | 200K tokens | 4K tokens |
| Gemini 1.5 Pro | 1M tokens | 8K tokens |
| Llama 3 70B | 8K tokens | 4K tokens |
Count your tokens at simpletools.one/token-counter — accurate, private, and completely free.