🧠 Built by SuperML.dev · SuperML.org

Having issues with buttons or file uploads? If tools aren't responding, please or press Ctrl+F5 (or Cmd+R on Mac).

← Back to Blog

AI Token Counter — Count Tokens for GPT, Claude, and Gemini for Free

Count tokens in your prompts before sending them to OpenAI, Anthropic, or Google APIs. Estimate costs and stay within context limits. Runs entirely in your browser.

LLMs don’t process text character by character or word by word — they process tokens. A token might be a word, part of a word, punctuation, or whitespace. SimpleTools Token Counter counts tokens for GPT-4, Claude, Gemini, and other models so you can estimate costs, manage context limits, and optimise your prompts.

Why Token Count Matters

Cost estimation: OpenAI, Anthropic, and Google charge per token. Counting tokens before sending lets you estimate costs and avoid surprises.

Context limit management: Every LLM has a maximum context window (e.g., GPT-4o: 128K tokens, Claude 3.5 Sonnet: 200K tokens). If your input + output exceeds this, the API returns an error or truncates the response.

Prompt optimisation: Understanding which parts of your prompt consume the most tokens helps you trim unnecessary content.

Budget tracking: Teams building AI-powered applications need to track token usage per feature to allocate costs.

Token Counting by Model

Different models use different tokenizers, so the same text produces different token counts:

Model FamilyTokenizerTokens for “Hello, world!”
GPT-4, GPT-3.5, GPT-4otiktoken (cl100k_base)4
Claude 3 (all versions)Claude tokenizer~4
Gemini 1.5 ProSentencePiece~4
Llama 2/3tiktoken variant~4

For English text, a rough approximation is: 1 token ≈ 4 characters ≈ 0.75 words. But this varies significantly for code, non-English text, and special characters.

Features

  • Multi-model support: Select from GPT-4o, GPT-3.5, Claude 3 variants, Gemini, Llama, and more
  • Live counting: Token count updates as you type
  • Cost estimation: Based on current published pricing for the selected model
  • Context window indicator: Visual indicator showing what percentage of the model’s context window your text uses
  • Token breakdown: Optionally see which tokens map to which text segments
  • Multi-segment analysis: Count tokens for system prompt, user message, and assistant response separately

Why Count Tokens in Your Browser?

Prompts often contain proprietary content — system instructions, business logic, sensitive user data. Sending prompts to a third-party token counting service is unnecessary:

Your prompts never leave your browser
Works offline — count tokens without internet
No API key required
Free for any volume of text

How It Works

For GPT models, the tool uses tiktoken-js — the JavaScript port of OpenAI’s tiktoken tokenizer library, compiled to run in the browser via WebAssembly:

import { encoding_for_model } from 'tiktoken';
const enc = encoding_for_model('gpt-4o');
const tokens = enc.encode(text);
console.log(tokens.length); // Number of tokens
enc.free(); // Free WASM memory

The tiktoken library uses byte-pair encoding (BPE) — the exact same algorithm used by the OpenAI API — ensuring accurate token counts.

For Claude models, the tool uses Anthropic’s published tokenizer details to provide accurate estimates.

How to Use the Token Counter

  1. Visit simpletools.one/token-counter
  2. Select the target model from the dropdown
  3. Paste or type your text (or your full messages array)
  4. See the token count and estimated cost instantly
  5. The context bar shows what percentage of the model’s limit you’re using
  6. Trim your prompt if you’re close to the limit

Token Counting Tips

System prompts consume tokens too: Your system prompt is part of every request — 500-token system prompt × 1000 requests/day = 500K input tokens per day just from the system prompt.

Few-shot examples are expensive: Adding 5 example user/assistant pairs might add 1000+ tokens to every request.

JSON formatting costs tokens: {"role": "user", "content": "..."} adds overhead compared to plain text.

Code is more token-efficient than English: Programming syntax tokenises more efficiently than natural language prose in most models.

Non-English text varies: Languages with more complex morphology (Arabic, Finnish, Hungarian) may tokenise less efficiently than English, costing more per word.

Context Window Reference

ModelContext WindowTypical Max Output
GPT-4o128K tokens16K tokens
GPT-4 Turbo128K tokens4K tokens
Claude 3.5 Sonnet200K tokens8K tokens
Claude 3 Opus200K tokens4K tokens
Gemini 1.5 Pro1M tokens8K tokens
Llama 3 70B8K tokens4K tokens

Count your tokens at simpletools.one/token-counter — accurate, private, and completely free.

Enjoyed this post?

Subscribe to our newsletter or explore more privacy-friendly tools!

Explore Tools