LLM Tokens

Understanding how LLM tokens work in AI processing

What are LLM Tokens?

LLM tokens are the fundamental units that large language models use to understand and generate text. Think of them as the "atoms" of AI language processing. Every piece of text you send to an AI model gets broken down into these tokens before the model can process it.

Unlike words in human language, tokens don't always align with word boundaries. A token might be a whole word, part of a word, a single character, or even punctuation and spaces.

How Tokenization Works

AI models don't read text the way humans do. Instead, they use algorithms (like BPE - Byte Pair Encoding) to break text into smaller pieces. The most common words become single tokens, while rare or complex words get split.

Examples:

"Hello" → 1 token (common word)
"artificial" → 1 token (common word)
"tokenization" → 2-3 tokens (less common)
"NyxoChat" → 3+ tokens (brand name, uncommon)
"こんにちは" → 3+ tokens (non-English characters)

Token Counting Rules

Here are some general rules for estimating token counts:

English text: ~4 characters = 1 token (or ~0.75 words per token)
Common words: Usually 1 token each
Numbers: Each digit is often its own token
Punctuation: Usually 1 token each
Spaces: Often merged with adjacent words
Code: Variable names and syntax use more tokens
Non-English: Characters may use 2-4× more tokens

Input vs Output Tokens

When you interact with an AI model, there are two types of token usage:

Input tokens: Your message, the conversation history, and any system instructions sent to the model
Output tokens: The AI's response back to you

Both input and output tokens count toward the total usage. Longer conversations accumulate more input tokens as the history grows.

Context Window

Every AI model has a context window – the maximum number of tokens it can process at once. This includes your input AND the model's output.

Smaller models: 4K - 8K tokens
Standard models: 32K - 128K tokens
Large context models: 200K - 1M+ tokens

* When a conversation exceeds the context window, older messages may be truncated or summarized

Tips for Efficient Token Usage

While NyxoChat handles token counting for you, being efficient still helps:

Be concise: Clear, direct prompts work better than verbose ones
Start fresh: Begin new conversations for unrelated topics
Avoid repetition: Don't repeat information the AI already knows
Use formatting wisely: Bullet points are often more efficient than paragraphs
Be specific: Precise questions get precise (shorter) answers

Tokens in NyxoChat

In NyxoChat, you don't need to worry about counting tokens yourself. Our pricing is simplified:

Standard Chat: Fixed per-message pricing in Comets
Nebula: Every 5,000 AI tokens = 1 message price

We handle the complexity behind the scenes so you can focus on your conversations!

Comets & Pricing LLM Terminology Back to Documentation