You’re deep in a building sesh. You need a better structured prompt and Cursor/Claude Code’s plan mode is a bit overkill, or a quick explanation of some output/logs, or you need a SQL query based on types in your code. The old workflow: alt-tab to a browser, open a new chat window, wait for the page to load, type your question/prompt and paste whatever content you brought from your IDE/terminal output, copy the result, switch back. Repeat 47+ times a day.
Keep it in your terminal instead.
Raypaste is an open-source CLI tool that puts AI text generation where devs currently spend most of their time — their IDE/terminal. A single command, a streamed response, output copied to clipboard. No distractions. It’s built on OpenRouter, which means you can route to virtually any LLM — but by default raypaste uses ultra-fast open-source models running on Cerebras chips, delivering responses in ~100–900ms.
Key Features
Raypaste is fast and convenient, fitting nicely inbetween heavy hitting AI IDE’s/tools like Cursor, Codex, and Claude Code.
- Responses in milliseconds — responses in ~100–900ms using open-source models running on Cerebras hardware.
- Interactive mode with slash commands — stay in the loop, get instant responses and use
/model,/prompt,/lengthto update settings on the fly. - Project context awareness — automatically loads context from
CLAUDE.md,AGENTS.md, or.cursor/rules/files to make generated prompts more relevant to your project. - Prompt templating — add and use your own reusable system prompts.
- Model aliasing — bring your own key and use any model through open router.
Installation & Quick Start
Getting started takes less than two minutes.
1. Install raypaste
The easiest way on macOS is Homebrew:
brew install --cask raypaste/tap/raypaste
Or install with Go:
go install github.com/raypaste/raypaste-cli@latest
Or build from source:
git clone https://github.com/raypaste/raypaste-cli.git
cd raypaste-cli
./build
2. Get an OpenRouter API key
Raypaste routes requests through OpenRouter. Grab a key at openrouter.ai/keys.
Cerebras BYOK: For maximum speed, you can bring your own Cerebras API key and add it under OpenRouter > Settings > BYOK > Cerebras API key. As of Feb 2026 you get ~1 million free tokens per day from Cerebras.
3. Set your API key
The recommended approach is the config command:
raypaste config set api-key your_api_key_here
Or export it as an environment variable:
export RAYPASTE_API_KEY=your_api_key_here
To make the environment variable permanent, add it to your shell config:
# zsh (macOS default)
echo 'export RAYPASTE_API_KEY=your_api_key_here' >> ~/.zshrc
source ~/.zshrc
Important: Go programs don’t automatically load
.envfiles. You must either export the variable or use the config command above. Aconfig.yamlin your project directory also won’t work — the CLI reads from~/.raypaste/config.yaml.
4. Run your first prompt
raypaste "help me write a blog post about AI tools for developers"
You’ll see the response stream into your terminal. When it finishes, the full text is copied to your clipboard.
Interactive Mode
For longer sessions — brainstorming, shipping a new feature or change, writing docs — keeping a terminal open with Raypaste’s interactive mode has been helpful for my daily work.
Launch it with:
raypaste interactive
# or
raypaste i
Type your text/question that goes with your selected prompt, press Enter, and get an instant response back. Each request is stateless — Raypaste doesn’t maintain conversation history, which keeps token usage predictable and responses fast.
Slash Commands
| Command | What it does |
|---|---|
/model <alias> | Switch the active model for subsequent requests |
/prompt <name> | Load a prompt template from ~/.raypaste/prompts/ |
/length <short|medium|long> | Override the response length |
/copy | Copy the last response to clipboard |
/clear | Clear the screen |
/help | Show available commands |
/quit or /exit | Exit interactive mode |
Keyboard Shortcuts
Ctrl+C— cancel the current generation (if you’re fast enough)Ctrl+D— exit interactive mode
Customization
Raypaste out of the box is configured to help write meta prompts for use in your favorite AI code editor. Here’s how to make it yours.
Custom Prompt Templates
Create and manage reusable prompts via the CLI:
Interactive mode:
raypaste config prompt add ascii-art
Or define inline:
raypaste config prompt add ascii-art \
--description "Convert text to ASCII art" \
--system "You are an ASCII art expert..." \
--medium "400"
Use it:
raypaste "coffee cup" -p ascii-art
Managing Prompts
raypaste config prompt list # List all prompts
raypaste config prompt show <name> # Show details
raypaste config prompt remove <name> # Remove custom prompt
Length Directives
Control output size per mode using either:
- Token counts — plain integers set
max_tokensdirectly - Text directives — injected into
{{.LengthDirective}}in your system prompt
# Token counts
length_directives:
short: "200"
medium: "600"
# Text directives
length_directives:
short: "Be concise, 2-3 sentences."
medium: "Provide balanced detail."
Omit unwanted lengths to restrict a prompt to specific modes.
Template Variables
{{.LengthDirective}}— Replaced with active length directive (empty for token counts){{.Context}}— Replaced with project context when available
You can also create prompts manually in ~/.raypaste/prompts/<name>.yaml:
name: code-review
description: "Generate a code review prompt"
system: "You are a code review expert... Output length guidance: {{.LengthDirective}}"
length_directives:
short: "200"
medium: "600"
Model Aliases
Define custom model aliases in ~/.raypaste/config.yaml:
models:
sonnet-4.6:
id: "anthropic/claude-sonnet-4.6"
provider: "anthropic"
tier: "powerful"
Then use the alias with the -m flag:
raypaste "explain this stack trace" -m sonnet-4.6
You can also use any full OpenRouter model ID directly without defining an alias:
raypaste "hello" -m "anthropic/claude-sonnet-4.6"
Built-in Models
| Alias | Model | Provider |
|---|---|---|
cerebras-llama-8b | meta-llama/llama-3.1-8b-instruct | Cerebras |
cerebras-gpt-oss-120b | openai/gpt-oss-120b | Cerebras |
openai-gpt5-nano | openai/gpt-5-nano | OpenAI |
Project Context Awareness
raypaste automatically loads context from your project to make prompts more relevant. It searches for (in order):
CLAUDE.md— Claude-specific guidanceAGENTS.md— Agent configuration.cursor/rules/— Cursor/Claude AI rules
Found context is injected via {{.Context}} in your prompts. The status message shows which file was loaded.
raypaste "refactor to use interfaces" -p metaprompt
# Context from CLAUDE.md automatically included
Configuration Reference
Configuration is loaded in this order (later sources override earlier ones):
- Default values
~/.raypaste/config.yaml- Environment variables (
RAYPASTE_*) - CLI flags
A full config file looks like:
# ~/.raypaste/config.yaml
api_key: "your_api_key_here"
default_model: cerebras-gpt-oss-120b
default_length: medium
disable_copy: false
temperature: 0.7
models:
my-custom-model:
id: "provider/model-name"
provider: "provider-name"
tier: "fast"
# more custom models...
Conclusion
Raypaste is a tool I made for myself and have found some success so far with. It’s a tool for developers who don’t want to context-switch for repetitive AI queries. Sure you could use a Gemini Gem, and I still do sometimes, but after building Raypaste and using it for a week I can say there’s some good value in getting to stay in your terminal and have your preset prompts easily accessible. It’s fast, composable, and built to do one thing, do it well, and play nicely with everything else in your workflow.
If you find yourself alt-tabbing to a chat interface more than a few times a day, try out Raypaste and see if it works for you.
The source code, issues, and contribution guidelines are all at github.com/raypaste/raypaste-cli. PRs welcome — especially new built-in prompt templates and model alias ideas.
Keep building and shipping!