The friction nobody talks about
You use AI tools constantly — ChatGPT for drafts, Claude for analysis, Gemini for quick lookups. The workflow is always the same: switch to the browser, open the chat, type your prompt, wait, copy the response, switch back to your app, paste. That is five steps for every single AI interaction.
Voice dictation helps with typing speed, but it does nothing about this context-switching. You are still jumping between apps, still copying and pasting, still losing focus on what you were actually doing.
My Agent closes that gap entirely. Hold a hotkey, speak your instruction, and the AI's response is typed directly at your cursor — in VS Code, in Gmail, in Slack, in your terminal. You never leave your app.
How My Agent works
The flow from your voice to the AI's typed output has five steps, all handled automatically:
-
Hold your agent hotkey
A separate configurable combo from normal transcription (e.g. Ctrl+Alt+Space).
-
Speak your instruction
Naturally, as you would to a person.
-
Whisper transcribes locally
On your CPU, with no audio sent anywhere.
-
Text goes to your AI
Directly from AirTypes to your API endpoint — not via our servers.
-
AI response is typed at your cursor
In whatever app had focus when you pressed the hotkey.
The whole cycle takes a few seconds. Most of that is your AI provider's response time. The Whisper transcription step is sub-second on most hardware.
Agent Profiles: say the name, get the right AI behaviour
The most powerful part of My Agent is not the hotkey — it is the profile system. You create named profiles, each with a system prompt that tells the AI how to behave. Then you activate a profile simply by saying its name at the start of your speech.
Examples of profiles in practice
Every profile works the same way: say the profile name first, then your instruction. AirTypes matches the name, removes it from what the model sees, and applies that profile's system prompt. Below, each card shows what you say out loud and what ends up typed at your cursor.
“Email, tell Sarah the sprint review is Thursday at 3pm”
A polished, ready-to-send email body — not a meta explanation.
“Prompt, I need a dashboard showing weekly revenue by plan type”
A clear, structured AI prompt from your casual description.
“Ticket, login button not responding on mobile Safari after update”
A formatted report: repro steps, expected vs actual, context.
“Casual, can we push standup to ten?”
A short, friendly message in the right register for team chat.
“Write a one-paragraph summary of this meeting”
The model treats your whole line as a normal instruction — no named profile, no extra system prompt from a profile.
Bring Your Own AI — any OpenAI-compatible provider
My Agent is not tied to any single AI provider. It works with any API that speaks the OpenAI chat completions format — which covers virtually every major provider today:
- OpenAI — GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
- Anthropic Claude — via OpenRouter (claude-opus-4, claude-sonnet-4)
- Google Gemini — via OpenRouter or Gemini API endpoint
- Groq — ultra-fast Llama 3, Mistral inference
- Ollama — fully local models, zero internet required end-to-end
- Any OpenRouter model — 100+ models via one API key
- Nebius, Together AI, Fireworks — and any other OpenAI-compatible host
To configure My Agent, open Settings → My Agent and enter three things:
the endpoint URL (e.g. https://api.openai.com/v1),
your API key, and the model name (e.g. gpt-4o).
That is it.
Privacy: your data never touches our servers
This is the part that matters most. When My Agent runs:
- Your API key is stored locally on your device only — never uploaded to AirTypes
- The HTTP call to your AI provider is made directly from the AirTypes desktop app — not via any AirTypes proxy or backend
- AirTypes never sees your prompt, your AI's response, or the content of your voice
- If you use Ollama, there is zero internet traffic at all — Whisper runs locally, the AI runs locally, everything stays on your machine
The privacy architecture is not a setting you have to configure — it is how the feature is built. AirTypes acts as a local bridge between your voice and your AI. We are not in the middle of that conversation.
Your API key never leaves your device. Your voice travels directly from AirTypes to your chosen AI provider — encrypted in transit, never stored by us. We see nothing.
Real use cases people are using it for today
Writing emails without typing
Hold the agent hotkey, say "Email, reply to Marcus telling him the contract is approved and we can start Monday" — and a ready-to-send professional email appears in your compose window. No drafting, no editing, no tab switching.
Creating AI prompts in creative flow
Developers and designers often have an idea mid-flow but break focus to write a prompt. With the Prompt profile, you speak the idea casually and My Agent converts it into a structured, well-formed prompt that produces better AI output.
Bug tickets from a quick voice note
Spot a bug while testing? Hold the hotkey, say "Ticket, the save button on the settings page throws a 500 when the user has no billing info" — and a formatted bug report appears in your Linear or Jira field.
Slack messages in the right tone
Different audiences need different tones. The Casual profile is great for team channels. The Professional profile handles client communication. You speak once, the AI handles the register.
Spoken bullets → polished docs
In Notion, Confluence, or a README, ramble the points you want to cover. My Agent rewrites them as clear prose or a tight outline at your cursor — no second window for cleanup.
Fully local workflows (Ollama)
Combine AirTypes with a local Ollama model and the entire pipeline — voice capture, transcription, AI processing, text injection — runs on your machine. No internet, no API costs, no data leaving your device.
My Agent vs normal voice dictation: when to use each
Same app, two modes. Below, scenarios are grouped where each mode is the better default — four situations per side, so the split stays easy to scan.
Showing both modes, eight scenarios in two columns.
Transcription
Normal dictation
Speech is converted to text and injected as-is. Choose this when the exact words matter and you do not want the model to rewrite or format.
- Meeting notes verbatim Keep a faithful record — no paraphrase or auto-summary.
- Code comments in your own words Technical wording should stay yours; avoid “helpful” rewrites.
- Filling in a form with specific data Names, IDs, and figures need to land exactly as spoken.
- Dictating in a foreign language Preserve the transcribed language and register without the model switching tone or locale.
AI-assisted output
My Agent
Speech becomes a request to your model; the reply is typed at the cursor. Use this when you want structure, tone, or formatting — not a raw transcript.
- Drafting a professional email Polished body text from a casual voice note — still in your compose window.
- Rambling idea → structured prompt Turn messy speech into a clear prompt you can reuse or refine.
- Quick Slack message in the right tone Profiles match channel or audience; dictation alone pastes how you talked.
- Bug ticket from a quick description Repro steps, expected vs actual, and structure — not one unstructured blob.
The two modes complement each other. Normal dictation is fastest for direct, literal text. My Agent is the right tool whenever you want the AI to do work on top of what you said.
How to get started with My Agent
- Download AirTypes — available for Linux & macOS; Windows in development
- Set up your API credentials — Settings → My Agent → enter your endpoint URL, API key, and model name
- Configure your agent hotkey — Settings → Hotkey → Agent Hotkey section, then click Save Settings
- Create your first profile — Settings → My Agent → My Profiles → + New. Name it "Email" and write a system prompt like: "You are a professional email writer. Rewrite the user's casual voice note as a clear, polished email. Output only the email body."
- Use it — switch to your email app, hold your agent hotkey, say "Email, [your message]", release
You can create up to 20 profiles, each with its own system prompt. They sync to your account so they work across reinstalls and devices.
Conclusion
Voice dictation and AI tools have been parallel workflows. My Agent connects them. Your voice becomes the input. Your AI becomes the processor. Your cursor — in whatever app you are working in — is where the output lands.
There is no copy-paste. No tab switching. No breaking your flow. And because your API key never leaves your machine, there is no privacy trade-off either.