How Tap2Talk Works: Hold, Speak, Release, Done
A step-by-step walkthrough of how Tap2Talk works. Download, paste your API key, hold Right Alt, speak, release, and text appears at your cursor.
Tap2Talk is a push-to-talk dictation app. You hold a key, speak, release the key, and your words appear as text wherever your cursor is. That’s the whole concept. This post walks through exactly how tap2talk works, step by step, from installation to daily use.
The Core Loop
Everything in Tap2Talk revolves around one action:
- Hold Right Alt (or Right Ctrl)
- Speak
- Release the key
- Text appears at your cursor
That’s it. No app to switch to. No button to click. No voice command to say. You hold a key on your keyboard, talk, and let go. The transcribed, cleaned-up text pastes directly into whatever application has focus — your email client, a chat window, a document, a browser form, a code editor. Anything with a text cursor.
What Happens Under the Hood
When you hold Right Alt and start talking, here’s what Tap2Talk does:
Step 1: Record. Your microphone captures audio for as long as you hold the key. The moment you release, recording stops. No audio is captured before you press or after you release.
Step 2: Transcribe. The audio is sent to Groq’s Whisper API for speech-to-text transcription. Groq runs Whisper on custom hardware that makes transcription extremely fast — you typically get results in under a second, even for longer clips.
Step 3: Clean up. The raw transcript passes through Groq’s LLM (Llama). This step is always on and automatic. It fixes grammar, adds punctuation, removes filler words (“um,” “uh,” “like”), and produces clean, readable text. You didn’t need to say “period” or “comma” — the LLM figures it out from context.
Step 4: Paste. The cleaned text is placed on your clipboard and pasted at your cursor position using Cmd+V (macOS) or Ctrl+V (Windows). It lands exactly where you were typing.
The whole process — record, transcribe, clean, paste — typically takes one to two seconds after you release the key.
Setup: Five Minutes to Working Dictation
1. Download and Install
Tap2Talk runs on macOS (Apple Silicon) and Windows 11. Download from tap2talk.app/buy.
On macOS, open the DMG and drag to Applications. On Windows, run the installer. Standard stuff.
2. Get a Groq API Key
Tap2Talk uses the Groq API for transcription and text cleanup. You need your own API key.
- Go to console.groq.com
- Sign up (free)
- Create an API key
- Copy it
Groq’s free tier includes 2,000 API requests per day, which covers most users at no cost. If you exceed the free tier, paid usage is roughly $0.04 per hour of audio — but most people never pay a cent.
3. Paste Your API Key
Open Tap2Talk’s settings and paste your Groq API key. That’s the only configuration required to start dictating.
4. Start Dictating
Hold Right Alt. Speak. Release. Text appears. You’re done.
The Hotkeys
Tap2Talk uses two hardcoded hotkeys:
- Right Alt — primary hotkey
- Right Ctrl — secondary hotkey (same function, alternative key)
These are global hotkeys. They work no matter which application is in the foreground. You don’t need to have Tap2Talk focused or even visible.
Why Right Alt and Right Ctrl? Because they’re easy to reach, rarely conflict with other shortcuts, and they’re on the right side of the keyboard where your hand naturally rests. They’re not configurable — Tap2Talk keeps things simple by removing that decision.
Lock Mode for Longer Dictation
Holding a key works great for a sentence or two. For longer content, Tap2Talk has lock mode.
Double-tap Right Alt to lock recording on. Now you can let go of the keyboard. Speak for as long as you need — hands free. When you’re done, tap Right Alt once to stop.
Lock mode has a 10-minute timeout for safety. If you forget to stop, it stops automatically.
This is perfect for long emails, reports, meeting notes, or any time you want to dictate more than a quick sentence. Read the full guide: Lock Mode: Hands-Free Dictation Without Always Listening.
LLM Cleanup: Always On, Always Helpful
Every transcription passes through Groq’s LLM before it reaches your cursor. This cleanup step:
- Fixes grammar and sentence structure
- Adds proper punctuation (periods, commas, question marks)
- Removes filler words and false starts
- Produces clean, professional text
You don’t need to speak perfectly. Talk naturally — say “um” and “uh” and “like” as much as you want. The LLM strips them out. Start a sentence, change your mind, restart it — the LLM sorts it out.
This is one of the biggest differences between Tap2Talk and basic dictation tools. You get polished text, not a raw transcript.
Custom Prompt: Make the Cleanup Your Own
The default LLM cleanup works well for general dictation. But if you have specific needs, you can write a custom prompt that tells the LLM exactly how to process your text.
Examples:
- “Always use British English spelling”
- “Format output as bullet points”
- “Use a professional, formal tone”
- “Convert to past tense”
- “Keep it concise — no more than two sentences”
The custom prompt is applied to every transcription. Set it once and forget it, or change it depending on what you’re working on.
Custom Words: Teach It Your Vocabulary
Speech-to-text engines sometimes struggle with uncommon words — brand names, technical terms, jargon, acronyms, proper nouns. Tap2Talk lets you add a custom words list to improve recognition.
If you regularly say “Kubernetes” or “HIPAA” or “Xero” or your company’s product name, add it to the list. The transcription engine will recognize these words more accurately.
Remote Dictation
Tap2Talk works with remote desktop applications:
- Chrome Remote Desktop
- Microsoft RDP
- Parsec
When you’re connected to a remote machine, Tap2Talk detects the remote desktop session and sends the transcribed text to the remote side. You speak on your local machine, and the text appears in the remote application.
Works in Any App
Tap2Talk doesn’t integrate with specific applications. It doesn’t need plugins, extensions, or API connections. It works by pasting text at your cursor — the same way Cmd+V or Ctrl+V works.
This means it works in:
- Email clients (Gmail, Outlook, Apple Mail, Thunderbird)
- Word processors (Google Docs, Word, Pages, Notion)
- Chat apps (Slack, Teams, Discord, WhatsApp Web)
- Browsers (any text field on any website)
- Code editors (VS Code, Cursor, Sublime Text)
- CRM and business tools (Salesforce, HubSpot, Xero)
- Literally any application where you can type
If you can put your cursor in a text field and press Cmd+V, Tap2Talk works there.
Platforms
- macOS: Apple Silicon (M1, M2, M3, M4). Intel Macs are not supported.
- Windows 11: Full support.
Pricing
Tap2Talk is a one-time purchase. Lifetime license. No subscription, no recurring charges. Check tap2talk.app/buy for current pricing.
Groq’s free tier (2,000 requests/day) covers most users at no ongoing cost. If you exceed the free tier, usage runs about $0.04 per hour of dictation.
Alternatively, you can get Tap2Talk free by referring 10 friends.
Try Tap2Talk — one-time purchase, no subscription. Or get it free by referring 10 friends.
FAQ
Do I need to pay for a Groq API key?
Signing up at console.groq.com is free. Groq’s free tier includes 2,000 requests per day, which covers both STT and LLM usage for most users at no cost. If you exceed the free tier, speech-to-text costs roughly $0.04 per hour of audio.
Can I change the hotkey from Right Alt to something else?
No. Tap2Talk uses hardcoded hotkeys — Right Alt (primary) and Right Ctrl (secondary). This is a deliberate design choice to keep the app simple and avoid configuration complexity. Both keys are global and work in any application.
Does Tap2Talk work offline?
No. Tap2Talk requires an internet connection because transcription and LLM cleanup happen via the Groq cloud API. There is no local/offline speech-to-text mode.
Ready to ditch typing?
Tap2Talk is $69 once — no subscription, no limits. Or get it free by referring 10 friends.