Chat LLM & LLM Pro — Support

If you need help with Chat LLM or LLM Pro, including model selection, image generation, API keys, subscriptions, or any other issue, you can contact support directly at:
Email: ferrraridave@icloud.com

Useful links:
- Privacy Policy
- Terms of Use

Frequently Asked Questions

What are Chat LLM and LLM Pro?

Chat LLM is a privacy-focused AI chat app for iPhone and iPad. It offers a free cloud AI model out of the box, an optional on-device AI model that runs entirely offline once downloaded, and optional bring-your-own-key features: access to 300+ models via OpenRouter, real-time voice chat with Google Gemini or xAI Grok, a connection to your own Mac's local models through LM Studio, and a document Reader that speaks PDFs, EPUBs, text, and webpages aloud using your choice of premium text-to-speech voices. Your settings and system prompts back up to your own iCloud automatically.

LLM Pro is a premium AI chat and image generation app for iPhone and iPad. It combines high-performance cloud AI models, real-time web search, vision capabilities, AI image generation, real-time voice chat, and spoken playback of AI replies, with clear usage limits.

Do I need an account or API key?

Chat LLM:
No account or API key is required to get started. The app includes free cloud AI access powered by Groq, which works immediately on first launch. You may optionally add your own API keys to unlock more: OpenRouter (300+ models and web search), Google Gemini (voice chat), xAI (Grok voice chat), and the Reader text-to-speech providers OpenRouter (Kokoro), MiniMax, and ElevenLabs. You can also connect to your own Mac running LM Studio, and optionally add a Groq key for tidier webpage import in Reader.

LLM Pro:
No account or API key is required to get started. The app includes free cloud AI access powered by Groq. A premium subscription unlocks additional cloud models powered by Groq and Cerebras, AI image generation, and spoken playback of AI replies — all included, no third-party keys required. Optional bring-your-own API keys are also supported for advanced users, including an xAI key for real-time voice chat.

What models are supported?

Chat LLM:
- Free cloud AI model powered by Groq (works immediately, no key required)
- Optional on-device AI model (Llama 3.2, downloaded on demand from the model picker — runs locally and works offline once installed)
- Optional access to 300+ third-party models via OpenRouter (requires your own OpenRouter key)
- Optional connection to your own Mac's local models through LM Studio over Tailscale
- Optional real-time voice chat with Google Gemini Live (requires a Gemini API key) or xAI Grok (requires an xAI API key)
- Optional Reader text-to-speech via OpenRouter Kokoro, MiniMax, or ElevenLabs (requires your own key for the chosen provider)

LLM Pro:
- Free cloud AI model powered by Groq
- Premium subscription unlocks curated high-performance cloud AI models powered by Groq and Cerebras
- Vision-capable models for image analysis
- Spoken playback of AI replies powered by Kokoro TTS via Replicate (included with subscription, no key required)
- Optional real-time voice chat powered by xAI's Grok voice model (requires an xAI API key)
- Optional bring-your-own API keys for access to additional third-party models

What is AI image generation in LLM Pro?

LLM Pro includes built-in AI image generation and image-to-image editing. Image generation and editing are processed via Replicate. The premium subscription includes up to 100 images per month. Image usage is clearly displayed in-app.

Connect to your own Mac with LM Studio (Chat LLM only)

Chat LLM can connect to LM Studio running on your own computer and use any model you have loaded there — from your iPhone or iPad, anywhere — over a private Tailscale connection. You can browse, load, and unload your LM Studio models directly from within Chat LLM, with a built-in connection test, and the chat experience is identical to every other provider.

Because the models run on your own hardware, there are no per-message API credits, no rate limits, and no content restrictions — the only cost is your own electricity and broadband. Your chat content is sent over your private network connection to your own machine; it is not sent to us or to any third-party cloud provider.

What you need: LM Studio installed on your computer, and Tailscale installed on both your computer and your iPhone/iPad (both free).

What is real-time web search?

Chat LLM supports two forms of real-time web search:

OpenRouter Web Search (text chat):
When enabled in settings, OpenRouter performs a live web search before the selected model responds and feeds the results into its context. This works with any OpenRouter model and is useful for current events, news, prices, or anything beyond the model's training data. Search results are region-aware: the app passes your device locale to the search step so users outside the US receive region-relevant sources on the first attempt. A small per-search fee (~$0.004) is charged to your OpenRouter account on top of the normal model cost.

Gemini Voice Chat Web Search:
Real-time voice chat powered by Gemini Live includes built-in Google Search grounding at no extra cost. Gemini automatically searches the web when you ask about current events, weather, sports results, or other live information, and weaves the results naturally into its spoken reply.

LLM Pro:
Real-time web search is available on supported premium models and requires an internet connection.

What is real-time Voice Chat?

Chat LLM:
Chat LLM offers real-time spoken conversations with AI through two optional backends, each using your own API key:

Google Gemini Live (requires a Gemini API key): natural real-time conversation with a choice of voices, built-in Google Search grounding so it can answer questions about current events and live information, and the ability to attach photos or camera images mid-conversation for Gemini to discuss. Spoken input and conversation content are processed by Google.

xAI Grok (requires an xAI API key): a fast, real-time voice backend with its own set of voices. Spoken input and conversation content are processed by xAI.

Each backend has its own system prompts, so the assistant's personality, accent, speech style, and behaviour can be shaped independently for each one.

LLM Pro:
Real-time Voice Chat enables natural spoken conversations with AI powered by xAI's Grok voice model. This feature requires your own xAI API key. Spoken input and conversation content are processed by xAI. The voice model performs background reasoning so it can think through more challenging questions before answering, with no impact on response latency. The voice assistant's personality, style, and behaviour can be shaped through the system prompt.

How do I hear an AI reply read aloud? (Chat LLM)

In Chat LLM, spoken playback is handled by Reader, the app's dedicated playback screen. To hear any AI reply aloud, long-press the message and share it into Reader, where it is read using your chosen text-to-speech voice with full transport and lock-screen controls. See Reader mode below for the available voices.

LLM Pro: Spoken playback is included as part of the LLM Pro premium subscription — no API key required. A play button, voice catalogue, speed controls, and audio cache settings are available, with generation costs absorbed by the subscription.

What is Reader mode? (Chat LLM only)

Reader mode turns Chat LLM into an audiobook player for your own content. Select Reader from the model picker, then import a PDF, EPUB, or TXT file, paste any text, or share a webpage or document into the app from the iOS share sheet — Reader speaks it aloud, sentence by sentence.

Choose from three text-to-speech providers, set per document: OpenRouter Kokoro (the dirt-cheap everyday option), MiniMax (very high quality and great value), and ElevenLabs (best-in-class, most expressive narration). Each uses your own API key for the chosen provider.

Tap any sentence to jump straight to it, and adjust voice and playback speed per document. Your position in each document is saved automatically so you can pick up exactly where you left off, and a built-in library keeps all your imported documents in one place. Your MiniMax and ElevenLabs account voices sync automatically each time you open Reader, you can search long voice lists, and you can star favourites. Generation costs are billed to your own provider account — a fraction of a cent per request with Kokoro; MiniMax and ElevenLabs vary by plan. Reader mode is only available in Chat LLM.

Do my settings sync across devices? (Chat LLM)

Yes. Chat LLM backs up your settings and all of your system prompts — including the individual system prompts for each provider and the secondary prompts used by the voice providers — to your own iCloud account, so they restore automatically across your devices and after a reinstall. This uses Apple's iCloud and we have no access to it. Your conversations and imported Reader documents are deliberately not synced.

How is my data handled?

On-Device AI (Local Llama model):
When using the on-device Llama model in Chat LLM, all processing happens entirely on your device. No messages, data, or personal information are sent to any server. The model itself is downloaded from Hugging Face on first use; only the download itself contacts Hugging Face, no chat content is ever sent there.

Free Cloud AI (Groq):
Messages and conversation content are sent to Groq to generate AI responses. Conversations are not stored on our servers.

LLM Pro – Premium Cloud Models (Groq & Cerebras):
Depending on the model selected, messages and conversation content are sent to Groq or Cerebras to generate AI responses. We do not store conversations, images, or documents on our servers.

Bring Your Own API Key (OpenRouter):
When you use your own OpenRouter API key in Chat LLM, messages and content are sent directly to OpenRouter, which may route requests to different model providers depending on the model selected. When OpenRouter web search is enabled, search queries are also processed by OpenRouter. API keys are stored only on your device.

LM Studio (Chat LLM):
If you connect Chat LLM to LM Studio running on your own computer, your chat content is sent over your private network / Tailscale connection directly to your own machine. It is not sent to us or to any third-party cloud provider.

Voice Chat (Google Gemini, Chat LLM):
When using Gemini voice chat in Chat LLM, spoken input, conversation content, and any images you attach are sent to Google to generate AI responses. Google Search grounding may be used automatically to answer questions requiring live information. No voice data or transcripts are stored on our servers. Your Gemini API key is stored only on your device.

Voice Chat (xAI Grok, Chat LLM and LLM Pro):
When using Grok voice chat, spoken input and conversation content are sent to xAI to generate AI responses. No voice data or transcripts are stored on our servers. Your xAI API key is stored only on your device.

Reader & Text-to-Speech (Chat LLM):
Reader sends the text being synthesised directly from your device to the text-to-speech provider you select — OpenRouter (Kokoro), MiniMax, or ElevenLabs — using your own API key for that provider. The generated audio is downloaded to your device and cached locally so repeat playback does not re-incur a generation cost. No document text or audio is stored on our servers, and your provider API keys are stored only on your device.

Webpage import cleanup (optional, Chat LLM):
Reader cleans imported webpages on your device by default. If you add your own Groq API key, webpage text can optionally be sent to Groq to be tidied up for clearer narration. This is optional and off unless you add a Groq key.

Settings & System Prompt Sync (Apple iCloud, Chat LLM):
Your settings and system prompts are backed up to your personal iCloud account so they restore across your devices. This uses Apple's iCloud; we have no access to it. Conversations and imported Reader documents are not synced.

Spoken Playback of AI Replies (LLM Pro, Replicate / Kokoro):
In LLM Pro, message text is sent to our backend server, which forwards it to Replicate on your behalf to generate the audio — no API key required. The generated audio is downloaded to your device and cached locally. No message text or audio is stored on our servers.

AI Image Generation (LLM Pro, Replicate):
When generating or editing images in LLM Pro, prompt text and any selected images are sent to Replicate for processing. Replicate processes this data to generate AI images within the app. Images are not stored on our servers.

Can I use the app offline?

Chat LLM: The optional on-device Llama model works fully offline once downloaded from the model picker — no internet connection is required after that initial download. All other features require a connection: Groq free chat, OpenRouter, Gemini and Grok voice chat, Reader text-to-speech, and LM Studio (which needs a network connection to your own computer).

LLM Pro: Requires an internet connection for all features.

Why can't I upload images or documents?

Image and document uploads are only available when using models that support vision or document input. Feature availability depends on the selected model.

What is the LLM Pro subscription?

LLM Pro offers a £9.99/month subscription that unlocks premium cloud AI models powered by Groq and Cerebras, real-time web search on supported models, AI image generation processed via Replicate with a monthly limit, and spoken playback of AI replies (Kokoro TTS via Replicate).

What devices are supported?

Both apps support iPhone and iPad running iOS 18.0 or later.

How do I report a bug?

Should you have a bug to report, please email details and reproduction steps to:
Email: ferrraridave@icloud.com