T-blogs.

Categories

Read Latest Articles
AI Research

Fixing DeepSeek on Janitor AI: API Setup and Infinite Loading Fix

Ashique Hussain
Ashique Hussain· May 6, 2026 · 14 min read
Share
Abstract AI neural network visualization representing DeepSeek language model

You are tired of mainstream LLM APIs silently restricting your outputs, breaking character immersion, or billing you like you are running a small nation-state. You want to know how to set up DeepSeek on Janitor AI. For unrestricted roleplay, deep character memory, and immersive storytelling, DeepSeek is a solid architectural choice: cheaper, highly capable, and fiercely uncensored. But connecting a raw API to a third-party client is not always plug-and-play. Let us break this down into components.

⚡ DeepSeek Janitor AI Quick Configuration Reference

Use these exact parameters in Janitor AI's Custom API (OpenAI-compatible) settings. 95% of setup errors (including the infinite loading screen) are caused by omitting the /v1 endpoint path or writing the wrong model name.

API Base URL (Direct)https://api.deepseek.com/v1
Must include the /v1 suffix
API Base URL (OpenRouter)https://openrouter.ai/api/v1
Standard failover gateway
Model Name (Chat Direct)deepseek-v4-flash
Matches the post-July 2026 low-latency, non-thinking option
Model Name (Thinking Direct)deepseek-v4-pro
Matches the post-July 2026 reasoning model option
Model Name (OpenRouter Chat)deepseek/deepseek-v4-flash
Routing string for OpenRouter non-thinking
Min Account Balance$2.00 USD
DeepSeek keys are inactive if balance is $0

The Architecture: What Is Actually Happening?

Before we start configuring things, let us clear up a massive misconception. Janitor AI is just a client frontend. It does not host the language models. Think of it as a beautifully styled terminal window.

DeepSeek, on the other hand, is the engine. You are connecting the two via an API key and a base URL. If your API routing is wrong, the entire pipeline fails silently. We are essentially spoofing the standard OpenAI API contract because Janitor AI treats external endpoints as "OpenAI-compatible."

Requirement

Valid API Key

Generated directly from the DeepSeek developer console.

Format

OpenAI Compatible

Must use OpenAI request payload formatting.

Base URL

Endpoints Matter

Missing the version path will break the integration.

Prerequisites: The $2 Minimum Balance Rule

Before we touch a single API key, we need to address the most common failure state. DeepSeek operates on a prepaid billing model. If you generate a key on a completely empty account ($0 balance), the API will fail silently. Janitor AI will just spin, or return blank responses.

To avoid this, you must add at least $2 to your DeepSeek account. It is a tiny threshold, but it acts as an anti-abuse measure on their end. Go to the billing dashboard, drop in the minimum top-up, and your API keys will immediately activate for external routing.

Step 1: Procuring the DeepSeek API Key

First, head over to the DeepSeek Platform and create an account. Navigate to the API keys dashboard and generate a new key.

A friendly warning from someone who has seen production go down at 3 AM: Treat this API key like a production database password. If you leak it, revoke it immediately unless you want a very expensive lesson in cloud billing.

Step 2: The Critical Configuration in Janitor AI

Now, log into Janitor AI and navigate to the API / Model Settings. Select the Custom API (OpenAI-compatible) option. Here are the precise parameters. Do not ad-lib here—I cannot stress this enough.

  • API Base URL: https://api.deepseek.com/v1 (Direct) or https://openrouter.ai/api/v1 (OpenRouter)
    Notice the /v1. Just one missing version tag and you get infinite loading. If using OpenRouter, use their v1 endpoint.
  • API Key: Paste the exact string you generated from DeepSeek (or OpenRouter, depending on your choice).
  • Model Name: deepseek-v4-flash (for Direct Chat) or deepseek-v4-pro (for Direct Thinking / reasoning), or their OpenRouter equivalents like deepseek/deepseek-v4-flash.
{
  "model": "deepseek-v4-flash",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."}
  ],
  "temperature": 1.1,
  "top_p": 1.0,
  "max_tokens": 4096
}

DeepSeek Direct vs. OpenRouter: Which is Better?

You have two main paths to pipe DeepSeek into Janitor AI: the direct DeepSeek API, or an aggregator like OpenRouter. For normal users, OpenRouter is the standard alternative to direct integration.

The direct API is cheaper if you use it heavily, but you might face geographic routing issues or rate limits during peak Chinese timezone hours. OpenRouter sits in the middle. You pay a slight markup, but OpenRouter handles the failovers, accepts standard US/EU credit cards more easily, and provides a unified dashboard for all your LLM spending. If you want the path of least resistance for immersive storytelling, OpenRouter is excellent.

Choosing the Right Model: DeepSeek V4 Flash vs. DeepSeek V4 Pro (Thinking)

DeepSeek's post-July 2026 lineup features DeepSeek V4 Pro (the reasoning/thinking mode option, replacing the old deepseek-reasoner), which uses chain-of-thought processing to "think" before it speaks. While incredible for coding or math, avoid using thinking-mode models like V4 Pro for standard roleplay in Janitor AI.

Reasoning models output raw <think>...</think> tags containing their internal monologue. Janitor AI’s frontend currently struggles to parse or hide these tags cleanly, which immediately breaks immersion during a character interaction. For seamless roleplay and character interactions, always stick to DeepSeek V4 Flash (deepseek-v4-flash, which replaces the legacy deepseek-chat). It remains lightning fast, highly responsive, completely uncensored, and parses perfectly in Janitor’s interface.

For a comprehensive evaluation of how DeepSeek V3 fares against the industry's heaviest models in coding, logic, and reasoning benchmarks, read my detailed shootout of the best AI chatbots. If you're architecting a wider technical stack, explore my ultimate AI tools guide listing leading text, image, and automation platforms.

Step 3: Tuning the Hyperparameters

DeepSeek is not Claude or GPT-4. If you leave the default hyperparameters untouched, it might start producing output that resembles a junior dev explaining their latest spaghetti code.

  • Temperature (1.0 - 1.3): Start around 1.0. For more creative prose and less rigid, repetitive patterns in storytelling, safely bump it up toward 1.3.
  • Top-P (1.0): Controls sampling diversity. Setting this to 1.0 is highly recommended for the best balance of creativity and coherence.
  • Max Tokens (2048 - 4096): Set this high enough (such as 4096) so long character responses do not abruptly cut off mid-sentence.

Taking Advantage of the 128K Context Window

One of the biggest advantages of DeepSeek's v4 models is their massive 128K context window. In roleplay terms, this means the character can "remember" events that happened dozens of chapters ago without you needing to summarize them in the chat memory manually.

In Janitor AI’s Generation Settings, you can safely slide the Context Size up to 64,000 or even 100,000 if your API balance can handle the token ingestion costs. Be warned: sending 100K tokens per message gets expensive fast, even at DeepSeek's rock-bottom pricing. A sweet spot for deep memory without bankrupting yourself is typically around 32,000 tokens.

Advanced Architecture: The LiteLLM Proxy (For Power Users)

If you are running this setup daily and want ultimate control, connecting Janitor AI directly to DeepSeek or even OpenRouter might feel limiting. You are one API outage away from a dead roleplay session. The adult in the room for model orchestration for power users is LiteLLM.

By running a local LiteLLM proxy, you point Janitor AI to http://localhost:4000. LiteLLM handles the routing. I remember a specific Friday night where a major API provider went down, but my LiteLLM router automatically failed over to a local Llama3 instance. The client never noticed the drop. That is the kind of resilience you want.

Here is the exact config.yaml you need to run LiteLLM with DeepSeek and a local fallback:

model_list:
  - model_name: deepseek-v4-flash
    litellm_params:
      model: deepseek/deepseek-v4-flash
      api_key: os.environ/DEEPSEEK_API_KEY
  - model_name: deepseek-v4-flash # Fallback
    litellm_params:
      model: ollama/llama3
      api_base: http://localhost:11434

router_settings:
  routing_strategy: usage-based-routing
  fallbacks: [{"deepseek-v4-flash": ["ollama/llama3"]}]

Run this via Docker, point Janitor AI to your local proxy, and you suddenly have enterprise-grade failovers for your roleplay sessions. (Yes, I know I should have used a Kubernetes cluster. No, I didn't. Move on.)

Latency: The Silent Killer

Let us talk about Time to First Token (TTFT). If it takes 30 seconds to respond, it is not an "Assistant." It is a pen pal.

DeepSeek is fast, but network routing from certain geographic locations can introduce a 300ms overhead before the token generation even begins. If you are experiencing high latency in Janitor AI, check your DNS resolution, or better yet, ensure your streaming settings in the Janitor UI are toggled on. Streaming drastically reduces perceived latency by printing tokens as they arrive rather than waiting for the entire chunk.

Debugging the Inevitable Failures

If you have followed these steps and things still are not working, here is your mini post-mortem checklist:

  • The Infinite Loading Screen: You almost certainly messed up the Base URL. Verify the /v1 is present.
  • "Model Not Found" Error: You typo'd the Model ID. It must be exactly deepseek-v4-flash.
  • Silent Empty Replies: Usually an API key issue, or you ran out of credits on the DeepSeek platform. Check your billing dashboard.

The Verdict

Boring technology is usually the right choice for production, and setting up a clean API pipeline should not be overly complex. Once configured correctly, DeepSeek on Janitor AI provides a robust, highly capable backend at a fraction of the cost of mainstream alternatives.

Now stop over-engineering the prompt settings and get back to your immersive storytelling.

FAQ

Frequently Asked Questions

Go to Janitor AI settings, select Custom API (OpenAI-compatible), enter the base URL as https://api.deepseek.com/v1, paste your DeepSeek API key, and set the model name to deepseek-v4-flash. Save and test the connection.
The correct base URL is https://api.deepseek.com/v1 — including the /v1 path. Omitting it is the most common cause of the infinite loading screen.
The most common cause is an incorrect base URL. Make sure you are using https://api.deepseek.com/v1 with the version path included. Another frequent issue is a $0 account balance; DeepSeek requires at least $2 in credits to activate your API keys.
OpenRouter is the standard, easiest alternative for normal users. It handles geographic routing issues, accepts standard credit cards more easily, and provides failovers. Direct DeepSeek API is cheaper but requires prepaying and can suffer from rate limits during peak hours.
Always use DeepSeek V4 Flash (deepseek-v4-flash) for Janitor AI roleplay. DeepSeek V4 Pro (Thinking) outputs raw <think> tags containing its internal monologue, which breaks character immersion and does not parse cleanly in Janitor’s interface.
DeepSeek is not free, but it is significantly cheaper than GPT-4. You pay per token through the DeepSeek platform or OpenRouter. You must have a positive balance (at least $2 minimum) to generate a working API key.
Start with temperature 1.0–1.3, top-p 1.0, and max tokens 2048–4096. Setting the temperature to 1.1 or 1.2 is a solid sweet spot for creative and engaging storytelling. Setting top-p to 1.0 ensures optimal creativity and coherence.

Related Articles