mirror of
https://github.com/ItzCrazyKns/Perplexica.git
synced 2025-06-21 17:28:43 +00:00
the reasoning models using ReasoningChatModel Custom Class. 1. Added the STREAM_DELAY parameter to the sample.config.toml file: [MODELS.DEEPSEEK] API_KEY = "" STREAM_DELAY = 20 # Milliseconds between token emissions for reasoning models (higher = slower, 0 = no delay) 2. Updated the Config interface in src/config.ts to include the new parameter: DEEPSEEK: { API_KEY: string; STREAM_DELAY: number; }; 3. Added a getter function in src/config.ts to retrieve the configured value: export const getDeepseekStreamDelay = () => loadConfig().MODELS.DEEPSEEK.STREAM_DELAY || 20; // Default to 20ms if not specified Updated the deepseek.ts provider to use the configured stream delay: const streamDelay = getDeepseekStreamDelay(); logger.debug(`Using stream delay of ${streamDelay}ms for ${model.id}`); // Then using it in the model configuration model: new ReasoningChatModel({ // ...other params streamDelay }), 4. This implementation provides several benefits: -User-Configurable: Users can now adjust the stream delay without modifying code -Descriptive Naming: The parameter name "STREAM_DELAY" clearly indicates its purpose -Documented: The comment in the config file explains what the parameter does -Fallback Default: If not specified, it defaults to 20ms -Logging: Added debug logging to show the configured value when loading models To adjust the stream delay, users can simply modify the STREAM_DELAY value in their config.toml file. Higher values will slow down token generation (making it easier to read in real-time), while lower values will speed it up. Setting it to 0 will disable the delay entirely.
35 lines
772 B
TOML
35 lines
772 B
TOML
[GENERAL]
|
|
PORT = 3001 # Port to run the server on
|
|
SIMILARITY_MEASURE = "cosine" # "cosine" or "dot"
|
|
KEEP_ALIVE = "5m" # How long to keep Ollama models loaded into memory. (Instead of using -1 use "-1m")
|
|
|
|
[MODELS.OPENAI]
|
|
API_KEY = ""
|
|
|
|
[MODELS.GROQ]
|
|
API_KEY = ""
|
|
|
|
[MODELS.ANTHROPIC]
|
|
API_KEY = ""
|
|
|
|
[MODELS.GEMINI]
|
|
API_KEY = ""
|
|
|
|
[MODELS.DEEPSEEK]
|
|
API_KEY = ""
|
|
STREAM_DELAY = 5 # Milliseconds between token emissions for reasoning models (higher = slower, 0 = no delay)
|
|
|
|
[MODELS.OLLAMA]
|
|
API_URL = "" # Ollama API URL - http://host.docker.internal:11434
|
|
|
|
[MODELS.LMSTUDIO]
|
|
API_URL = "" # LM STUDIO API URL - http://host.docker.internal:1234
|
|
|
|
[MODELS.CUSTOM_OPENAI]
|
|
API_KEY = ""
|
|
API_URL = ""
|
|
MODEL_NAME = ""
|
|
|
|
[API_ENDPOINTS]
|
|
SEARXNG = "http://localhost:32768" # SearxNG API URL
|