Comparing 3d1199540c...9673f8beba - khoj - Gitea: Git with a cup of tea

LLM/khoj

mirror of https://github.com/khoj-ai/khoj.git synced 2026-05-13 21:41:41 +00:00

Author	SHA1	Message	Date
Debanjum	9673f8beba	Release Khoj version 1.42.5	2025-06-11 13:36:46 -07:00
Debanjum	e87be4edf4	Pin python version used by github workflow to publish to pypi Avoids having to update python path to write web app static build files to everytime patch version of python is updated	2025-06-11 13:30:15 -07:00
Debanjum	eaae1cf74e	Fix rendering thoughts of Gemini reasoning models Previously there was duplication of thought in message to user and in the train of thought. This should be resolved now	2025-06-11 13:09:38 -07:00
Debanjum	4946ea1668	Fix to save organic results to conversation context in DB This bug was introduced in `05d4e19cb`, version 1.42.2, during migration to save deeply typed ChatMessageModel. As the ChatMessageModel did not use the right field name for organic results (since the start). Previously it did not matter as it was storing to DB irrespective but now the mapping of dictionary to ChatMessageModel drops that field before save to conversation in DB. This was resulting in organic context being lost on page reload and only being shown on first response.	2025-06-11 12:52:42 -07:00
Debanjum	30ced1d86c	Log non schema adhering chat message before save to DB	2025-06-11 12:52:42 -07:00
Debanjum	71763684a9	Explicitly drop stream_options if not streaming openai chat response Not sure why but it some cases when interacting with o3 (which needs non-streaming) the stream_options seems to be set. Cannot reproduce but hopefully dropping the stream_options explicitly should resolve this issue. Related `985a98214`	2025-06-11 12:52:42 -07:00
Debanjum	65644f78b0	Set lower max output tokens for non reasoning Gemini models While reasoning models support longer output tokens. Non reasoning models do not. Use a lower max output tokens for them	2025-06-11 11:12:24 -07:00
Debanjum	71221533c8	Release Khoj version 1.42.4	2025-06-10 23:49:30 -07:00
Debanjum	985a982148	Update openai package to stream response by non-reasoning models Older package (like 1.84.0) seem to always pass reasoning_effort argument to openai api, which now seems to be throwing unexpected request argument error when used with non-reasoning models (like 4o-mini).	2025-06-10 23:49:04 -07:00
Debanjum	9b767438e2	Update model pricing, default models, context and version metadata	2025-06-10 23:49:04 -07:00
Debanjum	753972997f	Enable non-streaming response via openai api to support o3 models	2025-06-10 23:49:04 -07:00
Debanjum	5110a06085	Fix GET agents API to return agent specific chat model There had been a regression that made all agents display the default chat model instead of the actual chat model associated with the agent. This change resolves that issue by prioritizing agent specific chat model from DB (over user or server chat model).	2025-06-10 15:29:46 -07:00
Debanjum	0cd709caf4	Release Khoj version 1.42.3	2025-06-10 10:20:44 -07:00
Debanjum	313f648bd7	Compile ai message content into single string when using DeepInfra DeepInfra only accepts assistant message.content of string type	2025-06-10 01:58:43 -07:00
Debanjum	9e73309d01	Add no think tag for qwen models msgs over api when no deepthought	2025-06-10 01:58:43 -07:00
Debanjum	64886cd0dd	Fix storing code results on server and rendering them on web app - Fix code context data type for validation on server. This would prevent the chat message from being written to history - Handle null code results on web app	2025-06-09 23:46:12 -07:00
Debanjum	b1a6e53d77	Fix populating chat message history to continue interrupted research We now pass deeply typed chat messages throughout the application to construct tool specific chat history views since `05d4e19cb`. This ChatMessageModel didn't allow intent.query to be unset. But interrupted research iteration history can have unset query. This changes allows makes intent.query optional. It also uses message by user entry to populate user message in tool chat history views. Using query from khoj intent was an earlier shortcut used to not have to deal with message by user. But that doesn't scale to current scenario where turns are not always required to have a single user, assistant message pair. Specifically a chat history can now contain multiple user messages followed by a single khoj message. The new change constructs a chat history that handles this scenario naturally and makes the code more readable. Also now only previous research iterations that completed are populated. Else they do not serve much purpose.	2025-06-09 23:46:12 -07:00
Debanjum	bd928b9f3c	Handle unset agent slug, name. E.g when chat with user created agents	2025-06-09 18:11:25 -07:00
Debanjum	5dd8a9cb24	Only add cache control to last Claude text block if exists, non-empty Otherwise Claude API throws error	2025-06-08 19:41:21 -07:00
Debanjum	d638a49cd9	Release Khoj version 1.42.2	2025-06-07 13:32:12 -07:00
Debanjum	2423db0186	Remove broken link to deprecated summarize slash command in docs	2025-06-07 13:31:21 -07:00
Debanjum	b6ceaeeffc	Execute doc search in parallel using asyncio instead of threadpool	2025-06-07 13:06:49 -07:00
Debanjum	dc1c3561fe	Make search type comparison in document search more robust	2025-06-07 12:52:10 -07:00
Debanjum	b9c6252a4a	Increase scroll amount on horizontal scroll in computer environment	2025-06-07 11:17:52 -07:00
Debanjum	3fc175d27b	Restrict Khoj to work with python <3.13 Python 3.13 not supported by all dependencies yet	2025-06-07 00:44:04 -07:00
Debanjum	1bbf719b04	Apply migrations to db for test runs to install pgvector	2025-06-06 15:47:28 -07:00
Debanjum	77caf183ee	Patch update django, next.js dependencies	2025-06-06 15:39:39 -07:00
Debanjum	c4cc70bcc9	Delete file summarization slash commands docs page File summarization slash commands have been deprecated. Folks can upload files and ask their questions directly	2025-06-06 15:30:21 -07:00
Debanjum	257c238a88	Improve DB clean up after test runs	2025-06-06 15:09:39 -07:00
Debanjum	6ac1530816	More robustly default to searching all content type	2025-06-06 15:09:39 -07:00
Debanjum	b21706aa45	Drop help, summarize and automation /slash commands from chat api Clean non useful slash commands to make chat API more maintanable. - App version, chat model via /help is visible in other parts of the UX. Asking help questions with site:docs.khoj.dev filter isn't used or known to folks - /summarize is esoterically tuned. Should be rewritten if add back. It wasn't being used by /research already - Automations can be configured via UX. It wasn't being shown in UX already	2025-06-06 15:09:39 -07:00
Debanjum	7f6db526c3	Enforce json for non reasoning anthropic models even in deepthought	2025-06-06 13:28:18 -07:00
Debanjum	d2c7e5516f	Fix online chat actor tests, improve offline chat actor tests The chat actor (and director) tests haven't been looked into in a long while. They'd gone stale in how they were calling thee functions. And what was required to run them. Now the online chat actor tests work again.	2025-06-06 13:28:18 -07:00
Debanjum	2f4160e24b	Use single extract questions method across all LLMs for doc search Using model specific extract questions was an artifact from older times, with less guidable models. New changes collate and reuse logic - Rely on send_message_to_model_wrapper for model specific formatting. - Use same prompt, context for all LLMs as can handle prompt variation. - Use response schema enforcer to ensure response consistency across models. Extract questions (because of its age) was the only tool directly within each provider code. Put it into helpers to have all the (mini) tools in one place.	2025-06-06 13:28:18 -07:00
Debanjum	c2cd92a454	[Breaking] Move automation api into new router with consistent routes - Rename GET /api/automations to GET /api/automation - Rename POST /api/trigger/automation to POST /api/automation/trigger - Update calls to the automations API from the web app.	2025-06-06 13:28:18 -07:00
Debanjum	7dfa710cb4	Log invalid automation ids for investigation and clean-up	2025-06-06 13:28:18 -07:00
Debanjum	7d59688729	Move document search tool into helpers module with other tools Document search (because of its age) was the only tool directly within an api router. Put it into helpers to have all the (mini) tools in one place.	2025-06-06 13:28:18 -07:00
Debanjum	1dbe60a8a2	Give more readable name to document search tool	2025-06-06 13:28:18 -07:00
Debanjum	38fa34a861	Simplify ai provider converse methods - Add context based on information provided rather than conversation commands. Let caller handle passing appropriate context to ai provider converse methods	2025-06-06 13:28:18 -07:00
Debanjum	bfd4695705	Save conversation in common chat api func instead of each ai provider	2025-06-06 13:28:18 -07:00
Debanjum	e7584bc29d	Remove old "Notes" stop keyword for openai api based models They were when passing notes context to dumber models. Not required for most models now.	2025-06-06 13:28:18 -07:00
Debanjum	a9b1a26089	Update gunicorn default timeouts, workers. Configure via env vars Increase timeout to 180 (from 120s previous) and graceful timeout to 90 (from 30s default) to reduce Increase default gunicorn workers and make it configurable to better utilize (v)CPUs. This is manually configured (instead of using multiprocessing.cpu_count()) as VMs/containers may read cpu count of host machine instead of their VMs/containers.	2025-06-06 13:28:18 -07:00
Debanjum	d16f9f272b	Add ability to retry a query from the web app	2025-06-06 13:28:18 -07:00
Debanjum	05d4e19cb8	Pass deep typed chat history for more ergonomic, readable, safe code The chat dictionary is an artifact from earlier non-db chat history storage. We've been ensuring new chat messages have valid type before being written to DB for more than 6 months now. Move to using the deeply typed chat history helps avoids null refs, makes code more readable and easier to reason about. Next Steps: The current update entangles chat_history written to DB with any virtual chat history message generated for intermediate steps. The chat message type written to DB should be decoupled from type that can be passed to AI model APIs (maybe?). For now we've made the ChatMessage.message type looser to allow for list[dict] type (apart from string). But later maybe a good idea to decouple the chat_history recieved by send_message_to_model from the chat_history saved to DB (which can then have its stricter type check)	2025-06-04 00:03:14 -07:00
Debanjum	430459a338	Release Khoj version 1.42.1	2025-06-03 21:46:16 -07:00
Debanjum	f6e2eebecc	Ignore devcontainer, launch.json from json pre-commit validation They follow jsonc format and allow comments but fail the json validator. This is a spurious error and should be ignored	2025-06-03 21:45:11 -07:00
Debanjum	d618f2d650	Raise value error if research pick next tool isn't a dictionary This will give a better error message with response content than the failed to get errors from non dictionary response we were getting earlier.	2025-06-03 21:45:11 -07:00
Debanjum	65d9ad6cb2	Use tool calls to enforce response schema for anthropic models - Converts response schema into a anthropic tool call definition. - Works with simple enums without needing to rely on $defs, $refs as unsupported by Anthropic API - Do not force specific tool use as not supported with deep thought This puts anthropic models on parity with openai, gemini models for response schema following. Reduces need for complex json response parsing on khoj end.	2025-06-03 21:05:29 -07:00
Debanjum	d45d9d4cfb	Fix malformed user uuids to fix automations [automations data loss] - Malformed automations will be dropped They can't run with malformed user uuid anyway.	2025-06-03 21:05:29 -07:00
Debanjum	4892e73323	Remove unsuppported NUL char from file, chat before save to DB	2025-06-03 21:05:29 -07:00
Debanjum	27534f6533	Make query field in context optional Query field isn't set for all context. The current change was preventing save to conversation errors when query unset in context.	2025-06-03 19:59:05 -07:00
Debanjum	6e48f4de84	Fix to switch text to speech model via API	2025-06-03 19:59:05 -07:00
Debanjum	63a1a8e91f	Try pre-install deps, use custom launch.json for dev container Previous attempts have not been sufficient. Let's see if this works	2025-06-03 05:00:05 -07:00
Debanjum	29b973e748	Fix building server and web app dev container	2025-06-03 03:46:08 -07:00
Debanjum	50f37d541a	Pre-install server deps for fast devcontainer start. Fix dev launch.json There seems to be a more standard mechanism of specifying launch.json params for devcontainers. Previous mechanism to write launch.json to .vscode/launch.json in post creation step does not work. Improve default launch.json to include khoj admin username, password with placeholder values to get started with local development faster. Define dockerfile for devcontainer to pre-built server, web app dependencies during dev container image creation stage. So install on dev container startup is sped up as no need to install dependencies.	2025-06-03 01:43:23 -07:00
Debanjum	f3a5fe1ae8	Release Khoj version 1.42.0	2025-06-01 20:52:25 -07:00
Debanjum	82ee0f5451	Revert computer dockerfile startup command to fix operating it	2025-06-01 20:39:58 -07:00
Debanjum	a236288ca9	Fixes to enable dockerized khoj to operate its computer	2025-06-01 19:19:01 -07:00
Debanjum	f95d352eb9	Ensure profile is right border aligned on khoj obsidian settings page On wide screens it wasn't taking up the header wasn't taking up the full width, so profile picture could hang out in the middle somewhere.	2025-06-01 17:02:08 -07:00
Debanjum	759ffc46b0	Default to read currently open file when chat with Khoj from Obsidian Vault is already indexed, this should ease engaging with current context more easily.	2025-06-01 16:56:19 -07:00
Debanjum	3fb8f77cd5	Fix terminal tool passed to claude 3.7 sonnet as anthropic operator	2025-06-01 16:55:17 -07:00
Debanjum	ddf028f7af	Fix khoj computer image name used in docker-compose.yml instead	2025-06-01 16:44:28 -07:00
Debanjum	257bdfadef	Setup vscode launch.json and configure pytests for dev container	2025-06-01 16:36:46 -07:00
Debanjum	a98525be01	Add default vscode config for khoj to ease development setup	2025-06-01 16:36:46 -07:00
Debanjum	c6cc709f62	Fix khoj computer image name and only build it once for each arch	2025-06-01 16:36:46 -07:00
Debanjum	a4eb85ac41	Reduce (superficial) xdg dir permissions errors on khoj computer start	2025-06-01 16:36:20 -07:00
sabaimran	e9a107cc06	fix spelling of development	2025-06-01 13:41:39 -07:00
Henri Jamet	dbfac89a0c	Major updates to Obsidian Khoj plugin chat interface and editing features (#1109 ) ## Description This PR introduces significant improvements to the Obsidian Khoj plugin's chat interface and editing capabilities, enhancing the overall user experience and content management functionality. ## Features ### 🔍 Enhanced Communication Mode I've implemented radio buttons below the chat window for easier communication mode selection. The modes are now displayed as emojis in the conversation for a cleaner interface, replacing the previous text-based system (e.g., /default, /research). I've also documented the search mode functionality in the help command. #### Screenshots - Radio buttons for mode selection - Emoji display in conversations ![Recording 2025-02-11 at 18 56 10](https://github.com/user-attachments/assets/798d15df-ad32-45bd-b03f-581f6093575a) ### 💬 Revamped Message Interaction I've redesigned the message buttons with improved spacing and color coding for better visual differentiation. The new edit button allows quick message modifications - clicking it removes the conversation up to that point and copies the message to the input field for easy editing or retrying questions. #### Screenshots - New message styling and color scheme ![Recording 2025-02-11 at 18 44 48](https://github.com/user-attachments/assets/159ece3d-2d80-4583-a7a8-2ef1f253adcc) - Edit button functionality ![Recording 2025-02-11 at 18 47 52](https://github.com/user-attachments/assets/82ee7221-bc49-4088-9a98-744ef74d1e58) ### 🤖 Advanced Agent Selection System I've added a new chat creation button with agent selection capability. Users can now choose from their available agents when starting a new chat. While agents can't be switched mid-conversation to maintain context, users can easily start fresh conversations with different agents. #### Screenshots - Agent selection dropdown ![Recording 2025-02-11 at 18 51 27](https://github.com/user-attachments/assets/be4208df-224c-45bf-a5b4-cf0a8068b102) ### 👁️ Real-Time Context Awareness I've added a button that gives Khoj access to read Obsidian opened tabs. This allows Khoj to read open notes and track changes in real-time, maintaining a history of previous versions to provide more contextual assistance. #### Screenshots - Window access toggle ![Recording 2025-02-11 at 18 59 01](https://github.com/user-attachments/assets/b596bfca-f622-41b7-b826-25a8e254d4a2) ### ✏️ Smart Document Editing Inspired by Cursor IDE's intelligent editing and ChatGPT's Canvas functionality, I've implemented a first version of a content creation system we've been discussing. Using a JSON-based modification system, Khoj can now make precise changes to specific parts of files, with changes previewed in yellow highlighting before application. Modification code blocks are neatly organized in collapsible sections with clear action summaries. While this is just a first step, it's working remarkably well and I have several ideas for expanding this functionality to make Khoj an even more powerful content creation assistant. #### Screenshots - JSON modification preview - Change highlighting system - Collapsible code blocks - Accept/cancel controls ![Recording 2025-02-11 at 19 02 32](https://github.com/user-attachments/assets/88826c9e-d0c9-40da-ab78-9976c786aa9e) --------- Co-authored-by: Debanjum <debanjum@gmail.com>	2025-06-01 10:42:36 +05:30
Debanjum	dee767042e	Operate Computer with Khoj Operator (#1190 ) ## Summary - Enable Khoj to operate computers: Add experimental computer operator functionality that allows Khoj to interact with desktop environments, browsers, and terminals to accomplish complex tasks - Multi-environment support: Implement computer environments with GUI, file system, and terminal access. Can control host computer or Docker container computer ## Key Features ### Computer Operation Capabilities - Desktop control (screenshots, clicking, typing, keyboard shortcuts) - File editing and management - Terminal/bash command execution - Web browser automation - Visual feedback via train-of-thought video playback ### Infrastructure & Architecture: - Docker container (ghcr.io/khoj-ai/computer:latest) with Ubuntu 24.04, XFCE desktop, VNC access - Local computer environment support with pyautogui - Modular operator agent system supporting multiple environment types - Trajectory compression and context management for long-running tasks ### Model Integration: - Anthropic models only (Claude Sonnet 4, Claude 3.7 Sonnet, Claude Opus 4) - OpenAI and binary operator agents temporarily disabled - Enhanced caching and context management for operator conversations ### User Experience: - `/operator` command or just ask Khoj to use operator tool to invoke computer operation - Integrate with research mode for extended 30+ minute task execution - Video of computer operation in train of thought for transparency ### Configuration - Set `KHOJ_OPERATOR_ENABLED=True` in `docker-compose.yml` - Requires Anthropic API key - Computer container runs on port 5900 (VNC)	2025-05-31 22:04:12 -07:00
Debanjum	fa2e370ce6	Document how to enable and use computer operator in operator readme	2025-05-31 21:41:23 -07:00
Debanjum	ceb1d82bf6	Create khoj computer via cloud build. Add computer to docker-compose.yml	2025-05-31 21:39:38 -07:00
Debanjum	68f7aae71c	Install claude 4 sonnet, latest gemini 2.5s when configure on first run	2025-05-31 20:52:27 -07:00
Debanjum	b90b724f9a	Disable openai, binary operator agents until they become useful	2025-05-31 20:51:08 -07:00
Debanjum	830a1af69e	Render operator train of thought as video on web app to ease viewing - You can seek through the train of thought video of computer operation or follow it in live mode. - Interleaves video with normal text thoughts. - Video available of old interactions and currently streaming message.	2025-05-31 20:51:08 -07:00
Debanjum	6821bd38ed	Fix mypy typing errors in operator environment files - Add type guards for action.path in drag vs text editor actions - Added type guards for Union type attribute access - Fixed variable naming conflicts between drag and text editor cases - Resolved remaining typing issues in OpenAI, Anthropic agents - Type guard without requiring another code indent level	2025-05-31 20:51:08 -07:00
Debanjum	c5c06a086e	Fix, improve openai operator agent for interrupts, computer environment - Create reusable method to call model - Fix to summarize messages on operator run. - Mark assistant tool calls with role = assistant, not environment - Try fix message format when load after interrupts. Does not work well yet	2025-05-31 20:51:08 -07:00
Debanjum	f517566560	Improve invoking keybindings on computer always using lowercase keys Previously CTRL+A would get triggered instead of ctrl+a. CTRL+A is equivalent to ctrl+shift+a. This isn't intended and should be called directly when required. Now key combos like ctrl+a on computer firefox etc. work as expected	2025-05-31 20:51:08 -07:00
Debanjum	2558ac7f18	Show thinking and engage deep thought for gemini 2.5 model series Gemini models now show (a summary of) their thoughts. Stream this in research mode, similar to how it is done already for claude, deepseek, qwen etc.	2025-05-31 20:51:08 -07:00
Debanjum	cecbfe35e2	Rename compile response into a private operator agents function	2025-05-31 20:51:08 -07:00
Debanjum	ded1db642c	Get max context for user, operator model pair for context compression	2025-05-31 20:51:08 -07:00
Debanjum	7eaf0e80c5	Get max prompt size for given user, model via reusable functions	2025-05-31 20:51:08 -07:00
Debanjum	3797f03625	Log ai model usage on every call to get_chat_usage_metrics in debug mode	2025-05-31 20:51:08 -07:00
Debanjum	4cb900658d	Cache system prompt, tools of anthropic operator agent for efficiency	2025-05-31 20:51:08 -07:00
Debanjum	928e5ee8ad	Cache messages to anthropic models from chat actors for efficiency	2025-05-31 20:51:08 -07:00
Debanjum	0d1e6b0d53	Do not overwrite system_prompt for idempotent AI API calls retry Previously on tenacity retry the system_prompt could get overwritten	2025-05-31 20:51:08 -07:00
Debanjum	e0ea151f20	Implement file editor and terminal tools, in-built in claude This should improve viewing, editing files and viewing terminal command outputs by anthropic operator	2025-05-31 20:51:08 -07:00
Debanjum	21bf7f1d6d	Continue interrupted operator run with new query and previous context Track research and operator results at each nested iteration step using python object references + async events bubbled up from nested iterators. Instantiates operator with interrupted operator messages from research or normal mode. Reflects actual interaction trajectory as closely as possible to agent including conversation history, partial operator trajectory and new query for fine grained, corrigible steerability. Research mode continues with operator tool directly if previous iteration was an interrupted operator run.	2025-05-31 20:51:08 -07:00
Debanjum	de35d91e1d	Pass previous trajectory to operator agents for context	2025-05-31 20:51:08 -07:00
Debanjum	864e0ac8b5	Simplify research iteration and main research function names	2025-05-31 20:51:08 -07:00
Debanjum	6c9d569a22	Fix to get user questions in chat history from user not khoj message Since partial state reload after interrupt drops Khoj messages. The assumption that there will always be a Khoj message after a user message is broken. That is, there can now be multiple user messages preceding a Khoj user message now. This change allow for user queries to still be extracted for chat history even if no khoj message follow.	2025-05-31 20:51:08 -07:00
Debanjum	b6aa77a6f5	Lookback 3 previous turns to select next tool, for questions history	2025-05-31 20:50:03 -07:00
Debanjum	d511cbfa34	Extract constructing question history into shared function for reuse Minor logic update to only include non image inferred queries for gemini, anthropic models as well instead of just for openai models. Apart from that the extracted function should be functionally same.	2025-05-31 16:50:26 -07:00
Debanjum	da663e184c	Type operator results. Enable storing, loading operator trajectories. We were passing operator results as a simple dictionary. Strongly typing it makes sense as operator results becomes more complex. Storing operator results with trajectory on interrupts will allow restarting interrupted operator run with agent messages of interrupted trajectory loaded into operator agents	2025-05-31 16:50:26 -07:00
Debanjum	675fc0ad05	Decouple trajectory compression from `act'. Reuse func to call llm api	2025-05-31 16:50:26 -07:00
Debanjum	b027024c42	Handle failed operator agent calls to anthropic api more gracefully Add anthropic operator api call errors to trajectory instead of erroring out of current operator run	2025-05-31 16:50:26 -07:00
Debanjum	d54bfc19e5	Add trajectory compression to anthropic operator agent - Add compression parameters to base operator agent for reuse - Increase default operator iterations	2025-05-31 16:50:26 -07:00
Debanjum	cb451fa67c	Put default summarize prompt into operator agent This allows: - Each operator agent to own its summarization prompt. That it can tune if it wants - The outer operator loop to pass an override summarize prompt when it invokes the summarize func but it does not have to	2025-05-31 16:50:26 -07:00
Debanjum	99fdd91a01	Latch to bottom instantly and well when auto scroll chat stream on web	2025-05-31 16:50:26 -07:00
Debanjum	253656b634	Fix engaging anthropic api cache for operator trajectories. It had become broken at some point due to refactoring. The cache control was getting added and removed right after in add_action_results What we actually wanted to do is clear the old cache breakpoint and put a new one at the latest operator tool result message. This should improve operator speed and lower costs with anthropic models.	2025-05-31 16:50:26 -07:00
Debanjum	faecbdb7d8	Enable operators to use computers	2025-05-31 16:50:25 -07:00
Debanjum	771909f76a	Implement docker computer environment for operator - Generalize building pyautogui into executable python code snippet. This should work across docker and local. And should be easier to extend to operate a remote computer over the network as well. - Create dockerfile for pyautogui operate-able containerized computer	2025-05-28 17:40:32 -07:00
Debanjum	e117f57f64	Implement local computer environment for operator	2025-05-28 17:40:32 -07:00
Debanjum	7eab87bfdf	Generalize operator to operate multiple types of environment Previously it could only operate a (playwright) browser. Now - The operator logic and naming has been updated assuming multiple environment types can be operated - The operator entrypoint is now at __init__.py to simplify imports and the entrypoint function is called operate_environment - All operator agents have been updated to select their system prompts and tools based on the environment they'll operate	2025-05-27 19:01:36 -07:00
Debanjum	c0689b2740	Easily interrupt and redirect khoj's research direction via chat - Khoj can now save and restore research from partial state This triggers an interrupt that saves the partial research, then when a new query is sent it loads the previous partial research as context and continues utilizing with the new user query to orient its future research - Support natural interrupt and send query behavior from web app This triggers an abort and send when a user sends a chat message while khoj is in the middle of some previous research. This interrupt mechanism enables a more natural, interactive research flow	2025-05-27 17:57:21 -07:00
Debanjum	c9e6b8e88d	Align expected types to actual returned types by AI APIs, operator	2025-05-26 00:39:06 -07:00
Debanjum	c1c1fc6265	Make send message validation more robust on web app	2025-05-26 00:35:10 -07:00
Debanjum	6cb512d9cf	Support natural interrupt and send query behavior from web app - Just send your new query. If a query was running previously it'd be interrupted and new query would start processing. This improves on the previous 2 click interrupt and send ux. - Utilizes partial research for interrupted query, so you can now redirect khoj's research direction. This is useful if you need to share more details, change khoj's research direction in anyway or complete research. Khoj's train of thought can be helpful for this.	2025-05-26 00:35:10 -07:00
Debanjum	2b7dd7401b	Continue interrupt queries only after previous query written to DB	2025-05-26 00:35:10 -07:00
Debanjum	3cd6e1a9a6	Save and restore research from partial state	2025-05-26 00:35:09 -07:00
Debanjum	a83c36fa05	Validate operator, research, context.query fields of ChatMessage - Track operator, research context in ChatMessage - Track query field in (document) context field of ChatMessage This allows validating chat message before inserting into DB	2025-05-26 00:03:59 -07:00
Debanjum	02ee4e90a2	Pass doc/web/code/operator context as list[dict] of message content	2025-05-26 00:03:59 -07:00
Debanjum	98b56316e4	Support constructing chat message as a list of dictionaries Research mode recently started passing iteration as list of message content dicts. This change extends to storing it as is in DB.	2025-05-26 00:03:59 -07:00
Debanjum	df9ab51fd0	Track research results as iteration list instead of iteration summaries	2025-05-26 00:03:59 -07:00
Debanjum	5d65fa8698	Use Django timezone funcs to make datetimes in DB timezone aware These seem to be a new class of errors showing up. Explicitly using django timezone functions to add awareness to date time files stored in DB seems to mitigate the issue. Related #1180	2025-05-25 23:43:06 -07:00
Debanjum	231aa1c0df	Support claude 4 models. Engage reasoning, operator. Track costs etc. - Engage reasoning when using claude 4 models - Allow claude 4 models as monolithic operator agents - Ease identifying which anthropic models can reason, operate GUIs - Track costs, set default context window of claude 4 models - Handle stop reason on calls to new claude 4 models	2025-05-25 23:43:06 -07:00
Debanjum	dca17591f3	Handle parsing json from string with plain text suffix	2025-05-23 19:44:02 -07:00
Debanjum	acebb90643	Mention keys expected in prompt to next research tool selector	2025-05-23 19:44:02 -07:00
Debanjum	e968cca273	Clean usage of conversation_id in chat API function - Normalize conversation_id type to str instead of str or UUID - Do not pass conversation_id to agenerate_chat_response as the associated conversation is also being passed. So can get its id directly.	2025-05-23 19:44:02 -07:00
Debanjum	a76032522e	Add type hints to function args calling anthropic model api	2025-05-22 15:02:45 -07:00
Debanjum	97c5222b04	Set type hints and reorder args of all converse_[provider] methods - Query is more important and should be passed before references - Add type hints to user query and references for code readability	2025-05-22 15:02:45 -07:00
Debanjum	2ea16298aa	Create Operator Framework. Enable Khoj to Operate Web Browser (#1174 ) ## Overview 1. Create base framework to compose different operators and environments for Khoj to operate. 2. Enable Khoj to operate a web browser using anthropic, openai, gemini or open-source models Note: This is an alpha level feature release. It is meant for local testing by contributors and self-hosters. ## Capabilities - Have Khoj operate a web browser to complete tasks that require actions and visual feedback. - Experiment with any vision model as operator. Khoj supports monolithic and binary operator - Monolithic operators rely on a single models like claude, openai to both reason and ground operator actions - Binary operators allow bootstrapping a fully local operator. It can use any vision model for visual reasoning when paired with a capable visual grounding model. ## Limitations - In general, it is slower, more expensive and less comprehensive than standard Khoj for research ## Setup 1. Install Khoj with playwright by either - running `pip install khoj[local]` - installing playwright separately via `pip install playwright` and `playwright install chromium` 2. Set `KHOJ_OPERATOR_ENABLED` env var to true (i.e `KHOJ_OPERATOR_ENABLED=true`) 3. Start Khoj (e.g `USE_EMBEDDED_DB="true" khoj --anonymous-mode -vv`) 4. Add the necessary chat model(s) with `vision enabled` via your [Khoj Admin Panel](http://localhost:42110/server/admin) - To use Anthropic claude: `claude-3.7-sonnet*` chat model is required with vision enabled - To use Openai operator: `gpt-4o` chat model is required with vision enabled - For other operator configurations: a chat model named `ui-tars-1.5` is required with vision enabled This can technically be any visual grounding model served via an openai compatible api. I've just tested with ui-tars-1.5-7b deployed to an HF inference endpoint for now. See [deployment instructions](https://github.com/bytedance/UI-TARS/blob/main/README_deploy.md) 5. Set your desired vision chat model via [user settings](http://localhost:42110/settings) to use as operator. 6. Run your queries with either the `/operator` slash command or by just asking Khoj in your query to use the operator tool. You can combine run operator in research mode a well ### Advanced Usage - Reuse Browser Session - Why: Have Khoj operate web services you've logged into. E.g manage your gmail, github, social media etc. - Setup 1. Start Chromium or Edge in Remote Debugging mode. For example, on Mac you can start Edge by running the following in your terminal: `/Applications/Microsoft\ Edge.app/Contents/MacOS/Microsoft\ Edge --remote-debugging-port=9222` 4. Connect Khoj to that browser instance by setting the environment variable `KHOJ_CDP_URL` to its URL. By default you'd set `KHOJ_CDP_URL="http://localhost:9222"` ## Architecture ### Operator Agents \| Type \| Design \| \|----- \|-----\| \| Monolithic \| <img src="https://github.com/user-attachments/assets/7a96440f-1732-482b-9bd9-0920cb0c60890" width=400> \| \| Binary \| <img src="https://github.com/user-attachments/assets/c5d101c0-3475-43c2-a301-daa943cde190" width=400> \|	2025-05-20 01:30:36 -07:00
Debanjum	19b4c18b69	Configure max iterations per operator run via environment variable	2025-05-20 01:03:11 -07:00
Debanjum	06a1a22e3b	Align generic grounding agent's interface with uitars grounding agent The generic grounding agent has not been tested properly but at least it should be aligned with the interface being used by the ui-tars grounding agent which has been tested.	2025-05-20 00:31:56 -07:00
Debanjum	0ce74e0329	Show operator context when use operator in default and research mode	2025-05-20 00:31:56 -07:00
Debanjum	cc355f93fc	Use operator context consistently as a dict[str, str] of query, result	2025-05-20 00:31:56 -07:00
Debanjum	07e33994f0	Reduce scroll amount to have previous page stay a bit on screen	2025-05-20 00:31:56 -07:00
Debanjum	e2c1b1fcd3	Add dev container config to ease setup for remote development	2025-05-19 23:34:31 -07:00
Debanjum	fdb681ca0e	Only install desktop, obsidian app from dev_setup.sh with --full flag	2025-05-19 23:34:31 -07:00
Debanjum	33dd4c8c33	Handle gemini returning simple string in response candidates	2025-05-19 19:45:10 -07:00
Debanjum	626ced8b8b	Fix adding code results to chatml messages context	2025-05-19 19:45:10 -07:00
Debanjum	ded753ff9a	Improve parsing tool use coordinate returned by claude operator agent It sometimes outputs coordinates in string rather than list. Make parser more robust to those kind of errors. Share error with operator agent to fix/iterate on instead of exiting the operator loop.	2025-05-19 16:28:55 -07:00
Debanjum	473dd006d5	Remove unnecessary images conversion to png in binary operator agent. It's handled by the ai model interaction handlers in khoj server core.	2025-05-19 16:28:55 -07:00
Debanjum	9f3fbf9021	Encourage reasoner, grounder to work better together in binary operator - Encourage grounder to adhere to the reasoners action instruction - Encourage reasoner to explore other actions when stuck in a loop Previously seemed to be forcing it too strongly to choose "single most important" next action. So may not have been exploring other actions to achieve objective on initial failure.	2025-05-19 16:28:55 -07:00
Debanjum	ac19f6d336	Improve operator exception handling - Do not catch errors messages just to re-throw them. Results in confusing exception happened during handling of an exception stacktrace. Makes it harder to debug - Log error when action_results.content isn't set or empty to debug this operator run error	2025-05-19 16:28:55 -07:00
Debanjum	59e0e092b0	Remove deprecated prompt for grounding model to choose goto, back func Goto and back functions are chosen by the visual reasoning model for increased reliability in selecting those tools. The ui-tars grounding models seems too tuned to use a specific set of tools.	2025-05-19 16:28:55 -07:00
Debanjum	1442a4f6fb	Handle reasoning messages returned by openai cua model Documentation about this is currently limited, confusing. But it seems like reasoning item should be kept if computer_call after, else drop. Add noop placeholder for reasoning item to prevent termination of operator run on response with just reasoning.	2025-05-19 16:28:55 -07:00
Debanjum	95f211d03c	Resolve mypy typing errors in operator code	2025-05-19 16:28:55 -07:00
Debanjum	33689feb91	Handle more openai response types for better rendering and error avoidance The reasoning messages in openai cua needs to be passed back or some such. Else it throws missing response with required id error. Folks are confused about expected behavior for this online as well. The documentation to handle this seems to be sparse, unclear.	2025-05-19 16:28:55 -07:00
Debanjum	3a75cd3c3d	Only trigger claude, openai monolithic operators with specific models To use Anthropic monolithic operator, set chat model to claude-3.7-sonnet To use Openai monolithic operator, set chat model to gpt-4o	2025-05-19 16:28:55 -07:00
Debanjum	258b5a0372	Show operator screenshots with reasoning in train of thought on web app	2025-05-19 16:28:55 -07:00
Debanjum	21a9556b06	Show formatted action, env screenshot after action on each operator step Show natural language, formatted text for each action. Previously we were just showing json dumps of the actions taken. Pass screenshot at each step for openai, anthropic and binary operator agents Use text and image field in json passed to client for rendering both. Show actions, env screenshot after actions applied in train of thought. Showing the post action application screenshot seems more intuitive. Previously we were showing the screenshot used to decide next action. This pre action application screenshot was being shown after next action decided (in train of thought). This was anyway misleading to the actual ordering of event. Rendered response is now a structured payload (dict) passing image and text to be rendered up from operator to clients for rendering of train of thought.	2025-05-19 16:28:55 -07:00
Debanjum	a1d712e031	Add current cursor position to browser screenshots for ai, human view	2025-05-19 16:28:55 -07:00
Debanjum	1be3986537	Require explicit switch to enable operator locally for now Operator is still early in development. To enable it: - Set KHOJ_OPERATOR_ENABLE environment variable to true - Run any one of the commands below: - `pip install khoj[local]' - `pip install khoj[dev]' - `pip install playwright'	2025-05-19 16:28:55 -07:00
Debanjum	b395a438d0	Fix handling multiple actions requested by grounding agent in an iteration	2025-05-19 16:28:55 -07:00
Debanjum	e5415bdaee	Only reasoning agent should terminate run, not the grounding agent. Grounding agent does not have the full context and capabilities to make this call. Only let reasoning agent make termination decision. Add a wait action instead when grounder requests termination.	2025-05-19 16:28:55 -07:00
Debanjum	ffe58d2ec1	Parse goto, back actions directly from instruction for uitars grounder UI tars grounder doesn't like calling non-standard functions like goto, back. Directly parse visual reasoner instruction to bypass uitars grounder model. At least for goto and back functions grounding isn't necessary, so this works well.	2025-05-19 16:28:55 -07:00
Debanjum	7395af3c3a	Allow visual grounder of binary operator agent to see past actions Previously the grounding agent would be reset on every call. So it only saw the most recent instruction and screenshot to make its next action suggestion. This change allows the visual grounders to see past instructions and actions to prevent looping and encourage more exploratory action suggestions by it when stuck or see errors.	2025-05-19 16:28:55 -07:00
Debanjum	d8bc6239f8	Bifurcate visual grounder into a ui-tars specific & generic grounder Split visual grounder into two implementations: - A ui-tars specific visual grounder agent. This uses the canonical implementation of ui-tars with specialized system prompt and action parsing. - Fallback to generic visual grounder utilizing tool-use and served over any openai compatible api. This was previously being used for our ui-tars implementation as well.	2025-05-19 16:28:55 -07:00
Debanjum	c3bfb15fab	Support KeyUp, KeyDown operator actions. Make coordinates into floats	2025-05-19 16:28:55 -07:00
Debanjum	b279060e2c	Enable using Operator with Gemini models	2025-05-19 16:28:55 -07:00
Debanjum	0d8fb667ec	Add action results for multiple actions similar to other operator agents Adds the results of each action in a separate item in message content. Previously we were adding this as a single larger text blob. This changes adds structure to simplify post processing (e.g truncation). The updated add_action_results should also require less work to generalize if we pass tool call history to grounding model as action results in valid openai format.	2025-05-19 16:28:55 -07:00
Debanjum	e17c06b798	Set operator query on init. Pass summarize prompt to summarize func The initial user query isn't updated during an operator run. So set it when initializing the operator agent. Instead of passing it on every call to act. Pass summarize prompt directly to the summarize function. Let it construct the summarize message to query vision model with. Previously it was being passed to the add_action_results func as previous implementation that did not use a separate summarize func. Also rename chat_model to vision_model for a more pertinent var name. These changes make the code cleaner and implementation more readable.	2025-05-19 16:28:55 -07:00
Debanjum	38bcba2f4b	Make back action in browser environment use goto to avoid timeouts For some reason the page.go_back() action in playwright had a much higher propensity to timeout. Use goto instead to reduce these page traversal timeouts. This requires tracking navigation history.	2025-05-19 16:28:55 -07:00
Debanjum	fd139d4708	Improve termination on task completion for binary operator agent Only let the visual reasoner handle terminating the operator run. Previously the grounder was also able to trigger termination. Make catching the termination by the reasoner more robust	2025-05-19 16:28:55 -07:00
Debanjum	680c226137	Use any supported vision model as reasoner for binary operator agent	2025-05-19 16:28:55 -07:00
Debanjum	3839d83b90	Modularize operator into separate files for agent, action, environment etc The previous browser_operator.py file had become pretty massive and unwieldy. This change breaks it apart into separate files for - the abstract environment and operator agent base - the concrete agents: anthropic, openai and binary - the concrete environment browser operator - the operator actions used by agents and environment	2025-05-19 16:28:55 -07:00
Debanjum	833c8ed150	Add a flexible operator agent using separate reasoning, grounder models - This operator works with model served over an openai compatible api - It uses separate vision models to reason and ground actions. This improves flexibility in the operator agents that can be created. We do not know need our operator agent ot rely on monolithic models to can both reason over visual data and ground their actions. We can create operator agent from 2 separate models: 1. To reason over screenshots to suggest natural language next action 2. To ground those suggestion into visually grounded actions This allows us to create fully local operators or operators combining the best visual reasoner with the best visual grounder models.	2025-05-19 16:28:55 -07:00
Debanjum	773d20a26f	Improve instructions to the openai operator agent. Inform it can only control a single playwright browser page. Previously it was assuming it is operating a whole browser, so would have trouble navigating to different pages. Improve handling of error in action parsing	2025-05-19 16:28:55 -07:00
Debanjum	4db888cd62	Simplify operator loop. Make each OperatorAgent manage state internally. Remove each OperatorAgent specific code from leaking out into the operator. The Oprator just calls the standard OperatorAgent functions. Each AgentOperator specific logic is handled by the OperatorAgent internally. The improve the separation of responsibility between the operator, OperatorAgent and the Environment. - Make environment pass screenshot data in agent agnostic format - Have operator agents providers format image data to their AI model specific format - Add environment step type to distinguish image vs text content - Clearly mark major steps in the operator iteration loop - Handle anthropic models returning computer tool actions as normal tool calls by normalizing next action retrieval from response for it - Remove unused ActionResults fields - Remove unnnecessary placeholders to content of action results like for screenshot data	2025-05-19 16:28:55 -07:00
Debanjum	a1c9c6b2e3	Add pages visited via browser operator to references returned to clients	2025-05-19 16:28:55 -07:00
Debanjum	e71575ad1a	Render screenshot in train of thought on openai agent screenshot action	2025-05-19 16:28:55 -07:00
Debanjum	78e052bfcb	Decouple environment from operator agent to improve modularity Decouple applying action on Environment from next action decision by OperatorAgent - Create an abstract Environment class with a `step' method and a standardized set of supported actions for each concrete Environment - Wrap playwright page into a concrete Environment class - Create abstract OperatorAgent class with an abstract `act' method - Wrap Openai computer Operator into concrete OperatorAgent class - Wrap Claude computer Operator into a concrete OperatorAgent class Handle interaction between Agent's action	2025-05-19 16:28:55 -07:00
Debanjum	7c60e04efb	Pull out common iteration loop into main browser operator method	2025-05-19 16:28:54 -07:00
Debanjum	08e93c64ab	Render screenshot in train of thought on browser screenshot action Update web app to render screenshot image when screenshot action taken by browser operator	2025-05-19 16:28:54 -07:00
Debanjum	188b3c85ae	Force open links in current page to stay in operator page context Previously some link clicks would open in new tab. This is out of the browser operator's context and so the new page cannot be interacted with by the browser operator. This change catches new page opens and opens them in the context page instead.	2025-05-19 16:28:54 -07:00
Debanjum	20f87542e5	Add cancellation support to browser operator via asyncio.Event	2025-05-19 16:28:54 -07:00
Debanjum	9f75622346	Allow browser operator to use browser with existing context over CDP Give the Khoj browser operator access to browser with existing context (auth, cookies etc.) by starting it with CDP enabled. Process: 1. Start Browser with CDP enabled: `Edge/Chromium/Chrome --remote-debugging-port=9222' 2. Set the KHOJ_CDP_URL env var to the CDP url of the browser to use. 3. Start Khoj and ask it to get browser based work done with operator + research mode	2025-05-19 16:28:54 -07:00
Debanjum	b9ea538b02	Support operating web browser with Anthropic models - Add back() and goto(url) helper functions to operate browser - Cache operator messages to Anthropic API for speed and cost savings	2025-05-19 16:28:54 -07:00
Debanjum	2e86141575	Enable Khoj to use a GUI web browser. Operate it with Openai models	2025-05-19 16:28:54 -07:00
Debanjum	ab5d0b5878	Upgrade server dependencies	2025-05-19 16:28:21 -07:00
Debanjum	22cd638add	Fix handling unset openai_base_url to run eval with openai chat models The github run_eval workflow sets OPENAI_BASE_URL to empty string. The ai model api created during initialization for openai models gets set to empty string rather than None or the actual openai base url This tries to call llm at to empty string base url instead of the default openai api base url, which obviously fails. Fix is to map empty base url's to the actual openai api base url.	2025-05-19 16:19:43 -07:00
Debanjum	cf55582852	Retry on empty response or error in chat completion by llm over api Previously all exceptions were being caught. So retry logic wasn't getting triggered. Exception catching had been added to close llm thread when threads instead of async was being used for final response generation. This isn't required anymore since moving to async. And we can now re-enable retry on failures. Raise error if response is empty to retry llm completion.	2025-05-19 11:27:19 -07:00
Debanjum	7827d317b4	Widen vision support for chat models served via openai compatible api Send image as png to non-openai models served via an openai compatible api. As more models support png than webp. Continue storing images as webp on server for efficiency. Convert to png at the openai api layer and only for non-openai models served via an openai compatible api. Enable using vision models like ui-tars (via llama.cpp server), grok.	2025-05-19 11:27:19 -07:00
Debanjum	4f3fdaf19d	Increase khoj api response timeout on evals call. Handle no decision	2025-05-18 19:14:49 -07:00
Debanjum	31dcc44c20	Output tokens >> reasoning tokens to avoid early response termination.	2025-05-18 14:45:23 -07:00
Debanjum	73e28666b5	Fix to set default chat model for all user tiers via env var	2025-05-18 14:45:23 -07:00
Debanjum	06dcd4426d	Improve Research Mode Context Management (#1179 ) ### Major * Do more granular truncation on hitting context limits * Pack research iterations as list of message content instead of separate messages * Update message truncation logic to truncate items in message content list * Make researcher aware of number of web, doc queries allowed per iteration ### Minor * Prompt web page reader to extract quantitative data as is from pages * Track gemini 2.0 flash lite cost. Reduce max prompt size for 4o-mini * Ensure time to first token logged only once per chat response * Upgrade tenacity to respect min_time passed to exponential backoff with jitter function	2025-05-17 17:38:31 -07:00
Debanjum	fd591c6e6c	Upgrade tenacity to respect min time for exponential backoff Fix for issue is in tenacity 9.0.0. But older langchain required tenacity <0.9.0. Explicitly pin version of langchain sub packages to avoid indexing and doc parsing breakage.	2025-05-17 17:37:15 -07:00
Debanjum	988bde651c	Make researcher aware of no. of web, doc queries allowed per iteration - Construct tool description dynamically based on configurable query count - Inform the researcher how many webpage reads, online searches and document searches it can perform per iteration when it has to decide which next tool to use and the query to send to the tool AI. - Pass the query counts to perform from the research AI down to the tool AIs	2025-05-17 17:37:15 -07:00
Debanjum	417ab42206	Track gemini 2.0 flash lite cost. Reduce max prompt size for 4o-mini	2025-05-17 17:37:15 -07:00
Debanjum	e125e299a7	Ensure time to first token logged only once per chat response Time to first token Log lines were shown multiple times if new chunk bein streamed was empty for some reason. This change makes the logic robust to empty chunks being recieved.	2025-05-17 17:37:15 -07:00
Debanjum	2694734d22	Update truncation logic to handle multi-part message content	2025-05-17 17:37:15 -07:00
Debanjum	a337d9e4b8	Structure research iteration msgs for more granular context management Previously research iterations and conversation logs were added to a single user message. This prevented truncating each past iteration separately on hitting context limits. So the whole past research context had to be dropped on hitting context limits. This change splits each research iteration into a separate item in a message content list. It uses the ability for message content to be a list, that is supported by all major ai model apis like openai, anthropic and gemini. The change in message format seen by pick next tool chat actor: - New Format - System: System Message - User/Assistant: Chat History - User: Raw Query - Assistant: Iteration History - Iteration 1 - Iteration 2 - User: Query with Pick Next Tool Nudge - Old Format - User: System + Chat History + Previous Iterations Message - User: Query - Collateral Changes The construct_structured_message function has been updated to always return a list[dict[str, Any]]. Previously it'd only use list if attached_file_context or vision model with images for wider compatibility with other openai compatible api	2025-05-17 17:37:15 -07:00
Debanjum	0f53a67837	Prompt web page reader to extract quantitative data as is from pages Previously the research agent would have a hard time getting quantitative data extracted by the web page reader tool AI. This change aims to encourage the web page reader tool to extract relevant data in verbatim form for higher granularity research and responses.	2025-05-17 17:37:15 -07:00
Debanjum	99a2305246	Improve tool chat history constructor and fix its usage during research. Code tool should see code context and webpage tool should see online context during research runs Fix to include code context from past conversations to answer queries. Add all queries to tool chat history when no specific tool to limit extracting inferred queries for provided.	2025-05-17 17:37:15 -07:00
Debanjum	8050173ee1	Timeout calls to khoj api in evals to continue to next question	2025-05-17 17:37:11 -07:00
Debanjum	442c7b6153	Retry running code on more request exception	2025-05-17 17:37:11 -07:00
Debanjum	10a5d68a2c	Improve retry, increase timeouts of gemini api calls - Catch specific retryable exceptions for retry - Increase httpx timeout from default of 5s to 20s	2025-05-17 16:38:55 -07:00
Debanjum	20f08ca564	Reduce timeouts on calling local and online llms via openai api - Use much larger read, connect timeout if llm served over local url - Use larger timeout duration than default (5s) for online llms too This matches timeout duration increase calls to gemini api	2025-05-17 16:38:55 -07:00
Debanjum	e0352cd8e1	Handle unset ttft in metadata of failed chat response. Fixes evals. This was causing evals to stop processing rest of batch as well.	2025-05-17 15:06:22 -07:00
Debanjum	673a15b6eb	Upgrade hf hub package to include hf_xet for faster downloads	2025-05-17 15:06:22 -07:00
Debanjum	d867dca310	Fix send_message_to_model_wrapper by using sync is_user_subscribed check Calling an async function from a sync function wouldn't work.	2025-05-17 15:06:22 -07:00
Sajjad Baloch	a4ab498aec	Update README for better contributions (#1170 ) - Improve overall flow of the contribute section of Readme - Fix where to look for good first issues. The contributors board is outdated. Easier to maintain and view good-first-issue with issue tags directly. Co-authored-by: Debanjum <debanjum@gmail.com>	2025-05-12 09:51:01 -06:00
Debanjum	2feed544a6	Add Gemini 2.0 flash back to default gemini chat models list Remove once gemini 2.5 flash is GA	2025-05-11 19:05:09 -06:00
Debanjum	2e290ea690	Pass conversation history to generate non-streaming chat model responses Allows send_message_to_model_wrapper func to also use conversation logs as context to generate response. This is an optional parameter	2025-05-09 00:02:14 -06:00
Debanjum	8787586e7e	Dedupe code to format messages before sending to appropriate chat model Fallback to assume not a subscribed user if user not passed. This allows user arg to be actually optional in the async send_message_to_model_wrapper function	2025-05-09 00:02:14 -06:00
Debanjum	e94bf00e1e	Add cancellation support to research mode via asyncio.Event	2025-05-09 00:01:45 -06:00
Debanjum	1572781946	Parse and show reasoning model thoughts (#1172 ) ### Major All reasoning models return thoughts differently due to lack of standardization. We normalize thoughts by reasoning models and providers to ease handling within Khoj. The model thoughts are parsed during research mode when generating final response. These model thoughts are returned by the chat API and shown in train of thought shown on web app. Thoughts are enabled for Deepseek, Anthropic, Grok and Qwen3 reasoning models served via API. Gemini and Openai reasoning models do not show their thoughts via standard APIs. ### Minor - Fix ability to use Deepseek reasoner for intermediate stages of chat - Enable handling Qwen3 reasoning models	2025-05-02 20:29:38 -06:00
Debanjum	2cd7302966	Parse Grok reasoning model thoughts returned by API	2025-05-02 19:59:17 -06:00
Debanjum	8cadb0dbc0	Parse Anthropic reasoning model thoughts returned by API	2025-05-02 19:59:13 -06:00
Debanjum	ae4e352b42	Fix formatting to use Deepseek reasoner for completion via OpenAI API Previously Deepseek reasoner couldn't be used via API for completion because of the additional formatting constrains it required was being applied in this function. The formatting fix was being applied in the chat completion endpoint.	2025-05-02 19:11:16 -06:00
Debanjum	61a50efcc3	Parse DeepSeek reasoning model thoughts served via OpenAI compatible API DeepSeek reasoners returns reasoning in reasoning_content field. Create an async stream processor to parse the reasoning out when using the deepseek reasoner model.	2025-05-02 19:11:16 -06:00
Debanjum	16f3c85dde	Handle thinking by reasoning models. Show in train of thought on web client	2025-05-02 19:11:16 -06:00
Debanjum	d10dcc83d4	Only enable reasoning by qwen3 models in deepthought mode	2025-05-02 18:36:49 -06:00
Debanjum	6eaf54eb7a	Parse Qwen3 reasoning model thoughts served via OpenAI compatible API The Qwen3 reasoning models return thoughts within <think></think> tags before response. This change parses the thoughts out from final response from the response stream and returns as structured response with thoughts. These thoughts aren't passed to client yet	2025-05-02 18:36:45 -06:00
Debanjum	7b9f2c21c7	Parse thoughts from thinking models served via OpenAI compatible API OpenAI API doesn't support thoughts via chat completion by default. But there are thinking models served via OpenAI compatible APIs like deepseek and qwen3. Add stream handlers and modified response types that can contain thoughts as well apart from content returned by a model. This can be used to instantiate stream handlers for different model types like deepseek, qwen3 etc served over an OpenAI compatible API.	2025-05-02 17:49:16 -06:00
Debanjum	6843db1647	Use conversation specific chat model to respond to free tier users Recent changes enabled free tier users to switch free tier chat models per conversation or the default. This change enables free tier users to generate responses with their conversation specific chat model. Related: #725, #1151	2025-05-02 17:48:48 -06:00
Debanjum	5b5efe463d	Remove inline base64 images from webpages read with Firecrawl	2025-05-02 14:11:27 -06:00
Debanjum	559b323475	Support attaching jupyter/ipython notebooks from the web app to chat	2025-05-02 14:11:27 -06:00
sabaimran	dab6977fed	add number 1 repo of day badge	2025-04-23 16:49:12 -07:00
Debanjum	964a784acf	Release Khoj version 1.41.0	2025-04-23 19:01:27 +05:30
Debanjum	23dae72420	Update default models: Gemini models to 2.5 series, Gpt 4o to 4.1	2025-04-23 18:40:38 +05:30
Debanjum	d84a0f6e2c	Use latest node base image to build web app for khoj docker image	2025-04-23 17:53:33 +05:30
Debanjum	dd46bcabc2	Track gpt-4.1 model costs. Set prompt size of new gemini, openai models	2025-04-23 17:53:33 +05:30
Debanjum	87262d15bb	Save conversation to DB in the background, as an asyncio task	2025-04-22 17:42:33 +05:30
Debanjum	f929ff8438	Simplify AI Chat Response Streaming (#1167 ) Reason --- - Simplify code and logic to stream chat response by solely relying on asyncio event loop. - Reduce overhead of managing threads to increase efficiency and throughput (where possible). Details --- - Use async/await with no threading when generating chat response via OpenAI, Gemini, Anthropic AI model APIs - Use threading for offline chat model as llama-cpp doesn't support async streaming yet	2025-04-21 14:28:02 +05:30
Debanjum	a4b5842ac3	Remove ThreadedGenerator class, previously used to stream chat response	2025-04-21 14:16:40 +05:30
Debanjum	763fa2fa79	Refactor Offline chat response to stream async, with separate thread	2025-04-21 10:48:38 +05:30
Debanjum	932a9615ef	Refactor Anthropic chat response to stream async, no separate thread	2025-04-21 10:46:07 +05:30
Debanjum	a557031447	Refactor Gemini chat response to stream async, no separate thread	2025-04-21 10:46:07 +05:30
Debanjum	0751f2ea30	Refactor Openai chat response to stream async, no separate thread - Refactor chat API to use async/await for Openai streaming - Fix and clean Openai chat response async streaming	2025-04-21 10:44:49 +05:30
Debanjum	c93c0d982e	Create async get anthropic, openai client funcs, move to reusable package This package is where the get openai client functions also reside.	2025-04-21 09:30:26 +05:30
Debanjum	973aded6c5	Fix system prompt to make openai reasoning models md format response	2025-04-20 20:33:45 +05:30
Debanjum	21d19163ba	Just pass user rather than whole request object to doc search func	2025-04-20 20:33:45 +05:30
Debanjum	b2390fa977	Allow attaching typescript files to chat on web app	2025-04-19 19:08:11 +05:30
Debanjum	4d331e5ad2	Bump documentation dependencies	2025-04-19 18:38:31 +05:30
Debanjum	d6aafef464	Fix formatting of FAQ section in README.md	2025-04-19 18:31:16 +05:30
Debanjum	8f9090940b	Resolve datetime utcnow deprecation warnings (#1164 ) # PR Summary This small PR resolves the deprecation warnings on `datetime` in Python3.12+. You can find them in the [CI logs](https://github.com/khoj-ai/khoj/actions/runs/14538833837/job/40792624987#step:9:134): ```python /__w/khoj/khoj/src/khoj/processor/content/images/image_to_entries.py:61: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC). timestamp_now = datetime.utcnow().timestamp() ```	2025-04-19 18:26:52 +05:30
Debanjum	5441793a10	Allow AI model switching based on User Tier (#1151 ) Overview --- Enable free tier users to chat with any AI model made available on free tier of production deployments like [Khoj cloud](https://app.khoj.dev). Previously model switching was completely disabled for users on free tier. Details --- - Track price tier of each Chat, Speech, Image, Voice AI model in DB - Update API to allow free tier users to switch between free models - Update web app to allow model switching on agent creation, settings chat page (via right side pane), even for free tier users.	2025-04-19 18:14:37 +05:30
Debanjum	ab29ffd799	Fix web app packaging for pypi since upgrade to python 3.11.12 in CI	2025-04-19 18:03:29 +05:30
Debanjum	79fc911633	Enable free tier users to switch between free tier AI models - Update API to allow free tier users to switch between free models - Update web app to allow model switching on agent creation, settings chat page (via right side pane), even for free tier users. Previously the model switching APIs and UX fields on web app were completely disabled for free tier users	2025-04-19 17:29:53 +05:30
Debanjum	30570e3e06	Track Price tier for each Chat, Speech, Image, Voice AI model in DB Enables users on free plan to choose AI models marked for free tier	2025-04-19 09:44:33 +05:30
Debanjum	fdaf51f0ea	Fix formatting in readme and documentation	2025-04-19 00:36:47 +05:30
Emmanuel Ferdman	fee1d3682b	Resolve datetime deprecation warnings Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>	2025-04-18 10:41:16 -07:00
Debanjum	eb1406bcb4	Support deepthought in research mode with new Gemini 2.5 reasoning model The 2.5 flash model is the first hybrid reasoning models by Google - Track costs of thoughts separately as they are priced differently	2025-04-18 17:37:05 +05:30
Debanjum	f95173bb0a	Support deepthought in research mode with new Grok 3 reasoning model Rely on deepthought flag to control reasoning effort of low/high for the grok model This is different from the openai reasoning models which support low/medium/high and for which we use low/medium effort based on the deepthought flag Note: grok is accessible over an openai compatible API	2025-04-18 17:37:05 +05:30
Debanjum	9c70a0f3f5	Support recently released Openai reasoning models - Rely on deepthought flag to control reasoning effort - Generalize Openai reasoning model check for all o- series models	2025-04-18 17:32:29 +05:30
Debanjum	2f8283935a	Warn and drop empty messages when format messages for Anthropic Log dropped empty messages to debug this unexpected state. Related `0eb2d17`	2025-04-18 17:32:29 +05:30
Debanjum	51e19c6199	Simplify KHOJ_DOMAIN states. All production deployments should set it. Do not need KHOJ_DOMAIN to be tri-state. KHOJ_DOMAIN set to empty does not change behavior anymore. Related `5a3c7b1`	2025-04-18 17:32:29 +05:30
Debanjum	e072530471	Deduplicate images generated using the e2b code tool Disregard chart types as not using rich chart rendering and they are duplicate of chart images that are rendered Disregard text output associated with generated image files	2025-04-18 17:32:29 +05:30
RIKIN BRIGHT	dc398d30f8	Add FAQ section to Readme and Troubleshooting Tips to setup GCP Vertex (#1158 ) Added a “Troubleshooting & Tips” section to the GCP Vertex documentation. This section provides guidance for self-hosted users on common issues they may encounter when setting up Google Vertex AI integration in Khoj. Topics covered include permissions, region compatibility, prompt size limits, API key testing, and secure key management with environment variables. The goal is to improve the onboarding experience and reduce setup errors for contributors and self-hosters using Vertex AI models like Claude and Gemini. Signed off by: brightally6@gmail.com	2025-04-15 08:19:44 +05:30
sabaimran	6a30da3e9e	Fix default state for tools in the agent settings for the chat sidebar	2025-04-11 11:12:22 -07:00
Debanjum	2470eea421	Release Khoj version 1.40.0	2025-04-11 18:10:56 +05:30
Debanjum	d0a933b072	Add email based rate limiting to email login API endpoint Server: - Rate limit based on unverified email before creating user - Check email address for deliverability before creating user - Track rate limit for unverified email in new non-user keyed table Web app: - Show error in login popup to user on failure/throttling - Simplify login popup logic by moving magic link handling logic into EmailSigninContext instead of passing require props via parent	2025-04-11 17:49:18 +05:30
Debanjum	fe308c2911	Handle scenario where no valid otps for selected users on admin panel	2025-04-11 17:49:18 +05:30
Debanjum	02a6ce9f14	Upgrade server django dependencies	2025-04-11 17:49:18 +05:30
Debanjum	d84a0abb7f	Fix and Improve Chat sidebar and component setup on Web App (#1157 ) - Set chatSidebar prompt, Setting name fields to empty str if value null - Track if agent modified in chatSidebar to simplify code, fix looping - Suppress spurious dark mode hydration warnings on the web app - Set key for chatMessage parent to get UX efficiently updated by react - Let only root next.js layout handle html, body tags, not child layouts	2025-04-11 16:12:03 +05:30
Dmitry	50b0b8a6e7	Fix typo in Development documentation (#1159 )	2025-04-11 16:05:40 +05:30
Darya	f609a2d050	Fix typos in admin panel documentation (#1156 ) These adjustments should make the text clearer and more accurate.	2025-04-11 07:00:17 +05:30
Debanjum	2935ea52cf	Set chatSidebar prompt, Setting name fields to empty str if value null TextArea and Input field values cannot be null.	2025-04-10 19:59:01 +05:30
Debanjum	aea7b90fec	Track if agent modified in chatSidebar to simplify code, fix looping Previously the sidebar could recurse on opening chat page (from home?) due to child modelSelector component updating parent chatSidebar prop which was passed back down to it in a loop. The chatSidebar decides if agent has been modified in a single useEffect and enables the Save button accordingly. - Track agent modification wrt agent info received from server in chatSidebar instead. - Reduce modelSelector's mandate to just notify when the user changes the model. - Fix to infer, show & update agent state from chat sidebar on web app This logic is fragile and convoluted because: - the default agent chat model is dynamically determined. - need to disambiguate tools not set vs none set vs all set by user The default agent's tool selection is stored as undefined to show not set scenario, which allows for all tools to be dynamically used by agent. But the user can also set no tools or all tools for their agents. All 3 scenarios are handled differently. - Track tools to be displayed vs tools to be stored	2025-04-10 19:59:01 +05:30
Debanjum	e9ee9004fb	Suppress spurious dark mode hydration warnings on the web app This is triggered by mismatch between "dark" class present on server sent layout but not in client sent layout on initial render. That mismatch exists because the server applies dark-mode styling early to avoid FOUC flickering of UX. Related `43e032e`	2025-04-10 19:59:01 +05:30
Debanjum	9ab5ead3ca	Set key for chatMessage parent to get UX efficiently updated by react By fixing the no key prop in ChatHistory error on web app	2025-04-10 19:59:01 +05:30
Debanjum	1ad7314fe6	Let only root next.js layout handle html, body tags, not child layouts Remove html, body elements from child page layouts. Let only the root layout handle it. Next.js router structure mounts child layouts inside parent layouts, as defined by their directory hierarchy. So the html, body component should only be defined in the parent layout. This avoids the child layout mounting its html, body component within the actual root layout's existing html, body component.	2025-04-10 19:59:01 +05:30
Debanjum	33665dee50	Dynamically set default agent chat model to server > user > first chat model Previously the chat model associated with the default agent was always the first chat model populated on the server. This doesn't match behavior of the rest of the system, where the server chat settings is preferred over the user chat settings over the first chat model. This change brings the default agent's chat model in line with the preference order used in the reset of the system.	2025-04-10 19:58:01 +05:30
Debanjum	1eb092010c	Fix handling unset response_schema being passed to gemini models Use of `is_none_or_empty' doesn't work well with classes, which response_schema can get set to.	2025-04-10 19:58:01 +05:30
Debanjum	5b248e8515	Fix fallback to default agent if none set for conversation Previous change to fallback to default agent was not functional. It would error out if the conversation agent wasn't set when trying to get conversation.agent.slug for calling aget_agent_by_slug func	2025-04-10 19:54:01 +05:30
Debanjum	4012a6372f	Fix pgvector docker image pull by using postgres 15 tag We were previously relying on an older, unmaintained version of pgvector docker image, ankane/pgvector. Moving to new docker image requires selecting from tags based on the pg major version (14, 15, 16 or 17). This change uses pg15 tag to resolve image pull. Note: we use postgres 15 for khoj docker images currently Fixes #1154	2025-04-08 18:07:08 +05:30
Debanjum	19618605a5	Upgrade documentation packages	2025-04-07 20:07:10 +05:30
Debanjum	3fc1435cd1	Fix login to local admin panel without need to set KHOJ_DEBUG Issue introduced in commit `5a3c7b1`. Usage of KHOJ_DOMAIN --- KHOJ_DOMAIN is tri-state for local, official and other production deployments: - If KHOJ_DOMAIN is unset (for local): - sets CSRF cookie to localhost - adds khoj.dev variants to ALLOWED_HOSTS, CSRF_TRUSTED_ORIGINS - adds app.khoj.dev variants to CORS origins - If KHOJ_DOMAIN is set to empty (for official): - sets CSRF to khoj.dev - adds khoj.dev variants to ALLOWED_HOSTS, CSRF_TRUSTED_ORIGINS - adds app.khoj.dev variants to CORS origins - If KHOJ_DOMAIN is set (for other prod deployments): - sets CSRF cookie to KHOJ_DOMAIN - adds KHOJ_DOMAIN variants to ALLOWED_HOSTS, CSRF_TRUSTED_ORIGINS - adds KHOJ_DOMAIN variants to CORS origins Related #1137, #1152 Resolves #1123	2025-04-07 19:43:58 +05:30
Debanjum	ed70d2254e	Suppress spurious RequestAborted ASGI errors on the admin panel Unsure why this error triggers on every request to the Django admin panel these days but all the requests are completing fine and the client is clearly not aborting the request when the RequestAborted exception is raised. Suppress these errors for now via middleware to prevent them from unnecessarily cluttering up the server logs and confusing folks. Related #1152	2025-04-07 19:43:58 +05:30
Debanjum	353a4aa597	Upgrade pgvector to use the new official docker image - ankane/pgvector has been deprecated in place of pgvector/pgvector - use full path to image on docker.io for consistency	2025-04-07 19:43:58 +05:30
Debanjum	50508d97f9	Disable telemetry only via KHOJ_TELEMETRY_DISABLE environment variable Do not unnecessarily overload usage of KHOJ_DEBUG environment variable	2025-04-07 19:43:58 +05:30
sabaimran	2ae8c62547	Repopulate the client API-key generation section	2025-04-07 06:54:47 -07:00
Debanjum	6872817d41	Remove request to set default chat model during interactive init It wasn't being set correctly and seems unnecessary as can switch to desired chat models via the settings page or chat side pane easily	2025-04-07 14:38:55 +05:30
Debanjum	0c257c044e	Handle unset response_schema being passed to gemini models	2025-04-05 23:12:35 +05:30
Debanjum	c1912f8ca7	Default eval to use 10 iterations for research mode	2025-04-05 10:09:58 +05:30
Debanjum	645c2bc546	Improve Khoj is ready message	2025-04-05 10:09:28 +05:30
Debanjum	6e61ec64a4	Release Khoj version 1.39.0	2025-04-04 20:35:50 +05:30
Debanjum	751215a701	Improve response quality with Gemini. Improve evaluation harness (#1150 ) ### Improve Gemini usage - Allow text tool to give agent ability to terminate research - Set default context for gemini 2 flash models 2x context window for small, commercial models to 120K - Default temperature of Gemini models to 1.0 to reduce repetition ### Improve evaluation harness - Add more knobs to control eval workflow - Allow running eval with any chat model served over an openai compatible api - Control random sampling from eval set - Auto read web page - Use embedded postgres instead of postgres server for eval workflow - Use Gemini 2.0 flash as evaluator. Set seed for evaluator to reduce decision variance	2025-04-04 20:17:36 +05:30
Debanjum	7f18bc0840	Add default context for gemini 2 flash. 2x it for small, commercial models Previously Gemini 2 flash and flash lite were using context window of 10K by default as no defaults were added for it. Increase default context for small commercial models to 120K from 60K as cheaper and faster than their pro models equivalents at 60K context.	2025-04-04 20:11:00 +05:30
Debanjum	47a081c7bd	Allow text tool to give agent ability to terminate research We'd moved research planner to only use tools in enum of schema. This enum tool enforcement prevented model from terminating research by setting tool field to empty. Fix the issue by adding text tool to research tools enum and tell model to use that to terminate research and start response instead.	2025-04-04 20:11:00 +05:30
Debanjum	38dd02afbf	Make ordering of fields expected by research planner consistent Make research planner consistently select tool before query. As the model should tune it's query for the selected tool. It got space to think about tool to use in the scratchpad already.	2025-04-04 20:11:00 +05:30
Debanjum	443c5a4420	Consistently wrap queries in online search prompt in double quotes The queries field name in the first example isn't wrapped in double quotes, rest are.	2025-04-04 20:11:00 +05:30
Debanjum	ae8fb6f9ac	Default temperature of Gemini models to 1.0 to try avoid repetition This is the default temperature for non-thinking gemini models on ai studio. See if using this alleviates the problem.	2025-04-04 20:11:00 +05:30
Debanjum	e9928d3c50	Eval more model, control randomization & auto read webpage via workflow - Control auto read webpage via eval workflow. Prefix env var with KHOJ_ Default to false as it is the default that is going to be used in prod going forward. - Set openai api key via input param in manual eval workflow runs - Simplify evaluating other chat models available over openai compatible api via eval workflow. - Mask input api key as secret in workflow. - Discard unnecessary null setting of env vars. - Control randomization of samples in eval workflow. If randomization is turned off, it'll take the first SAMPLE_SIZE items from the eval dataset instead of a random collection of SAMPLE_SIZE items.	2025-04-04 20:11:00 +05:30
Debanjum	911e1bf981	Use gemini 2.0 flash as evaluator. Set seed for it to reduce eval variance. Gemini 2.0 flash model is cheaper and better than Gemini 1.5 pro	2025-04-04 20:11:00 +05:30
Debanjum	0dcb2544d7	Use embedded postgres instead of postgres server for eval workflow	2025-04-04 20:11:00 +05:30
sabaimran	abfdf7b1fb	Temporarily remove the OTP feature on the settings page while we fix our number	2025-04-03 19:41:18 -05:00
Artem Yurchenko	1ef8c37c3a	Implement better template for feature request issue (#1132 ) This PR implements a new feature request template with a few UX/UI improvements. Key changes: - Use of GitHub forms. - Provide note info for a submitter about feature request submitting rules. - Adds a few handy fields like "Describe the feature" or "Use Case" Overall, with a template like this feature requests will be more structured and meaningful.	2025-04-03 15:41:04 +05:30
Debanjum	17997d26c5	Make generated code block extractor less strict to improve code tool	2025-04-03 14:07:15 +05:30
Debanjum	66e9ddb6be	Support OpenAI (API compatible) models and Firecrawl in eval workflow	2025-04-03 14:03:29 +05:30
Debanjum	ef5b19479f	Improve setup of openai compatible speech to text, image models Only setup speech to text and text to image models served via openai compatible APIs when explicitly specified during initialization. This avoids setup of whisper and dalle when an openai compatible API is being setup instead of the openai API itself.	2025-04-03 13:44:19 +05:30
Debanjum	c2b9006a7a	Improve Gemini Response Reliability (#1148 ) - Specify min, max number of list items expected in AI response via JSON schema enforcement. Used by Gemini models - Warn and drop invalid/empty messages when format messages for Gemini models - Make Gemini response adhere to the order of the schema property definitions - Improve agent creation safety checker by using response schema, better prompt	2025-04-03 13:42:35 +05:30
Debanjum	f77e871cc8	Improve agent creation safety checker with response schema, better prompt	2025-04-03 02:49:40 +05:30
Debanjum	aab010723c	Make Gemini response adhere to the order of the schema property definitions Without explicitly using the property ordering field, gemini returns responses in alphabetically sorted property order. We want the model to respect the schema property definition order. This ensures control during development to maintain response quality. For example in CoT make it fill scratchpad before answers.	2025-04-03 01:19:48 +05:30
Debanjum	ae9ca58ab9	Specify min, max items expected in ai response via schema enforcement Require at least 1 item in lists. Otherwise gemini flash will sometimes return an empty list. For chat actors where max items is known, set that as well. OpenAI API does not support specifying min, max items in response schema lists, so drop those properties when response schema is passed. Add other enforcements to response schema to comply with response schema format expected by OpenAI API.	2025-04-03 01:19:39 +05:30
Debanjum	0eb2d17771	Warn and drop invalid messages when format messages for gemini Previously we were setting message content part with empty text. This results in error from Gemini API. Warn and drop such messages instead. Log empty message content found during construction to root-cause the issue but allow Khoj to respond without the offending messages in context for call to Gemini API.	2025-04-03 01:11:22 +05:30
sabaimran	dd957bedd3	Remove pgserver in repo from git tracking	2025-03-31 15:07:24 -05:00
sabaimran	d53b740197	Improve online search and allow server to skip auto webpage read	2025-03-31 13:52:48 -05:00
Debanjum	177560655d	Fix and Improve Online Search and Webpage Read (#1147 ) New - Support Firecrawl as a online search provider Improve - Fallback to other enabled online search providers on failure - Speed up online search with Jina by excluding webpage content in search results Fix - Fix Jina webpage reader. Improve it to include generated alt text to each image on webpage - Truncate online query to Serper if query exceeds max supported length	2025-04-01 00:09:46 +05:30
Debanjum	d62dd4ef61	Support Firecrawl as a online search provider	2025-03-31 17:06:00 +05:30
Debanjum	3939e995e4	Fallback to enabled, lower priority online search providers on error Make serper.dev higher priority than official google serp api because it provides more detailed results with knowledge cards etc.	2025-03-31 17:05:44 +05:30
Debanjum	9b7442f28f	Truncate online query to Serper if query exceeds max supported length Previously query to serper with longer than max supported would throw error instead of returning at least some results. Truncating the onlien search query to serper to max supported length mitigates that issue.	2025-03-31 17:01:42 +05:30
Debanjum	db7eba56f6	Fix webpage read and improve web search with Jina - Improve webpage read to include image alt text - Improve Jina webpage search to not include each page content - Use POST instead of GET for web search, webpage read with Jina	2025-03-31 17:01:42 +05:30
Debanjum	db68372b81	Update code sandbox prompts to allow network access when using E2B Tell Khoj code writing chat actor that it has access to the network and can use the python requests library in the E2B code sandbox.	2025-03-31 15:33:47 +05:30
Debanjum	5b8c2989d6	Add hover text on button to unshare a conversation on web app	2025-03-31 15:32:43 +05:30
Debanjum	85d627ceb0	Simplify docs to self-host with pip since can use embedded DB now Remove postgres setup instructions from self host with pip docs. It is unnecessary if embedded postgres DB works on the operating system.	2025-03-30 00:17:24 +05:30
Debanjum	713ba06a8d	Release Khoj version 1.38.0	2025-03-29 18:30:06 +05:30
Debanjum	e9132d4fee	Support attaching programming language file types to web app for chat	2025-03-29 01:22:35 +05:30
Debanjum	bdb6e33108	Install pgserver only when `pip install khoj[local]' is enabled This avoids installing pgserver on linux arm64 docker builds, which it doesn't currently support and isn't required to support as Khoj docker images can use standard postgres server made available via our docker-compose.yml	2025-03-29 00:27:19 +05:30
Debanjum	5ee513707e	Use embedded postgres db to simplify self-hosted setup (#1141 ) Use pgserver python package as an embedded postgres db, installed directly as a khoj python package dependency. This significantly simplifies self-hosting with just a `pip install khoj'. No need to also install postgres separately. Still use standard postgres server for multi-user, production use-cases.	2025-03-29 00:03:55 +05:30
Debanjum	56b63f95ea	Suggest Google image gen model, new Anthropic chat models on first run - Update default anthropic chat models to latest good models. - Now that Google supports a good text to image model. Suggest adding that if Google AI API is setup on first run.	2025-03-28 23:07:17 +05:30
Ikko Eltociear Ashimine	1e34de69e9	Fix spelling in Automations Docs (#1140 ) Recieve -> Receive	2025-03-28 23:07:06 +05:30
Debanjum	72986c905a	Fix default agent creation to allow chat on first run Previously agent slug was not considered on create even when passed explicitly in agent creation step. This made the default agent slug different until next run when it was updated after creation. And didn't allow chat to work on first run The fix to use the agent slug when explicitly passed allows users to chat on first run.	2025-03-28 22:49:00 +05:30
Debanjum	03de2803f0	Fallback to default agent for chat when unset in get conversation API	2025-03-28 00:56:18 +05:30
Debanjum	a387f638cd	Enforce json schema on more chat actors to improve schema compliance Including infer webpage urls, gemini documents search, pick default mode tools chat actors	2025-03-28 00:56:18 +05:30
Debanjum	ccd9de7792	Improve safety settings for Gemini chat models - Align remaining harm categories to only refuse in high harm scenarios as well - Handle response for new "negligible" harm probability as well	2025-03-28 00:56:18 +05:30
Debanjum	2ec5cf3ae7	Normalize type of chat messages arg sent to Anthropic completion funcs Previously messages got Anthropic specific formatting done before being passed to Anthropic (chat) completion functions. Move the code to format messages of type list[ChatMessage] into Anthropic specific format down to the Anthropic (chat) completion functions. This allows the rest of the functionality like prompt tracing to work with normalized list[ChatMesssage] type of chat messages across AI API providers	2025-03-26 18:24:17 +05:30
Debanjum	4085c9b991	Fix infer webpage url step actor to request upto specified max urls Previously we'd always request up to 3 webpage url via the prompt but read only one of the requested webpage url. This would degrade quality of research and default mode. As model may request reading upto 3 webpage links but get only one of the requested webpages read. This change passes the number of webpages to read down to the AI model dynamically via the updated prompt. So number of webpages requested to be read should mostly be same as number of webpages actually read. Note: For now, the max webpages to read is kept same as before at 1.	2025-03-26 18:24:17 +05:30
Debanjum	c337c53452	Fix to use agent chat model for research model planning Previously the research mode planner ignored the current agent or conversation specific chat model the user was chatting with. Only the server chat settings, user default chat model, first created chat model were considered to decide the planner chat model. This change considers the agent chat model to be used for the planner as well. The actual chat model picked is decided by the existing prioritization of server > agent > user > first chat model.	2025-03-25 18:31:55 +05:30
Debanjum	df090e5226	Enable unsharing of a public conversation (#1135 ) This change enables the creator of a shared conversation to stop sharing the conversation publicly. ### Details 1. Create an API endpoint to enable the owner of the shared conversation to unshare it 2. Unshare a public conversations from the title pane of the public conversation on the web app	2025-03-25 14:24:01 +05:30
Debanjum	9dfa7757c5	Unshare public conversations from the title pane on web app Only show the unshare button on public conversations created by the currently logged in user. Otherwise hide the button Set conversation.isOwner = true only if currently logged in user shared the current conversation. This isOwner information is passed by the get shared conversation API endpoint	2025-03-25 14:05:29 +05:30
Debanjum	d9c758bcd2	Create API endpoint to unshare a public conversation Pass isOwner field from the get shared conversation API endpoint if the currently authenticated user created the requested public conversation	2025-03-25 14:05:29 +05:30
Debanjum	e3f6d241dd	Normalize chat messages sent to gemini funcs to work with prompt tracer Previously messages passed to gemini (chat) completion functions got a little of Gemini specific formatting mixed in. These functions expect a message of type list[ChatMessage] to work with prompt tracer etc. Move the code to format messages of type list[ChatMessage] into gemini specific format down to the gemini (chat) completion functions. This allows the rest of the functionality like prompt tracing to work with normalize list[ChatMesssage] type of chat messages across providers	2025-03-25 14:04:16 +05:30
Debanjum	7976aa30f8	Terminate research if query or tool is empty	2025-03-25 14:04:16 +05:30
Debanjum	39aa48738f	Set effort for openai reasoning models to pick tool in research mode This is analogous to how we enable extended thinking for claude models in research mode. Default to medium effort irrespective of deepthought for openai reasoning models as high effort is currently flaky with regular timeouts and low effort isn't great.	2025-03-25 14:04:16 +05:30
Debanjum	b4929905b2	Add costs of ai prompt cache read, write. Use for calls to Anthropic	2025-03-25 14:04:16 +05:30
Debanjum	d4b0ef5e93	Fix ability to disable code and internet providers in eval workflow Sets env vars to empty if condition not met so: - Terrarium (not e2b) used as code sandbox on release triggered eval - Internet turned off for math500 eval	2025-03-25 14:04:16 +05:30
sabaimran	a8285deed7	Release Khoj version 1.37.2	2025-03-23 11:38:25 -07:00
sabaimran	b7ac8771de	Update a few pieces of documentation around data sources.	2025-03-23 11:36:20 -07:00
sabaimran	12e7409da9	Release Khoj version 1.37.1	2025-03-23 11:10:34 -07:00
sabaimran	985f1672ed	Remove eval lists from git tracking	2025-03-23 10:59:32 -07:00
Debanjum	d1df9586ca	Standardize AI model response temperature across provider specific ranges - Anthropic expects a 0-1 range. Gemini & OpenAI expect a 0-2 range - Anneal temperature to explore reasoning trajectories but respond factually - Default send_message_to_model and extract_question temps to the same	2025-03-23 18:09:22 +05:30
Debanjum	55ae0eda7a	Upgrade package dependencies nextjs for web app and torch on server	2025-03-23 17:10:40 +05:30
Debanjum	8409e64ff0	Clean AI model API providers documentation	2025-03-23 16:26:34 +05:30
Debanjum	86a51d84ca	Access Claude and Gemini via GCP Vertex AI (#1134 ) Support accessing Claude and Gemini AI models via Vertex AI on Google Cloud. See the documentation at docs.khoj.dev for setup details	2025-03-23 16:26:02 +05:30
Debanjum	16ffebf765	Document how to configure using AI models via GCP Vertex AI	2025-03-23 16:12:46 +05:30
Debanjum	7153d27528	Cache Google AI API client for reuse	2025-03-23 16:12:46 +05:30
Debanjum	da33c7d83c	Support access to Gemini models via GCP Vertex AI	2025-03-23 16:12:46 +05:30
Debanjum	603c4bf2df	Support access to Anthropic models via GCP Vertex AI Enable configuring a Khoj AI model API for Vertex AI using GCP credentials. Specifically use the api key & api base url fields of the AI Model API associated with the current chat model to extract gcp region, gcp project id & credentials. This helps create a AnthropicVertex client. The api key field should contain the GCP service account keyfile as a base64 encoded string. The api base url field should be of the form `https://{MODEL_GCP_REGION}-aiplatform.googleapis.com/v1/projects/{YOUR_GCP_PROJECT_ID}` Accepting GCP credentials via the AI model API makes it easy to use across local and cloud environments. As it bypasses the need for a separate service account key file on the Khoj server.	2025-03-23 16:12:46 +05:30
Debanjum	8bebcd5f81	Support longer API key field in DB to store GCP service account keyfile	2025-03-23 14:55:50 +05:30
Debanjum	f2b438145f	Upgrade sentence-transformers. Avoid transformers v4.50.0 as problematic - The 3.4.1 release of sentence tranformer fixes offline load latency of sentence transformer models (and Khoj) by avoiding call to HF - The 4.50.0 release of transformers is resulting in jax error (unexpected keyword argument 'flatten_with_keys') on load.	2025-03-23 09:02:57 +05:30
Debanjum	510cbed61c	Make google auth package dependency explicit to simplify code Previously google auth library was explicitly installed only for the cloud variant of Khoj to minimize packages installed for non production use-cases. But it was being implicitly installed as a dependency of an explicit package in the default installation anyway. Making the dependency on google auth package explicit simplifies the conditional import of google auth in code while not incurring any additional cost in terms of space or complexity.	2025-03-23 09:02:57 +05:30
Debanjum	5fff05add3	Set seed for Google Gemini models using KHOJ_LLM_SEED env variable This env var was already being used to set seed for OpenAI and Offline models	2025-03-22 08:59:31 +05:30
Debanjum	6cc5a10b09	Disable SimpleQA eval on release as saturated & low signal for usecase Reaching >94% in research mode on SimpleQA. When answers can be researched online, it becomes too easy. And the FRAMES eval does a more thorough job of evaluating that use-case anyway.	2025-03-22 08:05:12 +05:30
Debanjum	45015dae27	Limit to json enforcement via json object with DeepInfra hosted models DeepInfra based models do not seem to support json schema. See https://deepinfra.com/docs/advanced/json_mode for reference	2025-03-22 08:04:09 +05:30
Debanjum	dc473015fe	Set default model, sandbox to display in eval workflow summary on release	2025-03-20 14:44:56 +05:30
Debanjum	80d864ada7	Release Khoj version 1.37.0	2025-03-20 14:06:57 +05:30
Debanjum	0c53106b30	Fix passing inline images to vision models - Fix regression: Inline images were not getting passed to the AI models since #992 - Format inline images passed to Gemini models correctly - Format inline images passed to Anthropic models correctly Verified vision working with inline and url images for OpenAI, Anthropic and Gemini models. Resolves #1112	2025-03-20 13:22:46 +05:30
Debanjum	1ce1d2f5ab	Deduplicate, clean code for S3 images uploads	2025-03-20 12:30:07 +05:30
Debanjum	f15a95dccf	Show Khoj agent in agent dropdown by default on mobile in web app home Previously on slow connection you'd see the agent dropdown flicker from undefined to Khoj default agent on phones and other thin screens. This is unnecessary and jarring. Populate with default agent to remove this issue	2025-03-20 12:27:52 +05:30
Debanjum	9a0b126f12	Allow chat input on web app while Khoj responds to speed interactions Previously the chat input area didn't allow inputting text while Khoj is researching and generating response. This change allows the user to add their next text while Khoj responds. This should speed up interaction cycles as user can have their next query ready to send when Khoj finishes its response.	2025-03-19 23:08:22 +05:30
Debanjum	e68428dd24	Support enforcing json schema in supported AI model APIs (#1133 ) - Trigger Gemini 2.0 Flash doesn't always follow JSON schema in research prompt - Details - Use json schema to enforce generate online queries format - Use json schema to enforce research mode tool pick format - Support constraining Gemini model output to specified response schema - Support constraining OpenAI model output to specified response schema - Only enforce json output in supported AI model APIs - Simplify OpenAI reasoning model specific arguments to OpenAI API	2025-03-19 22:59:23 +05:30
Debanjum	a5627ef787	Use json schema to enforce generate online queries format	2025-03-19 22:32:53 +05:30
Debanjum	2c53eb9de1	Use json schema to enforce research mode tool pick format	2025-03-19 22:32:53 +05:30
Debanjum	6980014838	Support constraining Gemini model output to specified response schema If the response_schema argument is passed to send_message_to_model_wrapper it is used to constrain output by Gemini models	2025-03-19 22:32:53 +05:30
Debanjum	ac4b36b9fd	Support constraining OpenAI model output to specified response schema	2025-03-19 22:32:52 +05:30
Debanjum	4a4d225455	Only enforce json output in supported AI model APIs Deepseek reasoner does not support json object or schema via deepseek API Azure Ai API does not support json schema Resolves #1126	2025-03-19 22:32:11 +05:30
Debanjum	d74c3a1db4	Simplify OpenAI reasoning model specific arguments to OpenAI API Previously OpenAI reasoning models didn't support stream_options and response_format Add reasoning_effort arg for calls to OpenAI reasoning models via API. Right now it defaults to medium but can be changed to low or high	2025-03-19 21:12:02 +05:30
Debanjum	9b6d626a09	Fix to store e2b code execution text output file content as string Previously was encoding E2B code execution text output content as b64. This was breaking - The AI model's ability to see the content of the file - Downloading the output text file with appropriately encoded content Issue created when adding E2B code sandbox in #1120	2025-03-19 20:09:41 +05:30
Artem Yurchenko	a7e261a191	Implement better bug issue template (#1129 ) * Implement better bug issue template * Fix IDs in new bug issue template * Reduce, reorder and improve field descriptions in the bug issue template --------- Co-authored-by: Debanjum <debanjum@gmail.com>	2025-03-18 20:53:57 +05:30
Debanjum	931f555cf8	Configure max allowed iterations in research mode via env var	2025-03-18 18:15:50 +05:30
Debanjum	2ab8e711d3	Fix Gemini models to output valid json when configured	2025-03-18 17:02:45 +05:30
sabaimran	ce60cb9779	Remove max-w 80vw, which was smushing AI responses	2025-03-13 13:44:05 -07:00
sabaimran	a3c4347c11	Add a one-click action to export all conversations. Add a self-service delete account action to the settings page	2025-03-12 23:54:02 -07:00
Debanjum	79816d2b9b	Upgrade package dependencies of server, clients and docs	2025-03-12 00:22:08 +05:30
Debanjum	7bb6facdea	Add support for Google Imagen AI models for image generation Use the new Google GenAI client to generate images with Imagen	2025-03-11 23:39:46 +05:30
Debanjum	bd06fcd9be	Stop using old google generativeai package to raise, catch exceptions	2025-03-11 23:39:46 +05:30
Debanjum	bdfa6400ef	Upgrade to new Gemini package to interface with Google AI	2025-03-11 22:18:07 +05:30
Debanjum	2790ba3121	Update default temperature for calls to Gemini models to 0.6 from 0.2 This aligns with default temperature used by google ai studio and may reduce loops and repetitions	2025-03-11 21:28:04 +05:30
Debanjum	50f71be03d	Support Claude 3.7 and use its extended thinking in research mode Claude 3.7 Sonnet is Anthropic's first reasoning model. It provides a single model/api capable of standard and extended thinking. Utilize the extended thinking in Khoj's research mode. Increase default max output tokens to 8K for Anthropic models.	2025-03-11 21:27:59 +05:30
Debanjum	69048a859f	Fix E2B tool description prompt to mention plotly package available	2025-03-11 02:20:06 +05:30
Debanjum	9751adb1a2	Improve Code Tool, Sandbox and Eval (#1120 ) # Improve Code Tool, Sandbox - Improve code gen chat actor to output code in inline md code blocks - Stop code sandbox on request timeout to allow sandbox process restarts - Use tenacity retry decorator to retry executing code in sandbox - Add retry logic to code execution and add health check to sandbox container - Add E2B as an optional code sandbox provider # Improve Gemini Chat Models - Default to non-zero temperature for all queries to Gemini models - Default to Gemini 2.0 flash instead of 1.5 flash on setup - Set default chat model to KHOJ_CHAT_MODEL env var if set	2025-03-09 18:49:59 +05:30
Debanjum	c133d11556	Improvements based on code feedback	2025-03-09 18:23:30 +05:30
Debanjum	94ca458639	Set default chat model to KHOJ_CHAT_MODEL env var if set Simplify code log to set default_use_model during init for readability	2025-03-09 18:23:30 +05:30
Debanjum	7b2d0fdddc	Improve code gen chat actor to output code in inline md code blocks Simplify code gen chat actor to improve correct code gen success, especially for smaller models & models with limited json mode support Allow specify code blocks inline with reasoning to try improve code quality Infer input files based on user file paths referenced in code.	2025-03-09 18:23:30 +05:30
Debanjum	8305fddb14	Default to non-zero temperature for all queries to Gemini models. It may mitigate the intermittent invalid json output issues. Model maybe going into repetition loops, non-zero temp may avoid that.	2025-03-09 18:23:30 +05:30
Debanjum	45fb85f1df	Add E2B as an optional code sandbox provider - Specify E2B api key and template to use via env variables - Try load, use e2b library when E2B api key set - Fallback to try use terrarium sandbox otherwise - Enable more python packages in e2b sandbox like rdkit via custom e2b template - Use Async E2B Sandbox - Parallelize file IO with sandbox - Add documentation on how to enable E2B as code sandbox instead of Terrarium	2025-03-09 18:23:30 +05:30
Debanjum	b4183c7333	Default to gemini 2.0 flash instead of 1.5 flash on Gemini setup Add price of gemini 2.0 flash for cost calculations	2025-03-07 13:48:15 +05:30
Debanjum	701a7be291	Stop code sandbox on request timeout to allow sandbox process restarts	2025-03-07 13:48:15 +05:30
Debanjum	ecc2f79571	Use tenacity retry decorator to retry executing code in sandbox	2025-03-07 13:48:15 +05:30
sabaimran	4a28714a08	Add retry logic to code execution and add health check to sandbox container	2025-03-07 13:48:15 +05:30
Debanjum	f13bdc5135	Log eval run progress percentage for orientation	2025-03-07 13:48:15 +05:30
Debanjum	bbe1b63361	Improve Obsidian Sync for Large Vaults (#1078 ) - Batch sync files by size to try not exceed API request payload size limits - Fix force sync of large vaults from Obsidian - Add API endpoint to delete all indexed files by file type - Fix to also delete file objects when call DELETE content source API	2025-03-07 13:47:21 +05:30
Debanjum	043de068ff	Fix force sync of large vaults from Obsidian Previously if you tried to force sync a vault with more than 1000 files it would only end up keeping the last batch because the PUT API call would delete all previous entries. This change calls DELETE for all previously indexed data first, followed by a PATCH to index current vault on a force sync (regenerate) request. This ensures that files from previous batches are not deleted.	2025-03-07 13:34:48 +05:30
Debanjum	86fa528a73	Add API endpoint to delete all indexed files by file type	2025-03-07 13:28:53 +05:30
Debanjum	b692e690b4	Rename and fix the delete content source API endpoint - Delete file objects on deleting content by source via API Previously only entries were deleted, not the associated file objects - Add new db adapter to delete multiple file objects (by name)	2025-03-07 13:28:53 +05:30
Debanjum	29403551b2	Batch sync files by size to not exceed API request payload size limits This may help mitigate the issue #970	2025-03-04 09:31:48 +05:30
Debanjum	0ddb6a38b8	Update Khoj docs to mark LMStudio as not supported anymore They seem to have deprecated json mode in their openai compatible API which Khoj uses extensively.	2025-02-18 21:33:03 +05:30
sabaimran	de550e5ca7	re-enable markdown formatting when chatting with o3	2025-02-18 21:12:03 +05:30
sabaimran	0016fe06c9	Release Khoj version 1.36.6	2025-02-18 18:54:13 +05:30
sabaimran	7089ea1cf4	Remove experimental parenthesis from research mode ✁	2025-02-18 08:59:12 +05:30
Debanjum	5a3c7b145a	Decouple Django CSRF, ALLOWED_HOST settings for more complex setups - Set KHOJ_ALLOWED_DOMAIN to the domain that Khoj is accessible on from the host machine. This can be the internal i.p or domain of the host machine. It can be used by your load balancer/reverse_proxy to access Khoj. For example, if the load balancer service is in the khoj docker network, KHOJ_DOMAIN will be `server' (i.e service name). - Set KHOJ_DOMAIN to your externally accessible DOMAIN or I.P to avoid CSRF trusted origin or unset cookie issue when trying to access the khoj admin panel. Resolves #1114	2025-02-17 15:47:35 +05:30
Debanjum	bb0828b887	Only show notes tool option to llm for selection when user has documents	2025-02-17 15:23:26 +05:30
Debanjum	5dfb59e1ee	Show more references in teaser ref section of chat response on web app	2025-02-15 14:11:17 +05:30
sabaimran	b6e745336b	Add s short description to explain what the create agent button does	2025-02-13 14:08:50 -08:00
sabaimran	848a91313d	Move the create agent button to the bottom of the sidebar and fix experience when resetting settings	2025-02-12 19:24:02 -08:00
sabaimran	d0d30ace06	Add feature to create a custom agent directly from the side panel with currently configured settings - Also, when in not subscribed state, fallback to the default model when chatting with an agent - With conversion, create a brand new agent from inside the chat view that can be managed separately	2025-02-12 18:24:41 -08:00
sabaimran	5d6eca4c22	Fix automation retrieval validity check	2025-02-11 19:17:37 -08:00
sabaimran	51952364ab	Release Khoj version 1.36.5	2025-02-11 13:23:30 -08:00
sabaimran	0211151570	Disable auto-setup of offline models if in non-interactive offline mode	2025-02-10 18:48:00 -08:00
sabaimran	589b047d90	Simplify agent picker selection in homepage mobile view	2025-02-10 15:05:30 -08:00
sabaimran	427ec061b4	Add auto redirect on delete of current conversation	2025-02-09 11:38:44 -08:00
sabaimran	bbb5fd667a	update messaging in the welcome email	2025-02-07 15:10:59 -08:00
sabaimran	ff6cb80c84	Release Khoj version 1.36.4	2025-02-06 16:50:50 -08:00
sabaimran	031bccb628	Fix awkward padding in chat window	2025-02-06 16:31:25 -08:00
sabaimran	43e032e25a	Improve handling of dark mode theme in order to avoid jitter when loading new page	2025-02-06 16:17:58 -08:00
sabaimran	2e01a95cf1	improve system prompt for generating mermaid.js diagrams	2025-02-06 15:05:30 -08:00
sabaimran	a2af6bea8e	Release Khoj version 1.36.3	2025-02-04 13:23:41 -08:00
sabaimran	0d10c5fb02	Improve default selection of models to avoid infinite loops	2025-02-04 11:36:41 -08:00
sabaimran	24b1dd3bff	Release Khoj version 1.36.2	2025-02-03 20:22:49 -08:00
sabaimran	4409a58794	set initial model of default state	2025-02-03 18:17:21 -08:00
sabaimran	51874c25d5	Prevent infinite loops in model selection logic by configuring an initial model state	2025-02-03 18:11:01 -08:00
sabaimran	489fa71143	Update the width for rending all conversation sessions	2025-02-03 15:49:43 -08:00
sabaimran	b354a37dcd	Release Khoj version 1.36.1	2025-02-02 21:55:59 -08:00
sabaimran	61e48d686e	Let file context buttons route to search page instead of settings for upload/manage	2025-02-02 12:26:13 -08:00
sabaimran	b4c467cd11	Remove shadows from reference panel trigger icons	2025-02-02 12:23:19 -08:00
sabaimran	a3d75e5241	When in mobile view, don't use the hover card in the model selector	2025-02-02 12:22:45 -08:00
sabaimran	4f79abb429	Release Khoj version 1.36.0	2025-02-02 08:39:21 -08:00
sabaimran	60e6913494	Merge pull request #1094 from khoj-ai/features/add-chat-controls Make it easier to determine which model you're chatting with, and to effortlessly update said model from within a given chat. In this change, we introduce a side bar that allows users to quickly change their chat model, tools, custom instructions, and file filters, directly within the chat view. This removes the need for setting up custom agents for simple instructions and mitigates the requirement to go to the settings page to verify the chat model in action. The settings page will still configure a per-user default, but the sidebar will allow for greater customization based on the needs of a conversation. We also extend the chat model to include more attributes that help users make decisions about model selection, including `strengths` and `description`. This can help people quickly understand which model might work best for their use case.	2025-02-01 14:35:47 -08:00
sabaimran	c558bbfd44	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-chat-controls	2025-02-01 14:25:58 -08:00
sabaimran	08bc1d3bcb	Improve sizing / spacing of side bar	2025-02-01 14:25:45 -08:00
sabaimran	f3b2580649	Add back hover state with collapsed references	2025-02-01 14:03:17 -08:00
sabaimran	0645af9b16	Use the agent CM as a backup chat model when available / necessary. Remove automation as agent option.	2025-02-01 13:06:42 -08:00
Debanjum	f2eba667fc	Fallback to schedule automation in UTC timezone if unset - Handle null automation ids in calls to get_automation function	2025-02-01 19:12:51 +05:30
Debanjum	0bfa7c1c45	Add support for o3 mini model by openai	2025-02-01 02:51:13 +05:30
sabaimran	641f1bcd91	Only open the side bar automatically when there is no chat history && no pending messages.	2025-01-30 16:07:27 -08:00
sabaimran	b111a9d6c6	Release Khoj version 1.35.3	2025-01-30 11:48:33 -08:00
Yuto SASAKI	018bc718fc	Fix typo in docs for Chat Model Type for Google Gemini setup (#1098 )	2025-01-30 03:40:37 -08:00
sabaimran	b73f446713	Fallback to show raw outputted diagram if fails rendering	2025-01-29 21:50:28 -08:00
sabaimran	98e3f5162a	Show generated diagram raw code if fails rendering	2025-01-29 21:49:33 -08:00
sabaimran	b5f99dd103	Rename custom instructions -> instructions	2025-01-29 15:06:22 -08:00
sabaimran	0ff33d4347	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-chat-controls	2025-01-29 14:58:15 -08:00
sabaimran	c3cb6086e0	Add list typing to the updated_messages temporary variable	2025-01-29 14:19:57 -08:00
sabaimran	5ea056f03e	Add custom handling logic when speaking with deepseak reasoner	2025-01-29 14:11:27 -08:00
sabaimran	d640299edc	use `is_active` property for determine user subscription status	2025-01-29 14:10:59 -08:00
sabaimran	e2bfd4ac0f	Change name of teams section to teams	2025-01-29 12:52:10 -08:00
sabaimran	58f77edcad	Make conversation sessions rounded-lg	2025-01-29 12:47:36 -08:00
sabaimran	0b2305d8f2	Add an animation to opening and closing the thought process	2025-01-29 12:47:14 -08:00
sabaimran	67b2e9c194	Increase subscribed total entries size to 500 MB	2025-01-29 09:06:35 -08:00
sabaimran	01faae0299	Simplify the train of thought UI	2025-01-28 22:00:09 -08:00
sabaimran	59f0873232	Rename train of thought button	2025-01-28 21:32:21 -08:00
sabaimran	272764d734	Simplify the chat response / user message bubbles	2025-01-28 21:17:56 -08:00
sabaimran	b61226779e	Simplify references section with icons in chat message	2025-01-28 18:13:26 -08:00
sabaimran	58879693f3	Simplify nav menu and add a teams section	2025-01-28 18:12:50 -08:00
sabaimran	e076ebd133	Make tools section of sidebar a popover to prevent increasing height	2025-01-28 18:12:13 -08:00
sabaimran	ee3ae18f55	add code, remove summarize from agent tools	2025-01-28 18:11:25 -08:00
sabaimran	ba7d53c737	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-chat-controls	2025-01-28 13:01:06 -08:00
Debanjum	af49375884	Only show confusing fallback tokenizer used logs in high verbosity mode	2025-01-25 18:35:14 +07:00
Debanjum	aebdd174ec	Make title of PeopleAlsoAsk section of online results optional	2025-01-25 18:35:14 +07:00
sabaimran	43c9ec260d	Allow agent personality to be nullable, in which case the default prompt will be used.	2025-01-24 08:24:53 -08:00
sabaimran	2b9a61c987	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-chat-controls	2025-01-23 11:43:51 -08:00
sabaimran	a3b5ec4737	Release Khoj version 1.35.2	2025-01-22 21:42:14 -08:00
sabaimran	938ef0a27b	Update DB migration for better memory and speed efficiency	2025-01-22 20:48:37 -08:00
sabaimran	9fc825d7a6	Release Khoj version 1.35.1	2025-01-22 19:51:28 -08:00
sabaimran	5a3a897080	Temporarily move logic to associate entry and fileobject objects into the management command, out of automatic migrations	2025-01-22 19:50:22 -08:00
sabaimran	fd90842d38	Bump postgresql server dev version to 16 for latest ubuntu	2025-01-22 19:07:54 -08:00
sabaimran	1a0923538e	Release Khoj version 1.35.0	2025-01-22 19:03:25 -08:00
sabaimran	dc6e9e8667	Skip showing hidden agents in the all conversations agent filter	2025-01-21 12:43:34 -08:00
sabaimran	c1b0a9f8d4	Fix sync to async issue when getting default chat model in hidden agent configuration API	2025-01-21 12:06:09 -08:00
sabaimran	e518626027	Add typing to empty list of operations	2025-01-21 11:56:52 -08:00
sabaimran	e3e93e091d	automatically open the side bar when a new chat is created with the default agent.	2025-01-21 11:56:06 -08:00
sabaimran	c43079cb21	Add merge migration and add a new button for new convo in sidebar	2025-01-21 11:01:08 -08:00
sabaimran	5a36360408	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-chat-controls	2025-01-21 10:32:47 -08:00
sabaimran	f96e5cd047	Allow file filter dropdown to pop up automatically when typing "file:"	2025-01-21 08:30:40 -08:00
sabaimran	6d1b16901d	Fix the naming of framework > file	2025-01-21 07:24:43 -08:00
Debanjum	7fad1f43f6	Fix text quoting and format web app search page with prettier	2025-01-21 18:18:44 +07:00
sabaimran	8022f040d2	Fix spacing of pagination buttons	2025-01-20 17:23:14 -08:00
sabaimran	3b381a5fe8	Improve default state when no documents are found yet	2025-01-20 17:19:15 -08:00
sabaimran	e0dcd11f34	Allow browsing and discovery of knowlege base in the search page #1073 Currently, it's rather opaque and difficult to substantially browse through the uploaded knowledge base. Effectively, you can only do this through the small file modal in the settings page. Update to include all indexed files in the search page for viewing & deletion. Function to delete all files is still in the settings page. Add a migration that associates file objects with `entry`s using a foreign key. Add a migration command that deletes dangling fileobjects.	2025-01-20 16:14:25 -08:00
sabaimran	4b35bee365	Merge branch 'master' of github.com:khoj-ai/khoj into HEAD	2025-01-20 15:35:19 -08:00
sabaimran	8ad60f53d3	Remove export from filefiltercombo box	2025-01-20 15:26:32 -08:00
sabaimran	9f18d6494f	Add a file filter combobox for easier file filter selection	2025-01-20 14:39:29 -08:00
sabaimran	d36f235da5	Rename API to get all files, minor UI updates	2025-01-20 13:30:10 -08:00
sabaimran	849b7c7af6	Refresh current page after file deleted	2025-01-20 13:12:37 -08:00
sabaimran	edfed2d571	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-a-knowledge-base-page	2025-01-20 13:11:45 -08:00
sabaimran	b661d2cba7	Merge pull request #1030 from Yash-1511/feat/autocomplete-file-query ## Description Added file path autocompletion to enhance the search experience in Khoj. Users can now easily search for specific files by typing "file:" followed by the file name, with real-time suggestions appearing as they type. ## Changes - Added new API endpoint `/api/file-suggestions` to get file path suggestions - Enhanced search UI with dropdown suggestions for file paths - Implemented debounced search to optimize API calls - Added keyboard (Enter) and mouse click support for selecting suggestions ## Features - Type "file:" to trigger file path suggestions - Real-time filtering of suggestions as you type - Top 10 alphabetically sorted suggestions - Case-insensitive matching - Keyboard and mouse interaction support - Clear visual feedback with a dropdown UI	2025-01-20 12:35:43 -08:00
sabaimran	0c29c7a5bf	Update layout and rendering of share page for hidden agent	2025-01-20 12:02:07 -08:00
sabaimran	000580cb8a	Improve loading state when files not found and fix default state interpretation for model selector	2025-01-20 11:47:42 -08:00
sabaimran	235114b432	Fix agent data import across chat page +	2025-01-20 11:04:48 -08:00
sabaimran	d681a2080a	Centralize use of useUserConfig and use that to retrieve default model and chat model options	2025-01-20 10:59:02 -08:00
sabaimran	a3fcd6f06e	Fix import of AgentData from agentcard	2025-01-20 10:36:11 -08:00
sabaimran	d7800812ad	Fix default states for the model selector	2025-01-20 10:18:09 -08:00
sabaimran	98baa93a31	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-chat-controls	2025-01-20 09:36:42 -08:00
sabaimran	696551b686	Add typing to operations in merge migration file	2025-01-20 08:36:40 -08:00
sabaimran	d1e7b5b234	Add a merge migration to resolve differences	2025-01-20 08:34:57 -08:00
sabaimran	1f59afe962	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-a-knowledge-base-page	2025-01-20 08:32:17 -08:00
sabaimran	016bcd7674	Revert page size to 10, rather than 1	2025-01-20 08:32:02 -08:00
sabaimran	8fe08eecce	add --break-system-packages to bypass venv requirement	2025-01-20 00:21:27 -08:00
sabaimran	bf58d9430b	downgrade postgres server pkg to 16	2025-01-20 00:15:56 -08:00
sabaimran	95ad1f936e	upgrade postgres server to 17	2025-01-20 00:10:20 -08:00
sabaimran	a214bd4100	upgrade pg server dev version to 15	2025-01-20 00:05:35 -08:00
sabaimran	82ff74cfa9	Run on container with ubuntu latest for pytest gh action workflow	2025-01-19 23:57:57 -08:00
sabaimran	83d856f97d	Add some basic pagination logic to the knowledge page to prevent overloading the Api or the client	2025-01-19 22:48:36 -08:00
sabaimran	59ee6e961a	Only set hasmodified to true during model select if different from original model	2025-01-19 18:19:48 -08:00
sabaimran	e982398c2c	Weird spacing issue resolve (it was because of the footer in the collapsed state still having some width)	2025-01-19 18:16:43 -08:00
sabaimran	dbce039033	Revert the `fixed` hack to hide horizontal spacing issue with sidebar because it breaks the animation on closed. Sigh.	2025-01-19 18:11:53 -08:00
sabaimran	0d38cc9753	Handle further edge cases when setting chat agent data and fix alignment of chat input / side panel	2025-01-19 17:59:37 -08:00
sabaimran	b248123135	Hook up hidden agent creation and update APIs to the UI - This allows users to initiate hidden agent creation from the side bar directly. Any updates can easily be applied to the conversation agent.	2025-01-19 17:36:30 -08:00
sabaimran	be11f666e4	Initialize the concept in the backend of hidden agents A hidden agent basically allows each individual conversation to maintain custom settings, via an agent that's not exposed to the traditional functionalities allotted for manually created agents (e.g., browsing, maintenance in agents page). This will be hooked up to the front-end such that any conversation that's initiated with the default agent can then be given custom settings, which in the background creates a hidden agent. This allows us to repurpose all of our existing agents infrastructure for chat-level customization.	2025-01-19 12:16:37 -08:00
sabaimran	0a0f30c53b	Update relevant agent tool descriptions Remove text (as by default, must output text), and improve the Notes description for clarity	2025-01-19 12:14:28 -08:00
sabaimran	f10b072634	update looks & feel of chat side bar with model selector, checkboxes for tools, and actions (not yet implemented)	2025-01-19 12:13:37 -08:00
sabaimran	7837628bb3	Update existing agentData imports	2025-01-19 12:12:23 -08:00
sabaimran	7998a258b6	Add additional ui components for tooltip, checkbox	2025-01-19 12:08:02 -08:00
sabaimran	00370c70ed	Consolidate the AgentData Type into the agentCard	2025-01-19 12:06:54 -08:00
Debanjum	fde71ded16	Upgrade web app dependencies	2025-01-19 13:44:59 +07:00
Debanjum	2d4633d298	Use encoded email, otp in login URL in email & web app sign-in flow Previously emails with url special characters would not get successfully identified for login. Account creation was fine due to email being in POST request body. But login with such emails did not work due to query params not being escaped before being sent to server This change escapes both the code and email in login URL sent to server. So login with emails containing special characters like `email+khoj@gmail.com' works. It fixes both the URL web app sent by web app directly and the magic link sent to users to their email This change also fixes accessibility issue of having a DialogTitle in DialogContent for screen readers. Resolves #1090	2025-01-19 13:11:23 +07:00
Debanjum	51f3af11b5	Fix Qwen 2.5 14B model source to use Q4_K_M quantized model The official Qwen2.5 14B model doesn't mention standard quantization suffixes like Q4_K_M, so doesn't work with Khoj	2025-01-19 12:27:35 +07:00
sabaimran	f99bd3f1bc	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-chat-controls	2025-01-17 17:49:08 -08:00
sabaimran	af9e906cb5	Use python3 instead of python when running pip install commands in gh actions	2025-01-17 17:48:42 -08:00
sabaimran	c80e0883ee	Use python3 instead of python for install pip commands in GH action	2025-01-17 17:36:46 -08:00
sabaimran	b8c866014d	Improve instruction description for the agent command description for notes	2025-01-17 17:19:59 -08:00
sabaimran	7481f78f22	Remove unused API request	2025-01-17 17:18:47 -08:00
sabaimran	5aadba20a6	Add backend support for hidden agents (not yet enabled)	2025-01-17 16:46:37 -08:00
sabaimran	2fa212061d	Add a ride hand side bar for chat controls	2025-01-17 16:45:50 -08:00
Tuğhan Belbek	849348e638	Handle additional HTTP redirect status code 308 in scheduled chat requests (#1088 ) Closes #1067	2025-01-16 07:03:43 -08:00
Debanjum	00843f4f24	Release Khoj version 1.34.0	2025-01-16 12:11:28 +07:00
Debanjum	ad27f34c96	Support online search using Google Search API (#1087 ) Add official Google Search API as an online search provider. We currently support Serper.dev and Searxng as online search providers.	2025-01-16 11:41:59 +07:00
Debanjum	a649b03658	Support online search using Google Search API	2025-01-16 11:39:03 +07:00
Sam Ho	f8f159efac	feat: add turnId handling to chat messages and history	2025-01-16 00:44:16 +00:00
sabaimran	42d4d15346	Merge pull request #1054 from khoj-ai/features/add-support-for-mermaidjs We've been having issues generating diagrams with Excalidraw that are any degree of complexity. By contrast, LLMs are able to handle Mermaid.js syntax a lot better, as it's much more forgiving and has a simpler declarative style. Refer to https://mermaid.js.org/. Update so that new diagrams are generated with Mermaid.js, while old diagrams generated with Excalidraw can still be viewed.	2025-01-15 11:55:12 -08:00
Debanjum	e2b2b3415e	Fix handling of inline base64 images by Obsidian, Desktop clients Fix for #1082 pushed down adding the `data:image/webp;base64' prefix of the base64 images to the server image gen API. But the code on the Obsidian and Desktop client still add these prefixes. This change stops the Desktop, Obsidian clients from adding the prefix as it is being handled by the API now. It should resolve showing images inline in those clients as well	2025-01-15 23:34:23 +07:00
Debanjum	2e585efd2f	Fix end with newline styling issue in style.css to pass lint checks	2025-01-15 19:43:02 +07:00
Debanjum	ed18c04576	Fix wrapping base64 generated image for inline display Resolves #1082	2025-01-15 19:19:31 +07:00
Debanjum	f8b887cabd	Allow using OpenAI (compatible) API for Speech to Text transcription	2025-01-15 19:19:31 +07:00
Debanjum	182c49b41c	Prefer explicitly configured OpenAI API url, key for image gen model Previously we'd use the general openai client, even if the image generation model has a different api key and base url set. This change uses the openai config of the image generation models when set. Otherwise it fallbacks to use the first openai api provider set	2025-01-15 19:19:31 +07:00
Debanjum	24204873c8	Use same openai base url env var name as the official openai client This eases re-use of the OpenAI API across all openai clients, including chat, image generation, speech to text. Resolves #1085	2025-01-15 19:19:30 +07:00
Debanjum	63dd3985b5	Support using Embeddings Model exposed via OpenAI (compatible) API (#1051 ) This change adds the ability to use OpenAI, Azure OpenAI or any embedding model exposed behind an OpenAI compatible API (like Ollama, LiteLLM, vLLM etc.). Khoj previously only supported HuggingFace embedding models running locally on device or via HuggingFaceW inference API endpoint. This allows using commercial embedding models to index your content with Khoj.	2025-01-15 17:39:54 +07:00
Debanjum	a6bf6803b6	Add docs on how to add, edit search model configs when self-hosting	2025-01-15 17:30:18 +07:00
Debanjum	92a1ec7afc	Do not auto restart khoj docker services by default Let folks who want to add that, add it manually if they want to. It creates too much noise for folks having trouble with self-host setup	2025-01-15 13:09:50 +07:00
Debanjum	85c537a1de	Set default PORT arg in Dockerfile to default Khoj port, 42110	2025-01-15 13:09:50 +07:00
Debanjum	9355381fac	Catch error in call to data sources, output format selection tool AI Previously if the call to this tool AI failed, the API call would non-gracefully fail on server. This would leave the client hanging in a wierd state (e.g with spinner running on web app with no indication of issue). Also do not show filters in query debug log lines when no filters in query	2025-01-15 13:09:50 +07:00
Debanjum	24ab8450ba	Handle scenario where read chat stream error is not json on web app	2025-01-15 13:09:50 +07:00
sabaimran	0b775c77d3	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-a-knowledge-base-page	2025-01-13 15:07:59 -08:00
Sam Ho	fc6fab4cce	chore: fix format issue from pre-commit hook - trailing-whitespace and end-of-file-fixer	2025-01-13 20:07:48 +00:00
sabaimran	7f329e7e9d	Fix configuration of name field for chatmodel options during initalization	2025-01-12 22:37:08 -08:00
sabaimran	1a00540ee9	Improve error handling in mermaid chart rendering	2025-01-12 22:36:31 -08:00
Osama Ata	96e3d0a7b9	Fix stale lmstudio documentation to set ai model api via admin panel (#1075 ) Use new name `Ai Model API` instead of `OpenAI Processor Conversation Config`	2025-01-12 03:06:01 -08:00
Yash-1511	27165b3f4a	fix: review suggestions	2025-01-12 15:12:14 +05:30
Debanjum	6bd9f6bb61	Give a shorter, simpler name to github workflow to deploy docs	2025-01-12 10:54:56 +07:00
Sam Ho	93687f141a	feat: do not show delete button on system messages	2025-01-11 17:35:57 +00:00
Sam Ho	a9c180d85f	feat: add delete chat message action to the Obsidian plugin	2025-01-11 17:19:40 +00:00
Debanjum	51a774c993	Add contrast to setting card inputs in dark mode on web app	2025-01-11 14:50:47 +07:00
Debanjum	9e8b8dc5a2	Toggle showing api key on web settings page via a visibility toggle - Background Access to the clipboard API is disabled by certain browsers in non localhost http scenarios for security reasons. So the copy API key button doesn't work when khoj is self-hosted with authentication enabled at a non localhost domain - Change This change enables copying API keys by manual text highlight + copy if copy button is disabled Resolves #1070	2025-01-11 14:50:47 +07:00
Debanjum	25c39bd7da	Extract api keys setting card into separate component on web app	2025-01-11 14:50:46 +07:00
sabaimran	c30047e859	Fix Obsidian style.css	2025-01-10 22:18:44 -08:00
sabaimran	da2b89e46a	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-a-knowledge-base-page	2025-01-10 22:18:14 -08:00
sabaimran	f170487338	Fix apostrophe in the add documents modal	2025-01-10 21:58:17 -08:00
sabaimran	be4b091a21	Add new line to styles.css	2025-01-10 21:52:52 -08:00
sabaimran	f398e1eb0c	Add codeblock rendering for the mermaidjs diagram in obsidian	2025-01-10 21:46:39 -08:00
Debanjum	6e955e158b	Use normalized email address for new users Not check email deliverability for now to allow air-gapped usage or authenticated/multi-user setups with admin managed otp Closes #1069	2025-01-11 12:28:40 +07:00
sabaimran	c441663394	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-support-for-mermaidjs	2025-01-10 21:25:33 -08:00
sabaimran	85c34a5f0f	Merge pull request #1018 from hjamet/master This PR delivers comprehensive improvements to the Khoj plugin across multiple key areas: 🔍 Search Enhancements: - Added visual loading indicators during search operations - Implemented color-coded results to distinguish between vault and external files - Added abort logic for previous requests to improve performance - Enhanced search feedback with clear status indicators - Improved empty state handling 🔄 Synchronization Improvements: - Added configurable sync interval setting in minutes - Implemented manual "Sync new changes" command - Enhanced sync timer management with automatic restart - Improved notification system for sync operations 📁 Folder Management: - Added granular folder selection for sync - Implemented intuitive folder suggestion modal - Enhanced folder list visualization 💅 UI/UX Improvements: - Added loading animations and spinners - Enhanced search results visualization with color coding - Refined chat interface styling - Improved overall settings panel organization 🔧 Technical Improvements: - Refactored search and synchronization logic - Implemented proper request cancellation - Enhanced error handling and user feedback - Improved code organization and maintainability	2025-01-10 21:24:12 -08:00
sabaimran	57545c1485	Fix the migration script to delete orphaned fileobjects - Remove knowledge page from the sidebar - Improve speed and rendering of the documents in the search page	2025-01-10 21:06:48 -08:00
sabaimran	d77984f9d1	Remove separate knowledge base file - consolidated in the search page	2025-01-10 18:57:38 -08:00
sabaimran	f2c6ce2435	Improve rendering of the file objects and sort files by updated_date	2025-01-10 18:18:15 -08:00
Patrick Jackson	6e0c767ff0	Use the configured OpenAI Base URL for Automations (#1065 ) This change makes Automations (and possibly other entrypoints) use the configured OpenAI-compatible server if that has been set. Without this change it tries to use the hardcoded OpenAI provider.	2025-01-10 17:17:51 -08:00
sabaimran	454a752071	Initial commit: add a dedicated page for managing the knowledge base - One current issue in the Khoj application is that managing the files being referenced as the user's knowledge base is slightly opaque and difficult to access - Add a migration for associating the fileobjects directly with the Entry objects, making it easier to get data via foreign key - Add the new page that shows all indexed files in the search view, also allowing you to upload new docs directly from that page - Support new APIs for getting / deleting files	2025-01-10 16:24:50 -08:00
Debanjum	1b5826d8b6	Support using Embeddings Model exposed via OpenAI (compatible) API	2025-01-10 23:48:04 +07:00
Debanjum	65f1c27963	Remove old, big warning about Khoj not configured on server init - Just say using default config. This old khoj.yml settings mechamism isn't standard, so not having a configured khoj.yml isn't a concern - Deep link to desktop download instead of the whole download page as android etc. are also on it, which don't help with syncing docs	2025-01-10 23:46:27 +07:00
Debanjum	3cc6597b49	Support Azure OpenAI API endpoint (#1048 ) OpenAI chat models deployed on Azure are (ironically) not OpenAI API compatible endpoints. This change enables using OpenAI chat models deployed on Azure with Khoj.	2025-01-10 08:35:03 -08:00
sabaimran	bac90ad69d	Upgrade deploy-pages action to vv4	2025-01-09 19:04:31 -08:00
Debanjum	2069f571c8	Upgrade upload-artifact gh action to v4 as <=v3 deprecated This started failing github workflow jobs	2025-01-10 00:41:24 +07:00
Debanjum	dd63bd8bcf	Fix dark mode dropdown colors of phone no. country code on web settings page Resolves #1046	2025-01-10 00:10:51 +07:00
Debanjum	bb6a6cbe19	Restart Khoj docker services unless stopped. Remove default network - Seems less aggressive to use unless-stopped versus always. - Default network is used anyway, so doesn't seem necessary to specify	2025-01-09 22:08:48 +07:00
Debanjum	01d27f5220	Do not show user logout button on web app side pane in anoymous mode Refer https://github.com/khoj-ai/khoj/issues/1050#issuecomment-2579119234	2025-01-09 21:18:50 +07:00
Debanjum	a739936563	Mark Github integration as unmaintained in documentation Also mention what the Reflective Questions table is about Resolves #1059	2025-01-09 20:42:38 +07:00
Debanjum	1eaa54b0ae	Make PAT token requirement optional for Github indexing for now The github integration has not been tested and may still be broken. This change at least makes it easier to add repositories without needing a PAT token if/when it does work.	2025-01-09 20:42:38 +07:00
sabaimran	ec02757fd1	Add an export feature along with the mermaid diagram. Add sidebar to loading page.	2025-01-08 23:53:58 -08:00
sabaimran	889f34c7bf	Adjust typing and error handling for incorrect diagrams	2025-01-08 23:22:16 -08:00
sabaimran	11fcf2f299	Remove dangling response type	2025-01-08 22:10:59 -08:00
sabaimran	6b0a49b12d	Add the mermaid package and apply front-end parsing - Add the mermaid package and apply front-end parsing for interpreting the diagrams. Retain processing of the excalidraw type for backwards compatibility	2025-01-08 22:09:35 -08:00
sabaimran	539ce99343	Add backend support for parsing and processing and storing mermaidjs diagrams - Replace default diagram output from excalidraw to mermaid - Retain typing of the excalidraw type for backwards compatibility in the chatlog	2025-01-08 22:08:40 -08:00
sabaimran	c448c49811	Clean-up some code on homepage and disable initial card animations because of jitter	2025-01-08 22:07:23 -08:00
Ikko Eltociear Ashimine	9ce9f02886	Fix typo in Admin Doc (#1034 ) appropiate -> appropriate	2025-01-08 21:27:34 -08:00
sabaimran	ad5f0c7a02	Merge pull request #1029 from DPS0340/master Improve docker-compose.yml - Do not expose dependencies on host internet - Put all services on the same network	2025-01-08 10:42:47 -08:00
Henri Jamet	f42b0cb08c	Refactor comments and CSS for improved clarity in Khoj plugin - Translated comments from French to English for better accessibility and understanding. - Updated CSS comment for loading animation to reflect the change in language. - Enhanced code readability by ensuring consistent language usage across multiple files. - Improved user experience by clarifying the purpose of various functions and settings in the codebase.	2025-01-08 09:31:43 +01:00
sabaimran	875cdde9b9	Release Khoj version 1.33.2	2025-01-07 15:32:18 -08:00
sabaimran	5c5c4a6bbc	Add help text for Enterprises in the README	2025-01-07 14:55:57 -08:00
sabaimran	8d028e10c6	Fix populating login url in sign in email	2025-01-07 14:53:00 -08:00
omahs	36bdaedd2d	Fix typos in Khoj Docs (#1033 )	2025-01-07 15:55:57 +07:00
sabaimran	25c1c1c591	Release Khoj version 1.33.1	2025-01-06 09:08:01 -08:00
sabaimran	689d9d8b3a	Update formatting in admin.py and utils.py	2025-01-06 09:07:28 -08:00
thinker007	aa442c28eb	Handle reporting chat estimated cost when some fields unavailable (#1026 ) Fix AttributeError: 'NoneType' object has no attribute 'model_extra' * cost = chunk.usage.model_extra.get("estimated_cost", 0) if hasattr(chunk, "usage") and chunk.usage else 0 # Estimated costs returned by DeepInfra API	2025-01-06 09:03:49 -08:00
sabaimran	4aed6f7e08	Add a link around the header khojlogotype to go home	2025-01-06 08:55:00 -08:00
Debanjum	266d274e21	Make automation should_notify check robust to non json mode chat models Use clean_json to handle automation should_notify check For gemini and other chat models where enforcing json mode is problematic, not supported	2025-01-06 20:16:32 +07:00
Debanjum	9a5e3583cf	Remove bullet styling only from sidebar items on web app Previous fix had removed bullet styling from all components in web app. This made chat messages on the web app lose bullet styling too.	2025-01-06 20:15:42 +07:00
Debanjum	dc0bc5bcca	Evaluate information retrieval quality using eval script - Encode article urls in filename indexed in Khoj KB Makes it easier for humans to compare, trace retrieval performance by looking at logs than using content hash (which was previously explored)	2025-01-06 13:19:52 +07:00
Debanjum	daeba66c0d	Optionally pass references used by agent for response to eval scorers This will allow the eval framework to evaluate retrieval quality too	2025-01-06 13:19:52 +07:00
Debanjum	8231f4bb6e	Return accuracy as decision to generalize across IR & standard scorers	2025-01-06 13:19:52 +07:00
Jiho Lee	c1c086e431	fix: Use localhost on SEARXNG_BASE_URL Co-authored-by: sabaimran <65192171+sabaimran@users.noreply.github.com>	2025-01-06 13:32:06 +09:00
sabaimran	eb9aadf72a	Add an Obsidian README documentation for development	2025-01-05 19:06:27 -08:00
sabaimran	e89e49818b	Merge pull request #1028 from ReallyVirtual/patch-1 Update image_generation.md	2025-01-05 13:53:00 -08:00
sabaimran	616cc189d1	Remove bullet points from li styling explicitly	2025-01-05 13:52:08 -08:00
sabaimran	a5705a5aa6	After agent prompt safe check is parsed as json, load it into a json object	2025-01-05 13:40:20 -08:00
Yash-1511	f306159a5a	feat: add autocomplete suggestions feature in search page	2025-01-05 17:30:00 +05:30
Jiho Lee	a5c7315874	feat: Improve docker-compose.yml - Remove host port mappings on dependencies - Add 'restart: always' - Add default network for lookup by docker dns	2025-01-05 15:17:50 +09:00
Sohaib Athar	95a2387e9b	Update image_generation.md Fixed typo	2025-01-05 07:25:41 +05:00
sabaimran	756f4a2a66	Update regeneration logic to run for all entries now that we have a single search model ID	2025-01-03 15:16:10 -08:00
Debanjum	33f85b4a55	Fix title of Query Filters documentation	2025-01-01 18:56:16 +08:00
sabaimran	0ef787b57c	Set waitBeforeSeconds to 0	2024-12-30 15:08:34 -08:00
Debanjum	90b4e03454	Pull out query filters as top level documentation page - Note perf eval from 2022 - Update links to query-filters in docs - Fix links - Update image model docs	2024-12-30 14:35:05 -08:00
sabaimran	8f69eb949b	Release Khoj version 1.33.0	2024-12-29 10:39:26 -08:00
Henri Jamet	1aff78a969	Enhance Khoj plugin search functionality and loading indicators - Added visual loading indicators to the search modal for improved user experience during search operations. - Implemented logic to check if search results correspond to files in the vault, with color-coded results for better clarity. - Refactored the getSuggestions method to handle loading states and abort previous requests if necessary. - Updated CSS styles to support new loading animations and result file status indicators. - Improved the renderSuggestion method to display file status and provide feedback for files not in the vault.	2024-12-29 16:19:42 +01:00
Henri Jamet	7d28b46ca7	Implement sync interval setting and enhance synchronization timer in Khoj plugin - Added a new setting for users to configure the sync interval in minutes, allowing for more flexible automatic synchronization. - Introduced methods to start and restart the synchronization timer based on the configured interval. - Updated the synchronization logic to use the user-defined interval instead of a fixed 60 minutes. - Improved code readability and organization by refactoring the sync timer logic.	2024-12-29 13:23:08 +01:00
Henri Jamet	c5c9e0072c	Enhance Khoj plugin settings and UI for folder synchronization - Added a new setting to manage sync folders, allowing users to specify which folders to sync or to sync the entire vault. - Implemented a modal for folder suggestions to facilitate folder selection. - Updated the folder list display to show currently selected folders with options to remove them. - Improved CSS styles for chat interface and folder list for better user experience. - Refactored code for consistency and readability across multiple files.	2024-12-29 13:07:21 +01:00
Henri Jamet	833ab52986	Add manual sync command to Khoj plugin - Introduced a new command 'Sync new changes' to allow users to manually synchronize new modifications. - The command updates the content index without regenerating it, ensuring only new changes are synced. - User-triggered notifications are displayed upon successful sync.	2024-12-29 12:51:52 +01:00
Debanjum	1a43ca75f3	Update to latest jinja python package dependency	2024-12-27 01:44:41 -08:00
Debanjum	ca197ba9ba	Improve Automation Flexibility and Automation Email Format (#1016 ) - Format AI response to send in automation email - Let Khoj infer chat query based on user automation query - Decide if automation emails should be sent based on response - Fix the `to_notify_or_not` decider AI - Ask reason before decision to improve to_notify decider AI - Show error message on web app when fail to create/update automation	2024-12-27 01:36:38 -08:00
Debanjum	90685ccbb0	Show error message if update automation via web app fails	2024-12-26 22:21:56 -08:00
Debanjum	c4bb92076e	Convert async create automation api endpoints to sync	2024-12-26 21:59:55 -08:00
Debanjum	9674de400c	Format AI response to send in automation email Previously we sent the AI response directly. This change post processes the AI response with the awareness that it is to be sent to the user as an email to improve rendering and quality of the emails.	2024-12-26 21:04:50 -08:00
Debanjum	6d219dcc1d	Switch to let Khoj infer chat query based on user automation query This tries to decouple the automation query from the chat query. So the chat model doesn't have to know it is running in an automation context or figure how to notify user or send automation response. It just has to respond to the AI generated `query_to_run' corresponding to the `scheduling_request` automation by the user. For example, a `scheduling_request' of `notify me when X happens' results in the automation calling the chat api with a `query_to_run` like `tell me about X` and deciding if to notify based on information gathered about X from the scheduled run. If these two are not decoupled, the chat model may respond with how it can notify about X instead of just asking about it. Swap query_to_run with scheduling_request on the automation web page	2024-12-26 21:04:50 -08:00
Debanjum	3600a9a4f3	Ask reason before decision to improve to_notify decider automation AI Previously it just gave a decision. This was hard to debug during prompt tuning. Asking for reason before decision improves models decision quality.	2024-12-26 21:04:50 -08:00
Debanjum	dcc5073d16	Fix the to_notify decider automation chat actor. Add detailed logging	2024-12-26 21:04:50 -08:00
sabaimran	03b4667acb	Merge pull request #1017 from khoj-ai/features/update-home-page - Rather than chunky generic cards, make the suggested actions more action oriented, around the problem a user might want to solve. Give them follow-up options. Design still in progress.	2024-12-24 12:12:37 -08:00
sabaimran	5985ef4c7c	Further improve prompts	2024-12-24 12:11:27 -08:00
sabaimran	012e0ef882	Add tooltip for file input ref.	2024-12-24 11:20:39 -08:00
sabaimran	a58b3b4a37	Remove some of the step one suggestions	2024-12-24 10:52:42 -08:00
sabaimran	cf78f426d3	Merge branch 'master' of github.com:khoj-ai/khoj into features/update-home-page	2024-12-24 09:52:06 -08:00
sabaimran	d27d291584	Merge pull request #1015 from khoj-ai/features/clean-up-authenticated-data - De facto, was being assumed everywhere if authenticatedData is null, that it's not logged in. This isn't true because the data can still be loading. Update the hook to send additional states. - Bonus: Delete model picker code and a slew of unused imports.	2024-12-24 09:51:39 -08:00
sabaimran	cd4cf4f9f6	Merge pull request #1014 from khoj-ai/features/improve-agent-management - Add support for seeing all steps of the agent modification flow via tabs at the top of the modal - Separate knowledge base & tool selection into two separate parts	2024-12-24 09:50:38 -08:00
sabaimran	3e6ba45cbe	Merge pull request #1013 from khoj-ai/features/use-sidebar Use the [shadcn sidebar](https://ui.shadcn.com/docs/components/sidebar#sidebarmenusub) across Khoj and standardize how the side panel experience works across the app. This helps us generalize the code better, while re-using the same components. Deprecates current usage of the chat history side panel, replacing it with the new `appSidebar.tsx` component. We'll eventually move out the `Manage Files` section and move it into a separate panel dedicated to chat-level controls.	2024-12-24 09:50:09 -08:00
sabaimran	95fdbe13ae	move all conversations button to bottom of side panel	2024-12-23 23:13:19 -08:00
sabaimran	d0256f267e	Fix submit state for form with buttons sticky to button	2024-12-23 23:10:19 -08:00
Debanjum	0d5fc70aa3	Fix colors, title on create agent card	2024-12-23 20:01:56 -08:00
Debanjum	0b91383deb	Make post oauth next url redirect more robust, handle q params better	2024-12-23 17:50:45 -08:00
Debanjum	17f8ba732d	Autofocus on email input box when email sign-in selected on web app	2024-12-23 17:49:41 -08:00
sabaimran	fb111a944b	Use disable_https flag instead of is_in_debug_mode to decide whether to redirect google auth request	2024-12-23 17:23:47 -08:00
sabaimran	3fd0c202ea	Allow better spacing in the agent card and make the buttons sticky	2024-12-23 16:43:50 -08:00
sabaimran	c83709fdd1	Further clean up in home page initial cards experience	2024-12-22 18:11:19 -08:00
sabaimran	4c4f4401b1	Make suggestion cards a little more minimal	2024-12-22 11:54:27 -08:00
sabaimran	90b02b4cfe	Merge branch 'features/clean-up-authenticated-data' of github.com:khoj-ai/khoj into features/update-home-page	2024-12-22 11:26:29 -08:00
sabaimran	798837378f	Improve mobile friendliness and highlight missing necessary data	2024-12-22 11:02:50 -08:00
sabaimran	7032ccf130	Show create agent button when not logged in agents page	2024-12-22 10:01:15 -08:00
sabaimran	0fefbac89f	Improve sidebar experience for not logged in state	2024-12-22 09:58:21 -08:00
sabaimran	60f80548a4	Remove unused span text	2024-12-22 09:21:40 -08:00
sabaimran	9f84f5dcc7	Give more breathing space to the sidebar footer	2024-12-22 09:18:05 -08:00
sabaimran	dc17272f70	Fix some spacing in the nav menu	2024-12-22 09:01:09 -08:00
sabaimran	45da6ec750	Separate the shorthand of each suggestion card from the prefilled text	2024-12-21 20:46:57 -08:00
sabaimran	bf405f50d7	Fix spacing of main content in the settings page	2024-12-21 20:46:21 -08:00
sabaimran	62dd4c55d4	Further improve UX of the suggestion cards	2024-12-21 20:10:54 -08:00
sabaimran	8c9c57e060	Merge branch 'features/clean-up-authenticated-data' of github.com:khoj-ai/khoj into features/update-home-page	2024-12-21 19:32:07 -08:00
sabaimran	8c6b4217ae	Set width of chat layout to 100%	2024-12-21 19:29:37 -08:00
sabaimran	a17cc9db38	Fix handling 403 forbidden error from auth response	2024-12-21 19:19:35 -08:00
sabaimran	95826393e1	Update the home page suggestion cards - Rather than chunky generic cards, make the suggested actions more action oriented, around the problem a user might want to solve. Give them follow-up options. Design still in progress.	2024-12-21 18:57:19 -08:00
Debanjum	8d129c4675	Bump default max prompt size for commercial chat models	2024-12-21 17:31:05 -08:00
Debanjum	37ae48d9cf	Add support for OpenAI o1 model Needed to handle the current API limitations of the o1 model. Specifically its inability to stream responses	2024-12-21 17:25:32 -08:00
sabaimran	2c7c16d93e	Fix conditional reference to is mobile width hook	2024-12-21 12:48:29 -08:00
sabaimran	e9dae4240e	Clean up all references to authenticatedData - De facto, was being assumed everywhere if authenticatedData is null, that it's not logged in. This isn't true because the data can still be loading. Update the hook to send additional states. - Bonus: Delete model picker code and a slew of unused imports.	2024-12-21 08:45:43 -08:00
sabaimran	b1c5c5bcc9	Fix path to component library in shadcn sidebar	2024-12-20 15:48:57 -08:00
sabaimran	cc7fd1163f	Use tabs to separate sections in the agent mod form - Add knowledge base as a separate section, apart from tools - This makes it easier to navigate the different components quickly	2024-12-20 14:41:15 -08:00
sabaimran	078753df30	Add chatwoot to the frame-src CSP	2024-12-20 13:33:48 -08:00
sabaimran	efe812e323	Add componets for tabs in agent page	2024-12-20 13:33:20 -08:00
sabaimran	ba792c02ba	Improve share chat UI for alignment	2024-12-20 12:28:31 -08:00
sabaimran	7770caa793	Add side bar inset to home page. Simplify automations card.	2024-12-20 11:37:23 -08:00
sabaimran	b8a9dcd600	Improve mobile layout with sidebar inset and fix dark mode logo	2024-12-19 23:23:52 -08:00
sabaimran	b1880d9c9d	Add side bar to search page	2024-12-19 22:46:50 -08:00
sabaimran	02c503e966	Further improve mobile layout with side panel	2024-12-19 22:00:32 -08:00
sabaimran	43331f7730	Remove unused css classes	2024-12-19 21:36:35 -08:00
sabaimran	b644bb8628	Further improve mobile friendliness	2024-12-19 21:34:04 -08:00
sabaimran	9d5480d886	Improve mobile friendlinses across chat and home pages.	2024-12-19 20:33:53 -08:00
sabaimran	68af10c805	Use the new shadcn sidebar for khoj nav and actions - Use the sidebar across all pages to quickly navigate through the app, access settings, and go to past chats - Pending: mobile friendliness	2024-12-19 20:10:03 -08:00
sabaimran	7eb15bf0a9	Update shadcn components	2024-12-19 14:33:36 -08:00
sabaimran	a4aeb9fdf3	Simplify the home page color scheme and overall design	2024-12-19 14:02:53 -08:00
Debanjum	0ae21e5628	Use icons, not text labels, for sidebar nav items on docs website	2024-12-17 20:49:09 -08:00
sabaimran	cafe1b0655	Remove error log line for payload inclusion	2024-12-17 19:48:22 -08:00
Debanjum	1e3f452d15	Handle sharing old conversation publically even if they have no slug New conversation have a slug, but older conversation may not. This change allows those older conversations to still be shareable by using a random uuid for constructing their url instead	2024-12-17 19:44:36 -08:00
Debanjum	90b7ba51a4	Make agent prompt safety checker more forgiving and concise - Some of the instructions were duplicated (e.g illegal, harmful) - Return format requested was inconsistent - Safety prompt felt overtly prudish which lowered their utility. Make it laxer for now, add checks later if required	2024-12-17 19:44:36 -08:00
sabaimran	bcee2ea01a	Release Khoj version 1.32.2	2024-12-17 16:03:29 -08:00
sabaimran	92144c8102	Remove release step in todesktop flow, since we need to run releases manually now - Leaving it commented out for the time being so we can revisit automating this later	2024-12-17 16:02:45 -08:00
sabaimran	f291884921	If not in debug mode, force google auth to use the https protocol	2024-12-17 15:44:18 -08:00
sabaimran	bcc1bc6854	Log the payload sent temporarily in order to help with debugging	2024-12-17 15:37:42 -08:00
sabaimran	7ca2553d17	Update login popup copy	2024-12-17 15:20:44 -08:00
sabaimran	ef99d8c28e	Add more descriptive error logs when google auth token verification fails	2024-12-17 15:18:38 -08:00
Debanjum	60d55a83c4	Use khoj logo via url in readme to load in other locations like pypi	2024-12-17 14:00:53 -08:00
Debanjum	10bd56d2b9	Attest Khoj pypi package by upgrading pypi publish gh action - Print hash in CI to ease verifying ci built python package matches khoj package published on pypi - Newer pypi publish github action should speed up workflow by ~30s	2024-12-17 13:40:39 -08:00
sabaimran	2e80a1ce7c	Release Khoj version 1.32.1	2024-12-17 13:28:00 -08:00
Debanjum	df15f00243	Tag docker images with latest tag in dockerize workflow on release	2024-12-17 13:18:51 -08:00
sabaimran	ded168dae3	Release Khoj version 1.32.0	2024-12-17 12:29:20 -08:00
sabaimran	f6abfcfa6b	Use latest release version for pypi gh action to publish	2024-12-17 12:19:42 -08:00
Debanjum	c20364efcb	Upgrade web app next.js, shadcn and other package dependencies	2024-12-17 11:15:22 -08:00
Debanjum	63d2c6f35a	Allow research mode and other conversation commands in automations (#1011 ) Major --- Previously we couldn't enable research mode or use other slash commands in automated tasks. This change separates determining if a chat query is triggered via automated task from the (other) conversation commands to run the query with. This unlocks the ability to enable research mode in automations apart from other variations like /image or /diagram etc. Minor --- - Clean the code to get data sources and output formats - Have some examples shown on automations page run in research mode now	2024-12-17 00:44:51 -08:00
sabaimran	3b050a33bb	Include resend as a default dependency, rather than restricting to prod	2024-12-16 22:24:41 -08:00
sabaimran	741e9f56f9	Update admin button for getting login url to include user email	2024-12-16 22:16:05 -08:00
Debanjum	88aa8c377a	Support online search with Searxng as zero config, self-hostable solution (#1010 ) This allows online search to work out of the box again for self-hosting users, as no auth/api key setup required. Docker users do not need to change anything in their setup flow. Direct installers can setup Searxng locally or use public instances if they do not want to use any of the other providers (like Jina, Serper) Resolves #749. Resolves #990	2024-12-16 18:59:09 -08:00
sabaimran	7677bb14f8	Update docs home page	2024-12-16 18:48:42 -08:00
sabaimran	1bd0d46b3d	Update the theme color used in our docs to match the theme in our emails	2024-12-16 18:42:50 -08:00
sabaimran	432e901087	Update the online search self-hosting instructions to reflect new setup	2024-12-16 18:37:20 -08:00
sabaimran	af2553a890	Remove the Searxng API key env variable for simplicity	2024-12-16 18:36:55 -08:00
sabaimran	19d80d190e	Add `latest` tag to the khoj cloud description for prod	2024-12-16 17:56:12 -08:00
sabaimran	d62cc0d539	Merge branch 'master' of github.com:khoj-ai/khoj into support-online-search-via-searxng	2024-12-16 17:55:06 -08:00
sabaimran	6f3218f487	Merge pull request #1008 from khoj-ai/features/new-sign-in-page - Make it easier to quickly create the account without losing track of where you are - Show some capabilities before you sign on	2024-12-16 17:54:43 -08:00
sabaimran	753859fbe0	Make the docs and github buttons on the sign in email less prominent	2024-12-16 17:49:13 -08:00
sabaimran	20888d3930	Clarify some of the language in the sign in email	2024-12-16 17:47:12 -08:00
sabaimran	13e7455f56	Use lowercase c in click	2024-12-16 17:44:08 -08:00
sabaimran	d17a9ba4c8	Fix return data for expired code user	2024-12-16 17:33:53 -08:00
sabaimran	efb0b9f495	Gracefully handle error when user login code is expired	2024-12-16 16:47:54 -08:00
sabaimran	064f7e48ca	Update various copy texts for OG metadata and such	2024-12-16 16:40:46 -08:00
sabaimran	28b8f9105d	Remove parenthetical from email template	2024-12-16 13:40:41 -08:00
Debanjum	9d02978f6e	Support online search with Searxng as turnkey, self-hostable solution This allows online search to work out of the box again for self-hosting users, as no auth/api key setup required. Docker users do not need to change anything in their setup flow. Direct installers can setup searxng locally or use public instances if they do not want to use any of the other providers (like Jina, Serper) Resolves #749. Resolves #990	2024-12-16 12:53:38 -08:00
Debanjum	9c64275dec	Auto redirect requests to use HTTPS if server is using SSL certs	2024-12-16 12:53:38 -08:00
sabaimran	ae9750e58e	Add rate limiting to OTP login attempts and update email template for conciseness	2024-12-16 11:59:45 -08:00
sabaimran	b7783357fa	Decrease timeout limit for verification codes to 5 minutes	2024-12-16 09:07:34 -08:00
sabaimran	5d3da3340f	Include email in verification API	2024-12-15 13:54:41 -08:00
sabaimran	8e3313156e	Simplify the magic link email a little bit	2024-12-14 11:19:31 -08:00
sabaimran	6a56140360	Allow users to directly enter their unique code when logging in - Code automatically becomes invalid after 30 minutes	2024-12-14 11:06:05 -08:00
sabaimran	c25174e8d4	Apply more finished styling to login features, make the pop-up mobile friendly	2024-12-14 09:46:19 -08:00
sabaimran	73c1fe6ae1	Add text overlay to caption the different assets	2024-12-13 13:45:31 -08:00
Debanjum	132f2c987a	Make Khoj email sender configurable for all email variants The welcome, feedback and automation emails were still using the Khoj domain, which wouldnt work for self-hosted users with their RESEND key Resolves #1004	2024-12-13 12:25:23 -08:00
sabaimran	f1fb4525c6	Remove old images	2024-12-13 11:31:14 -08:00
sabaimran	d5681ad1a2	Update image assets to sign up prompt	2024-12-13 11:30:14 -08:00
sabaimran	62545a9af3	Update package path for pypi ci export	2024-12-12 21:25:21 -08:00
sabaimran	e74e922cea	Update file path of python installation	2024-12-12 16:50:32 -08:00
sabaimran	144f283a25	Maintain old login page for posterity and associated API	2024-12-12 16:23:44 -08:00
sabaimran	4697daeb1a	Improve opengraph metadata across front-end pages	2024-12-12 15:56:43 -08:00
sabaimran	dfc150c442	Merge branch 'master' of github.com:khoj-ai/khoj into features/new-sign-in-page	2024-12-12 15:43:06 -08:00
sabaimran	ad3f8a33d1	Add a static login footer that prompts login, disable input box without auth	2024-12-12 14:57:52 -08:00
Debanjum	2db7a1ca6b	Restart code sandbox on crash in eval github workflow (#1007 ) See `e3fed3750b` for corresponding change to use pm2 to auto-restart code sandbox	2024-12-12 14:32:03 -08:00
Debanjum	12c976dcb2	Track usage costs from DeepInfra OpenAI compatible API	2024-12-12 14:20:34 -08:00
Debanjum	b0abec39d5	Use chat model name var name consistently and type openai chat utils	2024-12-12 14:20:34 -08:00
Debanjum	4915be0301	Fix initializing chat model names parameter after field rename in #1003	2024-12-12 14:20:33 -08:00
sabaimran	a7d0ed8670	Add carousel for navigating images in the sign up modal	2024-12-12 11:47:41 -08:00
Debanjum	9eb863e964	Restart code sandbox on crash in eval github workflow	2024-12-12 11:28:54 -08:00
Debanjum	01bc6d35dc	Rename Chat Model Options table to Chat Model as short & readable (#1003 ) - Previous was incorrectly plural but was defining only a single model - Rename chat model table field to name - Update documentation - Update references functions and variables to match new name	2024-12-12 11:24:16 -08:00
sabaimran	943065b7b3	Remove dead dependencies and improve the google sign in button	2024-12-12 11:19:19 -08:00
sabaimran	41bb1e60d0	Use the LoginPrompt in the chat history side panel	2024-12-11 22:56:03 -08:00
sabaimran	b60b750555	Update the styling to align with Google branding via the sign in button - Disable the gsi client side code since it's being finnicky and inconsistent	2024-12-11 22:49:11 -08:00
sabaimran	0f8b055b42	Improve padding for space, esp in mobile	2024-12-11 18:22:48 -08:00
sabaimran	142239d2c9	Add mobile friendliness and replace the login page redirects	2024-12-11 18:01:04 -08:00
sabaimran	de6ed2352a	Break up the parts of the login dialog into smaller modules to extend for mobile	2024-12-11 17:18:43 -08:00
sabaimran	d35db99c6f	Initial version of a carousel working for sign in with steps for email sign up - Google sign in is pending with the gsi client code. Will see if I can get that working - Check in relevant image assets	2024-12-11 16:54:06 -08:00
aditya218	9be26e1bd2	Fix documentation to point to local environment image_generation.md (#1005 ) Fix documentation to point to local environment.	2024-12-11 16:12:25 -08:00
sabaimran	530b44cf56	Merge branch 'master' of github.com:khoj-ai/khoj into features/new-sign-in-page	2024-12-11 10:30:13 -08:00
Debanjum	fe09df66b7	Make code sandbox container url accessible to Khoj container in docker compose	2024-12-11 01:14:26 -08:00
Debanjum	59008ae90e	Use buildx to create multi platform docker image	2024-12-11 00:21:29 -08:00
Debanjum	ec797bc6b8	Build docker imgs on native arch runners to avoid manifest list error This also avoids the need to use --amend and annotate steps when creating the multi-arch docker images	2024-12-10 23:16:36 -08:00
Debanjum	5f7b13df2d	Fix new docker tags in workflow to not include forward slashes	2024-12-10 22:55:33 -08:00
Debanjum	ba6237b5c0	Fix to create multi-arch builds. Stop docker image overwrites in workflow	2024-12-10 21:08:17 -08:00
sabaimran	44ede26e67	Temporarily disable cloud arm builds while we disambiguate the build issues	2024-12-10 20:00:59 -08:00
Debanjum	33a5efaf4b	Fix undefined variable exception during openai provider setup on init Resolves #1001	2024-12-10 19:54:00 -08:00
sabaimran	e43341fdcc	Release Khoj version 1.31.0	2024-12-10 19:41:31 -08:00
Debanjum	a757ecfd2b	Put the generated assets message after the user query and fix prompt	2024-12-10 19:40:13 -08:00
Debanjum	40e4f2ec2e	Reduce clutter in chat message ux on Obsidian - Move khoj message border to left like in web ui - Remove user, sender emojis and name - Ensure conversation title always at top of chat sessions view, even if no chat sessions loaded yet, instead of causing layout shift on chat sessions load	2024-12-10 19:34:17 -08:00
sabaimran	1ec1eff57e	Improve mobile header, reduce title bar padding and add conv title	2024-12-10 19:03:57 -08:00
sabaimran	321eeeaed7	Fix setting title of shared conversation, move shared button into the title pane	2024-12-10 18:19:46 -08:00
sabaimran	d7e5a76ace	Add an icon in the input bar for research mode	2024-12-10 17:49:25 -08:00
sabaimran	01d000e570	Merge pull request #1002 from khoj-ai/features/improve-multiple-output-mode-generation Improve handling of multiple output modes - Use the generated descriptions / inferred queries to supply context to the model about what it's created and give a richer response - Stop sending the generated image in user message. This seemed to be confusing the model more than helping. - Collect generated assets in a structured objects to provide model context. This seems to help it follow instructions and separate responsibility better - Also, rename the open ai converse method to converse_openai to follow patterns with other providers	2024-12-10 17:06:19 -08:00
sabaimran	2bb14c55a6	Merge branch 'master' of github.com:khoj-ai/khoj into features/improve-multiple-output-mode-generation	2024-12-10 16:56:36 -08:00
sabaimran	6c8007e23b	Improve handling of multiple output modes - Use the generated descriptions / inferred queries to supply context to the model about what it's created and give a richer response - Stop sending the generated image in user message. This seemed to be confusing the model more than helping. - Also, rename the open ai converse method to converse_openai to follow patterns with other providers	2024-12-10 16:54:21 -08:00
Debanjum	4bc5c1357a	Upgrade server, documentation dependencies. Spell fix docker-compose.yml	2024-12-10 15:47:47 -08:00
Debanjum	f8957e52bf	Keep chatml message content simple for wider compat unless attachments This allows for wider compatibility with chat models and openai proxy ai model apis that expect message content to be string format, not objects.	2024-12-10 00:10:56 -08:00
sabaimran	4b4e0e20d4	Make the version number a badge, rather than an independent item in the nav dropdown	2024-12-09 14:45:26 -08:00
sabaimran	eb36492ba5	Update handling of images when included in the chat history with assistant message	2024-12-08 21:46:07 -08:00
Debanjum	b660c494bc	Use recognizable DB model names to ease selection UX on Admin Panel Previously id were used (by default) for model display strings. This made it hard to select chat model options, server chat settings etc. in the admin panel dropdowns. This change uses more recognizable names for the DB objects to ease selection in dropdowns and display in general on the admin panel.	2024-12-08 20:34:50 -08:00
Debanjum	d10dc9cfe1	Inform code tool AI only limited python packages are available to it Reduce code tool failing with module not found errors	2024-12-08 20:34:50 -08:00
Debanjum	3fd8614a4b	Only auto load available chat models from Ollama provider for now Allowing models from any openai proxy service makes it too unwieldy. And a bunch of them do not even support this endpoint.	2024-12-08 20:34:50 -08:00
sabaimran	2c934162d3	Add a data filter for privacy_level of agents	2024-12-08 19:55:57 -08:00
sabaimran	3b9f4c4356	Correct negative for running prod image locally	2024-12-08 19:55:35 -08:00
Debanjum	9dd3782f5c	Rename OpenAIProcessorConversationConfig DB model to more apt AiModelApi (#998 ) * Rename OpenAIProcessorConversationConfig to more apt AiModelAPI The DB model name had drifted from what it is being used for, a general chat api provider that supports other chat api providers like anthropic and google chat models apart from openai based chat models. This change renames the DB model and updates the docs to remove this confusion. Using Ai Model Api we catch most use-cases including chat, stt, image generation etc.	2024-12-08 18:02:29 -08:00
sabaimran	df66fb23ab	Centralize definition of the content security policy and add in-app chat - in-app chat is meant for support requests and currently is only in the settings page, where users are most likely to go if confused IMO	2024-12-08 17:57:27 -08:00
sabaimran	0b87c13f8d	Add khoj_version to the settings menu	2024-12-08 17:55:56 -08:00
sabaimran	47a087c73b	Fix chatwoot import issue by checking whether we're in an execution environment before loading the script	2024-12-08 17:16:20 -08:00
sabaimran	66f59c8d41	Add Chatwoot to documentation See repo: https://github.com/chatwoot/chatwoot	2024-12-08 16:51:43 -08:00
sabaimran	6ed051d631	Merge pull request #994 from khoj-ai/features/update-desktop-app Simplify the desktop app - Make the desktop app mainly a file-syncing client for users who have lots of documents that they need to share with Khoj. This is because the web app provides a fairly robust chat client which can be used by anyone on their computer. - The chat client in the desktop app had significantly drifted from our current brand / them, and didn't provide enough value add to update. Later, we will make it easier to install the existing web app as a desktop PWA.	2024-12-08 15:05:35 -08:00
sabaimran	05b3911080	Update some button titles and add descriptions for clarity	2024-12-08 14:29:12 -08:00
sabaimran	b78b92d6a0	Merge branch 'master' of github.com:khoj-ai/khoj into features/update-desktop-app	2024-12-08 14:20:20 -08:00
sabaimran	e3789aef49	Merge pull request #992 from khoj-ai/features/allow-multi-outputs-in-chat Currently, Khoj has terminal states with respect to what assets it outputs. We limit it to image, text, and excalidraw diagram. This limitation is unnecessary and provides undue constraints for creating more dynamic, diverse experiences. For instance, we may want the chat view to morph for document editing or generation, in which case having limited output modes would be a detriment. This change allows us to emit generated assets and then continue on to more text generation in final response. It forces text response for all messages. It adds a new stream event, GENERATED_ASSETS, which holds content that the AI is emitting in response to the query.	2024-12-08 14:19:05 -08:00
sabaimran	a2251f01eb	Make result optional for code context, relevant when code execution was unsuccessful	2024-12-08 13:27:33 -08:00
sabaimran	9c403d24e1	Fix reference to directory in the eval workflow for starting terrarium	2024-12-08 13:03:05 -08:00
sabaimran	7cd2855146	Make attributes optional in the knowledge graph model	2024-12-08 12:23:17 -08:00
sabaimran	2af687d1c5	Allow snippetHighlighted to also be nullable	2024-12-08 11:51:24 -08:00
sabaimran	efa23a8ad8	Update validation requirements for online searches	2024-12-08 11:30:17 -08:00
sabaimran	6940c6379b	Add sudo when running installations in order to install relevant packages add --legacy-peer-deps temporarily to see if it helps mitigate the issue	2024-12-08 11:11:13 -08:00
sabaimran	4c4b7120c6	Use Khoj terrarium fork instead of building from official Cohere repo	2024-12-08 11:06:33 -08:00
sabaimran	a138845fea	Merge branch 'master' of github.com:khoj-ai/khoj into features/allow-multi-outputs-in-chat	2024-12-08 10:57:16 -08:00
sabaimran	19832a3ed0	Add note to uncomment line when using the prod image	2024-12-05 18:16:01 -08:00
sabaimran	110c64ba27	Update the desktop instructions	2024-12-05 17:39:16 -08:00
Debanjum	65c5b163c9	Add khoj_lantern svg to web public assets for use by new admin panel	2024-12-05 10:57:18 -08:00
Debanjum	354dc12b3b	Style the Admin Panel with a modern theme and Khoj branding (#999 ) Overview - The default django admin panel UI looks pretty dated and didn't have any Khoj specific branding - Used the Unfold Django admin panel theme for a modern look - Used the Khoj logo and name in Admin panel title, headings, favicons Details: All models shown on Admin panel need to inherit from unfold's ModelAdmin to get styling applied. So - Make all models on Admin panel inherit from unfold's ModelAdmin - Subclassed UserAdmin to inherit from unfold's ModelAdmin - Deregistered the unused Auth Group model from the Admin panel We can add it back when its actually used. Avoid confusion for now - Explicitly register DjangoJobExecution on admin panel and again make it inherit from the unfold.admin.ModelAdmin	2024-12-04 23:53:43 -08:00
Matias Forbord	9cc79c0fb7	Fix broken doc links to query filters from emacs docs page (#1000 ) * docs: repair query filters link * change docs link to be relative	2024-12-04 23:52:27 -08:00
sabaimran	8953ac03ec	Rename additional context for llm response to program execution context	2024-12-04 18:43:41 -08:00
sabaimran	886fe4a0c9	Merge branch 'master' of github.com:khoj-ai/khoj into features/allow-multi-outputs-in-chat	2024-12-03 21:37:00 -08:00
sabaimran	df5e34615a	Fix processing of images field when construct chat messages	2024-12-03 21:26:55 -08:00
sabaimran	3552032827	Rename additional context to additional_context_for_llm_response	2024-12-03 21:23:15 -08:00
sabaimran	d507894546	Simplify the desktop app - Make the desktop app mainly a file-syncing client for users who have lots of documents that they need to share with Khoj. This is because the web app provides a fairly robust chat client which can be used by anyone on their computer. - The chat client in the desktop app had significantly drifted from our current brand / them, and didn't provide enough value add to update. Later, we will make it easier to install the existing web app as a desktop PWA.	2024-12-02 15:54:05 -08:00
Debanjum	9f7cb335c5	Create Android app for Khoj (#991 ) Use bubblewrap to publish the Khoj Progressive Web App (PWA) as a Trusted Web Activity (TWA) to the Android Play Store	2024-12-02 15:45:04 -08:00
Debanjum	ee28d7f125	Fix Android Studio build warnings by using newer gradle, mavenCentral	2024-12-02 12:48:33 -08:00
Debanjum	5d6fb07066	Fix app icons, orientation. Improve name, id, description in webmanifest - Use bubblewrap generated splash screen, notification icons from 1200x1200 high res khoj icon in assets.khoj.dev. - Discard bubblewrap generated launcher icons. - Fix orientation to respect phone orientation settings. "any" was not.	2024-12-02 12:48:25 -08:00
Debanjum	147c8e9115	Release v3 with high-res splash screen. More details in web, app manifest - Add 512, 192 Khoj maskable icons to web app manifest for android rendering - Add id, categories etc suggested by pwabuilder - Use higher quality icon images for splash screen than what bubblewrap creates by default	2024-12-02 11:28:35 -08:00
Debanjum	d333e10e64	Encode request params as utf-8 to fix multibyte char error in khoj.el Encode api key in header, POST request body and GET query param for search as utf-8 to avoid the multibyte char in request issue when making API calls from khoj.el to khoj server. Resolves #935	2024-12-02 02:00:14 -08:00
Debanjum	db29894038	Do not wrap filepath in Path to fix indexing markdown files on Windows (#993 ) ### Issue - Path with / are converted to \\ on Windows using the `Path' operator. - The `markdown_to_entries' module was trying to normalize file paths with`Path' for some reason. This would store the file paths in DB Entry differently than the file to entries map if Khoj ran on Windows. That'd result in a KeyError when trying to look up the entry file path from `file_to_text_map' in the `text_to_entries:update_embeddings()' function. ### Fix - Removing the unnecessary OS dependent Path normalization in `markdown_to_entries' should keep the file path storage consistent across `file_to_text_map' var, `FileObjectAdaptor', `Entry' DB tables on Windows for Markdown files as well. This issue will affect users hosting Khoj server on Windows and attempting to index markdown files. Resolves #984	2024-12-02 01:02:58 -08:00
Debanjum	47c926b0ff	Add more typing to org\|md_to_entries. Remove redundant f-string wraps - Add type hints to improve maintainability of stabilzed indexing code - It shouldn't be necessary to wrap string variables in an f-string This change aims to improve code quality. It should not affect functionality.	2024-12-01 23:02:52 -08:00
Debanjum	dffdd81345	Do not wrap filepath in Path to fix indexing markdown files on Windows Issue - Path with / are converted to \\ on Windows using the Path operator. - The markdown to entries method for some reason was doing this. This would store the file paths in DB entry differently than the file to entries map. Resulting in a KeyError when trying to look up the entry file path from file_to_text_map in the text_to_entries:update_embeddings() function. Fix - Removing the unnecessary OS dependendent Path normalization in markdown_to_entries should keep the file path storage consistent across file_to_text_map var, FileObjectAdaptor, Entry DB tables on Windows for Markdown files as well This issue would only affect users hosting Khoj server on Windows and attempting to index markdown files. Resolves #984	2024-12-01 23:00:31 -08:00
sabaimran	c87fce5930	Add a migration to use the new image storage format for past conversations - Added it to the Django migrations so that it auto-triggers when someone updates their server and starts it up again for the first time. This will require that they update their clients as well in order to view/consume image content. - Remove server-side references in the code that allow to parse the text-to-image intent as it will no longer be necessary, given the chat logs will be migrated	2024-12-01 18:35:31 -08:00
Debanjum	9e0a2c7a98	Restrict generated chat title to 200 chars limit allowed for chat slug	2024-11-30 19:12:03 -08:00
Debanjum	8b8e2be82d	Only create subscription object when it does not exist for user This avoid unnecessarily throwing an internal server error when the user tries to sign-up using multiple mechanisms (e.g first by email, then by google oauth)	2024-11-30 19:08:34 -08:00
Debanjum	fc6be543bd	Improve GPQA eval prompt to imrpove parsing answer from Khoj response	2024-11-30 17:21:09 -08:00
sabaimran	00f48dc1e8	If in the new images format, show the response text in obsidian instead of the inferred query	2024-11-30 14:39:51 -08:00
sabaimran	224abd14e0	Only add the image_url to the constructed chat message if it is a url	2024-11-30 14:39:27 -08:00
sabaimran	991577aa17	Allow a None turnId to accommodate historic chat messages	2024-11-30 14:39:08 -08:00
sabaimran	a539761c49	Fix processing of excalidrawdiagram in json response chunking	2024-11-30 12:35:13 -08:00
sabaimran	dc4a9ee3e1	Ensure that the generated assets are maintained in the chat window after streaming is completed.	2024-11-30 12:31:20 -08:00
sabaimran	e3aee50cf3	Fix parsing of generated_asset response	2024-11-29 18:41:53 -08:00
sabaimran	2b32f0e80d	Remove commented out code blocks	2024-11-29 18:11:50 -08:00
sabaimran	df855adc98	Update response handling in Obsidian to work with new format	2024-11-29 18:10:47 -08:00
sabaimran	512cf535e0	Collapse train of thought when completed during live stream	2024-11-29 18:10:35 -08:00
sabaimran	a0b00ce4a1	Don't include null attributes when filling in stored conversation metadata - Prompt adjustments to indicate to LLM what context it has	2024-11-29 18:10:14 -08:00
sabaimran	c5329d76ba	Merge branch 'master' of github.com:khoj-ai/khoj into features/allow-multi-outputs-in-chat	2024-11-29 14:12:03 -08:00
sabaimran	46f647d91d	Improve image rendering for khoj generated images. FIx typing of stored excalidraw image.	2024-11-29 14:11:48 -08:00
Debanjum	fdf69b7049	Publish second version with new upload key	2024-11-28 22:04:10 -08:00
Debanjum	faf15072b6	Create first version of Khoj Android app from PWA using Bubblewrap	2024-11-28 22:04:10 -08:00
sabaimran	4f6d1211ba	Fix additional context type in anthropic chat	2024-11-28 20:16:36 -08:00
sabaimran	6f408948d3	Fix typing of generated_fiels parameters	2024-11-28 20:15:10 -08:00
sabaimran	439b18c21f	Release Khoj version 1.30.10	2024-11-28 19:43:06 -08:00
sabaimran	2dfd163430	Add more explicity run strategies in the runner matrix	2024-11-28 19:31:34 -08:00
sabaimran	80cd902c86	Since linux/amd64 images aren't being created, try setting a custom description on the image Refer to this GH documentation on working with multi arch images in the container registry: https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#adding-a-description-to-multi-arch-images	2024-11-28 19:14:06 -08:00
sabaimran	40d8a7a581	Release Khoj version 1.30.9	2024-11-28 18:45:50 -08:00
sabaimran	87aa653c7f	Add additional steps in prod.Dockerfile to ensure dependencies are copied over	2024-11-28 18:37:08 -08:00
sabaimran	d91935c880	Initial commit of a functional but not yet elegant prototype for this concept	2024-11-28 17:28:23 -08:00
Debanjum	a552543f4f	Use json5 to parse llm generated questions to query docs and web json5 is more forgiving, handles double quotes, newlines in raw json string	2024-11-28 14:35:34 -08:00
Debanjum	0a69af4f61	Update to latest ToDesktop runtime	2024-11-28 13:56:14 -08:00
Debanjum	1d0fe141dc	Release Khoj version 1.30.8	2024-11-28 13:37:30 -08:00
Debanjum	29e801c381	Add MATH500 dataset to eval Evaluate simpler MATH500 responses with gemini 1.5 flash This improves both the speed and cost of running this eval	2024-11-28 12:48:25 -08:00
Debanjum	22aef9bf53	Add GPQA (diamond) dataset to eval	2024-11-28 12:48:25 -08:00
Debanjum	f1190ccf32	Improve parsing complex json strings returned by LLM (#989 ) - Improve escaping to load complex json objects - Fallback to a more forgiving [json5](https://json5.org/) loader if `json.loads` cannot parse complex json str This should reduce failures to pick research tool and run code by agent	2024-11-28 11:01:39 -08:00
Debanjum	8c120a5139	Fallback to json5 loader if json.loads cannot parse complex json str JSON5 spec is more flexible, try to load using a fast json5 parser if the stricter json.loads from the standard library can't load the raw complex json string into a python dictionary/list	2024-11-26 21:17:00 -08:00
Debanjum	70b7e7c73a	Improve load of complex json objects. Use it to pick tool, run code Gemini doesn't work well when trying to output json objects. Using it to output raw json strings with complex, multi-line structures requires more intense clean-up of raw json string for parsing	2024-11-26 17:37:57 -08:00
Debanjum	8cb0db0051	Fix llama-cpp-python install by pytest github workflow - Use pre-built wheels for torch and llama-cpp-python - Install and link musl as it's used by llama-cpp-python pre-built wheel instead of glibc - Join Install git and Install Dependencies steps in pytest workflow To remove unnecessary steps	2024-11-26 02:04:36 -08:00
Debanjum	29315f44e7	Add assetlinks.json to link android app to app.khoj.dev domain Add sha cert of android upload, signing keys to open debug, prod apps as TWA in fullscreen on android phones	2024-11-26 01:57:54 -08:00
Debanjum	a97a45bf20	Align agent personality with recently updated khoj personality See update to Khoj personality in commit `6eb59464da`	2024-11-26 00:06:16 -08:00
Debanjum	e088fcbc7b	Build for arm64 on arm64 runner. Parallelize arm64, x64 docker builds - Building arm64 image on an ubuntu arm64 runner reduces `yarn build' step time by 75% from 12mins to 3mins. - This is because no QEMU emulation for arm64 on x86 is required now - Parallelizing x64 and arm64 platform builds halves build time on top - Revert to use standard ubuntu-latest runner as large x64 runner doesn't give much more speed improvements This results an effective additional 50%-66% reduction in build time on top of #987. So a full dockerize workflow run now takes 10 mins vs previous 35+mins. This is a total of 72% improvement in max dockerize run time. Get additional speed improvements when docker layer cache hit.	2024-11-24 23:18:55 -08:00
Debanjum	5723a3778e	Speed up Docker image builds using multi-stage parallel pipelines (#987 ) ## Objective Improve build speed and size of khoj docker images ## Changes ### Improve docker image build speeds - Decouple web app and server build steps - Build the web app and server in parallel - Cache docker layers for reuse across dockerize github workflow runs - Split Docker build layers for improved cacheability (e.g separate `yarn install` and `yarn build` steps) ### Reduce size of khoj docker images - Use an up-to-date `.dockerignore` to exclude unnecessary directories - Do not installing cuda python packages for cpu builds ### Improve web app builds - Use consistent mechanism to get fonts for web app - Make tailwind extensions production instead of dev dependencies - Make next.js create production builds for the web app (via `NODE_ENV=production` env var)	2024-11-24 21:49:46 -08:00
Debanjum	4a5646c8da	Cache docker layers, nextjs builds in dockerize github workflow	2024-11-24 21:06:22 -08:00
Debanjum	6a39651ad3	Standardize loading fonts locally across pages on web app	2024-11-24 20:41:15 -08:00
sabaimran	9368699b2c	Migrate the pre-commit config	2024-11-24 14:54:26 -08:00
sabaimran	6eb59464da	Add additional reinforcement to coax gemini into giving a minimum helpful response	2024-11-24 14:53:53 -08:00
sabaimran	15f062b34a	Remove print statement for agent style map	2024-11-24 14:53:53 -08:00
sabaimran	d7e68a2d1b	Wait for iplcodata to load before first message - Fix the console khoj ai ascii art - Remove some not so good suggested prompt	2024-11-24 14:53:53 -08:00
Debanjum	f51e0f7859	Make Next.js create production builds of web app for Docker images	2024-11-24 13:59:40 -08:00
Debanjum	710e00ad9e	Make tailwind extensions prod, instead of dev, deps of web app	2024-11-24 13:59:40 -08:00
Debanjum	4b486ea5f6	Exclude unnecessary directories from final docker builds	2024-11-24 13:59:40 -08:00
Debanjum	78d8ca49ec	Skip Nvidia GPU python packages during Server install in Dockerfiles	2024-11-24 13:59:39 -08:00
Debanjum	37887a175a	Speed up Docker image builds using multi-stage parallel pipelines Decouple web app, server builds in parallel to speed up Docker builds	2024-11-24 12:48:30 -08:00
Debanjum	7c77d65d35	Improve logic to disable telemetry via KHOJ_TELEMETRY_DISABLE env var The newly added KHOJ_TELEMETRY_DISABLE env var knob to disable telemetry should override old config mechanism when set	2024-11-24 00:54:16 -08:00
sabaimran	2d683898c2	Release Khoj version 1.30.7	2024-11-23 22:51:10 -08:00
sabaimran	914ff994f7	Fix cost addition to chat_metadata	2024-11-23 22:50:45 -08:00
Debanjum	caaa127dcf	Release Khoj version 1.30.6	2024-11-23 21:07:00 -08:00
Debanjum	57b8273002	Fix apt install for musl-dev in prod.Dockerfile	2024-11-23 21:06:09 -08:00
Debanjum	8f966b11ec	Release Khoj version 1.30.5	2024-11-23 20:49:05 -08:00
Debanjum	498895a47d	Fix libmusl error using pre-built llama-cpp-python wheel in prod Docker	2024-11-23 20:47:41 -08:00
Debanjum	e5b211a743	Release Khoj version 1.30.4	2024-11-23 19:48:21 -08:00
Debanjum	9848d89d03	Try build docker images with github high cpu, ram runner	2024-11-23 19:09:36 -08:00
Debanjum	04bb3d6f15	Fix libmusl error using pre-built llama-cpp-python wheel via Docker Seems like llama-cpp-python pre-built wheels need libmusl. Otherwise you run into runtime errors on Khoj startup via Docker.	2024-11-23 18:46:44 -08:00
Debanjum	8dd2122817	Set sample size to 200 for automated eval runs as well	2024-11-23 14:48:38 -08:00
Debanjum	c4ef31d86f	Release Khoj version 1.30.3	2024-11-23 14:40:06 -08:00
Debanjum	15ae22bdcf	Use pre-built llama-cpp-python wheel in Khoj docker images Reduces build time and resolves FileNotFoundError 'ninja' during llama-cpp-python local build.	2024-11-23 14:38:07 -08:00
sabaimran	4ac49ca90f	Release Khoj version 1.30.2	2024-11-23 12:00:28 -08:00
sabaimran	eb1b21baaa	Add a new sign in modal that is triggered from the login prompt screen, rather than redirecting user to another screen to sign in	2024-11-23 11:55:34 -08:00
Debanjum	5aa5cb1941	Add "New" section with latest updates to Readme	2024-11-23 01:36:50 -08:00
sabaimran	7f5bf35806	Disambiguate renewal_date type. Previously, being used as None, False, and Datetime in different places.	2024-11-22 12:06:20 -08:00
sabaimran	5e8c824ecc	Improve the experience for finding past conversation - add a conversation title search filter, and an agents filter, for finding conversations - in the chat session api, return relevant agent style data	2024-11-22 12:03:01 -08:00
sabaimran	a761865724	Fix handling of customer.subscription.updated event to process new renewal end date	2024-11-22 12:03:01 -08:00
sabaimran	6a054d884b	Add quicker/easier filtering on auth	2024-11-22 12:03:01 -08:00
Debanjum	b9a889ab69	Fix Khoj responses when code generated charts in response context The current fix should improve Khoj responses when charts in response context. It truncates code context before sharing with response chat actors. Previously Khoj would respond with it not being able to create chart but than have a generated chart in it's response in default mode. The truncate code context was added to research chat actor for decision making but it wasn't added to conversation response generation chat actors. When khoj generated charts with code for its response, the images in the context would exceed context window limits. So the truncation logic to drop all past context, including chat history, context gathered for current response. This would result in chat response generator 'forgetting' all for the current response when code generated images, charts in response context.	2024-11-21 14:43:52 -08:00
Debanjum	5475a262d4	Move truncate code context func for reusability across modules It needs to be used across routers and processors. It being in run_code tool makes it hard to be used in other chat provider contexts due to circular dependency issues created by send_message_to_model_wrapper func	2024-11-21 14:27:39 -08:00
Debanjum	f434c3fab2	Fix toggling prompt tracer on/off in Khoj via PROMPTRACE_DIR env var Previous changes to depend on just the PROMPTRACE_DIR env var instead of KHOJ_DEBUG or verbosity flag was partial/incomplete. This fix adds all the changes required to only depend on the PROMPTRACE_DIR env var to enable/disable prompt tracing in Khoj.	2024-11-21 14:06:00 -08:00
Debanjum	4a40cf79c3	Add docs on how to cross-device access self-hosted khoj using tailscale	2024-11-21 11:07:18 -08:00
Debanjum	1f96c13f72	Enable starting khoj uvicorn server with ssl cert file, key for https Pass your domain cert files via the --sslcert, --sslkey cli args. For example, to start khoj at https://example.com, you'd run command: KHOJ_DOMAIN=example.com khoj --sslcert example.com.crt --sslkey example.com.key --host example.com This sets up ssl certs directly with khoj without requiring a reverse proxy like nginx to serve khoj behind https endpoint for simple setups. More complex setups should, of course, still use a reverse proxy for efficient request processing	2024-11-21 11:07:18 -08:00
sabaimran	9fea02f20f	In telemetry, differentiate create_user google and email	2024-11-21 11:01:37 -08:00
sabaimran	9db885b5f7	Limit access to chat models to futurist users	2024-11-21 07:53:24 -08:00
sabaimran	7a00a07398	Add trailing slash to Ollama url in docs	2024-11-21 07:48:18 -08:00
sabaimran	3519dd76f0	Fix type of excalidraw image response	2024-11-20 19:01:13 -08:00
sabaimran	467de76fc1	Improve the image diagramming prompts and response parsing	2024-11-20 18:59:40 -08:00
Debanjum	50d8405981	Enable khoj to use terrarium code sandbox as tool in eval workflow	2024-11-20 14:19:27 -08:00
Debanjum	2203236e4c	Update desktop app dependencies	2024-11-20 13:05:55 -08:00
Debanjum	409204917e	Update documentation website dependencies	2024-11-20 13:05:32 -08:00
Debanjum	6f1adcfe67	Track Usage Metrics in Chat API. Track Running Cost, Accuracy in Evals (#985 ) - Track, return cost and usage metrics in chat api response Track input, output token usage and cost of interactions with openai, anthropic and google chat models for each call to the khoj chat api - Collect, display and store costs & accuracy of eval run currently in progress This provides more insight into eval runs during execution instead of having to wait until the eval run completes.	2024-11-20 12:59:44 -08:00
Debanjum	ffbd0ae3a5	Fix eval github workflow to run on releases, i.e on tags push	2024-11-20 12:57:42 -08:00
Debanjum	ed364fa90e	Track running costs & accuracy of eval runs in progress Collect, display and store running costs & accuracy of eval run. This provides more insight into eval runs during execution instead of having to wait until the eval run completes.	2024-11-20 12:40:51 -08:00
Debanjum	bbd24f1e98	Improve dropdown menus on web app setting page with scroll & min-width - Previously when settings list became long the dropdown height would overflow screen height. Now it's max height is clamped and y-scroll - Previously the dropdown content would take width of content. This would mean the menu could sometimes be less wide than the button. It felt strange. Now dropdown content is at least width of parent button	2024-11-20 12:27:13 -08:00
Debanjum	c53c3db96b	Track, return cost and usage metrics in chat api response - Track input, output token usage and cost for interactions via chat api with openai, anthropic and google chat models - Get usage metadata from OpenAI using stream_options - Handle openai proxies that do not support passing usage in response - Add new usage, end response events returned by chat api. - This can be optionally consumed by clients at a later point - Update streaming clients to mark message as completed after new end response event, not after end llm response event - Ensure usage data from final response generation step is included - Pass usage data after llm response complete. This allows gathering token usage and cost for the final response generation step across streaming and non-streaming modes	2024-11-20 12:17:58 -08:00
Debanjum	80df3bb8c4	Enable prompt tracing only when PROMPTRACE_DIR env var set Better to decouple prompt tracing from debug mode or verbosity level and require explicit, independent config to enable prompt tracing	2024-11-20 11:54:02 -08:00
Debanjum	9ab76ccaf1	Skip adding agent to chat metadata when chat unset to avoids null ref	2024-11-19 21:10:23 -08:00
Debanjum	4da0499cd7	Stream responses by openai's o1 model series, as api now supports it Previously o1 models did not support streaming responses via API. Now they seem to do	2024-11-19 21:10:23 -08:00
sabaimran	e5347dac8c	Fix base image used for prod in docs	2024-11-19 15:51:27 -08:00
sabaimran	b943069577	Fix button text, and login url in self-hosted auth docs	2024-11-19 15:50:13 -08:00
sabaimran	3b5e6a9f4d	Update authentication documentation	2024-11-19 15:45:47 -08:00
Debanjum	7bdc9590dd	Fix handling sources, output in chat actor when is automated task Remove unnecessary ```python prefix removal. It isn't being triggered in json deserialize path.	2024-11-19 13:49:27 -08:00
Debanjum	0e7d611a80	Remove ```python codeblock prefix from raw json before deserialize	2024-11-19 12:53:52 -08:00
Debanjum	001c13ef43	Upgrade web app package dependencies	2024-11-19 12:53:52 -08:00
sabaimran	4f5c1eeded	Update some of the open graph data for the documentation website	2024-11-19 11:14:46 -08:00
sabaimran	5134d49d71	Release Khoj version 1.30.1	2024-11-18 17:30:33 -08:00
sabaimran	8bdd0b26d3	And a connections clean up decorator to all scheduled tasks	2024-11-18 17:19:36 -08:00
Debanjum	817601872f	Update default offline models enabled	2024-11-18 16:38:17 -08:00
Debanjum	45c623f95c	Dedupe, organize chat actor, director tests - Move Chat actor tests that were previously in chat director tests file - Dedupe online, offline io selector chat actor tests	2024-11-18 16:10:50 -08:00
Debanjum	2a76c69d0d	Run online, offine chat actor, director tests for any supported provider - Previously online chat actors, director tests only worked with openai. This change allows running them for any supported onlnie provider including Google, Anthropic and Openai. - Enable online/offline chat actor, director in two ways: 1. Explicitly setting KHOJ_TEST_CHAT_PROVIDER environment variable to google, anthropic, openai, offline 2. Implicitly by the first API key found from openai, google or anthropic. - Default offline chat provider to use Llama 3.1 3B for faster, lower compute test runs	2024-11-18 15:11:37 -08:00
Debanjum	653127bf1d	Improve data source, output mode selection - Set output mode to single string. Specify output schema in prompt - Both thesee should encourage model to select only 1 output mode instead of encouraging it in prompt too many times - Output schema should also improve schema following in general - Standardize variable, func name of io selector for readability - Fix chat actors to test the io selector chat actor - Make chat actor return sources, output separately for better disambiguation, at least during tests, for now	2024-11-18 15:11:37 -08:00
Debanjum	e3fd51d14b	Pass user arg to create title from query in new automation flow	2024-11-18 12:58:10 -08:00
Debanjum	9e74de9b4f	Improve serializing conversation JSON to print messages on console - Handle chatml message.content with non-json serializable data like WebP image binary data used by Gemini models	2024-11-18 12:57:05 -08:00
sabaimran	3f70d2f685	Add more graceful exception handling when tool selection doesn't work	2024-11-18 09:34:49 -08:00
Debanjum	a2ccf6f59f	Fix github workflow to start Khoj, connect to PG and upload results - Do not trigger tests to run in ci on update to evals	2024-11-18 04:25:15 -08:00
Debanjum	7c0fd71bfd	Add GitHub workflow to quiz Khoj across modes and specified evals (#982 ) - Evaluate khoj on random 200 questions from each of google frames and openai simpleqa benchmarks across general, default and research modes - Run eval with Gemini 1.5 Flash as test giver and Gemini 1.5 Pro as test evaluator models - Trigger eval workflow on release or manually - Make dataset, khoj mode and sample size configurable when triggered via manual workflow - Enable Web search, webpage read tools during evaluation	2024-11-18 02:19:30 -08:00
sabaimran	f75085dc7a	Release Khoj version 1.30.0	2024-11-17 21:36:22 -08:00
sabaimran	c72813ba67	Merge pull request #981 from rznzippy/bugfix/980/database-connections-leakage Fix database connections leakage (#980)	2024-11-17 21:01:06 -08:00
sabaimran	7d50c6590d	Merge pull request #977 from khoj-ai/features/improve-tool-selection - JSON extract from LLMs is pretty decent now, so get the input tools and output modes all in one go. It'll help the model think through the full cycle of what it wants to do to handle the request holistically. - Make slight improvements to tool selection indicators	2024-11-17 20:08:19 -08:00
sabaimran	282f47e0d6	Add Jina documentation to readme for self-hosting	2024-11-17 17:20:28 -08:00
Debanjum	48567fd468	Do not erase partial message when generation stopped via button on web app Previously, we'd replace the generated message with an error message when message generation stopped via stop button on chat page of web app. So the partially generated message (which could be useful) gets lost. This change just stops generation, while keeping the generated response so any useful information from the partially generated message can be retrieved.	2024-11-17 16:29:18 -08:00
Debanjum	285006d6c9	Sync chat models in Khoj with OpenAI proxies (e.g Ollama) on startup - Allows managing chat models in the OpenAI proxy service like Ollama. - Removes need to manually add, remove chat models from Khoj Admin Panel for these OpenAI compatible API services when enabled. - Khoj still mantains the chat models configs within Khoj, so they can be configured via the Khoj admin panel as usual.	2024-11-17 15:34:36 -08:00
Debanjum	4a7f5d1abe	Set API keys in docker-compose.yml to enable web search, scrape tools	2024-11-17 15:34:36 -08:00
Debanjum	d6eece63f4	Use Jina API Key of Jina web scraper if configured in DB Previously Jina search didn't API key. Now that it does need API key, we should re-use the API key set in the Jina web scraper config, otherwise fallback to using JINA_API_KEY from environment variable, if either is present. Resolves #978	2024-11-17 15:34:14 -08:00
sabaimran	6531f24ca0	Further improvements for descriptions to LLM on modes, code, diagram, image.	2024-11-17 13:23:57 -08:00
sabaimran	0eba6ce315	When diagram generation fails, save to conversation log - Update tool name when choosing tools to execute	2024-11-17 13:23:12 -08:00
sabaimran	7e662a05f8	Merge branch 'master' of github.com:khoj-ai/khoj into features/improve-tool-selection	2024-11-17 12:26:55 -08:00
Ilya Khrustalev	00b1af8f99	Fix database connections leakage (#980 )	2024-11-17 19:15:05 +01:00
Debanjum	69ef6829c1	Simplify integrating Ollama, OpenAI proxies with Khoj on first run - Integrate with Ollama or other openai compatible APIs by simply setting `OPENAI_API_BASE' environment variable in docker-compose etc. - Update docs on integrating with Ollama, openai proxies on first run - Auto populate all chat models supported by openai compatible APIs - Auto set vision enabled for all commercial models - Minor - Add huggingface cache to khoj_models volume. This is where chat models and (now) sentence transformer models are stored by default - Reduce verbosity of yarn install of web app. Otherwise hit docker log size limit & stops showing remaining logs after web app install - Suggest `ollama pull <model_name>` to start it in background	2024-11-17 02:08:20 -08:00
Debanjum	2366fa08b9	Update default vision supported & anthropic chat models on first run - Update to latest initialize with new claude 3.5 sonnet and haiku models - Update to set vision enabled for google and anthropic models by default. Previously we didn't support but we've supported this for a month or two now	2024-11-17 02:08:20 -08:00
Debanjum	23ab258d78	Improve user conversation config details on Admin panel Show user email and chat model that is associated with the user conversation config	2024-11-17 02:08:20 -08:00
Debanjum	41d9011a26	Move evaluation script into tests/evals directory This should give more space for eval scripts, results and readme	2024-11-17 02:08:20 -08:00
Debanjum	d9d5884958	Enable evaluating Khoj on the OpenAI SimpleQA bench using eval script - Just load the raw csv from OpenAI bucket. Normalize it into FRAMES format - Improve docstring for frames datasets as well - Log the load dataset perf timer at info level	2024-11-17 02:08:20 -08:00
Debanjum	eb5bc6d9eb	Remove Talc search bench from Khoj eval script	2024-11-17 02:08:20 -08:00
Debanjum	fc45aceecf	Delete unused favicon ico in old web app directory	2024-11-17 02:08:20 -08:00
Debanjum	a16fc3ade8	Only add /research prefix when no slash command in message on web app - Explictly adding a slash command is a higher priority intent than research mode being enabled in the background. Respect that for a more intuitive UX flow. - Explicit slash commands do not currently work in research mode. You've to turn research mode off to use other slash commands. This is strange, unnecessary given intent priority is clear.	2024-11-17 02:08:20 -08:00
sabaimran	a1b4587b34	Remove extract_images flag from PDF loader	2024-11-15 21:46:35 -08:00
sabaimran	15b4cec1e8	Add documentation for how to use the text to image model configs, reduce to Replicate	2024-11-15 15:26:14 -08:00
sabaimran	759873ec44	Add documentation for how to use the text to image model configs	2024-11-15 15:22:06 -08:00
sabaimran	c77dc84a68	Remove output_modes function reference in chat tests	2024-11-15 14:03:07 -08:00
sabaimran	e3f1ea9dee	Improve tool, output mode selection process - JSON extract from LLMs is pretty decent now, so get the input tools and output modes all in one go. It'll help the model think through the full cycle of what it wants to do to handle the request holistically. - Make slight improvements to tool selection indicators	2024-11-15 13:53:53 -08:00
sabaimran	c1a5b32ebf	Do not start server when importing the main.py file, unless gunicorn - Add more graceful shutdown when closing bg scheduler thread	2024-11-14 17:36:51 -08:00
sabaimran	be3ee5ec9f	Add cool new suggestion cards for math, diagramming	2024-11-14 17:36:51 -08:00
Debanjum	9fc44f1a7f	Enable evaluation Khoj on the Talc Search Bench using Eval script - Just load the raw jsonl from Github and normalize it into FRAMES format - Color printed accuracy in eval script to blue for readability	2024-11-13 22:50:14 -08:00
Debanjum	8e009f48ce	Show tool call error in next iteration. Allow rerun if model requests. Previously errors would get eaten up but the model wouldn't see anything. And the model wouldn't be allowed re-run the same query-tool combination in the next iteration. This update should give it insight into why it didn't get a result. So it can make an informed (hopefully better) decision on what to do next. And re-run the previous query if appropriate.	2024-11-13 22:50:14 -08:00
Debanjum	604da90fa8	Wrap try/catch around online search in research mode like other tools Previously when call to online search API etc. failed, it'd error out of response to query in research mode. Khoj should skip tool use that iteration but continue to try respond.	2024-11-13 16:46:09 -08:00
Debanjum	8851b5f78a	Standardize chat message truncation and serialization before print Previously chatml messages were just strings, now they can be list of strings or list of dicts as well. - Use json seriallization to manage their variations and truncate them before printing for context. - Put logic in single function for use across chat models	2024-11-13 16:30:17 -08:00
Debanjum	f4e37209a2	Improve error handling, display and configurability of eval script - Default to evaluation decision of None when either agent or evaluator llm fails. This fixes accuracy calculations on errors - Fix showing color for decision True - Enable arg flags to specify output results file paths	2024-11-13 14:32:22 -08:00
Debanjum	15b0cfa3dd	Improve structured message truncation in logger Previously chatml messages were just strings. Since gemini, anthropic models always have messages as list of strings, truncate those strings instead of the list of message content	2024-11-13 14:32:22 -08:00
Debanjum	153ae8bea9	Cut binary, long output files from code result for context efficiency Removing binary data and truncating large data in output files generated by code runs should improve speed and cost of research mode runs with large or binary output files. Previously binary data in code results was passed around in iteration context during research mode. This made the context inefficient because models have limited efficiency and reasoning capabilities over b64 encoded image (and other binary) data and would hit context limits leading to unnecessary truncation of other useful context Also remove image data when logging output of code execution	2024-11-13 14:32:22 -08:00
sabaimran	de34cc3987	Remove og image url with khoj documentation	2024-11-13 10:23:02 -08:00
sabaimran	4a1b1e8b9a	Add support for interrupting messages after they've been sent.	2024-11-12 22:22:45 -08:00
sabaimran	d607ad7a27	Release Khoj version 1.29.1	2024-11-12 10:32:56 -08:00
sabaimran	8ec1764e42	Handle size calculation more gracefully for converted documents, depending on type	2024-11-12 02:00:29 -08:00
sabaimran	b6714c202f	Increase the title character limit to 500 for conversations	2024-11-12 01:51:19 -08:00
sabaimran	f05e64cf8c	Release Khoj version 1.29.0	2024-11-11 21:46:25 -08:00
sabaimran	47d3c8c235	Remove email query parameter from subscription patch api	2024-11-11 21:39:49 -08:00
sabaimran	d7027109a5	And null handling for response output_files in code output	2024-11-11 21:14:56 -08:00
sabaimran	d68243a3fb	Revert clean_json logic temporarily. Eventually, we should do better validation here to extract markdown-formatted json.	2024-11-11 21:05:17 -08:00
sabaimran	1cab6c081f	Add better error handling for diagram output, and fix chat history construct - Make the `clean_json` method more robust as well	2024-11-11 20:44:19 -08:00
sabaimran	7bd2f83f97	Wrap test in suggestionCard	2024-11-11 20:12:46 -08:00
Debanjum	48862a8400	Enable Passing External Documents for Analysis in Code Sandbox (#960 ) - Allow passing user files as input into code sandbox for analysis - Update prompt to give more example of complex, multi-line code - Simplify logic for model. Run one program at a time, instead of allowing model to run multiple programs in parallel - Show Code generated charts and docs in Reference pane of web app and make them downloaded	2024-11-11 19:37:17 -08:00
Debanjum	5078ac0ce2	Await on conversation save when generate conversation title via API	2024-11-11 19:17:39 -08:00
Debanjum	e1d0015248	Allow disabling Khoj telemetry via KHOJ_TELEMETRY_DISABLE env var	2024-11-11 19:17:39 -08:00
Debanjum	a52500d289	Show generated code artifacts before notes and online references	2024-11-11 18:00:22 -08:00
Debanjum	218eed83cd	Show output file not code on hover. Remove reference card title border	2024-11-11 18:00:22 -08:00
Debanjum	b970cfd4b3	Align styling of reference panel card across code, docs, web results - Add a border below heading - Show code snippet in pre block - Overflow-x when reference side panel open to allow seeing whole text via x-scroll - Align header, body position of reference cards with each other - Only show filename in doc reference cards at message bottom. Show full file path in hover and reference side panel	2024-11-11 18:00:22 -08:00
Debanjum	8e9f4262a9	Render code output files with code references in reference section - Improve rendering code reference with better icons, smaller text and different line clamps for better visibility - Show code output files as sub card of code card in reference section - Allow downloading files generated by code instead of rendering it in chat message directly - Show executed code before online references in reference panel	2024-11-11 18:00:22 -08:00
Debanjum	92c1efe6ee	Fixes to render & save code context with non text based output modes - Fix to render code generated chart with images, excalidraw diagrams - Fix to save code context to chat history in image, diagram output modes - Fix bug in image markdown being wrapped twice in markdown syntax - Render newline in code references shown on chat page of web app Previously newlines weren't getting rendered. This made the code executed by Khoj hard to read in references. This changes fixes that. `dangerouslySetInnerHTML' usage is justified as rendered code snippet is being sanitized by DOMPurify before rendering.	2024-11-11 18:00:22 -08:00
Debanjum	af0215765c	Decode code text output files from b64 to str to ease client processing	2024-11-11 18:00:22 -08:00
Debanjum	7b39f2014a	Enable analysing user documents in code sandbox and other improvements - Run one program at a time, instead of allowing model to pass multiple programs to run in parallel to simplify logic for model - Update prompt to give more example of complex, multi-line code - Allow passing user files as input into code sandbox for analysis - Log code execution timer at info level to evaluate execution latencies in production - Type the generated code for easier processing by caller functions	2024-11-11 17:59:37 -08:00
sabaimran	dc109559d4	Research mode gray when off, colored when on	2024-11-11 16:35:07 -08:00
sabaimran	cdda9c2e73	Improve text wrapping for attached files and preview context For the research mode toggle, make it not fill when it's off	2024-11-11 13:32:10 -08:00
sabaimran	dd36303bb7	Fix sending file attachments in save_to_conversation method - When files attached but upload fails, don't update the state variables - Make removing null characters in pdf extraction more space efficient	2024-11-11 12:53:06 -08:00
Debanjum	ba2471dc02	Do not CRUD on entries, files & conversations in DB for null user (#958 ) Increase defense-in-depth by reducing paths to create, read, update or delete entries, files and conversations in DB when user is unset.	2024-11-11 12:47:22 -08:00
Debanjum	536fe994be	Remove unused db adapter methods, like for fact checker data store	2024-11-11 12:22:34 -08:00
Debanjum	10bca6fa8f	Convert required user param check into decorator. Use with more adapters	2024-11-11 12:22:32 -08:00
Debanjum	ff5c10c221	Do not CRUD on entries, files & conversations in DB for null user Increase defense-in-depth by reducing paths to create, read, update or delete entries, files and conversations in DB when user is unset.	2024-11-11 12:20:07 -08:00
sabaimran	27fa39353e	Make custom agent creation flow available to everyone - For private agents, add guardrails to prevent against any misuse or violation of terms of service.	2024-11-11 11:54:59 -08:00
sabaimran	b563f46a2e	Merge pull request #957 from khoj-ai/features/include-full-file-in-convo-with-filter Support including file attachments in the chat message Now that models have much larger context windows, we can reasonably include full texts of certain files in the messages. Do this when an explicit file filter is set in a conversation. Do so in a separate user message in order to mitigate any confusion in the operation. Pipe the relevant attached_files context through all methods calling into models. This breaks certain prior behaviors. We will no longer automatically be processing/generating embeddings on the backend and adding documents to the "brain". You'll have to go to settings and go through the upload documents flow there in order to add docs to the brain (i.e., have search include them during question / response).	2024-11-11 11:34:42 -08:00
sabaimran	2bb2ff27a4	Rename attached_files to query_files. Update relevant backend and client-side code.	2024-11-11 11:21:26 -08:00
sabaimran	47937d5148	Merge branch 'features/include-full-file-in-convo-with-filter' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-11 09:34:08 -08:00
sabaimran	ae4eb96d48	Consolidate file name to icon mapping	2024-11-11 09:34:04 -08:00
Debanjum	7954f39633	Use accept param to file input to indicate supported file types in web app Remove unused total size calculations in chat input	2024-11-11 04:06:17 -08:00
Debanjum	4223b355dc	Use python stdlib methods to write pdf, docx to temp files for loaders Use python standard method tempfile.NamedTemporaryFile to write, delete temporary files safely.	2024-11-11 03:24:50 -08:00
Debanjum	fd15fc1e59	Move construct chat history back to it's original position in file Keep function where it original was allows tracking diffs and change history more easily	2024-11-11 03:24:50 -08:00
Debanjum	35d6c792e4	Show snippet of truncated messages in debug logs to avoid log flooding	2024-11-11 02:30:38 -08:00
sabaimran	8805e731fd	Merge branch 'master' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-10 19:24:11 -08:00
sabaimran	a5e2b9e745	Exit early when running an automation if the conversation for the automation does not exist.	2024-11-10 19:22:21 -08:00
sabaimran	55200be4fa	Apply agent color fill to the toggle both in off and on states	2024-11-10 19:16:43 -08:00
Debanjum	7468f6a6ed	Deduplicate online references returned by chat API to clients This will ensure only unique online references are shown in all clients. The duplication issue was exacerbated in research mode as even with different online search queries, you can get previously seen results. This change does a global deduplication across all online results seen across research iterations before returning them in client reponse.	2024-11-10 16:10:32 -08:00
Debanjum	137687ee49	Deduplicate searches in normal mode & across research iterations - Deduplicate online, doc search queries across research iterations. This avoids running previously run online, doc searches again and dedupes online, doc context seen by model to generate response. - Deduplicate online search queries generated by chat model for each user query. - Do not pass online, docs, code context separately when generate response in research mode. These are already collected in the meta research passed with the user query - Improve formatting of context passed to generate research response - Use xml tags to delimit context. Pass per iteration queries in each iteration result - Put user query before meta research results in user message passed for generating response This deduplications will improve speed, cost & quality of research mode	2024-11-10 16:10:32 -08:00
Debanjum	306f7a2132	Show error in picking next tool to researcher llm in next iteration Previously the whole research mode response would fail if the pick next tool call to chat model failed. Now instead of it completely failing, the researcher actor is told to try again in next iteration. This allows for a more graceful degradation in answering a research question even if a (few?) calls to the chat model fail.	2024-11-10 14:52:02 -08:00
Debanjum	eb492f3025	Only keep webpage content requested, even if Jina API gets more data Jina search API returns content of all webpages in search results. Previously code wouldn't remove content beyond max_webpages_to_read limit set. Now, webpage content in organic results aree explicitly removed beyond the requested max_webpage_to_read limit. This should align behavior of online results from Jina with other online search providers. And restrict llm context to a reasonable size when using Jina for online search.	2024-11-10 14:51:16 -08:00
Debanjum	8ef7892c5e	Exclude non-dictionary doc context from chat history sent to chat models This fixes chat with old chat sessions. Fixes issue with old Whatsapp users can't chat with Khoj because chat history doc context was stored as a list earlier	2024-11-10 14:51:16 -08:00
Debanjum	d892ab3174	Fix handling of command rate limit and improve rate limit messages Command rate limit wouldn't be shown to user as server wouldn't be able to handle HTTP exception in the middle of streaming. Catch exception and render it as LLM response message instead for visibility into command rate limiting to user on client Log rate limmit messages for all rate limit events on server as info messages Convert exception messages into first person responses by Khoj to prevent breaking the third wall and provide more details on wht happened and possible ways to resolve them.	2024-11-10 14:51:16 -08:00
Debanjum	80ee35b9b1	Wrap messages in web, obsidian UI to stay within screen when long links Wrap long links etc. in chat messages and train of thought lists on web app app and obsidian plugin by breaking them into newlines by word	2024-11-10 14:49:51 -08:00
Debanjum	f967bdf702	Show correct example index being currently processed in frames eval Previously the batch start index wasn't being passed so all batches started in parallel were showing the same processing example index This change doesn't impact the evaluation itself, just the index shown of the example currently being evaluated	2024-11-10 14:49:51 -08:00
Debanjum	84a8088c2b	Only evaluate non-empty responses to reduce eval script latency, cost Empty responses by Khoj will always be an incorrect response, so no need to make call to an evaluator agent to check that	2024-11-10 14:49:51 -08:00
sabaimran	170d959feb	Handle offline messages differently, as they don't respond well to the structured messages	2024-11-09 19:52:46 -08:00
sabaimran	2c543bedd7	Add typing to the constructed messages listed	2024-11-09 19:40:27 -08:00
sabaimran	79b15e4594	Only add images when they're present and vision enabled	2024-11-09 19:37:30 -08:00
sabaimran	bd55028115	Fix randint import from random when creating filenames for tmp	2024-11-09 19:17:18 -08:00
sabaimran	92b6b3ef7b	Add attached files to latest structured message in chat ml format	2024-11-09 19:17:00 -08:00
sabaimran	835fa80a4b	Allow docx conversion in the chatFunction.ts	2024-11-09 18:51:00 -08:00
sabaimran	459318be13	And random suffixes to decreases any clash probability when writing tmp files to disc	2024-11-09 18:46:34 -08:00
sabaimran	dbf0c26247	Remove _summary_ description in function descriptions	2024-11-09 18:42:42 -08:00
sabaimran	e5ac076fc4	Move construct_chat_history method back to conversation.utils.py	2024-11-09 18:27:46 -08:00
sabaimran	bc95a99fb4	Make tracer the last input parameter for all the relevant chat helper methods	2024-11-09 18:22:46 -08:00
sabaimran	ceb29eae74	Add phone number verification and remove telemetry update call from place where authentication middleware isn't yet installed (in the middleware itself).	2024-11-09 12:25:36 -08:00
sabaimran	3badb27744	Remove stored uploaded files after they're processed.	2024-11-08 23:28:02 -08:00
sabaimran	78630603f4	Delete the fact checker application	2024-11-08 17:27:42 -08:00
sabaimran	807687a0ac	Automatically generate titles for conversations from history	2024-11-08 16:02:34 -08:00
sabaimran	7159b0b735	Enforce limits on file size when converting to text	2024-11-08 15:27:28 -08:00
sabaimran	4695174149	Add support for file preview in the chat input area (before message sent)	2024-11-08 15:12:48 -08:00
sabaimran	ad46b0e718	Label pages when extract text from pdf, docs content. Fix scroll area in doc preview.	2024-11-08 14:53:20 -08:00
sabaimran	ee062d1c48	Fix parsing for PDFs via content indexing API	2024-11-07 18:17:29 -08:00
sabaimran	623a97a9ee	Merge branch 'master' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-07 17:18:23 -08:00
sabaimran	33498d876b	Simplify the share chat page. Don't need it to maintain its own conversation history - When chatting on a shared page, fork and redirect to a new conversation page	2024-11-07 17:14:11 -08:00
sabaimran	4b8be55958	Convert UUID to string when forking a conversation	2024-11-07 17:13:04 -08:00
sabaimran	9bbe27fe36	Set default value of attached files to empty list	2024-11-07 17:12:45 -08:00
sabaimran	3a51996f64	Process attached files in the chat history and add them to the chat message	2024-11-07 16:06:58 -08:00
sabaimran	a89160e2f7	Add support for converting an attached doc and chatting with it - Document is first converted in the chatinputarea, then sent to the chat component. From there, it's sent in the chat API body and then processed by the backend - We couldn't directly use a UploadFile type in the backend API because we'd have to convert the api type to a multipart form. This would require other client side migrations without uniform benefit, which is why we do it in this two-phase process. This also gives us capacity to repurpose the moe generic interface down the road.	2024-11-07 16:06:37 -08:00
sabaimran	e521853895	Remove unnecessary console.log statements	2024-11-07 16:03:31 -08:00
sabaimran	92c3b9c502	Add function to get an icon from a file type	2024-11-07 16:02:53 -08:00
sabaimran	140c67f6b5	Remove focus ring from the text area component	2024-11-07 16:02:02 -08:00
sabaimran	b8ed98530f	Accept attached files in the chat API - weave through all subsequent subcalls to models, where relevant, and save to conversation log	2024-11-07 16:01:48 -08:00
sabaimran	ecc81e06a7	Add separate methods for docx and pdf files to just convert files to raw text, before further processing	2024-11-07 16:01:08 -08:00
sabaimran	394035136d	Add an api that gets a document, and converts it to just text	2024-11-07 16:00:10 -08:00
sabaimran	3b1e8462cd	Include attach files in calls to extract questions	2024-11-07 15:59:15 -08:00
sabaimran	de73cbc610	Add support for relaying attached files through backend calls to models	2024-11-07 15:58:52 -08:00
Debanjum	4cad96ded6	Add Script to Evaluate Khoj on Google's FRAMES benchmark (#955 ) - Why We need better, automated evals to measure performance shifts of Khoj across prompt, model and capability changes. Google's FRAMES benchmark evaluates multi-step retrieval and reasoning capabilities of AI agents. It's a good starter benchmark to evaluate Khoj. - Details This PR adds an eval script to evaluate Khoj responses on the the FRAMES benchmark prompts against the ground truth provided by it. Script allows configuring sample size, batch size, sampling queries from the eval dataset. Gemini is used as an LLM Judge to auto grade Khoj responses vs ground truth data from the benchmark.	2024-11-06 17:52:01 -08:00
Debanjum	8679294bed	Remove need to set server chat settings from use openai proxies docs This was previously required, but now it's only usefuly for more advanced settings, not typical for self-hosting users. With recent updates, the user's selected chat model is used for both Khoj's train of thought and response. This makes it easy to switch your preferred chat model directly from the user settings page and not have to update this in the admin panel as well. Reflect these code changse in the docs, by removing the unnecessary step for self-hosted users to create a server chat setting when using an OpenAI proxy service like Ollama, LiteLLM etc.	2024-11-05 17:10:53 -08:00
Debanjum	05a93fcbed	v-align attach, send buttons with chat input text area on web app Otherwise, those buttons look off-center when images are attached to the chat input area	2024-11-05 17:10:53 -08:00
sabaimran	a0480d5f6c	use fill weight for the toggle right (enabled state) for research mode	2024-11-04 22:01:09 -08:00
sabaimran	dc26da0a12	Add uploaded files in the conversation file filter for a new convo	2024-11-04 22:00:47 -08:00
Debanjum	b51ee644aa	Fix escaping filename when normalizing in org node parser	2024-11-04 20:24:57 -08:00
Debanjum	5724d16a6f	Fix passing images to anthropic chat models to extract questions	2024-11-04 20:24:57 -08:00
sabaimran	cf0bcec0e7	Revert SKIP_TESTS flag in offline chat director tests	2024-11-04 19:06:54 -08:00
sabaimran	1f372bf2b1	Update file summarization unit tests now that multiple files are allowed	2024-11-04 17:45:54 -08:00
sabaimran	7543360210	Merge branch 'master' of github.com:khoj-ai/khoj into features/include-full-file-in-convo-with-filter	2024-11-04 16:55:48 -08:00
sabaimran	b6145df3be	Handle file retrieval when agent is None	2024-11-04 16:55:22 -08:00
sabaimran	3dc9139cee	Add additional handling for when file_object comes back empty	2024-11-04 16:53:07 -08:00
sabaimran	a27b8d3e54	Remove summarize condition for only 1 file filter	2024-11-04 16:51:37 -08:00
sabaimran	362bdebd02	Add methods for reading full files by name and including context Now that models have much larger context windows, we can reasonably include full texts of certain files in the messages. Do this when an explicit file filter is set in a conversation. Do so in a separate user message in order to mitigate any confusion in the operation. Pipe the relevant attached_files context through all methods calling into models. We'll want to limit the file sizes for which this is used and provide more helpful UI indicators that this sort of behavior is taking place.	2024-11-04 16:37:13 -08:00
sabaimran	e3ca52b7cb	Use .get() to get text accompanying image url, instead of subindexing	2024-11-04 16:09:16 -08:00
sabaimran	1e89baca7b	Deprecate the UserSearchModelConfig and remove all references - The server has moved to a model of standardization for the embeddings generation workflow. Remove references to the support for differentiated models. - The migration script fo ra new model needs to be updated to accommodate full regeneration.	2024-11-04 12:24:41 -08:00
Debanjum	1ccbf72752	Use logger instead of print to track eval	2024-11-04 00:40:26 -08:00
sabaimran	99c1d2831a	Release Khoj version 1.28.3	2024-11-02 12:23:11 -07:00
sabaimran	075b4ecf15	Call subscription_to_state with sync_to_async wrapper when getting user subscription state - This is needed in case the renewal_date is not set and we need to reset it for the user	2024-11-02 12:22:35 -07:00
sabaimran	ec44cbe1e7	Release Khoj version 1.28.2	2024-11-02 07:53:51 -07:00
Debanjum	791eb205f6	Run prompt batches in parallel for faster eval runs	2024-11-02 04:58:03 -07:00
Debanjum	96904e0769	Add script to evaluate khoj on Google's FRAMES benchmark Google's FRAMES benchmark evaluates multi-step retrieval and reasoning capabilities of an agent. The script uses Gemini as an LLM Judge to evaluate Khoj responses to the FRAMES benchmark prompts against the ground truth provided by it.	2024-11-02 04:57:42 -07:00
Debanjum	31b5fde163	Only enable prompt tracer if git python is installed	2024-11-02 02:07:02 -07:00
sabaimran	5b18dc96e0	Release Khoj version 1.28.1	2024-11-01 22:51:51 -07:00
sabaimran	8d1b1bc78e	Move the git python dependency into top level dependencies	2024-11-01 22:51:00 -07:00
Debanjum	e85dd59295	Release Khoj version 1.28.0	2024-11-01 19:06:59 -07:00
Debanjum	1f79a10541	Fix link to code execution feature in docs	2024-11-01 18:22:21 -07:00
Debanjum	cff8e02b60	Research Mode [Part 2]: Improve Prompts, Edit Chat Messages. Set LLM Seed for Reproducibility (#954 ) - Improve chat actors and their prompts for research mode. - Add documentation to enable the code tool when self-hosting Khoj - Edit Chat Messages - Store Turn Id in each chat message. - Expose API to delete chat message. - Expose delete chat message button to turn delete chat message from web app - Set LLM Generation Seed for Reproducible Debugging and Testing - Setting seed for LLM generation is supported by Llama.cpp and OpenAI models. This can (somewhat) restrain LLM output - Getting fixed responses for fixed inputs helps test, debug longer reasoning chains like used in advanced reasoning	2024-11-01 18:16:42 -07:00
Debanjum	14e453039d	Add prompt tracing, agent personality to infer webpage urls chat actor	2024-11-01 18:12:50 -07:00
Debanjum	ab321dc518	Expect query before tool in response to give think space in research prompt	2024-11-01 17:51:41 -07:00
Debanjum	1a83bbcc94	Clean API chat router. Move FeedbackData response type to router helper	2024-11-01 17:51:41 -07:00
sabaimran	e6eb87bbb5	Merge branch 'improve-debug-reasoning-and-other-misc-fixes' of github.com:khoj-ai/khoj into improve-debug-reasoning-and-other-misc-fixes	2024-11-01 16:48:39 -07:00
sabaimran	a213b593e8	Limit the number of urls the webscraper can extract for scraping	2024-11-01 16:48:36 -07:00
sabaimran	327fcb8f62	create defiltered query after conversation command is extracted	2024-11-01 16:48:03 -07:00
sabaimran	b79a9ec36d	Clarify description of the code evaluation environment: not for document creation	2024-11-01 16:47:27 -07:00
Debanjum	9c7b36dc69	Use standard per minute rate limits across user types	2024-11-01 16:16:06 -07:00
Debanjum	ac21b10dd5	Simplify logic to get default search model. Remove unused import	2024-11-01 15:14:00 -07:00
sabaimran	2b35790165	Merge branch 'master' of github.com:khoj-ai/khoj into improve-debug-reasoning-and-other-misc-fixes	2024-11-01 14:51:26 -07:00
Debanjum	22f3ed3f5d	Research Mode: Give Khoj the ability to perform more advanced reasoning (#952 ) ## Overview Khoj can now go into research mode and use a python code interpreter. These are experimental features that are being released early for feedback and testing. - Research mode allows Khoj to dynamically select the tools it needs to best answer the question. It is also allowed more iterations to get to a satisfactory answer. Its more dynamic train of thought is shown to improve visibility into its thinking. - Adding ability for Khoj to use a python code interpreter is an adjacent capability. It can help Khoj do some data analysis and generate charts for you. A sandboxed python to run code is provided using [cohere-terrarium](https://github.com/cohere-ai/cohere-terrarium?tab=readme-ov-file), [pyodide](https://pyodide.org/). ## Analysis Research mode (significantly?) improves Khoj's information retrieval for more complex queries requiring multi-step lookups but takes longer to run. It can research for longer, requiring less back-n-forth with the user to find an answer. Research mode gives most gains when used with more advanced chat models (like o1, 4o, new claude sonnet and gemini-pro-002). Smaller models improve their response quality but tend to get into repetitive loops more often. ## Next Steps - Get community feedback on research mode. What works, what fails, what is confusing, what'd be cool to have. - Tune Khoj's capabilities for longer autonomous runs and to generalize across a larger range of model sizes ## Miscellaneous Improvements - Khoj's train of thought is saved and shown for all messages, not just the latest one - Render charts generated by Khoj and code running using the code tool on the web app - Align chat input color to currently selected agent color	2024-11-01 14:46:29 -07:00
sabaimran	baa939f4ce	When running code, strip any code delimiters. Disable application json type specification in Gemini request.	2024-11-01 13:47:39 -07:00
sabaimran	8fd2fe162f	Determine if research mode is enabled by checking the conversation commands and 'linting' them in the selection phase	2024-11-01 13:12:34 -07:00
sabaimran	cead1598b9	Don't reset research mode after completing research execution	2024-11-01 13:00:11 -07:00
Debanjum	c1c779a7ef	Do not yaml format raw code results in context for LLM. It's confusing	2024-11-01 12:45:26 -07:00
sabaimran	b3dad1f393	Standardize rate limits to 1/6 ratio	2024-11-01 12:21:09 -07:00
sabaimran	23a49b6b95	Add documentation for python code execution capability	2024-11-01 12:14:33 -07:00
Debanjum	cd75151431	Do not allow auto selecting research mode as tool for now. You are required to manually turning it on. This takes longer and should be a high intent activity initiated by user	2024-11-01 12:07:52 -07:00
Debanjum	0b0cfb35e6	Simplify in research mode check in api_chat. - Dedent code for readability - Use better name for in research mode check - Continue to remove inferred summarize command when multiple files in file filter even when not in research mode - Continue to show select information source train of thought. It was removed by mistake earlier	2024-11-01 12:07:08 -07:00
sabaimran	ffa7f95559	Add template for a code sandbox to the docker-compose configuration	2024-11-01 11:50:58 -07:00
Debanjum	73750ef286	Merge branch 'master' into features/advanced-reasoning	2024-11-01 11:42:01 -07:00
sabaimran	1fc280db35	Handle case where infer_webpage_url returns no valid urls	2024-11-01 11:41:32 -07:00
Debanjum	1c920273dd	Add Prompt Tracer to Visualize, Analyze and Debug Khoj's Train of Thought (#951 ) ## Overview Use git to capture prompt traces of khoj's train of thought. View, analyze and debug them using your favorite git client (e.g vscode, magit). - Each commit captures an interaction with an LLM The commit writes the query, response and system message each to a separate file in the repo. The commit message captures the chat model, Khoj version and other metadata - Each conversation turn can have multiple interactions with an LLM (e.g Khoj's train of thought) - Each new conversation turn forks from and merges back into its conversation branch - Each new conversation branches from the user branch - Each new user branches from root commit on the main branch ## Usage 1. Set `KHOJ_DEBUG=true` or start khoj in very verbose mode with `khoj -vv` to turn on prompt tracing 2. Chat with Khoj as usual 3. Open the promptrace git repo to view the generated prompt traces using your favorite git porcelain. The Khoj prompt trace git repo is created at `/tmp/khoj_promptrace` by default. You can configure the prompt trace directory by setting the `PROMPTRACE_DIR`environment variable. ## Implementation - Add utility functions to capture prompt traces using git (via `gitpython`) - Make each model provider in Khoj commit their LLM interactions with promptrace - Weave chat metadata from chat API through all chat actors and commit it to the prompt trace	2024-11-01 11:33:54 -07:00
sabaimran	33d36ee58c	Add experimental notice to research mode tooltip	2024-11-01 11:00:27 -07:00
sabaimran	0145b2a366	Set usage limits on the research mode	2024-11-01 10:29:33 -07:00
sabaimran	3ea94ac972	Only include inferred-queries in chat history when present	2024-10-31 22:01:41 -07:00
sabaimran	149cbe1019	Use bottom anchor for the commandbar popover	2024-10-31 20:40:38 -07:00
sabaimran	21858acccc	Remove conversation command always in query, filter out inferred queries that were not with selected tool when going through tool selection iterations	2024-10-31 20:27:38 -07:00
sabaimran	19241805ee	Merge branch 'master' of github.com:khoj-ai/khoj into improve-debug-reasoning-and-other-misc-fixes	2024-10-31 18:20:23 -07:00
Debanjum	302bd51d17	Improve online chat actor prompt for research and normal mode - Match the online query generator prompt to match the formatting of extract questions - Separate iteration results by newline - Improve webpage and online tool descriptions	2024-10-31 18:17:12 -07:00
Debanjum	52163fe299	Improve research planner prompt to reduce looping	2024-10-31 18:17:01 -07:00
sabaimran	7ebf999688	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-31 18:15:13 -07:00
sabaimran	159ea44883	Remove frame references in the diagramming prompts	2024-10-31 18:14:51 -07:00
Debanjum	89597aefe9	Json dump contents in prompt tracer to make structure discernable	2024-10-31 18:08:42 -07:00
Debanjum	5b15176e20	Only add /research prefix in research mode if not already in user query	2024-10-31 18:08:42 -07:00
sabaimran	559601dd0a	Do not exit if/else loop in research loop when notes not found	2024-10-31 13:51:10 -07:00
sabaimran	a13760640c	Only show trash can when turnId is present	2024-10-31 13:19:16 -07:00
sabaimran	8d1ecb9bd8	Add optional brew steps for docker install	2024-10-31 12:41:53 -07:00
Debanjum	adca6cbe9d	Merge branch 'master' into add-prompt-tracer-for-observability	2024-10-31 02:28:34 -07:00
Debanjum	e17dc9f7b5	Put train of thought ui before Khoj response on web app	2024-10-31 02:24:53 -07:00
Debanjum	e8e6ead39f	Fix deleting new messages generated after conversation load	2024-10-30 20:56:38 -07:00
Debanjum	cb90abc660	Resolve train of thought component needs unique key id error on web app	2024-10-30 14:00:21 -07:00
Debanjum	ca5a6831b6	Add ability to delete messages from the web app	2024-10-30 14:00:21 -07:00
Debanjum	ba15686682	Store turn id with each chat message. Expose API to delete chat turn Each chat turn is a user query, khoj response message pair	2024-10-30 14:00:21 -07:00
Debanjum	f64f5b3b6e	Handle add/delete file filter operation on non-existent conversation	2024-10-30 14:00:21 -07:00
Debanjum	b3a63017b5	Support setting seed for reproducible LLM response generation Anthropic models do not support seed. But offline, gemini and openai models do. Use these to debug and test Khoj via KHOJ_LLM_SEED env var	2024-10-30 14:00:21 -07:00
Debanjum	d44e68ba01	Improve handling embedding model config from admin interface - Allow server to start if loading embedding model fails with an error. This allows fixing the embedding model config via admin panel. Previously server failed to start if embedding model was configured incorrectly. This prevented fixing the model config via admin panel. - Convert boolean string in config json to actual booleans when passed via admin panel as json before passing to model, query configs - Only create default model if no search model configured by admin. Return first created search model if its been configured by admin.	2024-10-30 14:00:21 -07:00
Debanjum	358a6ce95d	Defer turning cursor color to selected agents color for later Capability exists but idea needs to be investigated further	2024-10-30 14:00:21 -07:00
Debanjum	2ac840e3f2	Make cursor in chat input take on selected agent color	2024-10-30 14:00:21 -07:00
Debanjum	1448b8b3fc	Use 3rd person for user in research prompt to reduce person confusion Models were getting a bit confused about who is search for who's information. Using third person to explicitly call out on who's behalf these searches are running seems to perform better across models (gemini's, gpt etc.), even if the role of the message is user.	2024-10-30 13:49:48 -07:00
Debanjum	b8c6989677	Separate example from actual question in extract question prompt	2024-10-30 13:49:48 -07:00
Debanjum	86ffd7a7a2	Handle \n, dedupe json cleaning into single function for reusability Use placeholder for newline in json object values until json parsed and values extracted. This is useful when research mode models outputs multi-line codeblocks in queries etc.	2024-10-30 13:49:48 -07:00
Debanjum	83ca820abe	Encourage Anthropic models to output json object using { prefill Anthropic API doesn't have ability to enforce response with valid json object, unlike all the other model types. While the model will usually adhere to json output instructions. This step is meant to more strongly encourage it to just output json object when response_type of json_object is requested.	2024-10-30 13:49:48 -07:00
Debanjum	dc8e89b5de	Pass tool AIs iteration history as chat history for better context Separate conversation history with user from the conversation history between the tool AIs and the researcher AI. Tools AIs don't need top level conversation history, that context is meant for the researcher AI. The invoked tool AIs need previous attempts at using the tool in this research runs iteration history to better tune their next run. Or at least that is the hypothesis to break the models looping.	2024-10-30 13:49:48 -07:00
Debanjum	d865994062	Rename code tool arg `previous_iteration_history' to` context'	2024-10-30 13:49:48 -07:00
Debanjum	06aeca2670	Make researcher, docs search AIs ask more diverse retrieval questions Models weren't generating a diverse enough set of questions. They'd do minor variations on the original query. What is required is asking queries from a bunch of different lenses to retrieve the requisite information. This prompt updates shows the AIs the breadth of questions to by example and instruction. Seem like performance improved based on vibes	2024-10-30 13:49:48 -07:00
Debanjum	01881dc7a2	Revert "Make extract question prompt in 1st person wrt user as its a user message" This reverts commit 6d3602798aa1b95a30c557576fd4f93ddef2ae76.	2024-10-30 13:49:48 -07:00
Debanjum	3e695df198	Make extract question prompt in 1st person wrt user as its a user message Divide Example from Actual chat history section in prompt	2024-10-30 13:49:48 -07:00
Debanjum	a3751d6a04	Make extract relevant information system prompt work for any document Previously it was too strongly tuned for extracting information from only webpages. This shouldn't be necessary	2024-10-30 13:49:48 -07:00
Debanjum	a39e747d07	Improve passing user name in pick next research tool prompt	2024-10-30 13:49:48 -07:00
Debanjum	deff512baa	Improve research mode prompts to reduce looping, increase webpage reads	2024-10-30 13:49:48 -07:00
Debanjum	d3184ae39a	Simplify storing and displaying document results in research mode - Mention count of notes and files disovered - Store query associated with each compiled reference retrieved for easier referencing	2024-10-30 13:49:48 -07:00
Debanjum	8bd94bf855	Do not use a message branch if no msg id provided to prompt tracer	2024-10-30 13:49:48 -07:00
sabaimran	b63fbc5345	Add a simple badget to the dropdown menu that shows subscription status	2024-10-30 13:00:16 -07:00
sabaimran	82f3d79064	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-30 11:32:10 -07:00
sabaimran	2b2564257e	Handle subscription case where it's set to trial, but renewal_date is not set. set the renewal_date for LENGTH_OF_FREE_TRIAL days from subscription creation.	2024-10-30 11:05:31 -07:00
Debanjum	9935d4db0b	Do not use a message branch if no msg id provided to prompt tracer	2024-10-28 17:50:27 -07:00
Debanjum	d184498038	Pass context in separate message from user query to research chat actor	2024-10-28 15:37:28 -07:00
Debanjum	d75ce4a9e3	Format online, notes, code context with YAML to be legibile for LLM	2024-10-28 15:37:28 -07:00
sabaimran	5bea0c705b	Use break-words in the train of thought for better formatting	2024-10-28 15:36:06 -07:00
sabaimran	1f1b182461	Automatically carry over research mode from home page to chat - Improve mobile friendliness with new research mode toggle, since chat input area is now taking up more space - Remove clunky title from the suggestion card - Fix fk lookup error for agent.creator	2024-10-28 15:29:24 -07:00
sabaimran	ebaed53069	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-28 12:39:00 -07:00
sabaimran	889dbd738a	Add keyword diagram to diagram output mode description	2024-10-28 12:20:46 -07:00
Debanjum	50ffd7f199	Merge branch 'master' into features/advanced-reasoning	2024-10-28 04:10:59 -07:00
Debanjum	a5d0ca6e1c	Use selected agent color to theme the chat input area on home page	2024-10-28 03:47:40 -07:00
Debanjum	aad7528d1b	Render slash commands popup below chat input text area on home page	2024-10-28 02:06:04 -07:00
Debanjum	3e17ab438a	Separate notes, online context from user message sent to chat models (#950 ) Overview --- - Put context into separate user message before sending to chat model. This should improve model response quality and truncation logic in code - Pass online context from chat history to chat model for response. This should improve response speed when previous online context can be reused - Improve format of notes, online context passed to chat models in prompt. This should improve model response quality Details --- The document, online search context are now passed as separate user messages to chat model, instead of being added to the final user message. This will improve - Models ability to differentiate data from user query. That should improve response quality and reduce prompt injection probability - Make truncation logic simpler and more robust When context window hit, can simply pop messages to auto truncate context in order of context, user, assistant message for each conversation turn in history until reach current user query The complex, brittle logic to extract user query from context in last user message isn't required.	2024-10-28 02:03:18 -07:00
Debanjum	8ddd70f3a9	Put context into separate message before sending to offline chat model Align context passed to offline chat model with other chat models - Pass context in separate message for better separation between user query and the shared context - Pass filename in context - Add online results for webpage conversation command	2024-10-28 00:22:21 -07:00
Debanjum	ee0789eb3d	Mark context messages with user role as context role isn't being used Context role was added to allow change message truncation order based on context role as well. Revert it for now since currently this is not currently being done.	2024-10-28 00:04:14 -07:00
Debanjum	4e39088f5b	Make agent name in home page carousel not text wrap on mobile	2024-10-27 23:03:53 -07:00
Debanjum	94074b7007	Focus chat input on toggle research mode. v-align it with send button	2024-10-27 22:54:55 -07:00
sabaimran	a691ce4aa6	Batch entries into smaller groups to process	2024-10-27 20:43:41 -07:00
sabaimran	2924909692	Add a research mode toggle to the chat input area	2024-10-27 16:37:40 -07:00
sabaimran	68499e253b	Auto-collapse train of thought, show after chat response in history	2024-10-27 15:48:13 -07:00
sabaimran	101ea6efb1	Add research mode as a slash command, remove from default path	2024-10-27 15:47:44 -07:00
sabaimran	0bd78791ca	Let user exit from command mode with esc, click out, etc.	2024-10-27 15:01:49 -07:00
sabaimran	a121d67b10	Persist the train of thought in the conversation history	2024-10-26 23:46:15 -07:00
sabaimran	9e8ac7f89e	Fix input/output mismatches in the /summarize command	2024-10-26 16:37:58 -07:00
sabaimran	e4285941d1	Use the advanced chat model if the user is subscribed	2024-10-26 16:00:54 -07:00
sabaimran	33e48aa27e	Merge branch 'add-prompt-tracer-for-observability' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-26 14:09:00 -07:00
sabaimran	fd71a4b086	Add better exception handling in the prompt trace logic, use default value from parameters	2024-10-26 14:08:00 -07:00
Debanjum	3e5b5ec122	Encourage model to read webpages more often after online search Previously model would rarely read webpages after webpage search. Need the model to webpages more regularly for deeper research and to stop getting stuck in repetitive online search loops	2024-10-26 10:49:09 -07:00
Debanjum	bf96d81943	Format online results as YAML to pass it in more readable form to model Previous passing of online results as json dump in prompts was less readable for humans, and I'm guessing less readable for models (trained on human data) as well?	2024-10-26 10:49:09 -07:00
Debanjum	3e97ebf0c7	Unescape special characters in prompt traces for better readability	2024-10-26 10:49:09 -07:00
Debanjum	8af9dc3ee1	Unescape special characters in prompt traces for better readability	2024-10-26 10:45:42 -07:00
Debanjum Singh Solanky	0f3927e810	Send gathered references to client after code results calculated	2024-10-26 05:59:10 -07:00
Debanjum Singh Solanky	f04f871a72	Merge branch 'add-prompt-tracer-for-observability' of github.com:khoj-ai/khoj into features/advanced-reasoning - Start from this branches src/khoj/routers/api_chat.py Add tracer to all old and new chat actors that don't have it set when they are called. - Update the new chat actors like apick next tool etc to use tracer too	2024-10-26 05:56:13 -07:00
Debanjum Singh Solanky	ddc6ccde2d	Merge branch 'master' into features/advanced-reasoning - Conflicts: Combine both sides of the conflict in all 3 files below - src/khoj/processor/conversation/utils.py - src/khoj/routers/helpers.py - src/khoj/utils/helpers.py	2024-10-26 05:15:51 -07:00
Debanjum Singh Solanky	ea0712424b	Commit conversation traces using user, chat, message branch hierarchy - Message train of thought forks and merges from its conversation branch - Conversation branches from user branch - User branches from root commit on the main branch - Weave chat tracer metadata from api endpoint through all chat actors and commit it to the prompt trace	2024-10-26 05:08:47 -07:00
Debanjum Singh Solanky	a3022b7556	Allow Offline Chat model calling functions to save conversation traces	2024-10-26 05:08:47 -07:00
Debanjum Singh Solanky	eb6424f14d	Allow Anthropic API calling functions to save conversation traces	2024-10-26 05:08:47 -07:00
Debanjum Singh Solanky	6fcd6a5659	Allow Gemini API calling functions to save conversation traces	2024-10-26 05:08:47 -07:00
Debanjum Singh Solanky	384f394336	Allow OpenAI API calling functions to save conversation traces	2024-10-26 04:59:21 -07:00
Debanjum Singh Solanky	10c8fd3b2a	Save conversation traces to git for visualization	2024-10-26 04:59:19 -07:00
sabaimran	7e0a692d16	Release Khoj version 1.27.1	2024-10-25 15:23:07 -07:00
sabaimran	b257fa1884	Add a None check before doing a DT comparison when getting subscription type	2024-10-25 15:22:48 -07:00
sabaimran	0f6f282c30	Release Khoj version 1.27.0	2024-10-25 14:11:14 -07:00
sabaimran	479e156168	Add to the ConversationCommand.Image description to LLM	2024-10-25 09:14:32 -07:00
sabaimran	a11b5293fb	Add uploaded images to research mode, code slash command, include code references	2024-10-24 23:56:24 -07:00
sabaimran	5acf40c440	Clean up summarization code paths Use assumption of summarization response being a str	2024-10-24 23:56:24 -07:00
sabaimran	12b32a3d04	Resolve merge conflicts	2024-10-24 23:43:55 -07:00
Debanjum	adee5a3e20	Give Vision to Anthropic models in Khoj (#948 ) ### Major - Give Vision to Anthropic models in Khoj ### Minor - Reuse logic to format messages for chat with anthropic models - Make the get image from url function more versatile and reusable - Encourage output mode chat actor to output only json and nothing else	2024-10-24 18:02:38 -07:00
Debanjum Singh Solanky	01d740debd	Return typed image from image_with_url function for readability	2024-10-24 17:58:46 -07:00
Debanjum Singh Solanky	37317e321d	Dedupe user location passed in image, diagram generation prompts	2024-10-24 01:03:29 -07:00
Debanjum Singh Solanky	2a32836d1a	Log more descriptive error when image gen fails with Replicate	2024-10-24 01:03:29 -07:00
sabaimran	30f9225021	Merge branch 'master' of github.com:khoj-ai/khoj into features/advanced-reasoning	2024-10-23 19:15:51 -07:00
sabaimran	5120597d4e	Remove user customized search model (#946 ) - Use a single standard search model across the server. There's diminishing benefits for having multiple user-customizable search models. - We may want to add server-level customization for specific tasks - Store the search model used to generate a given entry on the `Entry` object - Remove user-facing APIs and view - Add a management command for migrating the default search model on the server In a future PR (after running the migration), we'll also remove the `UserSearchModelConfig`	2024-10-23 17:38:37 -07:00
Debanjum Singh Solanky	8d588e0765	Encourage output mode chat actor to output only json and nothing else Latest claude model wanted to say more than just give the json output. The updated prompt encourages the model to ouput just json. This is similar to what is already being done for other prompts	2024-10-23 17:19:21 -07:00
Debanjum Singh Solanky	abad5348a0	Give Vision to Anthropic models in Khoj	2024-10-23 17:19:21 -07:00
Debanjum Singh Solanky	6fd50a5956	Reuse logic to format messages for chat with anthropic models	2024-10-23 17:19:21 -07:00
Debanjum Singh Solanky	82eac5a043	Make the get image from url function more versatile and reusable It was previously added under the google utils. Now it can be used by other conversation processors as well. The updated function - can get both base64 encoded and PIL formatted images from url - will return the media type of the image as well in response	2024-10-23 17:19:20 -07:00
sabaimran	f3ce47b445	Create explicit flow to enable the free trial (#944 ) * Create explicit flow to enable the free trial The current design is confusing. It obfuscates the fact that the user is on a free trial. This design will make the opt-in explicit and more intuitive. * Use the Subscription Type enum instead of hardcoded strings everywhere * Use length of free trial in the frontend code as well	2024-10-23 15:29:23 -07:00
Debanjum Singh Solanky	bc059eeb0b	Merge branch 'master' into put-retrieved-context-in-separate-chatml-message	2024-10-23 12:55:18 -07:00
Debanjum Singh Solanky	3b978b9b67	Fix chat history construction when generating chatml msgs with context	2024-10-23 12:55:12 -07:00
sabaimran	c5e91c346a	Fix Docker desktop link for Linux	2024-10-23 11:24:54 -07:00
Debanjum Singh Solanky	9f2c02d9f7	Chat with the default agent by default from web app home Had temporarily updated the default selected agent to last used. Revert for now as 1. The previous logic was buggy. It didn't select the default agent even when the last used agent was the default agent. Which would require more work. 2. It maybe too early anyway to set the default agent to last used.	2024-10-23 03:43:57 -07:00
Debanjum Singh Solanky	218946edda	Fix copying message with user images on web app Adding div elements to message to render degraded text copied to clipboard for messages with user uploaded images. This change fixes that by separating message to render from message for clipboard. It ensures differently formatted forms of the user images are added to the two to allow proper rendering while still having decently formatted text copied to clipboard	2024-10-23 03:41:25 -07:00
Debanjum Singh Solanky	7d9a06c8ab	Merge branch 'master' into put-retrieved-context-in-separate-chatml-message	2024-10-23 00:13:38 -07:00
sabaimran	7c29af9745	Add link to self-hosted admin page and add docs for building front-end assets. Close #901	2024-10-22 22:42:27 -07:00
Debanjum Singh Solanky	2a50694089	Allow typing multi-line queries from a phone with Enter key Add newline instead of sending message when hit Enter key on mobile displays. As on phones shift key doesn't exist and send button is easily clickable. Limit hitting Enter key to send message to computers = larger display = expected to have full fledged keyboards.	2024-10-22 21:20:22 -07:00
Debanjum Singh Solanky	a134cd835c	Focus on chat input area to enter text after file uploads on web app	2024-10-22 21:19:17 -07:00
Debanjum	c81e708833	Show all agents, smart sorted, in carousel on home screen of web app (#943 ) ## Overview Allow quickly selecting, switching agents from agents pane on home page of web app ## Details - Show all agents in carousel on home screen agent pane of web app - Smart Sort 1. Pin default agent as first for ease of access 2. Show used agents by MRU for ease of access 3. Shuffle unused agents for discoverability - Select most recently used agent to chat with by default - Push smart sort logic down to API - Common logic can be reused across clients - Agent sort was previously done in web app - Focus on chat input on agent select - Double click agent on home page to open edit agent card on agents page	2024-10-22 21:18:17 -07:00
Debanjum Singh Solanky	750fbce0c2	Merge branch 'master' into improve-agent-pane-on-home-screen	2024-10-22 20:05:29 -07:00
Debanjum Singh Solanky	3be505db48	Only show type of error when image generation fails to clients Rather than showing raw error message from the underlying service as it could contain sensitive information	2024-10-22 20:03:20 -07:00
Debanjum	c6f3253ebd	Chat with Multiple Images. Support Vision with Gemini (#942 ) ## Overview - Add vision support for Gemini models in Khoj - Allow sharing multiple images as part of user query from the web app - Handle multiple images shared in query to chat API	2024-10-22 19:59:18 -07:00
Debanjum Singh Solanky	b3fff43542	Sanitize user attached images. Constrain chat input width on home page Set max combined images size to 20mb to allow multiple photos to be shared	2024-10-22 19:42:40 -07:00
Debanjum Singh Solanky	6c393800cc	Merge branch 'master' into multi-image-chat-and-vision-for-gemini	2024-10-22 18:38:49 -07:00
Debanjum Singh Solanky	91bbd19333	Close the agent detail hover card when scroll on agent pane	2024-10-22 18:03:17 -07:00
Debanjum Singh Solanky	110c67f083	Improve agent pill, detail card styling. Handle null chatInputRef - Remove border from agent detail hover card on home page - Do not wrap long agent names in agent pills on home page - Handle scenario where chatInputRef is null	2024-10-22 18:03:17 -07:00
Debanjum Singh Solanky	aca8bef024	Only use recent chat sessions for agent MRU. Handle null agent chats	2024-10-22 17:46:45 -07:00
sabaimran	0dad4212fa	Generate dynamic diagrams (via Excalidraw) (#940 ) Add support for generating dynamic diagrams in flow with Excalidraw (https://github.com/excalidraw/excalidraw). This happens in three steps: 1. Default information collection & intent determination step. 2. Improving the overall guidance of the prompt for generating a JSON, Excalidraw-compatible declaration. 3. Generation of the diagram to output to the final UI. Add support in the web UI.	2024-10-22 16:13:46 -07:00
sabaimran	1e993d561b	Release Khoj version 1.26.4	2024-10-22 13:50:08 -07:00
Debanjum Singh Solanky	e8fb79a369	Rate limit the count and total size of images shared via API	2024-10-22 04:37:54 -07:00
Debanjum Singh Solanky	39a613d3bc	Fix up openai chat actor tests	2024-10-22 03:09:36 -07:00
Debanjum Singh Solanky	0847fb0102	Pass online context from chat history to chat model for response Previously only notes context from chat history was included. This change includes online context from chat history for model to use for response generation. This can reduce need for online lookups by reusing previous online context for faster responses. But will increase overall response time when not reusing past online context, as faster context buildup per conversation. Unsure if inclusion of context is preferrable. If not, both notes and online context should be removed.	2024-10-22 03:09:36 -07:00
Debanjum Singh Solanky	0c52a1169a	Put context into separate user message before sending to chat model The document, online search context are now passed as separate user messages to chat model, instead of being added to the final user message. This will improve - Models ability to differentiate data from user query. That should improve response quality and reduce prompt injection probability - Make truncation logic simpler and more robust When context window hit, can simply pop messages to auto truncate context in order of context, user, assistant message for each conversation turn in history until reach current user query The complex, brittle logic to extract user query from context in last user message isn't required. Marking the context message with assistant role doesn't translate well across chat models. E.g - Gemini can't handle consecutive messages by role = model well - Claude will merge consecutive messages by same role. In current message ordering the context message will result get merged into the previous assistant response. And if move context message after user query. The truncation logic will have to hop and skip while doing deletions - GPT seems to handle consecutive roles of any type fine Using context role = user generalizes better across chat models for now and aligns with previous behavior.	2024-10-22 03:09:36 -07:00
Debanjum Singh Solanky	7ac241b766	Improve format of notes, online context passed to chat models in prompt Improve separation of note snippets and show its origin file in notes prompt to have more readable, contextualized text shared with model. Previously the references dict was being directly passed as a string. The documents don't look well formatted and are less intelligible. - Passing file path along with notes snippets will help contextualize the notes better. - Better formatting should help with making notes more readable by the chat model.	2024-10-22 03:09:36 -07:00
sabaimran	892040972f	Replace user_id with server_id in telemetry	2024-10-21 20:47:52 -07:00
sabaimran	db959a504d	Fix the version of pymupdf to avert build errors	2024-10-21 12:56:51 -07:00
sabaimran	21e69b506d	Release Khoj version 1.26.3	2024-10-21 08:19:05 -07:00
Debanjum Singh Solanky	9b554feb91	Show agent details card on hover on agent pill on web app home page - Double click on agent to open edit agent card - Focus on chat input pane when agent selected/clicked for quick, smooth agent switch and message flow - Hover on agent to see agent detail card on non-mobile displays - Use debounce to only show when hover on card for a bit	2024-10-21 00:08:01 -07:00
Debanjum Singh Solanky	220ff1df62	Set chatInputArea forward ref from parent components for control	2024-10-21 00:02:48 -07:00
Debanjum Singh Solanky	54b92eaf73	Extract isUserSubscribed check from Agents page to make it resusable	2024-10-20 23:31:48 -07:00
Debanjum Singh Solanky	bdbe8f003e	Move agent details and edit card out into reusable components on web app	2024-10-20 23:31:47 -07:00
sabaimran	ad197be70c	Fix PDFs unit test, skip OCR	2024-10-20 22:25:41 -07:00
sabaimran	59fec37943	Improve agents management, and limit agents view to private and official agents - Default to None for the input_tools and output_modes so that they can be managed in the admin panel - Hold off on showing off all Public Agents until we have a better experience for user profiles etc.	2024-10-20 22:24:51 -07:00
sabaimran	a979457442	Add unit tests for agents - Add permutations of testing for with, without knowledge base. Private, public, different users.	2024-10-20 20:04:50 -07:00
sabaimran	fc70f25583	Release Khoj version 1.26.2	2024-10-20 18:03:36 -07:00
sabaimran	046de57571	Improve error handling when documents not searched with stack trace - Stop extract OCR content from PDFs - Only use agent knowledge base when user not provided	2024-10-20 18:03:14 -07:00
sabaimran	2b68d61fef	Release Khoj version 1.26.1	2024-10-20 16:21:51 -07:00
Debanjum Singh Solanky	5fca41cc29	Show agents sorted by mru, Select mru agent by default on web app Have get agents API return agents ordered intelligently - Put the default agent first - Sort used agents by most recently chatted with agent for ease of access - Randomly shuffle the remaining unused agents for discoverability	2024-10-20 15:21:25 -07:00
Debanjum Singh Solanky	a6bfdbdbfe	Show all agents in carousel on home screen agent pane of web app This change wraps the agent pane in a scroll area with all agents shown. It allows selecting an agent to chat with directly from the home screen without breaking flow and having to jump to the agents page. The previous flow was not convenient to quickly and consistently start chat with one of your standard agents. This was because a random subet of agents were shown on the home page. To start chat with an agent not shown on home screen load you had to open the agents page and initiate the conversation from there.	2024-10-20 15:21:25 -07:00
Debanjum Singh Solanky	9ffd726799	Allow making sync api requests with body from khoj.el	2024-10-20 15:16:40 -07:00
Debanjum Singh Solanky	ac51920859	Start conversation with Agents from within Emacs Exposes a transient switch with available agents as selectable options in the Khoj chat sub-menu. Currently shows agent slugs instead of agent names as options. This isn't the cleanest but gets the job done for now. Only new conversations with a different agent can be started. Existing conversations will continue with the original agent it was created with. The ability to switch the conversation's agent doesn't exist on the server yet.	2024-10-20 15:16:40 -07:00
Debanjum Singh Solanky	7646ac6779	Style user attached images as carousel on chat input area of web app	2024-10-20 00:40:08 -07:00
sabaimran	5d5bea6a5f	Ensure images are reset after messages processed	2024-10-19 22:02:06 -07:00
sabaimran	1ad6e1749f	Move window redirect to after relevant data is dropped in localStorage on the homage page One limitation of this methodology is that localStorage has a limit in how much data it can take. Should add more graceful error handling here as well.	2024-10-19 20:36:13 -07:00
sabaimran	cb6b3ec1e9	Improve mode description given to LLM when determining how to respond. Currently experiencing difficulty instruction following when an image is shared. It's more likely to try and output an image. Update to make a clearer distinction.	2024-10-19 20:35:32 -07:00
sabaimran	545259e308	Remove unused icons in chatInputArea	2024-10-19 16:54:21 -07:00
Debanjum Singh Solanky	3cc1426edf	Style user attached images with fixed height, in a single row on web app	2024-10-19 16:48:36 -07:00
Debanjum Singh Solanky	58a331227d	Display the attached images inside the chat input area on the web app - Put the attached images display div inside the same parent div as the text area - Keep the attachment, microphone/send message buttons aligned with the text area. So the attached images just show up at the top of the text area but everything else stays at the same horizontal height as before. - This improves the UX by - Ensuring that the attached images do not obscure the agents pane above the chat input area - The attached images visually look like they are inside the actual input area, rather than floating above it. So the visual aligns with the semantics	2024-10-19 16:29:45 -07:00
Debanjum Singh Solanky	3e39fac455	Add vision support for Gemini models in Khoj	2024-10-19 15:47:03 -07:00
Debanjum Singh Solanky	0d6a54c10f	Allow sharing multiple images as part of user query from the web app Previously the web app only expected a single image to be shared by the user as part of their query. This change allows sharing multiple images from the web app. Closes #921	2024-10-19 15:47:03 -07:00
Debanjum Singh Solanky	e2abc1a257	Handle multiple images shared in query to chat API Previously Khoj could respond to a single shared image at a time. This changes updates the chat API to accept multiple images shared by the user and send it to the appropriate chat actors including the openai response generation chat actor for getting an image aware response	2024-10-19 14:53:33 -07:00
Debanjum Singh Solanky	d55cba8627	Pass user query for chat response when document lookup fails Recent changes made Khoj try respond even when document lookup fails. This change missed handling downstream effects of a failed document lookup, as the defiltered_query was null and so the text response didn't have the user query to respond to. This code initializes defiltered_query to original user query to handle that. Also response_type wasn't being passed via send_message_to_model_wrapper_sync unlike in the async scenario	2024-10-19 14:32:19 -07:00
Debanjum Singh Solanky	a4e6e1d5e8	Share webp images from web, desktop, obsidian app to chat with	2024-10-19 14:32:17 -07:00
sabaimran	dbd9a945b0	Re-evaluate agent private/public filtering after authenticateddata is retrieved. Update selectedAgent check logic to reflect.	2024-10-18 09:31:56 -07:00
Debanjum Singh Solanky	35015e720e	Release Khoj version 1.26.0	2024-10-17 18:25:53 -07:00
Debanjum Singh Solanky	f0dcfe4777	Explicitly ask Gemini models to format their response with markdown Otherwise it can get confused by the format of the passed context (e.g respond in org-mode if context contains org-mode notes)	2024-10-17 18:12:47 -07:00
Debanjum	7fb4c2939d	Make Chat and Online Search Resilient and Faster (#936 ) ## Overview ### New - Support using Firecrawl(https://firecrawl.dev) to read web pages - Add, switch and re-prioritize web page reader(s) to use via the admin panel ### Speed - Improve response speed by aggregating web page read, extract queries to run only once for each web page ### Response Resilience - Fallback through enabled web page readers until web page read - Enable reading web pages on the internal network for self-hosted Khoj running in anonymous mode - Try respond even if web search, web page read fails during chat - Try respond even if document search via inference endpoint fails ### Fix - Return data sources to use if exception in data source chat actor ## Details ### Configure web page readers to use - Only the web scraper set in Server Chat Settings via the Django admin panel, if set - Otherwise use the web scrapers added via the Django admin panel (in order of priority), if set - Otherwise, use all the web scrapers enabled by settings API keys via environment variables (e.g `FIRECRAWL_API_KEY', `JINA_API_KEY' env vars set), if set - Otherwise, use Jina to web scrape if no scrapers explicitly defined For self-hosted setups running in anonymous-mode, the ability to directly read webpages is also enabled by default. This is especially useful for reading webpages in your internal network that the other web page readers will not be able to access. ### Aggregate webpage extract queries to run once for each distinct web page Previously, we'd run separate webpage read and extract relevant content pipes for each distinct (query, url) pair. Now we aggregate all queries for each url to extract information from and run the webpage read and extract relevant content pipes once for each distinct URL. Even though the webpage content extraction pipes were previously being run in parallel. They increased the response time by 1. adding more ~duplicate context for the response generation step to read 2. being more susceptible to variability in web page read latency of the parallel jobs The aggregated retrieval of context for all queries for a given webpage could result in some hit to context quality. But it should improve and reduce variability in response time, quality and costs. This should especially help with speed and quality of online search for offline or low context chat models.	2024-10-17 17:57:44 -07:00
Debanjum Singh Solanky	2c20f49bc5	Return enabled scrapers as WebScraper objects for more ergonomic code	2024-10-17 17:44:09 -07:00
Debanjum Singh Solanky	0db52786ed	Make web scraper priority configurable via admin panel - Simplifies changing order in which web scrapers are invoked to read web page by just changing their priority number on the admin panel. Previously you'd have to delete/, re-add the scrapers to change their priority. - Add help text for each scraper field to ease admin setup experience - Friendlier env var to use Firecrawl's LLM to extract content - Remove use of separate friendly name for scraper types. Reuse actual name and just make actual name better	2024-10-17 17:42:42 -07:00
Debanjum Singh Solanky	20b6f0c2f4	Access internal links directly via a simple get request The other webpage scrapers will not work for internal webpages. Try access those urls directly if they are visible to the Khoj server over the network. Only enable this by default for self-hosted, single user setups. Otherwise ability to scan internal network would be a liability! For use-cases where it makes sense, the Khoj server admin can explicitly add the direct webpage scraper via the admin panel	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	d94abba2dc	Fallback through enabled scrapers to reduce web page read failures - Set up scrapers via API keys, explicitly adding them via admin panel or enabling only a single scraper to use via server chat settings. - Use validation to ensure only valid scrapers added via admin panel Example API key is present for scrapers that require it etc. - Modularize the read webpage functions to take api key, url as args Removes dependence on constants loaded in online_search. Functions are now mostly self contained - Improve ability to read webpages by using the speed, success rate of different scrapers. Optimal configuration needs to be discovered	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	11c64791aa	Allow changing perf timer log level. Info log time for webpage read	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	c841abe13f	Change webpage scraper to use via server admin panel	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	e47922e53a	Aggregate webpage extract queries to run once for each distinct webpage This should reduce webpage read and response generation time. Previously, we'd run separate webpage read and extract relevant content pipes for each distinct (query, url) pair. Now we aggregate all queries for each url to extract information from and run the webpage read and extract relevant content pipes once for each distinct url. Even though the webpage content extraction pipes were previously being in parallel. They increased response time by 1. adding more context for the response generation chat actor to respond from 2. and by being more susceptible to page read and extract latencies of the parallel jobs The aggregated retrieval of context for all queries for a given webpage could result in some hit to context quality. But it should improve and reduce variability in response time, quality and costs.	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	98f99fa6f8	Allow using Firecrawl to extract web page content Set the FIRECRAWL_TO_EXTRACT environment variable to true to have Firecrawl scrape and extract content from webpage using their LLM This could be faster, not sure about quality as LLM used is obfuscated	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	993fd7cd2b	Support using Firecrawl to read webpages Firecrawl is open-source, self-hostable with a default hosted service provided, similar to Jina.ai. So it can be 1. Self-hosted as part of a private Khoj cloud deployment 2. Used directly by getting an API key from the Firecrawl.dev service This is as an alternative to Olostep and Jina.ai for reading webpages.	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	731ea3779e	Return data sources to use if exception in data source chat actor Previously no value was returned if an exception got triggered when collecting information sources to search.	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	a932564169	Try respond even if web search, webpage read fails during chat Khoj shouldn't refuse to respond to user if web lookups fail. It should transparently mention that online search etc. failed. But try respond as best as it can without those references This change ensures a response to the users query is attempted even when web info retrieval fails.	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	1b04b801c6	Try respond even if document search via inference endpoint fails The huggingface endpoint can be flaky. Khoj shouldn't refuse to respond to user if document search fails. It should transparently mention that document lookup failed. But try respond as best as it can without the document references This changes provides graceful failover when inference endpoint requests fail either when encoding query or reranking retrieved docs	2024-10-17 17:40:49 -07:00
Debanjum Singh Solanky	9affeb9e85	Fix to log the client app calling the chat API - Remove unused subscribed variable from the chat API - Unexpectedly dropped client app logging when migrated API chat to do advanced streaming in july	2024-10-17 15:24:43 -07:00
Debanjum Singh Solanky	c6c48cfc18	Fix arg to generate_summary_from_file and type of this_iteration	2024-10-17 13:38:48 -07:00
Debanjum Singh Solanky	884fe42602	Allow automation as an output mode supported by custom agents	2024-10-17 11:58:52 -07:00
Debanjum Singh Solanky	c5e19b37ef	Use Khoj icons. Add automation & improve agent text on web login page	2024-10-17 11:58:52 -07:00
Debanjum Singh Solanky	42acc324dc	Handle correctly setting file filters as array when API call fails - Only set addedFiles to selectedFiles when selectedFiles is an array - Only set seleectedFiles, addedFiles to API response json when response succeeded. Previously we set it to response json on errors as well. This made the variables into json objects instead of arrays on API call failure - Check if selectedFiles, addedFiles are arrays before running operations on them. Previously the addedFiles.includes was where the code would fail	2024-10-17 11:58:52 -07:00
Debanjum Singh Solanky	7ebfc24a96	Upgrade Django version used by Khoj server	2024-10-17 11:58:52 -07:00
Debanjum Singh Solanky	ea59dde4a0	Upgrade documentation website dependencies	2024-10-17 11:58:52 -07:00
sabaimran	07ab8ab931	Update handling of gemini response with new API changes. Per documentation: finish_reason (google.ai.generativelanguage_v1beta.types.Candidate.FinishReason): Optional. Output only. The reason why the model stopped generating tokens. If empty, the model has not stopped generating the tokens.	2024-10-17 09:00:01 -07:00
Rehan Daphedar	27835628e6	Fix typo in docs for error 400 fix when self-hosting (#938 )	2024-10-16 23:15:43 -07:00
Debanjum Singh Solanky	19c65fb82b	Show user uuid field in django admin panel	2024-10-15 17:59:12 -07:00
Debanjum Singh Solanky	6c5b362551	Remove deprecated GET chat API endpoint	2024-10-15 15:13:09 -07:00
Debanjum Singh Solanky	931c56182e	Fix default chat model to use user model if no server chat model set - Advanced chat model should also fallback to user chat model if set - Get conversation config should falback to user chat model if set These assume no server chat model settings is configured	2024-10-15 15:13:09 -07:00
Debanjum Singh Solanky	feb6d65ef8	Merge branch 'master' into features/advanced-reasoning	2024-10-15 09:37:56 -07:00
Debanjum Singh Solanky	336c6c3689	Show tool to use decision for next iteration in train of thought	2024-10-15 01:12:18 -07:00
Debanjum Singh Solanky	81fb65fa0a	Return data sources to use if exception in data source chat actor Previously no value was returned if an exception got triggered when collecting information sources to search.	2024-10-14 18:20:20 -07:00
Debanjum Singh Solanky	3c93f07b3f	Try respond even if web search, webpage read fails during chat Khoj shouldn't refuse to respond to user if web lookups fail. It should transparently mention that online search etc. failed. But try respond as best as it can without those references This change ensures a response to the users query is attempted even when web info retrieval fails.	2024-10-14 18:13:26 -07:00
Debanjum Singh Solanky	07ab7ebf07	Try respond even if document search via inference endpoint fails The huggingface endpoint can be flaky. Khoj shouldn't refuse to respond to user if document search fails. It should transparently mention that document lookup failed. But try respond as best as it can without the document references This changes provides graceful failover when inference endpoint requests fail either when encoding query or reranking retrieved docs	2024-10-14 18:13:26 -07:00
Debanjum Singh Solanky	d6206aa80c	Remove deprecated GET chat API endpoint	2024-10-14 18:13:26 -07:00
Debanjum Singh Solanky	263eee4351	Fix default chat model to use user model if no server chat model set - Advanced chat model should also fallback to user chat model if set - Get conversation config should falback to user chat model if set These assume no server chat model settings is configured	2024-10-14 18:13:26 -07:00
sabaimran	81aa1b5589	Update some edge cases and usability of create agent flow - Use the slug to determine which agent to PATCH - Make the agent creation form multi-step to streamline the process	2024-10-14 14:07:31 -07:00
Debanjum Singh Solanky	abcd11cfc0	Merge branch 'master' into features/advanced-reasoning	2024-10-13 03:06:23 -07:00
Debanjum Singh Solanky	9356e66b94	Fix default chat model to use user model if no server chat model set - Advanced chat model should also fallback to user chat model if set - Get conversation config should falback to user chat model if set These assume no server chat model settings is configured	2024-10-13 03:02:29 -07:00
Debanjum Singh Solanky	9314f0a398	Fix default chat configs to use user model if no server chat model set Post merge cleanup in advanced reasoning to fallback to user chat model if no server chat model defined for advanced and default	2024-10-13 02:59:10 -07:00
Debanjum Singh Solanky	8ff13e4cf6	Update readme. Mention new capabilities	2024-10-13 01:30:53 -07:00
Debanjum Singh Solanky	a2200466b7	Merge branch 'master' into features/advanced-reasoning	2024-10-12 21:01:22 -07:00
Debanjum	c66c571396	Simplify switching chat model when self-hosting (#934 ) # Overview - Default to use user chat models for train of thought when no server chat settings created by admins - Default to not create server chat settings on first run # Details This change simplifies switching chat models for self-hosted setups by just changing the chat model on the user settings page. It falls back to use the user chat model for train of thought if server chat settings have not been created on the admin panel. Server chat settings, when set, controls the chat model used for Khoj's train of thought and the default user chat model. Previously a self-hosted user had to update 1. the server chat settings in the admin panel and 2. their own user chat model in the user settings panel to completely switch to a different chat model for both train of thought & response generation respectively You can still set server chat settings via the admin panel to use a different chat model for train of thought vs response generation. But this is only useful for advanced, multi-user setups.	2024-10-12 19:58:05 -07:00
Debanjum Singh Solanky	90888a1099	Log when new user created via magic link or whatsapp as well	2024-10-12 19:56:01 -07:00
Debanjum Singh Solanky	8222c6629d	Remove unused subscribed argument to read_webpage function	2024-10-12 10:45:39 -07:00
Debanjum Singh Solanky	9daaae0fdb	Render inline any image files output by code in message Update regex to also include any links to code generated images that aren't explicitly meant to be displayed inline. This allows folks to download the image (unlike the fake link that doesn't work created by model)	2024-10-12 10:34:57 -07:00
Debanjum Singh Solanky	20d495c43a	Update the iterative chat director prompt to generalize across chat models These prompts work across o1 and standard openai model. Works with anthropic and google models as well	2024-10-12 10:34:57 -07:00
sabaimran	eb4d598d0f	Eliminate the drawer component from the Agents view	2024-10-10 20:40:59 -07:00
sabaimran	0a1c3e4f41	Release Khoj version 1.25.0	2024-10-10 18:07:30 -07:00
sabaimran	01a58b71a5	Skip image, code generation if in research mode	2024-10-10 18:06:29 -07:00
Debanjum Singh Solanky	1b13d069f5	Pass data collected from various sources to code tool in normal flow too	2024-10-10 05:19:27 -07:00
Debanjum Singh Solanky	f462d34547	Render images files output by code interpreter in message on web app	2024-10-10 05:17:53 -07:00
Debanjum Singh Solanky	564491e164	Extract date filters quoted with non-ascii quotes in query	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	6a8fd9bf33	Reorder embeddings search arguments based on argument importance	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	0eacc0b2b0	Use consistent name for user, planner to not miss current user query Previously Khoj would start answering the previous query. This maybe because the prompt uses User for prompt in chat history but was using Q for current user prompt.	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	284c8c331b	Increase default max iterations for research chat director to 5	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	1e390325d2	Let research chat director decide which webpage to read, if any Make webpages to read automatically on search_online configurable via a argument. Set it to default to 1, so other callers of the function are unaffected. But iterative chat director can still decide which, if any, webpages to read based on the online search it performs	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	5a699a52d2	Improve webpage summarization prompt to better extract links, excerpts This change allows the iterative director to dive deeper into its research as the data extracted contains relevant links from the webpage Previous summarization prompt didn't extract relevant links from the webpage which limited further explorations from webpages	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	61df1d5db8	Pass previous iteration results to code interpreter chat actors This improves the code interpreter chat actors abilitiy to generate code with data collected during the previous iterations	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	9e7025b330	Set python interpret sandbox url via environment variable	2024-10-10 04:45:00 -07:00
Debanjum Singh Solanky	2dc5804571	Extract defilter query into conversation utils for reuse	2024-10-10 04:45:00 -07:00
sabaimran	e69a8382f2	Add a code icon for code-related train of thought	2024-10-09 23:56:57 -07:00
sabaimran	536422a40c	Include code snippets in the reference panel	2024-10-09 23:54:11 -07:00
Debanjum Singh Solanky	8d33c764b7	Allow iterative chat director to use python interpreter as a tool	2024-10-09 23:38:20 -07:00
Debanjum Singh Solanky	b373073f47	Show executed code in web app chat message references	2024-10-09 22:13:18 -07:00
Debanjum Singh Solanky	a98f97ed5e	Refactor Run Code tool into separate module and modularize code functions Move construct_chat_history and ChatEvent enum into conversation.utils and move send_message_to_model_wrapper to conversation.helper to modularize code. And start thinning out the bloated routers.helper - conversation.util components are shared functions that conversation child packages can use. - conversation.helper components can't be imported by conversation packages but it can use these child packages This division allows better modularity while avoiding circular import dependencies	2024-10-09 22:13:17 -07:00
Debanjum Singh Solanky	8044733201	Give Khoj ability to run python code as a tool triggered via chat API Create python code executing chat actor - The chat actor generate python code within sandbox constraints - Run the generated python code in the cohere terrarium, pyodide based sandbox accessible at sandbox url	2024-10-09 21:37:22 -07:00
Debanjum Singh Solanky	4d33239af6	Improve prompts for the iterative chat director	2024-10-09 21:23:18 -07:00
Debanjum Singh Solanky	6ad85e2275	Fix to continue showing retrieved documents in train of thought	2024-10-09 21:20:22 -07:00
sabaimran	a6f6e4f418	Fix notes references and passage of user query in the chat flow	2024-10-09 20:34:20 -07:00
Debanjum Singh Solanky	ec248efd31	Allow iterative chat director to do notes search	2024-10-09 19:04:59 -07:00
Debanjum Singh Solanky	a6905a9f0c	Pass background context to iterating chat director	2024-10-09 19:04:59 -07:00
sabaimran	028b6e6379	Fix yield for scraping direct web page	2024-10-09 18:14:08 -07:00
sabaimran	717d9da8d8	Handle when summarize result is not present, rename variable in for loop from query	2024-10-09 17:57:08 -07:00
sabaimran	03544efde2	Ignore typing of the result dict for online, web page scrape	2024-10-09 17:48:24 -07:00
sabaimran	ab81b01fcb	Fix typing of direct_web_pages and remove the deprecated chat API	2024-10-09 17:46:28 -07:00
sabaimran	5b8d663cf1	Add intermediate summarization of results when planning with o1	2024-10-09 17:40:56 -07:00
sabaimran	7b288a1179	Clean up the function planning prompt a little bit	2024-10-09 16:59:20 -07:00
sabaimran	f71e4969d3	Skip summarize while it's broken, and snip some other parts of the workflow while under construction	2024-10-09 16:40:06 -07:00
sabaimran	f7e6f99a32	add typing for extract document references	2024-10-09 16:05:34 -07:00
sabaimran	6960fb097c	update types of prev iterations response	2024-10-09 16:04:39 -07:00
sabaimran	4978360852	Fix type of previous_iterations	2024-10-09 16:02:41 -07:00
sabaimran	46ef205a75	Add additional type annotations for compiled_references et al	2024-10-09 16:01:52 -07:00
sabaimran	4fbaef10e9	Correct usage of the summarize function	2024-10-09 15:58:05 -07:00
sabaimran	c91678078d	Correct the usage of query passed to summarize function	2024-10-09 15:55:55 -07:00
sabaimran	f867d5ed72	Working prototype of meta-level chain of reasoning and execution - Create a more dynamic reasoning agent that can evaluate information and understand what it doesn't know, making moves to get that information - Lots of hacks and code that needs to be reversed later on before submission	2024-10-09 15:54:25 -07:00
Debanjum	00546c1a63	Fix link to llama-cpp-python setup docs	2024-10-09 01:30:33 -07:00
Debanjum Singh Solanky	05fb0f14d3	Use user chat models for train of thought when no server chat settings Update chat actors to use user's chat model for train of thought. This requires passing the user info as argument to all the chat actors. Whether the user is subscribed or not can be inferred from the user info being passed, so it doesn't need to be passed as a separate argument to chat actor functions Let send_message_to_model function infer chat model instead of passing it as an argument from some chat actors. Better if this logic can be done in a single place.	2024-10-09 00:07:08 -07:00
Debanjum Singh Solanky	ec0c79217f	Do not set server chat settings on first run Server chat settings can be set for advanced self-hosted or multi-user cloud setups. They are not necessary anymore as we fallback to use the users chat model for train of thought now	2024-10-09 00:07:08 -07:00
Debanjum Singh Solanky	a9009ea774	Default to use user chat model if server chat settings not defined Fallback to use user chat model for train of thought if server chat settings not defined. This simplifies switching chat models for single-user, self-hosted setups by just changing the chat model on the user settings page. Server chat settings, when set, controls the default user chat model and the chat model that is used for Khoj's train of thought. Previously a self-hosted user had to update both the server chat settings in the admin panel and their own user chat model in the user settings panel to explicitly switch to a different chat model (i.e to switch to a new model for both train of thought & response generation) You can still set server chat settings to use a different chat model for train of thought and response generation. But this is only necessary for advanced self-hosted or cloud hosted setups of Khoj.	2024-10-09 00:07:08 -07:00
Debanjum Singh Solanky	9a056383e0	Reduce size of start chat and edit buttons on agent card in web app	2024-10-09 00:00:32 -07:00
Debanjum Singh Solanky	dc7f22f76c	Mention no. of docs in agents knowledge base in its badge hover text	2024-10-08 23:51:00 -07:00
Debanjum Singh Solanky	13fb22f7e7	Update agent form data shown in edit card after save operaton on web app Previously you had to refresh the page to see the updated data on reopening the agents edit card after a save operation. Now you see the latest saved agent data on reopening the agents edit card. This should avoid confusion on whether the data was saved correctly	2024-10-08 23:26:04 -07:00
Debanjum Singh Solanky	dd770cf1b9	Start chat with public and protected agents when shared via link	2024-10-08 22:10:07 -07:00
Debanjum Singh Solanky	80212c50fd	Use default agent in others chats with an agent if agent made private If a public or protected agent is made private. Other users who were having conversation with that agent will have to carry on their conversation using default agent instead	2024-10-08 22:08:38 -07:00
Debanjum Singh Solanky	d628f89ce9	Prefetch agents related database models	2024-10-08 21:59:15 -07:00
Debanjum Singh Solanky	8de67c5d4d	Fallback to use general command if no tool selected by agent	2024-10-08 19:48:02 -07:00
Debanjum Singh Solanky	b80c4bcfdd	Improve agent command descriptions	2024-10-08 19:47:51 -07:00
Debanjum Singh Solanky	67d0e59eac	Pass chat history to the summarize chat actor	2024-10-08 18:44:52 -07:00
Debanjum Singh Solanky	7e3090060b	Encourage Gemini to output more verbose responses	2024-10-08 18:41:43 -07:00
Debanjum Singh Solanky	bbbdba3093	Time embedding model load for better visibility into app startup time Loading the embeddings model, even locally seems to be taking much longer. Use timer to track visibility into embedding, cross-encoder model load times	2024-10-08 18:41:43 -07:00
Debanjum Singh Solanky	516472a8d5	Switch default tokenizer to tiktoken as more widely used The tiktoken BPE based tokenizers seem more widely used these days. Fallback to gpt-4o tiktoken tokenizer to count tokens for context stuffing	2024-10-08 18:41:43 -07:00
Debanjum Singh Solanky	2b8f7f3efb	Reuse a single func to format conversation for Gemini This deduplicates code and prevents logic from deviating across gemini chat actors	2024-10-08 18:41:42 -07:00
Debanjum Singh Solanky	452e360175	Do not use max prompt size to limit Gemini max output tokens We should start disambiguating the the max input from output size. Max prompt size should only be used for the max input context to an LLM. If required max_output_tokens should be set as a separate new field	2024-10-08 15:30:08 -07:00
Debanjum Singh Solanky	bdc36fec5d	Remove unnecessary whitespace indent from personality context	2024-10-08 15:30:08 -07:00
sabaimran	3daa3c003d	When tool selection is not done successfully with an agent, return all agent tools as options	2024-10-08 15:03:58 -07:00
sabaimran	ad716ca58d	Delete associated entries with an agent when it is deleted	2024-10-08 15:00:21 -07:00
sabaimran	f7fc6dbdc8	Limit agent creation and modification to subscribed users	2024-10-08 14:59:57 -07:00
sabaimran	c7638a783e	Dynamically update added files when upload in agent creation	2024-10-07 21:54:11 -07:00
sabaimran	e10a0571ff	Only check the prompt safety if the agent is not private	2024-10-07 21:42:14 -07:00
sabaimran	f700d5bddb	Add summarization capability with agent knowledge base	2024-10-07 21:20:23 -07:00
sabaimran	df3dc33e96	Show reference icon and domain side by side	2024-10-07 20:28:48 -07:00
sabaimran	59e55f981f	Reset agent to default when continuing with deceased agent	2024-10-07 20:28:33 -07:00
sabaimran	874776024a	Handle chat history rendering when agent is deceased	2024-10-07 20:28:10 -07:00
sabaimran	f232c2b059	Allow user to chat with agent knowledge base if general mode	2024-10-07 19:55:33 -07:00
sabaimran	c00654ae58	Update default agent settings	2024-10-07 18:11:24 -07:00
sabaimran	3d0e183bea	Add more log lines when encountering rate limiting	2024-10-07 14:36:12 -07:00
sabaimran	e4a8a69bc8	Add a subtle check mark when the copy button is selected	2024-10-07 09:41:03 -07:00
sabaimran	405c047c0c	Include agent personality through subtasks and support custom agents (#916 ) Currently, the personality of the agent is only included in the final response that it returns to the user. Historically, this was because models were quite bad at navigating the additional context of personality, and there was a bias towards having more control over certain operations (e.g., tool selection, question extraction). Going forward, it should be more approachable to have prompts included in the sub tasks that Khoj runs in order to response to a given query. Make this possible in this PR. This also sets us up for agent creation becoming available soon. Create custom agents in #928 Agents are useful insofar as you can personalize them to fulfill specific subtasks you need to accomplish. In this PR, we add support for using custom agents that can be configured with a custom system prompt (aka persona) and knowledge base (from your own indexed documents). Once created, private agents can be accessible only to the creator, and protected agents can be accessible via a direct link. Custom tool selection for agents in #930 Expose the functionality to select which tools a given agent has access to. By default, they have all. Can limit both information sources and output modes. Add new tools to the agent modification form	2024-10-07 00:21:55 -07:00
sabaimran	c0193744f5	Update readme.md	2024-10-06 12:26:53 -07:00
sabaimran	d4ffeca90a	Fix notion indexing with manually set token	2024-10-05 09:13:16 -07:00
sabaimran	29a422b6bc	Remove the single dollar sign delimeters from katex rendering	2024-10-04 12:24:19 -07:00
Debanjum Singh Solanky	e217cb5840	Suggest notification type automation on Automation page of web app	2024-10-03 16:36:23 -07:00
Debanjum Singh Solanky	f626b34436	Update automation docs with screenshots	2024-10-03 16:36:05 -07:00
sabaimran	27c7e54695	Release Khoj version 1.24.1	2024-10-03 13:21:10 -07:00
Debanjum	4a1cb50da3	Make Online Search Location Aware (#929 ) ## Overview Add user country code as context for doing online search with serper.dev API. This should find more user relevant results from online searches by Khoj ## Details ### Major - Default to using system clock to infer user timezone on js clients - Infer country from timezone when only timezone received by chat API - Localize online search results to user country when location available ### Minor - Add `__str__` func to `LocationData` class to deduplicate location string generation	2024-10-03 12:33:47 -07:00
sabaimran	cb4052e333	Bump up rate limit for subscribed users and add an option to create new conversation in the POST request	2024-10-03 12:31:58 -07:00
sabaimran	7a5cd06162	Improve the login page (#931 ) * Init version of improved login page * Use split screen view, add a gradient	2024-10-02 14:26:46 -07:00
Debanjum Singh Solanky	591c582eeb	Fix list references in use openai proxy docs	2024-09-30 23:21:33 -07:00
Debanjum Singh Solanky	852662f946	Use requestAnimationFrame for synced scroll on chat in web app Make all the scroll actions just use requestAnimationFrame instead of setTimeout. It better aligns with browser rendering loop, so better for UX changes than setTimeout	2024-09-30 23:21:10 -07:00
sabaimran	57b4f844b7	Fail app start if initalization fails	2024-09-30 17:30:06 -07:00
Debanjum Singh Solanky	04aef362e2	Default to using system clock to infer user timezone on js clients Using system clock to infer user timezone on clients makes Khoj more robust to provide location aware responses. Previously only ip based location was used to infer timezone via API. This didn't provide any decent fallback when calls to ipapi failed or Khoj was being run in offline mode	2024-09-30 07:08:12 -07:00
Debanjum Singh Solanky	344f3c60ba	Infer country from timezone when only tz received by chat API Timezone is easier to infer using clients system clock. This can be used to infer user country name, country code, even if ip based location cannot be inferred. This makes using location data to contextualize Khoj's responses more robust. For example, online search results are retrieved for user's country, even if call to ipapi.co for ip based location fails	2024-09-30 07:08:11 -07:00
Debanjum Singh Solanky	1fed842fcc	Localize online search results to user country when location available Get country code to server chat api from i.p location check on clients. Use country code to get country specific online search results via Serper.dev API	2024-09-30 07:08:11 -07:00
Debanjum Singh Solanky	eb86f6fc42	Add __str__ func to LocationData class to dedupe location string gen Previously the location string from location data was being generated wherever it was being used. By adding a __str__ representation to LocationData class, we can dedupe and simplify the code to get the location string	2024-09-30 07:08:11 -07:00
sabaimran	1dfc89e79f	Store conversation ID for new conversations as a string, not UUID	2024-09-29 18:07:08 -07:00
sabaimran	d92a349292	Improve image generation tool description	2024-09-29 16:20:25 -07:00
Debanjum Singh Solanky	d21a4e73a0	Update docs to use new variables to sync files, directories from khoj.el	2024-09-29 14:03:06 -07:00
Debanjum Singh Solanky	d66a0ccfaa	Update client setup docs with instructions for self-hosting users Resolves #808	2024-09-29 13:58:02 -07:00
Debanjum Singh Solanky	dd44933515	Release Khoj version 1.24.0	2024-09-29 04:56:11 -07:00
Debanjum Singh Solanky	1e8ce52d98	Reduce size of Khoj Docker images by removing layers and caches - Align Dockerfile and prod.Dockerfile code - Reduce Docker image size by 25% by reducing Docker layers and removing package caches	2024-09-29 04:06:35 -07:00
Debanjum Singh Solanky	9b10b3e7a1	Remove unused langchain openai server dependency	2024-09-29 04:06:35 -07:00
Debanjum Singh Solanky	e767b6eba3	Update Documentation with flags to enable GPU on Khoj pip install - Use tabs for GPU/CPU type khoj being install on - Update CMAKE flags to use to install Khoj with correct GPU support Previous flags used DLLAMA, this has been updated to use DGGML now in llama.cpp	2024-09-29 04:06:35 -07:00
sabaimran	63a2b5b3c4	Remove tools cache in dockerize.yml workflow	2024-09-29 00:27:37 -07:00
Debanjum Singh Solanky	936bc64b82	Render images to take full width of chat message div Remove unnecessary "Inferred Query" heading prefix to image generation prompt used by Khoj. The inferred query in chat message has a heading of it's own, so avoid two headings for the image prompt	2024-09-28 23:45:56 -07:00
Debanjum Singh Solanky	4efa7d4464	Upgrade the Next.js web app package dependency	2024-09-28 23:45:56 -07:00
Debanjum Singh Solanky	b3cb417796	Fix spelling of Manage Context in Side Panel of Web App	2024-09-28 23:45:56 -07:00
sabaimran	676ff5fa69	Fix setting title on new conversations, add the action menu	2024-09-28 23:43:27 -07:00
Shantanu Sakpal	65d5e03f7f	Reduce tooltip popup delay duration for Create Agent button on Web app (#926 ) The problem was the tool tip was visible on hover, but it was slow, so before the tool tip popped up, the user would click on the button and this stopped the tool tip from popping up. So i reduced the popup delay to 10ms. now as soon as user hovers over the button, they will see that its a feature coming soon!	2024-09-28 23:01:40 -07:00
Shantanu Sakpal	be8de1a1bd	Only Auto Scroll when at Page Bottom and Add Button to Scroll to Page Bottom on Web App (#923 ) Improve Scrolling on Chat page of Web app - Details 1. Only auto scroll Khoj's streamed response when scroll is near bottom of page Allows scrolling to other messages in conversation while Khoj is formulating and streaming its response 2. Add button to scroll to bottom of the chat page 3. Scroll to most recent conversation turn on conversation first load It's a better default to anchor to most recent conversation turn (i.e most recent user message) 4. Smooth scroll when Khoj's chat response is streamed Previously the scroll would jitter during response streaming 5. Anchor scroll position when fetch and render older messages in conversation Allow users to keep their scroll position when older messages are fetched from server and rendered Resolves #758	2024-09-28 22:54:34 -07:00
sabaimran	06777e1660	Convert the default conversation id to a uuid, plus other fixes (#918 ) * Update the conversation_id primary key field to be a uuid - update associated API endpoints - this is to improve the overall application health, by obfuscating some information about the internal database - conversation_id type is now implicitly a string, rather than an int - ensure automations are also migrated in place, such that the conversation_ids they're pointing to are now mapped to the new IDs * Update client-side API calls to correctly query with a string field * Allow modifying of conversation properties from the chat title * Improve drag and drop file experience for chat input area * Use a phosphor icon for the copy to clipboard experience for code snippets * Update conversation_id parameter to be a str type * If django_apscheduler is not in the environment, skip the migration script * Fix create automation flow by storing conversation id as string The new UUID used for conversation id can't be directly serialized. Convert to string for serializing it for later execution --------- Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-09-24 14:12:50 -07:00
Debanjum Singh Solanky	0c936cecc0	Release Khoj version 1.23.3	2024-09-24 12:44:09 -07:00
Debanjum Singh Solanky	61c6e742d5	Truncate chat context to max tokens for offline, openai chat actors too	2024-09-24 12:42:32 -07:00
sabaimran	e306e6ca94	Fix file paths used for pypi wheel building	2024-09-22 12:42:08 -07:00
Debanjum	f00e0e6080	Improve Khoj First Run, Docker Setup and Documentation (#919 ) ## Improve - Intelligently initialize a decent default set of chat model options - Create non-interactive mode. Auto set default server configuration on first run via Docker ## Fix - Make RapidOCR dependency optional as flaky requirements causing docker build failures - Set default openai text to image model correctly during initialization ## Details Improve initialization flow during first run to remove need to configure Khoj: - Set Google, Anthropic Chat models too Previously only Offline, Openai chat models could be set during init - Add multiple chat models for each LLM provider Interactively set a comma separated list of models for each provider - Auto add default chat models for each provider in non-interactive model if the `{OPENAI,GEMINI,ANTHROPIC}_API_KEY' env var is set - Used when server run via Docker as user input cannot be processed to configure server during first run - Do not ask for `max_tokens', `tokenizer' for offline models during initialization. Use better defaults inferred in code instead - Explicitly set default chat model to use If unset, it implicitly defaults to using the first chat model. Make it explicit to reduce this confusion Resolves #882	2024-09-21 14:15:45 -07:00
Debanjum Singh Solanky	a6c0b43539	Upgrade documentation package dependencies	2024-09-21 14:06:40 -07:00
Debanjum Singh Solanky	2033f5168e	Modularize chat models initialization with a reusable function The chat model initialize interaction flow is fairly similar across the chat model providers. This should simplify adding new chat model providers and reduce chances of bugs in the interactive chat model initialization flow.	2024-09-21 14:06:40 -07:00
Debanjum Singh Solanky	26c39576df	Add Documentation for the settings on the Khoj Admin Panel This is an initial pass to add documentation for all the knobs available on the Khoj Admin panel. It should shed some light onto what each admin setting is for and how they can be customized when self hosting. Resolves #831	2024-09-21 14:06:40 -07:00
Debanjum Singh Solanky	730e5608bb	Improve Self Hosting Docs. Better Docker, Remote Access Setup Instructions - Improve Self Hosting Docker Instructions - Ask to Install Docker Desktop to not require separate docker-compose install and unify the instruction across OS - To Self Host on Windows, ask to use Docker Desktop with WSL2 backend - Use nested Tab grouping to split Docker vs Pip Self Host Instructions - Reduce Self Host Setup Steps in Documentation after code simplification - First run now avoids need to configure Khoj via admin panel - So move the chat model config steps into optional post setup config section - Improve Instructions to Configure chat models on First Run - Compress configuring chat model providers into a Tab Group - Add Documentation for Remote Access under Advanced Self Hosting	2024-09-21 14:06:17 -07:00
Debanjum Singh Solanky	91c76d4152	Intelligently initialize a decent default set of chat model options Given the LLM landscape is rapidly changing, providing a good default set of options should help reduce decision fatigue to get started Improve initialization flow during first run - Set Google, Anthropic Chat models too Previously only Offline, Openai chat models could be set during init - Add multiple chat models for each LLM provider Interactively set a comma separated list of models for each provider - Auto add default chat models for each provider in non-interactive model if the {OPENAI,GEMINI,ANTHROPIC}_API_KEY env var is set - Do not ask for max_tokens, tokenizer for offline models during initialization. Use better defaults inferred in code instead - Explicitly set default chat model to use If unset, it implicitly defaults to using the first chat model. Make it explicit to reduce this confusion Resolves #882	2024-09-19 20:32:08 -07:00
Debanjum Singh Solanky	f177723711	Add default server configuration on first run in non-interactive mode This should configure Khoj with decent default configurations via Docker and avoid needing to configure Khoj via admin page to start using dockerized Khoj Update default max prompt size set during khoj initialization as online chat model are cheaper and offline chat models have larger context now	2024-09-19 15:12:55 -07:00
Debanjum Singh Solanky	020167c7cf	Set default openai text to image model correctly during initialization Speech to text model was previously being set to the text to image model previously!	2024-09-19 15:11:34 -07:00
Debanjum Singh Solanky	077b88bafa	Make RapidOCR dependency optional as flaky requirements RapidOCR depends on OpenCV which by default requires a bunch of GUI paramters. This system package dependency set (like libgl1) is flaky Making the RapidOCR dependency optional should allow khoj to be more resilient to setup/dependency failures Trade-off is that OCR for documents may not always be available and it'll require looking at server logs to find out when this happens	2024-09-19 15:10:31 -07:00
sabaimran	0a568244fd	Revert "Convert conversationId int to string before making api request to bulk update file filters" This reverts commit `c9665fb20b`. Revert "Fix handling for new conversation in agents page" This reverts commit `3466f04992`. Revert "Add a unique_id field for identifiying conversations (#914)" This reverts commit `ece2ec2d90`.	2024-09-18 20:36:57 -07:00
Debanjum Singh Solanky	bb2bd77a64	Send chat message to Khoj web app via url query param - This allows triggering khoj chat from the browser addressbar - So now if you add Khoj to your browser bookmark with - URL: https://app.khoj.dev/?q=%s - Keyword: khoj - Then you can type "khoj what is the news today" to trigger Khoj to quickly respond to your query. This avoids having to open the Khoj web app before asking your question	2024-09-17 21:50:47 -07:00
Debanjum Singh Solanky	ecdbcd815e	Simplify code to remove json codeblock from AI response string	2024-09-17 21:50:47 -07:00
sabaimran	e457720e8a	Improve the email templates and better align with new branding	2024-09-17 11:18:25 -07:00
sabaimran	c9665fb20b	Convert conversationId int to string before making api request to bulk update file filters	2024-09-16 15:45:23 -07:00
sabaimran	3466f04992	Fix handling for new conversation in agents page	2024-09-16 15:04:49 -07:00
sabaimran	ece2ec2d90	Add a unique_id field for identifiying conversations (#914 ) * Add a unique_id field to the conversation object - This helps us keep track of the unique identity of the conversation without expose the internal id - Create three staged migrations in order to first add the field, then add unique values to pre-fill, and then set the unique constraint. Without this, it tries to initialize all the existing conversations with the same ID. * Parse and utilize the unique_id field in the query parameters of the front-end view - Handle the unique_id field when creating a new conversation from the home page - Parse the id field with a lightweight parameter called v in the chat page - Share page should not be affected, as it uses the public slug * Fix suggested card category	2024-09-16 12:19:16 -07:00
sabaimran	e6bc7a2ba2	Fix links to log in email templates	2024-09-15 19:14:19 -07:00
Debanjum Singh Solanky	79980feb7b	Release Khoj version 1.23.2	2024-09-15 03:07:26 -07:00
Debanjum Singh Solanky	575ff103cf	Frame chat response error on web app in a more conversational form Also indicate hitting dislike on the message should be enough to convey the issue to the developers.	2024-09-15 03:00:49 -07:00
Debanjum Singh Solanky	893ae60a6a	Improve handling of harmful categorized responses by Gemini Previously Khoj would stop in the middle of response generation when the safety filters got triggered at default thresholds. This was confusing as it felt like a service error, not expected behavior. Going forward Khoj will - Only block responding to high confidence harmful content detected by Gemini's safety filters instead of using the default safety settings - Show an explanatory, conversational response (w/ harm category) when response is terminated due to Gemini's safety filters	2024-09-15 02:17:54 -07:00
sabaimran	ec1f87a896	Release Khoj version 1.23.1	2024-09-12 22:46:39 -07:00
sabaimran	2a4416d223	Use prefetch_related for the openai_config when retrieving all chatmodeloptions async	2024-09-12 22:45:43 -07:00
sabaimran	253ca92203	Release Khoj version 1.23.0	2024-09-12 20:25:29 -07:00
Debanjum Singh Solanky	178b78f87b	Show debug log, not warning when use default tokenizer for context stuffing	2024-09-12 20:21:01 -07:00
Debanjum	f173188dcf	Support using image generation models like Flux via Replicate (#909 ) - Support using image generation models like Flux via Replicate - Modularize the image generation code - Make generate better image prompt chat actor add composition details - Generate vivid images with DALLE-3	2024-09-12 20:19:46 -07:00
Debanjum Singh Solanky	75d3b34452	Extract image generation code into new image processor for modularity	2024-09-12 20:01:32 -07:00
Debanjum Singh Solanky	84051d7d89	Make generate better image prompt chat actor add composition details	2024-09-12 19:58:57 -07:00
Debanjum Singh Solanky	ed12f45a26	Generate vivid images with DALLE-3 It's apparently the default setting in chatgpt app according to the openai cookbook at https://cookbook.openai.com/articles/what_is_new_with_dalle_3#examples-and-prompts	2024-09-12 19:58:57 -07:00
Debanjum Singh Solanky	1b82aea753	Support using image generation models like Flux via Replicate Enables using any image generation model on Replicate's Predictions API endpoints. The server admin just needs to add text-to-image model on the server/admin panel in organization/model_name format and input their Replicate API key with it Create db migration (including merge)	2024-09-12 19:58:56 -07:00
Brian Kanya	1d512b4986	Use environment variable to set sender email of auth link emails (#907 ) Set sender email using `RESEND_EMAIL` environment variable for magic link sent via Resend API for authentication . It was previously hard-coded. This prevented hosting Khoj on other domains. Resolves #908	2024-09-12 18:48:11 -07:00
Debanjum	26ca3df605	Support OpenAI's new O1 Model Series (#912 ) - Major - The new O1 series doesn't seem to support streaming, response_format enforcement, stop words or temperature currently. - Remove any markdown json codeblock in chat actors expecting json responses - Minor - Override block display styling of links by Katex in chat messages	2024-09-12 18:42:51 -07:00
Debanjum Singh Solanky	0685a79748	Remove any markdown json codeblock in chat actors expecting json responses Strip any json md codeblock wrapper if exists before processing response by output mode, extract questions chat actor. This is similar to what is already being done by other chat actors Useful for succesfully interpreting json output in chat actors when using non (json) schema enforceable models like o1 and gemma-2 Use conversation helper function to centralize the json md codeblock removal code	2024-09-12 18:26:15 -07:00
Debanjum Singh Solanky	6e660d11c9	Override block display styling of links by Katex in chat messages This happens sometimes when LLM respons contains [\[1\]] kind of links as reference. Both markdown-it and katex apply styling. Katex's span uses display: block which makes the rendering of these references take up a whole line by themselves. Override block styling of spans within an `a' element to prevent such chat message styling issues	2024-09-12 18:22:46 -07:00
Debanjum Singh Solanky	272eae5d66	Add support for the newly released OpenAI O1 model series for preview The O1 series doesn't seem to support streaming, stop words or temperature, response_format currently.	2024-09-12 18:22:46 -07:00
Alexander Matyasko	9570933506	Support Google's Gemini model series (#902 ) * Add functions to chat with Google's gemini model series * Gracefully close thread when there's an exception in the gemini llm thread * Use enums for verifying the chat model option type * Add a migration to add the gemini chat model type to the db model * Fix chat model selection verification and math prompt tuning * Fix extract questions method with gemini. Enforce json response in extract questions. * Add standard stop sequence for Gemini chat response generation --------- Co-authored-by: sabaimran <narmiabas@gmail.com> Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-09-12 18:17:55 -07:00
Debanjum Singh Solanky	42b727e926	Revert additional logging enabled to debug automation failures in prod Additional logging was enabled to debug automation failures in production since migration chat API to use POST request method (from earlier GET). Redirect from http to https was default to use GET instead of POST method to call /api/chat on redirect. This has been resolved now	2024-09-12 17:56:54 -07:00
sabaimran	14a495cbb5	Release Khoj version 1.22.3	2024-09-12 12:39:04 -07:00
sabaimran	91cee2eaa8	Handle redirects when scheduling chats from automations	2024-09-12 11:36:47 -07:00
sabaimran	4555969d38	Add additional log lines	2024-09-12 10:50:36 -07:00
sabaimran	12897a9a62	Update link to gif demo in README to pull from GitHub	2024-09-11 20:09:26 -07:00
sabaimran	9310f88537	Add quadratic equation gif to docs	2024-09-11 20:07:57 -07:00
sabaimran	d042a055cf	Update the demo and simplify the readme	2024-09-11 20:03:48 -07:00
sabaimran	4d3224657f	Update the documentation with swanky new demo videos	2024-09-11 19:57:10 -07:00
Debanjum Singh Solanky	2cc4a0769e	Release Khoj version 1.22.2	2024-09-11 18:39:24 -07:00
Debanjum Singh Solanky	7f186be742	Fix json payload passed by automations to the new POST chat API	2024-09-11 18:35:31 -07:00
sabaimran	5038d15574	Route to config_page, not to deprecated notion_config_page, on notion callback API	2024-09-11 18:30:23 -07:00
Debanjum Singh Solanky	b61d825cbc	Sanitize user attached image in chat message input pane of web app	2024-09-11 18:02:33 -07:00
Debanjum Singh Solanky	de60ad7da6	Update automations to call new POST chat API endpoint	2024-09-11 17:28:40 -07:00
Debanjum Singh Solanky	055ead550c	Update desktop shortcut, web app factchecker to use new POST chat API	2024-09-11 17:28:32 -07:00
Debanjum Singh Solanky	bc2e889d72	Update chat director, client tests to call chat API using new POST method	2024-09-11 17:28:06 -07:00
Debanjum Singh Solanky	3f51af9a96	Keep the GET chat API endpoint for a bit before deprecating it This is to avoid breaking non-updated Khoj clients	2024-09-11 16:50:10 -07:00
Debanjum Singh Solanky	241b9009ba	Update OpenAI chat actor tests to handle more questions being extracted	2024-09-11 16:16:55 -07:00
Debanjum Singh Solanky	03befc9b12	Use consistent user attached image placeholder text for chat actors Get information sources and get output mode don't actually see the images. They just get placeholder text to indicate that the user attached an image to their message for context	2024-09-11 16:16:55 -07:00
Debanjum Singh Solanky	04363a504c	Prompt Whisper to know "Khoj" term for speech to text transcription	2024-09-11 16:16:55 -07:00
Debanjum Singh Solanky	3dcc8695b2	Improve vertical alignment of lists in chat messages on web app - Make train of thought icons to be top aligned, next to the their intermediate step heading - Add margin bottom to ordered, unordered lists in chat message, similar to how it is already added for paragraphs	2024-09-11 16:16:55 -07:00
Debanjum Singh Solanky	179357b28a	Default to gpt-4o-mini as online chat model	2024-09-11 16:16:55 -07:00
sabaimran	ae74c6ca55	Release Khoj version 1.22.1	2024-09-11 13:03:53 -07:00
sabaimran	cd5db277f3	Fix sync to async issue when getting all valid vision configs	2024-09-11 12:57:54 -07:00
sabaimran	9b12290c17	Release Khoj version 1.22.0	2024-09-11 11:21:02 -07:00
sabaimran	2932d305b0	Simplify redundant logic for constructing structured messages with image url	2024-09-10 21:09:43 -07:00
sabaimran	07e2c49a7a	Set default temperature to 0.7 in the extract_questions method	2024-09-10 21:09:21 -07:00
sabaimran	8d40fc0aef	Limit vision_enabled image formatting to OpenAI APIs and send vision to extract_questions query	2024-09-10 20:08:14 -07:00
Debanjum Singh Solanky	aa31d041f3	Style list html elements by default on web app to improve readability Previously list styling was turned off for some reason in Next.js	2024-09-10 17:45:04 -07:00
Debanjum Singh Solanky	d81a050d73	Update documentation package dependencies	2024-09-10 14:18:56 -07:00
Debanjum Singh Solanky	7614718204	Build Khoj cloud docker image for arm64 architecture too	2024-09-10 13:57:09 -07:00
Debanjum Singh Solanky	596db603e0	Pass query params to chat API in POST body instead of URL query string Closes #899, #678	2024-09-10 13:57:03 -07:00
Debanjum Singh Solanky	fc6345e246	Simplify setImagePath for upload from chat input area of web app	2024-09-10 09:18:54 -07:00
Raghav Tirumale	549686a7a4	Add Vision Support (#889 ) # Summary of Changes * New UI to show preview of image uploads * ChatML message changes to support gpt-4o vision based responses on images * AWS S3 image uploads for persistent image context in conversations * Database changes to have `vision_enabled` option in server admin panel while configuring models * Render previously uploaded images in the chat history, show uploaded images for pending msgs * Pass the uploaded_image_url through to subqueries * Allow image to render upon first message from the homepage * Add rendering support for images to shared chat as well * Fix some UI/functionality bugs in the share page * Convert user attached images for chat to webp format before upload * Use placeholder to attached image for data source, response mode actors * Update all clients to call /api/chat as a POST instead of GET request * Fix copying chat messages with images to clipboard TLDR; Add vision support for openai models on Khoj via the web UI! --------- Co-authored-by: sabaimran <narmiabas@gmail.com> Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-09-09 15:22:18 -07:00
Debanjum Singh Solanky	b553bba1d8	Release Khoj version 1.21.6	2024-09-09 14:55:36 -07:00
sabaimran	223d310ea2	CTA in welcome email	2024-09-09 14:33:27 -07:00
Debanjum Singh Solanky	5dea9ef323	Add selected OS tab to url in documentation to ease link sharing	2024-09-09 10:40:53 -07:00
Debanjum Singh Solanky	87c52dfd02	Update Documentation project dependencies Stop wrapping Tabs in explicit mdx-code-blocks as build after upgrade throws error	2024-09-09 10:40:53 -07:00
Debanjum Singh Solanky	7941b12d50	Toggle speak, send buttons based on chat input text entered on Desktop	2024-09-09 10:40:53 -07:00
Debanjum Singh Solanky	b5f6550de2	Move link to source code from Nav pane to About page on Desktop app	2024-09-09 10:40:53 -07:00
Debanjum Singh Solanky	77b44f6db0	Update Desktop app dependencies	2024-09-09 10:40:53 -07:00
Debanjum Singh Solanky	303d8ed64e	Update Obsidian plugin package dependencies	2024-09-09 10:40:53 -07:00
Debanjum Singh Solanky	72fbbc092c	Upgrade Django, FastAPI, Uvicorn packages - Update Django to 5.0.8 - Update Uvicorn to 0.30.6 - Update FastAPI minimum versions to 0.110.0	2024-09-09 10:40:53 -07:00
sabaimran	8e6b9afeb7	Add an automation for research paper summaries	2024-09-08 11:50:49 -07:00
Debanjum	05c169bb37	Set File Types to Sync from Obsidian via Khoj Plugin Settings Page (#904 ) Limit file types to sync with Khoj from Obsidian to: - Avoid hitting per user index-able data limits, especially for folks on the Khoj cloud free tier. E.g by excluding images in Obsidian vault from being synced - Improve context used by Khoj to generate responses	2024-09-05 22:40:30 -07:00
Husain007	4e8ead66a8	Fix URL to web, desktop settings pages on Desktop application (#903 ) Update web and desktop settings URLs on desktop application from previous 'config' path to new 'settings' path	2024-09-05 14:47:43 -07:00
Debanjum Singh Solanky	bc26cf8b2f	Only show updated index notice on success in Obsidian plugin Previously it'd show indexing success notice on error and success	2024-09-04 17:52:32 -07:00
Debanjum Singh Solanky	cb425a073d	Use rich text error to better guide when exceed data sync limits in Obsidian When user exceeds data sync limits. Show error notice with - Link to web app settings page to upgrade subscription - Link to Khoj plugin settings in Obsidian to configure file types to sync from vault to Khoj	2024-09-04 17:52:32 -07:00
Debanjum Singh Solanky	19efc83455	Set File Types to Sync from Obsidian via Khoj Plugin Settings Page Useful to limit file types to sync with Khoj. Avoids hitting indexed data limits, especially for users on the Khoj cloud free tier Closes #893	2024-09-04 16:09:56 -07:00
sabaimran	7216a06f5f	Release Khoj version 1.21.5	2024-09-03 21:58:00 -07:00
sabaimran	895f1c8e9e	Gracefully close thread when there's an exception in the anthropic llm thread. Include full stack traces.	2024-09-03 13:16:51 -07:00
sabaimran	17901406aa	Gracefully close thread when there's an exception in the openai llm thread. Closes #894 .	2024-09-03 13:16:51 -07:00
sabaimran	6ed68b574b	Merge pull request #898 from lvnilesh/patch-1 Handles deprecation of version reference	2024-09-03 12:53:44 -07:00
sabaimran	912cc0074a	Use nonlocal for conversation_id when running the event_generator	2024-09-03 11:55:06 -07:00
sabaimran	591f5a522c	Release Khoj version 1.21.4	2024-09-02 17:45:39 -07:00
sabaimran	9306a0bb2c	Prefetch the settings and openai_config of a texttoimagemodelconfig	2024-09-02 17:35:21 -07:00
sabaimran	132eac0f51	Merge pull request #897 from khoj-ai/features/increase-rate-limits Increase rate limits for data indexing	2024-08-25 23:39:30 -07:00
LV Nilesh	77cc1cd42f	Update docker-compose.yml Handles deprecation of version reference	2024-08-25 17:05:47 -07:00
sabaimran	977001b801	Reduce the test data packet size	2024-08-25 16:14:32 -07:00
sabaimran	6eb06e8626	Downgrade rate limit to 200MB	2024-08-25 15:26:27 -07:00
sabaimran	439a2680fd	Increase rate limits for data indexing	2024-08-25 15:09:30 -07:00
sabaimran	af4e9988c4	Merge pull request #896 from khoj-ai/features/add-support-for-custom-confidence Add support for custom search model-specific thresholds	2024-08-24 20:32:41 -07:00
sabaimran	4b77325f63	Default to infinite distance when using the search API	2024-08-24 19:57:49 -07:00
sabaimran	e919d28f1c	Add support for custom search model-specific thresholds	2024-08-24 19:28:26 -07:00
sabaimran	fa4d808a5f	Encode uri components when sending automations data to the server	2024-08-24 18:45:50 -07:00
sabaimran	387b7c7887	Release Khoj version 1.21.3	2024-08-23 11:15:15 -07:00
sabaimran	7b8b3a66ae	Revert django version to previous patch	2024-08-23 11:12:41 -07:00
Debanjum Singh Solanky	5927ca8032	Properly close chat stream iterator even if response generation fails Previously chat stream iterator wasn't closed when response streaming for offline chat model threw an exception. This would require restarting the application. Now application doesn't hang even if current response generation fails with exception	2024-08-23 02:06:26 -07:00
Debanjum Singh Solanky	bdb81260ac	Update docs to mention using Llama 3.1 and 20K max prompt size for it Update stale credits to better reflect bigger open source dependencies	2024-08-22 20:27:58 -07:00
Debanjum Singh Solanky	238bc11a50	Fix, improve openai chat actor, director tests & online search prompt	2024-08-22 19:09:33 -07:00
Debanjum Singh Solanky	9986c183ea	Default to gpt-4o-mini instead of gpt-3.5-turbo in tests, func args GPT-4o-mini is cheaper, smarter and can hold more context than GPT-3.5-turbo. In production, we also default to gpt-4o-mini, so makes sense to upgrade defaults and tests to work with it	2024-08-22 19:04:49 -07:00
Debanjum Singh Solanky	8a4c20d59a	Enforce json response by offline models when requested by chat actors - Background Llama.cpp allows enforcing response as json object similar to OpenAI API. Pass expected response format to offline chat models as well. - Overview Enforce json output to improve intermediate step performance by offline chat models. This is especially helpful when working with smaller models like Phi-3.5-mini and Gemma-2 2B, that do not consistently respond with structured output, even when requested - Details Enforce json response by extract questions, infer output offline chat actors - Convert prompts to output json objects when offline chat models extract document search questions or infer output mode - Make llama.cpp enforce response as json object - Result - Improve all intermediate steps by offline chat actors via json response enforcement - Avoid the manual, ad-hoc and flaky output schema enforcement and simplify the code	2024-08-22 18:07:44 -07:00
Debanjum Singh Solanky	ab7fb5117c	Release Khoj version 1.21.2	2024-08-20 12:38:54 -07:00
Debanjum Singh Solanky	de24ffcf0d	Upgrade Axios, a desktop app dependency, to version 1.7.4	2024-08-20 12:32:36 -07:00
Debanjum Singh Solanky	a60baa55fb	Upgrade Django, a Khoj server dependency, to version 5.0.8	2024-08-20 12:32:00 -07:00
sabaimran	1ac8de6c3a	Release Khoj version 1.21.1	2024-08-20 11:55:34 -07:00
Debanjum Singh Solanky	5d59acd1f4	Stop pushing deprecated khoj-assistant package to pypi - Also skip uploading package version to it already exists on pypi This happens when a release is new khoj tagged release is created	2024-08-20 11:43:02 -07:00
sabaimran	f6ce2fd432	Handle end of chunk logic in openai stream processor	2024-08-20 10:50:09 -07:00
sabaimran	029775420c	Release Khoj version 1.21.0	2024-08-20 10:01:56 -07:00
sabaimran	4808ce778a	Merge pull request #892 from khoj-ai/upgrade-offline-chat-models-support Upgrade offline chat model support. Default to Llama 3.1	2024-08-20 11:51:20 -05:00
Debanjum Singh Solanky	58c8068079	Upgrade default offline chat model to llama 3.1	2024-08-20 09:28:56 -07:00
sabaimran	2d9dd81e76	Re-add authenticated decorator to api_chat.py /chat endpoint	2024-08-19 05:37:18 -05:00
sabaimran	2c5350329a	Remove the hashes from titles in found relevant notes	2024-08-18 22:31:15 -05:00
Debanjum Singh Solanky	acdc3f9470	Unwrap any json in md code block, when parsing chat actor responses This is a more robust way to extract json output requested from gemma-2 (2B, 9B) models which tend to return json in md codeblocks. Other models should remain unaffected by this change. Also removed request to not wrap json in codeblocks from prompts. As code is doing the unwrapping automatically now, when present	2024-08-16 14:16:29 -05:00
Debanjum Singh Solanky	ca45fce8ac	Break long links in train of thought to stay within chat page width	2024-08-16 14:16:29 -05:00
sabaimran	c0316a6b5d	Enable free tier users to have unlimited chats with the default chat model (#886 ) - Allow free tier users to have unlimited chats with default chat model. It'll only be rate-limited and at the same rate as subscribed users - In the server chat settings, replace the concept of default/summarizer models with default/advanced chat models. Use the advanced models as a default for subscribed users. - For each `ChatModelOption' configuration, allow the admin to specify a separate value of `max_tokens' for subscribed users. This allows server admins to configure different max token limits for unsubscribed and subscribed users - Show error message in web app when hit rate limit or other server errors	2024-08-16 12:14:44 -07:00
Debanjum	8dad9362e7	Improve search model config display for admin (#887 ) from aam-at/feature/improve_search_model_config_admin Currently, the search model config display for admins only shows the id of the search model config, which is not very informative. The changes enhances the admin console by displaying the name of the search model config (name), as well as the bi-encoder model (bi_encoder) and cross-encoder model (cross_encoder) along the id.	2024-08-16 07:33:55 -07:00
Debanjum	2b1482d2b4	Fix indexing content from Emacs #883 from aam-at/bugfix/fix_emacs_if Previously `force' was passed as a query param to the single indexing API. After the recent API updates, it is meant to select the API method to use (PATCH vs PATCH). Converting `force' argument to a bool fixes implementing this new behavior	2024-08-16 07:32:46 -07:00
Debanjum	0b568e204e	Add model_config for cross-encoder model (#885 ) from aam-at/feature/crossencoder_model_config Add `model_config' for the cross-encoder model, so the server admin can use models which require the `trust_remote_code' argument to run locally	2024-08-16 07:32:19 -07:00
Debanjum	39e566ba91	Improve Document, Online Search to Answer Vague or Meta Questions (#870 ) - Major - Improve doc search actor performance on vague, random or meta questions - Pass user's name to document and online search actors prompts - Minor - Fix and improve openai chat actor tests - Remove unused max tokns arg to extract qs func of doc search actor	2024-08-16 06:46:13 -07:00
Debanjum Singh Solanky	27ad9b1302	Remove unused max tokns arg to extract qs func of doc search actor	2024-08-13 12:53:39 +05:30
Debanjum Singh Solanky	f75606d7f5	Improve doc search actor performance on vague, random or meta questions - Issue Previously the doc search actor wouldn't extract good search queries to run on user's documents for broad, vague questions. - Fix The updated extract questions prompt shows and tells the doc search actor on how to deal with such questions The doc search actor's temperature was also increased to support more creative/random questions. The previous temp of 0 was meant to encourage structured json output. But now with json mode, a low temp is not necessary to get json output	2024-08-13 12:53:39 +05:30
Debanjum Singh Solanky	3675938df6	Support passing temperature to offline chat model chat actors - Use temperature of 0 by default for extract questions offline chat actor - Use temperature of 0.2 for send_message_to_model_offline (this is the default temperature set by llama.cpp)	2024-08-13 12:53:00 +05:30
Shantanu Sakpal	b5bcce7f85	Cycle through chat history in chat input on Obsidian (#861 ) * Add ability to cycle through the chat history in the chat input on Obsidian (similar to terminal history navigation) * Add mod key shortcut to cycle through chat history in chat input * Add shortcut help text in chat input placeholder --------- Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-08-12 23:55:25 -07:00
srikary12	05c0aa3882	Support exclusion file filters (#826 ) ### Overview Support exclude file filter in user search queries ### Details - All of the exclude file filter terms need to be satisfied - Any one of the include file filter terms should be satisfied ### Example - Search Query: what happened yesterday? -file:"tasks.org" -file:"work.md" file:"diary.org" file:"journal.org - Behavior: Query will try find relevant notes in any of `journal.org` or `diary.org` and not in `tasks.org` and not in `work.md` ### Details * Add support for exclusion file filters * Translate file filter to valid Django DB entry filter regex * Exclude all files when multiple exclude file filter in query Previously we were applying an "Or" filter, which would exclude any file mentioned in a query with multiple exclude file filter. This is not what we naturally mean when we ask excluding a file in a query * Rename, rearrange, deduplicate and add file filter tests Closes #728 --------- Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-08-12 05:41:54 -07:00
Alexander Matyasko	2d9bf14ecb	Improve search model config display for admin	2024-08-11 19:13:25 +08:00
Debanjum Singh Solanky	7815e02dd4	Release Khoj version 1.20.4	2024-08-11 16:00:13 +05:30
Debanjum Singh Solanky	d951e36945	Update khoj.el package description, it had gone stale	2024-08-11 15:52:46 +05:30
Debanjum Singh Solanky	16b31c3e35	Refresh automation data shown by edit automation card after update Previously required the automation page to be refreshed to see updates to the automation in the edit automation card. This would be seen when user tries to edit an automation multiple times (without a page refresh)	2024-08-11 15:52:46 +05:30
Debanjum Singh Solanky	f2f37ae444	Fix creating, editing automations that start weekly on Sunday	2024-08-11 15:52:46 +05:30
Debanjum Singh Solanky	ec9add9a51	Fix automation edit cards height. Scroll when card longer than screen	2024-08-11 15:52:46 +05:30
sabaimran	d99f03e4f3	If the list of choices in a chunk is empty, continue in openai response	2024-08-11 15:30:09 +05:30
Alexander Matyasko	f16b0f628b	Fix true/false evaluation in Emacs to prevent unintended index re-indexing Previously, the code incorrectly treated all non-nil values as true, leading to the index being re-indexed with the force flag whenever the user selected to update the index.	2024-08-11 17:24:11 +08:00
Alexander Matyasko	0e9e9648e6	Fix emacs if syntax	2024-08-11 17:24:11 +08:00
sabaimran	6f94a076f7	Add conversation_id parameter to the create_automation method	2024-08-11 10:45:13 +05:30
sabaimran	acb825f4f5	Bug fixes for automations - Pass the new conversation id as kwarg for the scheduled_chat function - For edit automations, re-use the original conversation id - Parse images correctly for image automations	2024-08-11 10:41:43 +05:30
Debanjum Singh Solanky	5075d13902	Give visual feedback when interact with chat message feedback buttons - Use color to provide visual feedback when hover, click on feedback buttons - Use color to provide visual feedback when hover on speech, copy buttons click - Add cooldown period before being able to send feedback on that message again. Avoids inadvertent multiple consecutive clicks on feedback buttons	2024-08-10 20:09:52 +05:30
Debanjum Singh Solanky	b3c6c8c84b	Add OpenGraph metadata to web app pages for improve social share links	2024-08-10 18:14:05 +05:30
Debanjum Singh Solanky	fc411091c8	Add apple favicon, load favicons for each web app page from assets folder	2024-08-10 18:14:05 +05:30
Debanjum Singh Solanky	a7623e64fa	Move Khoj webmanifest, assets to new web app public directory	2024-08-10 18:14:04 +05:30
sabaimran	af1d4b9ba4	Remove the premium requirement for speech for now	2024-08-10 14:10:12 +05:30
sabaimran	1d581464e6	Filter out any undefined agents when rendering the home page	2024-08-10 13:33:55 +05:30
sabaimran	acf1c14122	Release Khoj version 1.20.3	2024-08-09 18:11:11 +05:30
sabaimran	7d3a25f8c0	Handle processing case for the schedule leader process lock when it's empty	2024-08-09 16:37:06 +05:30
sabaimran	faf3584acd	Fix automations edit button	2024-08-09 14:21:11 +05:30
sabaimran	5ef198a5b2	Improve default background color styling for inputs	2024-08-08 18:08:05 +05:30
sabaimran	c08b9e89f0	Update test_db_lock with new function name	2024-08-08 13:03:01 +05:30
sabaimran	64b2073e63	In the time-based job for managing the schedule leader, and logic to create a new lock when the current one is expired.	2024-08-08 12:42:59 +05:30
sabaimran	7ee0d9067d	Fix apostrophe issue in copy text when commandempty in settings page	2024-08-08 11:41:10 +05:30
sabaimran	f28693c8c7	create a useismobilewidth method for standardized mobile view detection.	2024-08-07 21:04:44 +05:30
sabaimran	2943bed5d4	Update category colors	2024-08-07 18:51:31 +05:30
sabaimran	37afa3411f	Improve the file upload experience in the settings page	2024-08-07 18:51:20 +05:30
sabaimran	1ee21f5150	Add support for showing files outside of conversation view and linking people to manage files in settings	2024-08-07 18:50:53 +05:30
sabaimran	93f4ceabc1	Add drag/drop file upload support to the chat input area	2024-08-07 18:50:19 +05:30
sabaimran	370ebdee24	Standardized the mobile width calculation	2024-08-07 18:49:06 +05:30
sabaimran	52fed6023f	Overlay the side panel on top of other content	2024-08-07 18:46:06 +05:30
Alexander Matyasko	823f8d58bb	Add model_config for crossencoder model Add model_config for crossencoder model, so the user can use models which require trust_remote_code.	2024-08-07 18:00:12 +08:00
sabaimran	09b71846be	Remove favicon.ico as it's interfering with favicon rendering in the home page	2024-08-07 11:53:25 +05:30
Debanjum Singh Solanky	167ef000f4	Fix chat API for non-streaming mode json response	2024-08-06 19:27:54 +05:30
sabaimran	00ee4c2697	Release Khoj version 1.20.2	2024-08-06 16:16:33 +05:30
sabaimran	d4a8ff0683	Support workflow dispatch events for running the pypi.yml job	2024-08-06 15:55:39 +05:30
sabaimran	ccccb8e7e6	Just ignore the static directory outputting by django's static collection	2024-08-06 15:51:54 +05:30
sabaimran	c4be3b43e5	Add the compiled folder to the list of directories to look through for static templates	2024-08-06 14:50:44 +05:30
sabaimran	265d2a79be	Remove duplicate assets from being included in the pypi output	2024-08-06 13:51:37 +05:30
sabaimran	24d0fdb262	Fix directory referenceds in pypi.yml configuration for compiled folder	2024-08-06 13:38:34 +05:30
sabaimran	23b1b36f8c	Fix directory referenceds in pypi.yml configuration for compiled folder	2024-08-06 13:31:42 +05:30
sabaimran	81c75e1024	Fix static file folder path for the pypi build - Since the .gitignore will ignore any of the assets in the src/ folder when building the package wheel, we need to output the static assets to another folder just for the python pypi package. Use /compiled for this.	2024-08-06 13:24:26 +05:30
sabaimran	694f551625	Fix mkdir step when copying generated files	2024-08-06 10:17:56 +05:30
sabaimran	7607abc726	Release Khoj version 1.20.1	2024-08-06 10:05:41 +05:30
sabaimran	e9f9d92989	Try to manually copy the built files into where the src directory should be for the pypi build	2024-08-06 10:05:06 +05:30
Debanjum	c23688e2de	Fixes and Improvements Post Spring UX Release (#880 ) - Auto focus on email input on login screen for smoother login experience - Use file icon associated with search page results. Improve search bar - Show logged in user's email in nav menu for context - Use previous icons with eyes for search, agents and automations items in nav menu	2024-08-05 14:32:31 -07:00
Debanjum Singh Solanky	a4388c5e65	Use custom Khoj Icons for Search, Agents & Automation in Nav Menu - Update agents, automations, search svg icons	2024-08-06 02:55:29 +05:30
sabaimran	e9d6899fc2	Change the way the export is created for the pypi package in order to transfer static files out of the tmp shell	2024-08-05 22:46:54 +05:30
sabaimran	b17577c138	Fix configuration for default voice model settings	2024-08-05 19:57:21 +05:30
Debanjum Singh Solanky	ec106d743d	Use file icon associated with search page results. Improve search bar	2024-08-05 19:24:39 +05:30
Debanjum Singh Solanky	4258392fc7	Auto focus on email input on login screen for smoother login experience	2024-08-05 19:24:16 +05:30
Debanjum Singh Solanky	020a956c89	Show user email address on settings menu for logged in account context	2024-08-05 19:19:47 +05:30
sabaimran	998d08f155	Fix logic for deletion to automatically re-render the side pane	2024-08-05 18:07:20 +05:30
sabaimran	20d95dc45e	Add the favicon.ico file to the public directory of app.khoj.dev	2024-08-05 18:04:03 +05:30
sabaimran	1eab6c8590	Add additional icons for agents, pencil line and chalkboard	2024-08-05 17:23:29 +05:30
sabaimran	bafda233e2	Add standlone khoj_domain for allowed_hosts	2024-08-05 17:11:37 +05:30
Debanjum Singh Solanky	e412ed3bcb	Release Khoj version 1.20.0	2024-08-05 16:25:21 +05:30
Debanjum Singh Solanky	9f785dbafe	Format web app package.json using prettier	2024-08-05 16:23:31 +05:30
Debanjum Singh Solanky	7d3a208f8b	Update bump version script to bump new next.js web app version too	2024-08-05 16:20:47 +05:30
sabaimran	2a63439b16	Merge pull request #879 from khoj-ai/features/migrate-to-spring-ui Migrate all existing pages except login to the new spring ui	2024-08-05 03:45:02 -07:00
sabaimran	b7ed32f455	Merge branch 'master' of github.com:khoj-ai/khoj into features/migrate-to-spring-ui	2024-08-05 16:12:46 +05:30
sabaimran	7e6b611a19	Fix typo for Obsidian	2024-08-05 15:55:06 +05:30
sabaimran	34d54c75f7	Lint new changes again	2024-08-05 15:54:50 +05:30
Debanjum Singh Solanky	7cb14ff07a	Add dev setup script. Run prettier on web app pre-commit	2024-08-05 15:49:31 +05:30
sabaimran	91047d1619	Use a png for the windows desktop icon	2024-08-05 15:29:30 +05:30
sabaimran	1151d14466	Add a separate windows object in the todesktop configuration	2024-08-05 14:27:56 +05:30
sabaimran	c56072aa7b	Update todesktop runtime and use the icns file for the todesktop configuration	2024-08-05 14:19:38 +05:30
sabaimran	484b0aa96b	Use the newer, simpler favicon across desktop and documentation. Update the macos icon set	2024-08-05 14:06:04 +05:30
sabaimran	1b35a3b16e	Fix link to login in the nav menu	2024-08-05 12:32:19 +05:30
sabaimran	5a5bbe3852	Remove deprecate views, assets	2024-08-05 12:31:47 +05:30
sabaimran	c61b289bd1	Migrate all existing pages except login to the new spring ui	2024-08-05 12:17:56 +05:30
sabaimran	f835e330b8	Fix selection of icons, colors, add examples for personal finance	2024-08-05 12:08:18 +05:30
sabaimran	af6a70c9fb	Fix fuschia spelling in the colorutils file as well	2024-08-05 11:51:45 +05:30
sabaimran	e0775446c9	fix spelling of fuschia :(	2024-08-05 11:50:11 +05:30
sabaimran	de1cd8c264	Clean up some of the suggestions code, improve randomness of cards'	2024-08-05 11:19:50 +05:30
sabaimran	37e261ff93	Show connected icon when files or notion is indexed	2024-08-05 10:33:18 +05:30
sabaimran	8bc28fb11d	Merge pull request #878 from khoj-ai/features/big-upgrade-chat-ux Spring UI: Modernize UX for normie development	2024-08-04 21:32:18 -07:00
sabaimran	22cfedcaff	In the chat history side panel, order conversations by updated time	2024-08-05 09:48:00 +05:30
sabaimran	8220dc6115	Include the updated_at datetime when returning a conversation session	2024-08-05 09:47:13 +05:30
Debanjum Singh Solanky	e296d387e1	Clean duplicate title shown in reference snippets of hierarchical docs Hierarchical documents like org-mode, markdown have their ancestry shown in first line. Remove it to show cleaner, deduplicated reference text from org-mode, markdown files	2024-08-05 04:59:06 +05:30
Debanjum Singh Solanky	95c2a52775	Show file icons in references for first party supported document types Add org, markdown, pdf, word, icon and default file icons to simplify identifying file type used as reference for generating chat response	2024-08-05 04:59:06 +05:30
Debanjum Singh Solanky	18a973b666	Fix name of Khoj logo component file in web app	2024-08-05 04:59:06 +05:30
Debanjum Singh Solanky	842036688d	Format next.js web app with prettier	2024-08-05 04:59:06 +05:30
Debanjum Singh Solanky	41bdd6d6d9	Throw warning on prettier formatting issues in web app	2024-08-05 03:58:20 +05:30
Debanjum Singh Solanky	1cdfa8087c	Update Khoj tagline to "Your Second Brain"	2024-08-05 02:27:05 +05:30
Debanjum Singh Solanky	46f928165c	Fix deep linking to settings page cards from docs	2024-08-05 02:27:05 +05:30
sabaimran	f7840782a4	Fix broken rendering of math equations via katex	2024-08-05 00:20:43 +05:30
Debanjum Singh Solanky	b803ed19d3	Add simplified, cleaner khoj logo images to web app static dir	2024-08-04 23:40:21 +05:30
sabaimran	69c3635ce7	Merge pull request #877 from khoj-ai/features/fit-and-finish-new-ux Fit and finish updates for the new UX	2024-08-04 10:26:33 -07:00
Debanjum Singh Solanky	51e56e17ee	Align padding of agent pills to home screen chat input on small screens	2024-08-04 21:57:54 +05:30
Debanjum Singh Solanky	b744dffefd	Align voice message button with send chat message button style	2024-08-04 21:04:38 +05:30
Debanjum Singh Solanky	70f670dcf7	Show send button when text in chat input else voice message button Utilize chat footer space more efficiently. This is especially useful on small screens - Send button is anyway only enabled when there is text in chat input - Otherwise voice message button is better to show by default	2024-08-04 19:25:49 +05:30
Debanjum Singh Solanky	c627527a6f	Reorder automation card actions buttons. Put Delete action last	2024-08-04 19:01:11 +05:30
Debanjum Singh Solanky	c7b67a978e	Align agents and automation page structure, widths and spacings - Remove invalid call to styles.main - Remove unnecessary top padding above side pane to keep side pane at consistent position across web app - Use same pageLayout styles and styling structure on agent like automation - Vertically center automation section and page title on it's row - Fix applying flex vs grid with tailwind	2024-08-04 19:01:11 +05:30
Debanjum Singh Solanky	60af173c4a	Improve responsive spacing of chat page footer buttons - Remove x axis footer padding on small screens to preserve space, keep equal spacing between footer items - Add 1rem margin to buttons to not have overlap in boundary - Add 1rem y-axis padding to chat footer to not have focus boundary leave the footer boundary on smaller screens	2024-08-04 19:01:10 +05:30
sabaimran	4f2fcc82f0	Make the input area only rounded on the top corners when in mobile view - Create better styling for the input area buttons, resizing in mobile and creating more even height with a more minimal send button	2024-08-04 18:28:33 +05:30
sabaimran	322fb34d4b	Add top padding to the automations header to align it with the agents page	2024-08-04 12:27:37 +05:30
sabaimran	3e1e4a1857	Move the clients section back to the bottom	2024-08-04 11:32:22 +05:30
Debanjum Singh Solanky	caf5c3d74c	Link to Khoj manifest in home page metadata to support PWA install Installing Khoj as PWA was supported in previous web UX as well. This just adds link to the existing webmanifest to continue support for installing Khoj as PWA with new web UX	2024-08-04 05:06:38 +05:30
Debanjum Singh Solanky	692058bbdd	Fix time of day calculation logic Previously between 00:00 - 04:00 it'd trigger afternoon insteead of evening	2024-08-04 04:53:50 +05:30
Debanjum Singh Solanky	015c155582	Simplify structure of chat page to match other pages	2024-08-04 04:43:55 +05:30
Debanjum Singh Solanky	bf71e472c4	Load static assets from Khoj server in dev environment	2024-08-04 04:25:48 +05:30
Debanjum Singh Solanky	f38c072f07	Update chat session title in side pane to new title after rename Previously the rename wasn't updating the chat session title. We'd have to refresh the page or side pane to get latest chat session names after rename action.	2024-08-04 04:25:48 +05:30
Debanjum Singh Solanky	2f7a8698a0	Fix width and equalize spacing between buttons in chat footer Previously the footer's right border wasn't visible on small screens due to usage of w-full Use mr-1 on send button instead of px-1 on chat input parent to eualize chat footer buttons spacing	2024-08-04 04:25:48 +05:30
Debanjum Singh Solanky	5541bc09c8	Prefix Khoj page breadcrumbs to chat page title for orientation Allows tab search by looking at standard prefix. Still allows page title based identification of different Khoj chat sessions	2024-08-04 04:25:48 +05:30
Debanjum Singh Solanky	6a9865ace7	Only show API keys card in non anon mode - Show informative toast messages on copy, delete of API keys - Onle show API keys card in non anonymous mode. API keys aren't required (and is disabled on server side) in anon mode. Not showing card at all in anon mode reduces chance of unnecessary confusion	2024-08-04 04:25:45 +05:30
Debanjum Singh Solanky	f28208d35b	Only show chat sessions uptil last month in side pane - Reduce chat title size	2024-08-04 01:52:08 +05:30
sabaimran	75559a55aa	only show search if logged in. update agents icon	2024-08-03 23:23:03 +05:30
sabaimran	185dcb61f7	Update the settings page to better match the design	2024-08-03 20:49:19 +05:30
sabaimran	3e74d383fe	Strip quotes from the response mode llm response	2024-08-03 17:33:20 +05:30
sabaimran	87e97e40f4	Resolve various warnings during export	2024-08-03 17:33:04 +05:30
sabaimran	5a75f2c00f	Use filled icons when side panel is open	2024-08-03 15:42:49 +05:30
sabaimran	e6260a7bb6	Improve oadding for h9me page chat iput area and inc margin on api keys	2024-08-03 15:33:33 +05:30
Debanjum Singh Solanky	7a8a9fc807	Auto focus cursor on search input when open search page	2024-08-03 13:52:36 +05:30
Debanjum Singh Solanky	30304ccc56	Fix session drawer to fit title, action triple-dot in width on mobile	2024-08-03 13:52:36 +05:30
Debanjum Singh Solanky	5b17fa5dda	Set home, chat page height so footer, header visible w/o scroll on phone Set dynamic view height of page to 100%	2024-08-03 13:52:36 +05:30
sabaimran	687a881ad2	Remove the agents header in the loading view	2024-08-03 13:44:56 +05:30
sabaimran	0db630a123	image cards should be /image, not /paint	2024-08-03 13:44:31 +05:30
sabaimran	261f62e353	Fix automations mobile view by using a wrapper component that chooses a dialog or a drawer	2024-08-03 13:44:17 +05:30
sabaimran	4ce17acd00	Set greeting message to longer text in default view. Only show two agents in mobile	2024-08-03 12:14:58 +05:30
sabaimran	6c35ee4960	Revert height of the side panel on the home page	2024-08-03 11:59:07 +05:30
Debanjum Singh Solanky	e66adf60c5	Have the home and chat page take full height, reduce greeting top space	2024-08-03 11:54:12 +05:30
Debanjum Singh Solanky	cf8745ef78	Improve structure of chat footer on mobile to put agents above input	2024-08-03 11:31:57 +05:30
Debanjum Singh Solanky	529ffdb7e3	Make Title, Chat Footer Icons larger to ease click, tap on Mobile	2024-08-03 11:23:29 +05:30
Debanjum Singh Solanky	8d1c5226ec	Remove unnecessary debug logs	2024-08-03 09:55:31 +05:30
sabaimran	f136214290	Improve the nav menu in the not logged in experience	2024-08-03 09:44:04 +05:30
sabaimran	f9606ce9b7	Merge branch 'features/fit-and-finish-new-ux' of github.com:khoj-ai/khoj into features/fit-and-finish-new-ux	2024-08-03 09:34:04 +05:30
Debanjum Singh Solanky	d8fe677933	Prevent overflow on Search page by search results	2024-08-03 07:07:35 +05:30
Debanjum Singh Solanky	f3765a20b9	Improve content alignment on automation page for small screens - Left align email, location, timezone pills on small screens - Indent user enabled automations to improve delineation between sections	2024-08-03 07:05:15 +05:30
Debanjum Singh Solanky	a6e1b2c7cb	Style nav menu button and expand nav menu item click area to full-width Style profile pircture button on nav menu - Use primary colored ring around subscribed user profile on nav menu - Use gray colored ring around non-subscribed user profile on nav menu - Use upper case initial as profile pic for user with no profile pic - Click anywhere on nav menu item to trigger action Previously the actual clickable area was smaller than the width of the nav menu item	2024-08-03 05:43:24 +05:30
Debanjum Singh Solanky	eed9e401a2	Improve alignment of title bar elements	2024-08-03 04:11:58 +05:30
sabaimran	f188396395	Prompt to login when authenticated, click on suggestion card - Improve styling for the side panel when not logged in	2024-08-03 01:42:32 +05:30
sabaimran	9c5ff1699a	Use new nav menu alignment in the settings page	2024-08-03 01:42:32 +05:30
sabaimran	b1d3979ed9	Fix navmenu in settings, share/chat pages	2024-08-03 01:42:21 +05:30
sabaimran	5f8b76c8f2	Fix layout/styling of the factchecker app	2024-08-03 01:07:59 +05:30
sabaimran	1bb746aaed	Adjust spacing when side panel is opened	2024-08-03 01:07:59 +05:30
sabaimran	07b3bdf181	Update nav menu styling to include everything in one header - Move the nav menu into the chat history side panel component, so that they both show up on one line - Update all pages to use it with the new formatting - in mobile, present the sidebar button, home button, and profile button evenly centered in the middle	2024-08-03 01:07:55 +05:30
Debanjum Singh Solanky	e62888659f	Only show greeting once userConfig is fetched from server - Pass userConfig from Home as prop to chatBodyData component with loading state - Pass loading state of userConfig to allow components to handle rendering dependent elements once it is loaded	2024-08-02 20:25:09 +05:30
Debanjum Singh Solanky	0adee07d40	Update home page greetings to use user name, when available	2024-08-02 20:25:09 +05:30
sabaimran	bbe7491f2f	Prompt to login when authenticated, click on suggestion card - Improve styling for the side panel when not logged in	2024-08-02 20:12:18 +05:30
sabaimran	d48a789442	Use new nav menu alignment in the settings page	2024-08-02 19:44:30 +05:30
sabaimran	e6014e89bf	Fix navmenu in settings page	2024-08-02 19:28:59 +05:30
sabaimran	1509c536f9	Fix layout/styling of the factchecker app	2024-08-02 19:06:01 +05:30
sabaimran	0d8cdee60a	Adjust spacing when side panel is opened	2024-08-02 17:49:50 +05:30
sabaimran	d3c07a098d	Update nav menu styling to include everything in one header - Move the nav menu into the chat history side panel component, so that they both show up on one line - Update all pages to use it with the new formatting - in mobile, present the sidebar button, home button, and profile button evenly centered in the middle	2024-08-02 17:46:13 +05:30
sabaimran	5a8ea884a9	Use new HTTP stream format within the new UX Use updated format for HTTP streamed responses from the Khoj server in the new chat UX Remove references to the websocket connected field, as websocket use has been deprecated	2024-08-02 02:35:10 -07:00
Debanjum Singh Solanky	02b46a1784	Render references after chat response is streamed for smoother render Otherwise the Khoj's chat response is filling up in between the streamed message and already rendered references section at the bottom of the message Define OnlineContext type to simplify typing online context param across other interfaces and functions	2024-08-02 14:11:34 +05:30
Debanjum Singh Solanky	a733e5c1d4	Remove unused handleCompiledReferences chat functions	2024-08-02 13:18:55 +05:30
Debanjum Singh Solanky	7858aff2e2	Trigger welcomeConsole only once on chat, shared chat page load	2024-08-02 13:18:01 +05:30
Debanjum Singh Solanky	cab0957fd3	Just show Khoj logo on title bar on small screens Continue to show logo + text on larger screens	2024-08-02 13:18:01 +05:30
Debanjum Singh Solanky	3f607b3978	Add icons, improve description of home, chat & search page metadata	2024-08-02 13:18:01 +05:30
Debanjum Singh Solanky	4f783b911c	Update DOMPurify imports correctly to resolve compilation warnings	2024-08-02 13:18:01 +05:30
sabaimran	4492017b96	Move processmessagechunk file into a common chat function	2024-08-02 12:31:43 +05:30
sabaimran	13dee7d89e	Remove status update for understanding query	2024-08-01 19:22:21 +05:30
sabaimran	6babd5c0ce	Merge pull request #876 from khoj-ai/features/use-intl-phone-input-settings Use international phone number input and verify whatsapp flow	2024-08-01 04:52:02 -07:00
sabaimran	1b2cad2a2c	Use af in the default state and configure the phone number input styling	2024-08-01 17:04:57 +05:30
sabaimran	723b37955a	Disable input for phone number only if its pending verification	2024-08-01 16:45:38 +05:30
sabaimran	84dd1b57fe	Use an intl phone input number field and fix the whole verification flow - There were some state mismatches in configuring a whatsapp number. This commit fixes those issues and uses an external library for phone number validation	2024-08-01 16:44:17 +05:30
sabaimran	ed16914ac3	Remove deprecated fields and fix erroneous export in settings page	2024-08-01 14:45:54 +05:30
sabaimran	7941f4d54d	Remove references to deprecated setupwebsocket function	2024-08-01 14:43:17 +05:30
sabaimran	db93ac5d4b	Merge branch 'features/big-upgrade-chat-ux' of github.com:khoj-ai/khoj into features/use-new-sse-in-new-chat-ux	2024-08-01 14:41:50 +05:30
sabaimran	fd0e0405af	Fix logic for setting and sending the initial chat message from the home page - Load agents only once when the page loads, rather than triggering constant re-renders	2024-08-01 13:53:16 +05:30
sabaimran	9a43622cef	Remove usages of the websocketconnected variable	2024-08-01 13:14:23 +05:30
sabaimran	bfeb64b48f	Migrate the shared chat page to also use the new SSE streaming format	2024-08-01 13:14:09 +05:30
sabaimran	833553c3a3	Move conversation commands selection earlier to include in telemetry collected	2024-08-01 12:52:41 +05:30
sabaimran	dbbcf2564f	Remove the usage of emojis in the incremental status updates	2024-08-01 12:52:05 +05:30
sabaimran	cd85a51980	Ingest new format for server sent events within the HTTP streamed response - Note that the SSR for next doesn't support rendering on the client-side, so it'll only update it one big chunk - Fix unique key error in the chatmessage history for incoming messages - Remove websocket value usage in the chat history side panel - Remove other websocket code from the chat page	2024-08-01 12:50:43 +05:30
Debanjum	60870a7a3e	Create Settings Page in new Web App (#872 ) - Details - Add Profile Client, Content Sections - Make Multi Step Cards for Whatsapp, Files, Notion Integrations - Align Settings page with new Baraabar UX	2024-07-30 06:59:42 -07:00
Debanjum Singh Solanky	32ce564b7c	Remove unused Files Connect button and setup Github content card	2024-07-30 18:55:14 +05:30
Debanjum Singh Solanky	ecb873c488	Only allow search model to be updated without being subscribed Do not make fetch request to server if user is not subscribed	2024-07-30 18:50:57 +05:30
Debanjum Singh Solanky	f58cff5bcc	Increase rate limit in the api/content vs deprecated indexer API	2024-07-30 16:09:26 +05:30
Debanjum Singh Solanky	f0bb6883f8	Improve Delete experience on Files Card in Settings Page Improve placeholder text for notion API key and Whatsapp number (mention country code required)	2024-07-30 15:25:14 +05:30
sabaimran	b1eb564706	remove the optional pydantic typing from the files param	2024-07-30 15:25:14 +05:30
sabaimran	4a7efdc552	Use patch in place of put in the indexer API call, ensure that files are not being required in the indexer path	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	ffbf57292c	Create synced files management modal on the settings page Use a Command Dialog to allow easier filtering of files to view without having to leave the settings page	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	ccc46a09b5	Add new API to batch delete a list of files by filename - Rearrange DELETE content API definitions order to go from more specific to more general - Create batched file deletion DB adapter	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	9d86cb57ac	Build UX to Connect and Manage Notion Integration via Settings Page	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	7ee179ee1f	Return user's Notion token in API call for detailed user settings	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	00a908ae12	Move subscription card to Profile settings section. Remove Billing section - Why Profile section and billing section looked too empty (1 card each). Combining them makes the setting page look more complete. Shows subscription options early on - Details - Made Futurist text orange - Made Unsubscribe a down button instead of cloud slash - Updated toast title to subscription - Improve Futurist trial title and description	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	058c902dc7	Delete unused npm package-lock.json as Web app uses yarn.lock instead	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	b8c9b3ffa3	Reduce padding height of input area on new home page	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	8a447107dd	Set user name on clicking Save button on settings page	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	44e0b20202	Align Content, Client & Billing settings sections with new designs	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	51e83bcc26	Improve responsive behavior of settings cards	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	efcad4996d	Add phone number verification for Whatsapp to new settings page	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	48548684c0	Add card to connect Whatsapp to Khoj on settings page	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	8ec90f194f	Add title icons for each content section card on settings page	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	60cdf61737	Create billing section for managing subscription on settings page - Replicate behavior on current settings.html page - Improve text for each subscription state to make it more informative, fun	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	2e165a0e0a	Create client API keys section on settings page - Add table shadcn component to use in API keys settings section - In dev mode, route requests to auth to khoj server at localhost:42110	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	00fa4fa0fa	Save model on selecting model in dropdown. No extra save action reqd - Remove now unnecessary button to Save in Card with dropdown - Use toast to show success, failure (not working) - Rename language to search, Move it to features section. Add icon to the card	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	13292fc4ca	Add icons to card headings	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	a5a06da3fc	Use Dropdown component for model options. Make cards more responsive - Ensure model name doesn't stretch or shrink dropdown width from parent card width - Ensure buttons flex wrap on smaller displays	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	ade2f6f5d1	Rename selected voice model in get config API response for consistency - Update references in new and old web client settings - Arrange new client settings props and add header comments similar to - config response for code readability	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	b3253562a5	Dynamically set Content cards buttons based on already setup or not	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	7e8e80f29e	Create config page using tailwind, shadcn components, styling - Include side pane but with only the account info in it - Replicate styling of the old config page	2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky	88007d7552	Get user config in the new web client from the new user config APIs	2024-07-30 15:25:14 +05:30
sabaimran	a6339bb973	Add mroe card suggestions and simplify color selection for suggestion cards	2024-07-29 19:11:39 +05:30
sabaimran	551630f0f1	Code clean-up and some fit and finish - Add a lot more suggestions cards, improve mobile rendering of suggestion cards, improve alignment of chat input, shift message when starts recording voice, remove dead code	2024-07-28 15:19:36 +05:30
sabaimran	413255ddc7	Add closing tag to whatsapp qr code image	2024-07-28 13:50:38 +05:30
sabaimran	41eb85c933	Update the docs for whatsapp to include the QR code	2024-07-28 13:43:50 +05:30
sabaimran	1a1d9c7257	Merge branch 'master' of github.com:khoj-ai/khoj into features/big-upgrade-chat-ux	2024-07-27 14:18:05 +05:30
Raghav Tirumale	1685c60e3c	Nav Menu Upgrades and Minor UX Improvements (#869 ) * Converted navigation menu into a dropdown menu * Moved collapsed side panel menu icons into top row * Auto refresh when conversation is deleted to update side panel and route back to main page if deletion is on current conversation * Highlight the current conversation in the side panel * Dynamic homepage messages with current day and time of day. * `colorutils` upgraded to have more expansive tailwind color options and dynamic class name generation. * Converted create agent button alert into shadcn `ToolTip` * Colored lines and icons for agents in chat window * Cleaned up border styling in dark mode * fixed three dot menu in side panel to be more easier to click * Add the KhojLogo import in the nav menu and use a default user profile icon when not authenticated * Get rid of custom --box-shadow CSS variable * Pass the agent metadat through the chat body data in order to style the send button * Add login to the unauthenticated login view, redirecto to home if conversation history not loaded * Set a max height for the input text area * Simplify tailwind class names --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-07-27 14:12:00 +05:30
Debanjum	8503d7a07b	Split Configure API into Content, Model API paths (#857 ) ## Major: Breaking Changes - Move API endpoints under /configure/<type>/model to /api/model/<type> - Move API endpoints under /api/configure/content/ to /api/content/ - Accept file deletion requests by clients during sync - Split /api/v1/index/update into /api/content PUT, PATCH API endpoints ## Minor: Create New API Endpoint - Create API endpoints to get user content configurations Related: #852	2024-07-26 23:48:41 -07:00
Debanjum Singh Solanky	878cc023a0	Fix and improve openai chat actor tests - Use new form of passing doc references to now passing chat actor test - Fix message list generation from conversation logs provided Strangely the parent conversation_log gets passed down to message_to_log func when the kwarg is not explicitly specified	2024-07-26 23:53:47 +05:30
Debanjum Singh Solanky	a47a54f207	Pass user name to document and online search actors prompts This should improve the quality of personal information extraction from document and online sources. The user name is only used when it is set	2024-07-26 23:53:17 +05:30
sabaimran	e86143dbb0	Merge pull request #867 from khoj-ai/features/search-page-v2 Update the search page	2024-07-26 08:08:04 -07:00
sabaimran	eb5af38f33	Release Khoj version 1.17.0	2024-07-26 20:14:45 +05:30
Raghav Tirumale	5dcac18ba5	New Agents Page User Interface (#866 ) Changes for new agents page - Modernized agent cards - Responsive design to support mobile users - Button for users to create their own agents (coming soon) - Optimized to use tailwind and icon utils - Side panel added for quick access to conversations	2024-07-26 20:12:31 +05:30
Debanjum Singh Solanky	3daef910c0	Remove stale comment from api content	2024-07-26 20:05:35 +05:30
sabaimran	44d34f9090	Update the unit test for the subscribed user	2024-07-26 19:59:01 +05:30
sabaimran	377f7668c5	Merge pull request #858 from khoj-ai/use-sse-instead-of-websocket Use Single HTTP API for Robust, Generalizable Chat Streaming	2024-07-26 07:11:54 -07:00
sabaimran	6607e666dc	Increase rate limit for data upload packet size in indexer.py	2024-07-26 19:35:32 +05:30
Debanjum Singh Solanky	778c571288	Use enum to track chat stream event types in chat api router	2024-07-26 00:19:43 +05:30
sabaimran	7482797605	Add some better default states for no files found, prompt to search. Add link to search in the file search compnoent in side panel	2024-07-25 13:00:28 +05:30
sabaimran	662dffea3b	Press enter to search	2024-07-24 19:28:38 +05:30
sabaimran	19cd607c96	Style the see content button correctly	2024-07-24 18:28:23 +05:30
sabaimran	75a370cc06	Implement focus mode to click into full text of the note	2024-07-24 18:00:33 +05:30
sabaimran	5adbfe14ab	Add a search page that just renders truncated results when you click search	2024-07-24 17:43:19 +05:30
sabaimran	52db15706d	Remove unused styling	2024-07-24 17:42:36 +05:30
sabaimran	cfe7a1068e	Update the navmenu title if prop is updated and undefined	2024-07-24 17:41:31 +05:30
Debanjum Singh Solanky	ebe92ef16d	Do not send references twice in streamed image response Remove unused image content to reduce response payload size. References are collated, sent separately	2024-07-24 17:18:14 +05:30
Debanjum Singh Solanky	37b8fc5577	Extract events even when http chunk contains partial or mutiple events Previous logic was more brittle to break with simple unbalanced '{' or '}' string present in the event data. This method of trying to identify valid json obj was fairly brittle. It only allowed json objects or processed event as raw strings. Now we buffer chunk until we see our unicode magic delimiter and only then process it. This is much less likely to break based on event data and the delimiter is more tunable if we want to reduce rendering breakage likelihood further	2024-07-24 17:17:39 +05:30
sabaimran	4d30e5b158	Fix indexing error for notion, expecting image and docx in dict	2024-07-24 16:58:31 +05:30
sabaimran	694bedc25b	Add support for text to speech and speech to text (#863 ) - Add support for text to speech, speech to text. Add loading and responsive indicators to reflect state. - When streaming for speech to text, show incremental transcription in the message input field - When streaming text to speech, and a pause button in the chat message to allow user to stop playback	2024-07-24 14:36:40 +05:30
Raghav Tirumale	3e4325edab	Upgrade: New Home Screen for Khoj (#860 ) * V1 of the new automations page Implemented: - Shareable - Editable - Suggested Cards - Create new cards - added side panel new conversation button - Implement mobile-friendly view for homepage - Fix issue of new conversations being created when selected agent is changed - Improve center of the homepage experience - Fix showing agent during first chat experience - dark mode gradient updates --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-07-24 13:16:19 +05:30
Debanjum Singh Solanky	70201e8db8	Log total, ttft chat response time on start, end llm_response events - Deduplicate code to collect chat telemetry by relying on end_llm_response event - Log time to first token and total chat response time for latency analysis of Khoj as an agent. Not just the latency of the LLM - Remove duplicate timer in the image generation path	2024-07-23 23:21:12 +05:30
Debanjum Singh Solanky	b36a7833a6	Remove the old mechanism of streaming compiled references Do not need response generator to stuff compiled references in chat stream using "### compiled references:" separator. References are now sent to clients as structured json while streaming	2024-07-23 19:53:51 +05:30
Debanjum Singh Solanky	eb4e12d3c5	s/online_context/onlineContext chat API response field for consistency This will align the name of the online context field returned by current chat message and chat history	2024-07-23 19:50:43 +05:30
Debanjum	498fe2458c	Support Gemma 2 Model Family for Offline Chat (#855 ) ## Overview - Gemma 2 is a new open model family by Google. They've released a 9B, 29B param model. A 2B model is also expected. - It performs really well on the Chatbot arena and shows good performance when testing within Khoj as well. - Llama.cpp support for Gemma 2 architecture seems to have stabilized - If Gemma 2 performs well in further testing, it can be made the default offline chat model for Khoj - Once the 2B param model is released, the model size to download can be automatically chosen based on (V)RAM available ## Major - Support Gemma 2 for Offline Chat - Improve and fix chat model prompts for better, consistent context ## Minor - Fix and improve offline chat actor, director tests - Improve offline chat truncation to consider chat message delimiter tokens	2024-07-23 06:57:02 -07:00
Debanjum Singh Solanky	0277d16daf	Share desktop chat streaming utility funcs across chat, shortcut views Null check menu, menuContainer to avoid errors on Khoj mini	2024-07-23 19:16:33 +05:30
Debanjum Singh Solanky	e439a6ddac	Use async/await in web client chat stream instead of promises Align streaming logic across web, desktop and obsidian clients	2024-07-23 18:17:47 +05:30
Debanjum Singh Solanky	fafc467173	Put loading spinner at bottom of chat message in web client	2024-07-23 18:17:47 +05:30
Debanjum Singh Solanky	fc33162ec6	Use new chat streaming API to show Khoj train of thought in Desktop app Show loading spinner at end of current message	2024-07-23 18:17:47 +05:30
Debanjum Singh Solanky	c5ad172616	Keep loading animation at message end & reduce lists padding in Obsidian Previously loading animation would be at top of message. Moving it to bottom is more intuitve and easier to track. Remove white-space: pre from list elements. It was adding too much y axis padding to chat messages (and train of thought)	2024-07-23 17:56:03 +05:30
Debanjum Singh Solanky	54b4203683	Update chat API client tests to mix testing of batch and streaming mode	2024-07-23 17:56:03 +05:30
Debanjum Singh Solanky	3f5f418d0e	Use new chat streaming API to show Khoj train of thought in Obsidian client	2024-07-23 17:56:03 +05:30
Debanjum Singh Solanky	8303b09129	Convert snake case to camel case in chat view of obsidian plugin	2024-07-23 15:29:12 +05:30
Debanjum Singh Solanky	b224d7ffad	Simplify get_conversation_by_user DB adapter code	2024-07-23 14:51:11 +05:30
Debanjum Singh Solanky	daec439d52	Replace old chat router with new chat router with advanced streaming - Details Only return notes refs, online refs, inferred queries and generated response in non-streaming mode. Do not return train of throught and other status messages Incorporate missing logic from old chat API router into new one. - Motivation So we can halve chat API code by getting rid of the duplicate logic for the websocket router The deduplicated code: - Avoids inadvertant logic drift between the 2 routers - Improves dev velocity	2024-07-23 14:51:11 +05:30
Debanjum Singh Solanky	2d4b284218	Simplify streaming chat function in web client	2024-07-23 14:38:55 +05:30
Debanjum Singh Solanky	6b9550238f	Simplify advanced streaming chat API, align params with normal chat API	2024-07-22 22:51:24 +05:30
Debanjum Singh Solanky	b8d3e3669a	Stream Status Messages via Streaming Response from server to web client - Overview Use simpler HTTP Streaming Response to send status messages, alongside response and references from server to clients via API. Update web client to use the streamed response to show train of thought, stream response and render references. - Motivation This should allow other Khoj clients to pass auth headers and recieve Khoj's train of thought messages from server over simple HTTP streaming API. It'll also eventually deduplicate chat logic across /websocket and /chat API endpoints and help maintainability and dev velocity - Details - Pass references as a separate streaming message type for simpler parsing. Remove passing "### compiled references" altogether once the original /api/chat API is deprecated/merged with the new one and clients have been updated to consume the references using this new mechanism - Save message to conversation even if client disconnects. This is done by not breaking out of the async iterator that is sending the llm response. As the save conversation is called at the end of the iteration - Handle parsing chunked json responses as a valid json on client. This requires additional logic on client side but makes the client more robust to server chunking json response such that each chunk isn't itself necessarily a valid json.	2024-07-22 15:41:21 +05:30
Debanjum Singh Solanky	91fe41106e	Convert Websocket into Server Side Event (SSE) API endpoint - Convert functions in SSE API path into async generators using yields - Validate image generation, online, notes lookup and general paths of chat request are handled fine by the web client and server API	2024-07-21 14:20:22 +05:30
sabaimran	9cf52bb7e4	Update automations UX for more consistency (#856 ) * Update the automations UI to be a more suitable color distribution based on new designs * Use accented colors for the metadata, update dark mode colors * Update form to use icons as well and render more pretty inline form labels	2024-07-21 12:22:23 +05:30
sabaimran	e694c82343	Fix Docker build issues with yarn / next /node (#859 ) * Rollback node version being installed from nodesource to node 20	2024-07-19 19:11:29 +05:30
sabaimran	1af9dbb083	Switch node/yarn install steps to use more native installation patterns	2024-07-19 17:10:08 +05:30
sabaimran	6d5ca5a3e1	yarn clean cache before build	2024-07-19 16:06:38 +05:30
sabaimran	7f0d1bd414	Add verbose logs when outputing yarn install steps	2024-07-19 15:48:43 +05:30
sabaimran	7426a4f819	Prefetch related agent when retrieving the conversation for performance improvements	2024-07-19 14:43:30 +05:30
Debanjum Singh Solanky	07f36fa95a	Update new web interface with update calls to /content, /model APIs	2024-07-19 12:23:22 +05:30
Debanjum Singh Solanky	f03525f431	Add back /api/configure as /api/settings API endpoint It had been removed during the /api/configure/content to /api/content API migration before	2024-07-19 05:40:34 +05:30
Debanjum Singh Solanky	3832ef0236	Move API endpoints under /api/configure/phone/ to /api/phone/ Pull out /api/configure/phone API endpoints into /api/phone for more concise and sufficiently explanatory API path Refactor Flow 1. Rename /api/configure/phone -> /api/phone	2024-07-19 05:40:34 +05:30
Debanjum Singh Solanky	1197266912	Move API endpoints under /configure/<type>/model to /api/model/<type> Now the API to configure all the AI models is under /api/models. This provides better organization and API hierarchy. The /configure url segment was redundant. - Rename POST /api/phone to PATCH /api/phone - Rename GET /api/configure to GET /api/settings Refactor Flow 1. Move out POST /user/name to main api.py 2. Rename /api/configure/<type>/model -> /api/model/<type> 3. Rename @api_configure to @api_mode 4. Rename file api_config.py to api_model.py	2024-07-19 05:40:34 +05:30
Debanjum Singh Solanky	469a1cb6a2	Move API endpoints under /api/configure/content/ to /api/content/ Pull out /api/configure/content API endpoints into /api/content to allow for more logical organization of API path hierarchy This should make the url more succinct and API request intent more understandable by using existing HTTP method semantics along with the path. The /configure URL path segment was either - redundant (e.g POST /configure/notion) or - incorrect (e.g GET /configure/files) Some example of naming improvements: - GET /configure/types -> GET /content/types - GET /configure/files -> GET /content/files - DELETE /configure/files -> DELETE /content/files This should also align, merge better the the content indexing API triggered via PUT, PATCH /content Refactor Flow 1. Rename /api/configure/types -> /api/content/types 2. Rename /api/configure -> /api 3. Move /api/content to api_content from under api_config	2024-07-19 05:40:34 +05:30
Debanjum Singh Solanky	bba4e0b529	Accept file deletion requests by clients during sync - Remove unused full_corpus boolean. The full_corpus=False code path wasn't being used (accept for in a test) - The full_corpus=True code path used was ignoring file deletion requests sent by clients during sync. Unclear why this was done - Added unit test to prevent regression and show file deletion by clients during sync not ignored now	2024-07-19 04:53:01 +05:30
Debanjum Singh Solanky	5923b6d89e	Split /api/v1/index/update into /api/content PUT, PATCH API endpoints - This utilizes PUT, PATCH HTTP method semantics to remove need for the "regenerate" query param and "/update" url suffix - This should make the url more succinct and API request intent more understandable by using existing HTTP method semantics	2024-07-19 01:45:53 +05:30
Debanjum Singh Solanky	e9f86e320b	Fix and improve offline chat actor, director tests - Use updated references schema with compiled key - Enable director tests that are now expected to pass and that do pass (with Gemma 2 at least)	2024-07-18 03:43:09 +05:30
Debanjum Singh Solanky	b0ee78586c	Improve offline chat truncation to consider message separator tokens	2024-07-18 03:43:09 +05:30
Debanjum Singh Solanky	6f46e6afc6	Improve and fix chat model prompts for better, consistent context - Add day of week to system prompt of openai, anthropic, offline chat models - Pass more context to offline chat system prompt to - ask follow-up questions - know where to find information about khoj (itself) - Fix output mode selection prompt. Log error if model does not select valid option from list of valid output modes provided - Use consistent names for question, answers passed to extract_questions_offline prompt - Log which model extracts question, what the offline chat model sees as context. Similar to debug log shown for openai models	2024-07-18 03:43:09 +05:30
Debanjum Singh Solanky	53eabe0c06	Support Gemma 2 for Offline Chat - Pass system message as the first user chat message as Gemma 2 doesn't support system messages - Use gemma-2 chat format - Pass chat model name to generic, extract questions chat actors Used to figure out chat template to use for model For generic chat actor argument was anyway available but not being passed, which is confusing	2024-07-18 03:09:38 +05:30
Debanjum Singh Solanky	65dade4838	Create API endpoints to get user content configurations This is to be used by the new Next.js web client	2024-07-17 13:41:14 +05:30
Debanjum	2ab8fb78b1	Migrate the PyPI package to use project name: khoj (#853 ) ### Changes - Deprecate [khoj-assistant](https://pypi.org/project/khoj-assistant) pypi package. Use more accurate and succinct pypi project name, [khoj](https://pypi.org/project/khoj) - Update references to use `khoj` pypi package in docs and code - Update pypi workflow to publish to both khoj, khoj-assistant for now - Update stale python 3.9 support mentioned in our pyproject Can't support python 3.9 as depend on [Django 5.0.7](https://pypi.org/project/Django/5.0.7/) which needs python >=3.10 ### Verify - Updated `pypi.yml` github workflow publishes to both (new) [khoj](https://pypi.org/project/khoj/1.16.1.dev16/), (old) [khoj-assistant](https://pypi.org/project/khoj-assistant/1.16.1.dev16/) pypi projects - Can install Khoj python package with `pip install khoj`	2024-07-17 01:05:51 -07:00
Debanjum	bf815e4463	Refactor Config API and Settings pages for Reuse and Consistency (#852 ) ### Major - Reuse get config data logic across config pages on web client - Make config api endpoint urls and response fields consistent - Rename API path /api/config to /api/configure - Move Web, Desktop client settings page to be under `/settings` from the previous `/config` url path ### Minor - Pass isMobileWidth prop to SidePanel via chat share interface - Turn prettier off instead of throwing error for now - Do no explicitly add line-clamp plugin as it's in Tailwind by default	2024-07-17 01:03:06 -07:00
Debanjum Singh Solanky	a1c362a4f7	Expose web, desktop settings page under /settings, not /configure - Update references to the settings page to use new url across docs and code - Rename desktop and web settings page to settigns.html instead of config[ure].html	2024-07-17 13:17:29 +05:30
Debanjum Singh Solanky	b015b0e83d	Arrange config API detailed response fields to improve readability There are a lot of fields being returned. Group returned fields and add comment header to each Group for readability	2024-07-17 13:17:28 +05:30
Debanjum Singh Solanky	71ebf31a54	Make config API detailed response fields more intuitive, consistent - Use name, id for every [search\|chat\|voice\|pain]_model_option - Rename current_model_state field to more intuitive enabled_content_source - Update references to the update fields in config.html	2024-07-17 12:41:01 +05:30
Debanjum Singh Solanky	30d60aaae9	Add, fix Khoj Docker container labels	2024-07-17 10:41:17 +05:30
Debanjum Singh Solanky	583fa3c188	Migrate the pypi package to khoj project name. Update references - Deprecate khoj-assistant pypi package. Use more accurate and succinct pypi project name, khoj - Update references to sye khoj pypi package in docs and code instead of the legacy khoj-assistant pypi package - Update pypi workflow to publish to both khoj, khoj-assistant for now - Update stale python 3.9 support mentioned in our pyproject. Can't support python 3.9 as depend on latest django which support >=3.10	2024-07-17 10:41:16 +05:30
Debanjum Singh Solanky	7316e6b9d3	Pass isMobileWidth prop to SidePanel via chat share interface	2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky	4759c4ac96	Turn prettier off instead of throwing error for now Until web interface code is reformatted with prettier	2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky	466ef3f8f1	Do no explicitly add line-clamp plugin as it's in Tailwind by default	2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky	59000a47cb	Move Desktop config page to /configure from /config url path Update references to point to page at /configure instead of /config	2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky	a5c16ad600	Move Web client config page to /configure from /config url path Update docs, clients and error messages to point to /configure instead of /config	2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky	de15a7a3fc	Rename API path /api/config to /api/configure - Update clients calling /api/config to call /api/configure instead	2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky	dd31936746	Make config api endpoint urls consistent - Consistently use /content/ for data. Remove content-source from path - Remove unnecessary /data/ prefix for API endpoints under /config	2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky	e8176b41ef	Reuse get config data logic across config pages on web client - Put logic to get config data, detailed or basic into router helpers module - Use the get config func across the config pages on web clients - Put configure content and get_notion_auth_url funcs in router helper module to avoid circular import	2024-07-16 16:13:27 +05:30
sabaimran	1a5405e24c	Fix interpretation of day of week in automation form	2024-07-16 10:12:30 +05:30
sabaimran	c837f3779e	Update the agents page with new UX (#850 ) - Use icons/colors for setting the styling of agents - Update automations page to use the shadcn cards: https://github.com/shadcn-ui/ui	2024-07-16 10:10:55 +05:30
sabaimran	1c6ed9bc6d	Migrate the existing automations page to use React (#849 ) Migrates the Automations page to React, mostly keeping the overall design consistent with organization. Use component library, with some changes in color. Add easier management with straightforward form and editing experience. Use system preference for determining dark mode if not explicitly set.	2024-07-15 21:42:33 +05:30
Debanjum	c7764c7470	Fix, Improve Behavior, Styling of Chat View on Web (#851 ) ### Behavior - Close chat sessions side panel on click open a chat session - Show agent profile card with description when hover on agent in chat view - Show action bar on last chat message without hover - Show chat message action buttons without hover on mobile interfaces - Show chat message timestamp on hover in chat view - Show text descriptions of chat message action buttons on hover - Render inline png, webp images generated by Khoj in chat view ### Fixes - Do not render references with broken links in chat view - Fix closing side panel on mobile when click open a chat session - Only open side panel as drawer in mobile view - Constrain chat messages to stay within view port across screen sizes ### Styling: Spacing, Sizing, Mobile Friendly - Make Khoj icon appropriately sized and side panel arrow bold - Conversations list should resize to take max space on side panel - Make loading message, styling configurable. Do not show agent when no data - Improve Train of Thought icons spacing and loading circle - Improve mobile friendly styling of chat session side panel - Improve styling of chat input, references UI across screen sizes - Center cursor in chat input. See upto 2 lines for multi-line context ### Miscellaneous - Add code formatter for web interface with prettier	2024-07-15 08:39:14 -07:00
Debanjum Singh Solanky	6c630bc6c3	Constrain chat messages to stay in view port across screen sizes - Constrain chat messages max width to view port across screen sizes - Wrap references on smaller screens, use tailwind, not js to apply styling	2024-07-15 21:00:50 +05:30
sabaimran	9a5bf4c701	Fix rendering of teaser reference panel in mobile width	2024-07-15 19:40:55 +05:30
sabaimran	2e9275c0f3	Remove side panel padding in desktop view. Fix width in mobile view	2024-07-15 19:33:12 +05:30
Debanjum Singh Solanky	ba0ba6b59f	Merge branch 'features/big-upgrade-chat-ux' of github.com:khoj-ai/khoj into document-styling-on-chat-ux	2024-07-15 10:42:56 +05:30
Debanjum	23f61d49e0	Support syncing, searching images from Obsidian plugin (#847 ) - Sync images from Obsidian vault with Khoj server now that Khoj can OCR images - Support rendering images returned by Khoj search modal	2024-07-14 20:41:39 -07:00
Debanjum Singh Solanky	6f8f846086	Standardize code format for web interface with prettier Use husky, lint-staged to run prettier pre-commit	2024-07-15 00:34:54 +05:30
sabaimran	06dce4729b	Make most major changes for an updated chat UI (#843 ) - Updated references panel - Use subtle coloring for chat cards - Chat streaming with train of thought - Side panel with limited sessions, expandable - Manage conversation file filters easily from the side panel - Updated nav menu, easily go to agents/automations/profile - Upload data from the chat UI (on click attachment icon) - Slash command pop-up menu, scrollable and selectable - Dark mode-enabled - Mostly mobile friendly	2024-07-14 23:18:06 +05:30
Debanjum Singh Solanky	6dd90931e8	Fix closing side panel on mobile when click open a chat session	2024-07-14 22:54:49 +05:30
Debanjum Singh Solanky	47b754c07b	Only open side panel as drawer in mobile view	2024-07-14 14:08:41 +05:30
Debanjum Singh Solanky	b47f30ad77	Make Khoj icon appropriately sized and side panel arrow bold	2024-07-14 14:06:36 +05:30
Debanjum Singh Solanky	e6b21144e2	Conversations list should resize to take max space on side panel	2024-07-14 13:49:36 +05:30
Debanjum Singh Solanky	c2bf405489	Make loading message, styling configurable. Do not show agent when no data - Pass Loading message, class name via props to both inline and normal loading spinners - Pass loading conversation message to loading spinner when chat history is being fetched	2024-07-14 13:00:36 +05:30
Debanjum Singh Solanky	63719747cb	Show agent profile card with description when hover on agent in chat view - Create profile card componennt. Use it for agent profile card - Pass agent persona from khoj server via API - Put link to agent profile page in the hover card to make it 2 clicks away. Othewise inadvertent clicks on agent in chat view lead away to agent page - Use tailwind line-clamp extension to clamp card to first two lines	2024-07-14 12:20:11 +05:30
Debanjum Singh Solanky	dbbd4b9777	Show action bar on last chat message without hover	2024-07-14 10:32:31 +05:30
Debanjum Singh Solanky	a0f38e079f	Improve Train of Thought icons spacing and loading circle	2024-07-14 09:35:15 +05:30
Debanjum Singh Solanky	e9567741eb	Improve mobile friendly styling of chat session side panel	2024-07-14 00:57:08 +05:30
Debanjum Singh Solanky	b26a6e25d1	Show chat message action buttons without hover on mobile interfaces This is because hover maybe hard to do on mobile devices	2024-07-14 00:54:23 +05:30
Debanjum Singh Solanky	f69f9e3523	Close chat sessions side panel on click open a chat session	2024-07-14 00:53:16 +05:30
Debanjum Singh Solanky	d51011314f	Improve styling of chat input, references UI across screen sizes Use tailwind screen breakpoints shorthand instead of js to apply different styling for different screen sizes	2024-07-13 20:45:34 +05:30
Debanjum Singh Solanky	2668e42e7f	Center cursor in chat input. See upto 2 lines for multi-line context - Reuse class name when get slash command icons - Previous chat input styling didn't have the cursor centered in the chat input text area. But it did allow seeing multi line chat inputs for context	2024-07-13 02:51:29 +05:30
Debanjum Singh Solanky	aeaebfb515	Show chat message timestamp on hover in chat view	2024-07-13 02:51:19 +05:30
Debanjum Singh Solanky	e00c6b486e	Add hover text descriptions of action buttons on chat message in web view	2024-07-12 15:40:51 +05:30
Debanjum Singh Solanky	5fccccfdff	Do not render references with broken links in chat view	2024-07-12 15:14:11 +05:30
Debanjum Singh Solanky	b98a0cfe1b	Render inline png, webp images generated by Khoj in chat view Add spacing between chat message paragraphs	2024-07-12 15:13:19 +05:30
sabaimran	3e7e73ddd6	Switch from using dynamic routes to static routes and extracting slug from URL manually. See https://github.com/vercel/next.js/discussions/64660 for limitations with static export / dynamic routes	2024-07-11 23:06:27 +05:30
sabaimran	bea0aa5445	Improve the logged out share experience	2024-07-11 20:11:21 +05:30
Debanjum Singh Solanky	02658ad4fd	Upgrade Django version	2024-07-11 16:35:10 +05:30
Debanjum Singh Solanky	cbae8b68fb	Add DB migration from making bi_encode configs optional in #834	2024-07-11 16:33:31 +05:30
Debanjum Singh Solanky	3a75838196	Add Keyboard shortcuts to navigate in Khoj Desktop	2024-07-11 16:29:53 +05:30
Debanjum Singh Solanky	6c1861b319	Improve the prompt to generate images with DALLE3 and SD3 - Major - Ask for prompt in prose - Remove seed from SD3 image generation to improve diversity of output for a given prompt Otherwise for conversations with similar sounding prompts, the images would be almost exactly the same. This maybe another indicator of SD3's inability to capture detailed instructions - Consistently use "prompt" wording instead of "query" in improved image generation prompts. Previously a mix of those terms were being used, which could confuse the chat model - Minor - Add day of week to prompt - Remove 2-5 sentence limit on instructions to SD3. It seems to be able to follow longer instructions just with less fidelity than DALLE. And the 2-5 sentence instruction limit wasn't being adhered to - Improve ability to edit, improve the image based on follow-up instructions by the user - Align prompts for DALLE and SD3. Only difference is to wrap text to be rendered in quotes for SD3. This improves it's ability to render requested text. DALLE cannot render text as well or consistently	2024-07-11 16:29:53 +05:30
Debanjum Singh Solanky	21fe1a917b	Support syncing, searching images from Obsidian plugin	2024-07-11 16:22:31 +05:30
sabaimran	6f1d799759	Modularize code and implemenet share experience	2024-07-10 23:08:16 +05:30
sabaimran	1b4a51f4a2	Remove print statement for debugging timestamps	2024-07-10 14:54:22 +05:30
sabaimran	0369eb6e0e	Fix timestamp bug for pending message and expand CSP for thumbnails	2024-07-10 14:53:31 +05:30
sabaimran	375685530f	Add content security policy to the chat page	2024-07-10 11:18:41 +05:30
sabaimran	c5cfd0f2cf	Remove unused slash command-related useeffect hook	2024-07-10 10:03:58 +05:30
sabaimran	e1a5c17775	Add DOMPurify for rendering md text. Add a easter egg in the console	2024-07-10 10:03:08 +05:30
sabaimran	e358723baa	Fix image rendering and unique key for pending message?	2024-07-09 21:55:54 +05:30
sabaimran	c8c5d50b1a	Improve command bar slash experience	2024-07-09 21:39:13 +05:30
sabaimran	c25bf97831	Update hover styling for see all button	2024-07-09 20:55:54 +05:30
sabaimran	23b71b0dff	Remove shadow from the slash command bar	2024-07-09 20:52:38 +05:30
sabaimran	998e2aec30	Update dark mode, fix chat message time stamp, fix rendering for new message	2024-07-09 20:50:20 +05:30
sabaimran	0c6b6de09e	Revert web client route chat page rendering logic	2024-07-09 19:47:04 +05:30
sabaimran	cc22e1b013	Add pop-up module for the slash commands	2024-07-09 19:46:17 +05:30
sabaimran	5b69252337	Add hover effects for chat messages	2024-07-09 14:56:57 +05:30
sabaimran	a0e9530fa4	Merge branch 'master' of github.com:khoj-ai/khoj into features/chat-ui-updates-big	2024-07-09 12:57:50 +05:30
sabaimran	260aa61818	Remove tests for python3.9	2024-07-09 12:28:11 +05:30
sabaimran	4471c1e37f	Apply mitigations for piling up open connections - Because we're using a FastAPI api framework with a Django ORM, we're running into some interesting conditions around connection pooling and clean-up. We're ending up with a large pile-up of open, stale connections to the DB recurringly when the server has been running for a while. To mitigate this problem, given starlette and django run in different python threads, add a middleware that will go and call the connection clean up method in each of the threads.	2024-07-09 12:22:58 +05:30
sabaimran	609e7ee19c	Fix width of side panel	2024-07-09 12:02:01 +05:30
Debanjum	0b1b262512	Add system dependencies required by RapidOCR to fix Khoj Docker image (#842 ) - Issue The Khoj docker build would fail with `ImportError: libGL.so.1: cannot open shared object file: No such file or directory`. This was required by the Khoj RapidOCR python package dependency. - Fix A minimal set of system packages have been added to resolve this issue.	2024-07-08 22:16:16 +05:30
kxnarak	43413cd21f	add dependencies required by the RapidOCR python package	2024-07-08 18:26:19 +05:30
sabaimran	bf4c2f219e	Merge branch 'master' of github.com:khoj-ai/khoj into features/chat-ui-updates-big	2024-07-08 17:00:42 +05:30
sabaimran	037e157648	Fix a variety of links	2024-07-08 16:49:13 +05:30
sabaimran	6b80bb3f37	Add a demo for the khoj mini application, minor updates to other pages, remove out of date demos page	2024-07-08 16:33:47 +05:30
Debanjum Singh Solanky	9e31ebff93	Release Khoj version 1.16.0	2024-07-07 18:26:10 +05:30
Debanjum Singh Solanky	54132efd67	Fix Khoj Obsidian plugin build	2024-07-07 18:26:10 +05:30
Debanjum Singh Solanky	510d9b3a29	Add short keys to open chat menu, new chat, search from Obsidian pane	2024-07-07 17:57:17 +05:30
Debanjum Singh Solanky	3e0c882e27	Transcribe only when keyboard shortcut or button pressed in Obsidian - Transcribe on holding Ctrl+s keyboard shortcut - Transcribe on holding the transcribe button pressed via mouse too - Make the transcribe button robust to inadvertent touches by using timeout - Do not transcribe, trigger auto-send on silences. Silence detection is super rudimentary, just blocks standard emanations by whisper when no speech	2024-07-07 17:57:17 +05:30
sabaimran	0eb000c3ea	Add health checks for the django ORM	2024-07-07 16:11:28 +05:30
sabaimran	6f8a65c529	References, mobile friendly chat sessions and file filter	2024-07-07 15:42:29 +05:30
Debanjum Singh Solanky	a31cd0dec1	Fix async batch delete of indexed entries	2024-07-06 22:45:26 +05:30
Debanjum	08b379c2ab	Fix, Improve Indexing, Deleting Files (#840 ) ### Fix - Fix degrade in speed when indexing large files - Resolve org-mode indexing bug by splitting current section only once by heading - Improve summarization by fixing formatting of text in indexed files ### Improve - Improve scaling user, admin flows to delete all entries for a user	2024-07-06 19:52:42 +05:30
Debanjum Singh Solanky	4a471979eb	Upgrade sentence-transformer package to version 3.0.1 Add einops dependency for some sentence transformer models like the nomic-embed	2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky	d693baccbc	Make it optional to set the encoder, cross-encoder configs via admin UI	2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky	1baebb8d0e	Identify markdown headings by any whitespace character after ^#+ Previously only markdown headings with space characters after # would be considered a heading. So ^##\t wouldn't be considered a valid heading	2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky	010486fb36	Split current section once by heading to resolve org-mode indexing bug - Split once by heading (=first_non_empty) to extract current section body Otherwise child headings with same prefix as current heading will cause the section split to go into infinite loop - Also add check to prevent getting into recursive loop while trying to split entry into sub sections	2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky	6a135b1ed7	Fix degrade in speed of indexing large files. Improve summarization Adding files to the DB for summarization was slow, buggy in two ways: - We were updating same text of modified files in DB = no of chunks per file times - The `" ".join(file_content)' code was breaking each character in the file content by a space. This formats the original file content incorrectly before storing in the DB Because this code ran in the main file indexing path, it was slowing down file indexing. Knowledge bases with larger files were impacted more strongly	2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky	e6ffb6b52c	Improve scaling user flow to delete all entries - Delete entries by batch to improve efficiency of query at scale - Share code to delete all user entries between it's async, sync methods - Add indicator to show when files being deleted on web config page	2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky	1ab59865b5	Improve scaling admin flow to delete all entries for user	2024-07-06 19:35:59 +05:30
Debanjum	05138cbd0a	Use DOM Scripting, Add CSP to Web config pages. Disable CSP in Obsidian plugin (#834 ) - Add CSP to web config pages. Load phone no. validation js, css from S3 - Construct config page elements on Web via DOM scripting - Disable CSP in Khoj Obsidian as it interferes with Obsidian functionality - Other miscellaneous voice message level improvements (rate limit, listening animation)	2024-07-06 19:30:09 +05:30
Debanjum Singh Solanky	9bdb48807b	Ratelimit text to speech model. Validate share chat url domain - Do not log auth error message on server when Resend setup as Magic links for sign-in are now supported	2024-07-06 12:53:19 +05:30
Debanjum Singh Solanky	b334db0fca	Add CSP to web config pages. Load phone no validation js, css from S3	2024-07-06 12:48:28 +05:30
Debanjum Singh Solanky	2f034f807a	Construct config page elements on Web via DOM scripting. Minimize isage of innerHTML to prevent DOM clobbering and unintended escape by user Input	2024-07-06 12:48:28 +05:30
Debanjum Singh Solanky	69c9e8cc08	Disable CSP in Khoj Obsidian as it interferes with Obsidian functionality The Khoj CSP interferes with other Obsidian features and plugins as CSP is applied page wide. For now chat message sanitization via Dompurify should suffice. Enable CSP when can scope it to only the Khoj Obsidian plugin.	2024-07-05 16:10:08 +05:30
Debanjum Singh Solanky	a353d883a0	Make it optional to set the encoder, cross-encoder configs via admin UI Upgrade sentence-transformer, add einops dependency for some sentence transformer models like nomic	2024-07-05 16:09:30 +05:30
Debanjum Singh Solanky	6d59ad7fc9	Add listening circle animation to speak button in Obsidian plugin Use icon active focus as color of animation button	2024-07-05 14:00:53 +05:30
sabaimran	aec44a0b89	Add dark mode toggle! And improve experience for train of thought	2024-07-04 18:29:21 +05:30
Debanjum Singh Solanky	516af86575	Fix add, remove of the text to speech loader element in Obsidian	2024-07-04 17:38:45 +05:30
sabaimran	465ef0b772	Add a loading experience when waiting for khoj response	2024-07-04 13:49:51 +05:30
Debanjum Singh Solanky	814aca6d69	Skip summarize when not triggered via slash cmd and can't summarize Maybe better to fallback to non-summarize behavior if summarize intent is just inferred but we can't actually summarize because the single file added to conversation isn't satisfied	2024-07-04 13:31:00 +05:30
Debanjum	4446de00d3	Enable Voice, Keyboard Shortcuts in Khoj Obsidian Plugin (#837 ) - Simplify quick jump between Khoj side pane and main editor view using keyboard shortcuts - Enable voice chat in Obsidian to make interactions with Khoj more seamless	2024-07-04 13:28:29 +05:30
sabaimran	5ea8b16f84	Fix missing method error	2024-07-04 12:08:22 +05:30
sabaimran	d61bddf56c	Fix retrieving image model by prefetching the openai config in the async method	2024-07-04 11:58:33 +05:30
sabaimran	a129b017b9	Fix image generation on server -- use default config when not set by user	2024-07-04 09:13:23 +05:30
sabaimran	34118078bf	kill the emojis	2024-07-04 00:30:21 +05:30
sabaimran	d5ba916978	Working example of streaming, intersection observer, other UI updates	2024-07-04 00:30:01 +05:30
sabaimran	78d1a29bc1	Finish up filte filter side panel menu	2024-07-02 23:32:36 +05:30
sabaimran	6fa2dbc042	Do not use the custom configured max prompt size to send message to anthropic	2024-07-02 21:59:06 +05:30
sabaimran	8a6722ba97	Add basic implementation for chat side panel components	2024-07-02 21:56:43 +05:30
Debanjum Singh Solanky	afcfc60637	Merge DB migrations post merge of SD3 via API support PR	2024-07-02 17:54:58 +05:30
Debanjum	c015eeb5dd	Improve Online Search: Parallelize Search, Use Jina Reader API by default (#832 ) - Overview Khoj wil be able to do online search out of the box, even for self-hosted users - Default to Jina search, reader API when no Serper.dev, Olostep API keys - Run online searches in parallel to process multiple queries faster - Details - Jina provides a [reader API](https://github.com/jina-ai/reader) for online search and web page reading It requires no API key. This provides a good default to enable online search for self-hosted readers requiring no additional setup. - Jina search API also returns webpage contents with the results, so just use those directly when Jina Search API used instead of trying to read webpages separately. The extract relevant content from webpage step using a chat model is still used from the `read_webpage_and_extract_content' func in this case. - Parse search results from Jina search API into same format as Serper.dev for accurate rendering of online references by clients - Run online searches in parallel with AsyncIO to process multiple queries faster	2024-07-02 17:44:51 +05:30
Debanjum	826c3dc9cc	Enable using Stable Diffusion 3 for Image Generation via API (#830 ) - Support Stable Diffusion 3 via API Server Admin needs to setup model similar to DALLE-3 via Django Admin Panel - Use shorter prompt generator to prompt SD3 to create better images - Allow users to set paint model to use from web client config page	2024-07-02 17:28:50 +05:30
Debanjum Singh Solanky	d5ceff2691	Update tests and documentation with Jina reader API usage and info Update offline, openai chat actor, director tests to not require Serper to run the online command tests Update documentation for self-hosted online search to mention no setup is required by default. But improvements can be made by using Serper.dev or Olostep	2024-07-02 17:19:09 +05:30
Debanjum Singh Solanky	553beae848	No need to set OpenAI API key from environment variable explicitly It is unnecessary as the OpenAI client automatically tries to use API key from OPENAI_API_KEY env var when the api_key field is unset	2024-07-02 17:19:09 +05:30
Debanjum Singh Solanky	a038e4911b	Default to Jina search, reader API when no Serper.dev, Olostep API keys Jina AI provides a search and webpage reader API that doesn't require an API key. This provides a good default to enable online search for self-hosted readers requiring no additional setup. Jina search API also returns webpage contents with the results, so just use those directly when Jina Search API used instead of trying to read webpages separately. The extract relvant content from webpage step using a chat model is still used from the `read_webpage_and_extract_content' func in this case. Parse search results from Jina search API into same format as Serper.dev for accurate rendering of online references by clients	2024-07-02 17:19:08 +05:30
Debanjum Singh Solanky	ff44734774	Run online searches in parallel to process multiple queries faster	2024-07-02 17:19:08 +05:30
sabaimran	0ee7cc8c47	Change overall architecure of how information is flowing for better statefulness	2024-07-02 12:39:54 +05:30
sabaimran	541ce04ebc	Checkpoint: Updated sidebar panel with new components - Add non-functional UI elements for chat, references, feedback buttons, rename/share session, mic, attachment, websocket connection	2024-07-02 11:18:50 +05:30
Raghav Tirumale	8eccd8a5e4	Support Indexing Images via OCR (#823 ) - Added support for uploading .jpeg, .jpg, and .png files to Khoj from Web, Desktop app - Updating indexer to generate raw text and entries using RapidOCR - Details * added support for indexing images via ocr * fixed pyproject.toml * Update src/khoj/processor/content/images/image_to_entries.py Co-authored-by: Debanjum <debanjum@gmail.com> * Update src/khoj/processor/content/images/image_to_entries.py Co-authored-by: Debanjum <debanjum@gmail.com> * removed redudant try except blocks * updated desktop js file to support image formats * added tests for jpg and png * Fix processing for image to entries files * Update unit tests with working image indexer * Change png test from version verificaition to open-cv verification --------- Co-authored-by: Debanjum <debanjum@gmail.com> Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-07-01 06:00:00 -07:00
Debanjum Singh Solanky	cffc14a46a	Trigger voice chat via keyboard shortcut in Khoj side pane Quickly trigger voice chat from Khoj side pane using Keyboard shortcuts	2024-07-01 18:06:09 +05:30
Debanjum Singh Solanky	3723904512	Toggle jump between Khoj side pane & previous editor via cmd, kbd shortcut Improve quick navigation to, from Khoj side pane using Keyboard shortcut or Obsidian command	2024-07-01 18:05:59 +05:30
Debanjum Singh Solanky	fbb95ca342	Put cursor on chat input when focus on chat view in Obsidian This should improve fluidity of keyboard interactions with Khoj on Obsidian. Open Khoj chat view via keybinding or command pallete and ask question using only the keyboard, with no mouse clicks required	2024-07-01 18:05:55 +05:30
Debanjum Singh Solanky	093e276908	Enable Voice chat in Khoj Obsidian plugin - Automatically carry out voice chats with Khoj from within Obsidian When send voice message, Khoj will auto respond with voice as well - Listen to past Khoj messages as speech - Add circular loading spinner to use while message is being converted to speech	2024-07-01 18:02:28 +05:30
sabaimran	c83b8f2768	Allow just one worker to be the background schedule leader (#836 ) * Add a leader election mechanism to circumvent runtime issues for multiple schedulers - Reduce the load on the DB and risk of issues on the service side by limiting the execution environment to one elected leader at a given time. This one is responsible for managing all of the execution of the jobs, though all workers are capable of adding and removing jobs * Set a max duration for the schedule leader position (12 hrs), add some error if automation not added successfully	2024-06-28 13:13:25 +05:30
sabaimran	80fe5ce182	Fix user not authenticated interpretation error	2024-06-27 21:13:54 +05:30
Raghav Tirumale	24a0d8b073	Add OS Level Shortcut Window for Quick Access to Khoj Desktop (#815 ) * rough sketch of desktop shortcuts. many bugs to fix still * working MVP of desktop shortcut khoj * UI fixes * UI improvements for editable shortcut message * major rendering fix to prevent clipboard text from getting lost * UI improvements and bug fixes * UI upgrades: custom top bar, edit sent message and color matching * removed debug javascript file * font reverted to Noto Sans * cleaning up the code and removing diffs * UX fixes * cleaning up unused methods from html * front end for button to send user back to main window to continue conversation * UX fix for window and continue conversation support added * migrated common js functions into chatutils.js * Fix window closing issue in macos by 1. Use a helper function to determine if the window is open by seeing if there's a browser window with shortcut.html loaded 2. Use the event listener on the window to handle teardown * removed extra comment and renamed continue convo button --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-06-27 07:20:13 -07:00
sabaimran	870d9ecdbf	Add a fact checker feature with updated styling (#835 ) - Add an experimental feature used for fact-checking falsifiable statements with customizable models. See attached screenshot for example. Once you input a statement that needs to be fact-checked, Khoj goes on a research spree to verify or refute it. - Integrate frontend libraries for [Tailwind](https://tailwindcss.com/) and [ShadCN](https://ui.shadcn.com/) for easier UI development. Update corresponding styling for some existing UI components. - Add component for model selection - Add backend support for sharing arbitrary packets of data that will be consumed by specific front-end views in shareable scenarios	2024-06-27 18:45:38 +05:30
sabaimran	3b7a9358c3	Add our first view via Next.js for Agents (#817 ) Initialize our migration to use Next.js for front-end views via Agents. This includes setup for getting authenticated users, reading in available agents, setting up a pop-up modal when you're clicking on an agent, and allowing users to start new conversations with agents. Best attempt at an in-place migration, though there are some noticeable differences. Also adds view for chat that are not being used, but in experimental phase.	2024-06-27 13:56:16 +05:30
Debanjum Singh Solanky	afbeee9e82	Rename copy-button to more general chat-action-button in Obsidian client - Use 4 space indent of activateView function in pane_view component	2024-06-26 18:09:23 +05:30
sabaimran	8c12a69570	Fix issue in anthropic chat when khoj message becomes top message This is because Anthropic requires the first message in the chat history to be from the user.	2024-06-26 12:59:34 +05:30
Debanjum Singh Solanky	4f89319b40	Release Khoj version 1.15.0	2024-06-26 10:38:16 +05:30
Debanjum Singh Solanky	bbfd320ed4	Use Yarn instead of NPM to bump Desktop, Obsidian client versions	2024-06-26 10:37:58 +05:30
Debanjum Singh Solanky	c793d8a69e	Add Validation logic to save PaintModel. Use API key from Paint Model Rename Paint Model, Adapters to TextToImage for consistency	2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky	1acf969c6e	Do not require OpenAI to generate image as local chat + sd3 works now Previously the text_to_image helper would only trigger the image generation flow if OpenAI client was setup. This is not required anymore as offline chat model + sd3 API works. So remove that check	2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky	2c4bf91a61	Allow user to set paint model to use from web client config page	2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky	eb09aba747	Remove quotes wrapping the prompt from being passed to image gen model	2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky	fdd4c02461	Use shorter prompt generator to prompt SD3 to create better images	2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky	eda33e092f	Enable using Stable Diffusion 3 for Image Generation via API	2024-06-26 10:16:26 +05:30
Debanjum	a25689fabf	Use user theme in Obsidian for Khoj plugin styling (#825 ) Makes the Khoj chat in the Obsidian plugin adapt better to the user theme, making it feel more seamless, and helps with dark mode compatibility	2024-06-26 10:14:17 +05:30
Debanjum Singh Solanky	cfe46fd9f5	Add Border Color instead of BG Color for Chat Message in Obsidian	2024-06-26 08:11:04 +05:30
sabaimran	fb818ead60	Use active bg instead of code background for khoj response	2024-06-26 08:05:13 +05:30
sabaimran	a4b2552540	Update conversation session selection menu to use Obsidian theme colors as well	2024-06-26 08:05:13 +05:30
sabaimran	da5b07e913	Remove custom styling on the reference buttons	2024-06-26 08:05:13 +05:30
sabaimran	c4a1ae9375	Make the Khoj Obsidian plugin more user theme friendly Use the CSS variables from the theme for the Khoj UI components	2024-06-26 08:04:17 +05:30
Debanjum Singh Solanky	d6fe5d9a63	Pass current component as arg to markdown renderer in chat view This doesn't work on search modal, but hopefully will get resolved once we migrate search into a view from a modal	2024-06-24 16:12:20 +05:30
Debanjum Singh Solanky	0d04018622	Install pydantic with optional email validator package Otherwise Khoj fails on startup. Not sure why, must be new changes to pydantic?	2024-06-24 16:12:20 +05:30
Debanjum Singh Solanky	6f280b1ccc	Split setup of specific OpenAI API proxies into separate doc pages	2024-06-24 16:12:20 +05:30
Debanjum Singh Solanky	68e7c297e0	Add Advanced Self Hosting Section, Improve Self Hosting, OpenAI Proxy Docs - Add instructions for self-hosted users with info, warning boxes to avoid, fix common issues when setting up Khoj server - Create new Advanced Self Hosting section - Extract Advanced Self-Hosting Sections from the Advanced Page and move them to separate Pages under Advanced Self Hosting section - Improve OpenAI Proxy Docs - Put Ollama setup as a section under OpenAI API Proxy page instead of a separate page - Add Section to use Khoj with chat model from LM Studio - Update LiteLLM docs to use chat model from LM Studio	2024-06-24 16:12:20 +05:30
Debanjum Singh Solanky	732332a3c5	Spell fix s/e.g/e.g./ across code, tests and docs	2024-06-24 15:24:45 +05:30
Debanjum Singh Solanky	8fc7f980aa	Revert KHOJ_DOMAIN to only support single domain. Multiple domain support didn't generalize to other portions where it is used	2024-06-24 15:24:45 +05:30
sabaimran	4110e71e84	Add info in the documentation about text to speech	2024-06-24 12:46:33 +05:30
sabaimran	939811e9b5	Fix conversation look up logic	2024-06-24 09:10:03 +05:30
Debanjum Singh Solanky	a4d88612c1	Just use yarn for package version locking. Remove npm package lock	2024-06-23 16:06:20 +05:30
Debanjum Singh Solanky	55be90cdd2	Sanitize user input fields on Automations page of web client Use Dompurify to sanitize user input	2024-06-23 14:14:47 +05:30
Debanjum Singh Solanky	1c7a562880	Generate automation cards via DOM scripting	2024-06-23 13:22:38 +05:30
Debanjum Singh Solanky	57a36967bf	Run Obsidian version script in bump_version.sh to write to versions.json This handles updates from manifest.json minAppVersion field to the versions.json file. The minAppVersion field is for the minimum Obsidian app version supported by a Khoj plugin version	2024-06-23 08:18:55 +05:30
Debanjum Singh Solanky	c7c32a7467	Improve online chat reference extraction in Khoj.el Emacs package - Handle online references with no title - Improve handling references which are arrays instead of lists	2024-06-23 08:13:36 +05:30
Debanjum Singh Solanky	9d33d8c0fa	Upgrade typescript eslint dev dependency of Khoj Obsidian plugin	2024-06-23 07:36:49 +05:30
Debanjum	a94062469a	Automatically Find Similar Notes on Emacs in Background (#827 ) Khoj will find and display notes similar to the current entry in the side pane when 1. find similar is open in side pane and 2. cursor has moved to a new entry ### Major - Find similar notes to current note at cursor automatically in background - Only show headings of search result and increase default results count ### Minor - Pass absolute path of file to index from khoj.el emacs client - Update help message to only show the smaller set of new keybindings - Fix edge cases in loading some chat sessions	2024-06-23 07:36:11 +05:30
sabaimran	38090b2553	In dockerize.yml file, revert the added configuration	2024-06-22 21:11:25 +05:30
sabaimran	a53178cab9	Add developer support for using next.js to serve generated static files (#814 ) To improve the developer experience for front-end development, we're migrating to Next.js. In order to do this migration page-by-page, we're using static site generation via Next.js. This also helps us avoid making cross site requests from front-end to back-end for the time being, while giving a ramp to separating out server and client if needed for scale down the road. Dev instructions for using the next.js setup are in the added README. This adds scaffolding for including the built files in the python package as well as the docker images. Docker setup has been tested locally. In order to verify the build is working as expected, we can navigate to the {khoj_host}:42110/experimental and verify that the experiment page comes up. This setup works with serving static files included in the src/interface/web folder from the Django app. The key bit for understanding the setup is in the yarn export command in package.json.	2024-06-22 20:12:41 +05:30
Debanjum Singh Solanky	59edb99f04	Simplify, improve bump version development script - Just use in-built `npm version' command to update desktop, obsidian version - Upgrade by major, minor or patch version using new -t flag in script E.g bump_version -t minor	2024-06-22 18:19:38 +05:30
Debanjum Singh Solanky	abd6f58aee	Upgrade Desktop app package dependencies	2024-06-22 17:38:52 +05:30
Debanjum Singh Solanky	f413dc62cd	Upgrade Obsidian plugin dependencies. Add package lock file for it Add it to bump_version script as well.	2024-06-22 17:38:52 +05:30
Debanjum Singh Solanky	1d7d51a7ab	Upgrade Documentation packages	2024-06-22 17:38:48 +05:30
Debanjum Singh Solanky	22f6db0a6b	Upgrade RapidOCR and enable for Python 3.12. Fix PDF OCR test	2024-06-22 16:01:55 +05:30
Debanjum Singh Solanky	55a23eae25	Upgrade pillow to fix pytest workflow failure	2024-06-22 15:17:43 +05:30
Debanjum Singh Solanky	7e277e9381	Fix getting file-toggle-button element in chat of web app	2024-06-21 15:54:38 +05:30
Debanjum Singh Solanky	fa7b40ab86	Automatically respond with Voice if subscribed user sent Voice message	2024-06-21 15:53:01 +05:30
Debanjum Singh Solanky	5e5fe4b7af	Improve font size, spacing of conversation session on desktop app	2024-06-21 12:25:35 +05:30
sabaimran	d3c0111121	Include base URL when using openai api config in extract questions. Close #831	2024-06-21 12:18:50 +05:30
sabaimran	b9966eb3d4	Add support for text to speech in chat responses (#821 ) * Enable speech to text responses in khoj chat - Current issue: reads out all the markdown formatting, plus waits for the whole result to be streamed before playing it * Extract content from markdown-formatted text * Add a loader for while you're waiting for Khoj's response * Add user configuration option for chat model options, allow server side configuration for option list * Join up APIs, views, admin pages to allow configuring custom voice models	2024-06-21 11:30:28 +05:30
Debanjum Singh Solanky	427575e958	Improve khoj chat new, delete session flows When create new conversation session, automatically request query. As that is expected next action after creating new session Pass session-id to khoj-chat to allow reuse from create-new-conversation func When delete conversation session, do not call load chat session. Unnecessary action. Use thread-last to improve code flow in new, delete conversation funcs	2024-06-21 10:54:59 +05:30
Debanjum Singh Solanky	59032a06d5	Improve defaults when extracting fields from online reference in khoj.el	2024-06-21 10:54:59 +05:30
Debanjum Singh Solanky	9262aea7a5	Fix comments, func calls based on melpazoid, checkdoc, package-lint	2024-06-21 10:54:59 +05:30
sabaimran	ff26b19d2b	Add a migration for allowing the docx field in the entries file type	2024-06-21 09:47:49 +05:30
sabaimran	3cfe5aabe5	Add support for magic link email sign-in (#820 ) * Add magic link email sign-in option * Adding backend routes and model changes to keep state of email verification code and status * Test and fix end to end email verification flow * Add documentation for how to use the magic link sign-in when self-hosting Khoj * Add magic link sign in to public conversation page	2024-06-20 13:32:58 +05:30
Debanjum Singh Solanky	0afe66ac39	Restore cursor to original window after opening Khoj side pane Previously the cursor would move to the Khoj side pane on opening it. This would break user's flow, especially when find similar triggers automatically New behavior maintains smoother update of auto find similar without disrupting user browsing	2024-06-20 12:50:13 +05:30
Debanjum Singh Solanky	afe91a2633	Only show headings of search result and increase total count returned Previously it would show complete result body this would make the result width variable and hard to track all the returned results Showing just heading makes it easier to track	2024-06-20 12:50:13 +05:30
Debanjum Singh Solanky	2b12a5514e	Find similar notes to current note at cursor automatically in background - Call find similar on current element if point has moved to new element - Delete the first result from find-similar search results as that'll be the current note (which is trivially most similar to itself) - Determine find-similar based text formating at the rendering layer rather than at the top level find-similar func	2024-06-20 12:50:13 +05:30
Raghav Tirumale	093eb473cb	Add Documentation for the /summarize Command (#822 ) * added documentation for the /summarize command * Add a hint for natural language usage --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-06-20 12:08:01 +05:30
Raghav Tirumale	bd3b590153	Support Indexing Docx Files (#801 ) * Add support for indexing docx files and associated unit tests --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-06-20 11:18:01 +05:30
Debanjum Singh Solanky	d042e073cc	Pass absolute path of file to index from khoj.el emacs client	2024-06-20 00:26:18 +05:30
Debanjum Singh Solanky	d23f2849d4	Update help message to only show the smaller set of new keybindings	2024-06-20 00:26:18 +05:30
Raghav Tirumale	d4e5c95711	Add Ability to Summarize Documents (#800 ) * Uses entire file text and summarizer model to generate document summary. * Uses the contents of the user's query to create a tailored summary. * Integrates with File Filters #788 for a better UX.	2024-06-18 19:31:07 +05:30
Debanjum Singh Solanky	677d49d438	Release Khoj version 1.14.0	2024-06-18 17:13:46 +05:30
Debanjum Singh Solanky	2930b57c78	Use hashed value to improve deduplication of search results on server	2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky	6814dadd21	Fix opening Web, Desktop setup links on first run from Desktop app Previous version failed to open the setup links	2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky	632f55a9e8	Do not default to rerank if device has GPU	2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky	f1120f24a1	Use solarized light css styling to highlight code in chat messages	2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky	d8a5a01cea	Pass multiple allowed Khoj domains via KHOJ_DOMAIN env var To add multiple allowed Khoj domains pass them as a comma separated list of domains via the KHOJ_DOMAIN environment variable Resolve comment in issue #662	2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky	4daf16e5f9	Only redirect to next url relative to current domain	2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky	86a3505d89	Remove image HTML elements from non whitelisted sources in Obsidian chat Given img src enforcement via CSP required loosening. Soft enforce it via a regex replace of img HTML elements if the src isn't from the whitelisted set of source prefixes. Currently allowed source prefixes are - app: for local images - data: for inline generated images - https://generated.khoj.dev: for cloud generated images	2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky	c7d825bddb	Sanitize markdown in Obsidian after conversion to HTML too - Create and use a function to convert markdown to sanitized html - Remove unused Latex delimiter handling as Katex isn't used in Khoj chat on Obsidian	2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky	08c3aa496d	Loosen CSP in Obsidian to load images, sync and allow Obsidian domain	2024-06-18 17:04:25 +05:30
sabaimran	327045be43	Make some basic updates to the chat documentation. Inc. conversation file filters, new screenshot	2024-06-18 12:14:59 +05:30
sabaimran	76e1bed8f9	Update Obsidian documentation	2024-06-18 08:22:10 +05:30
sabaimran	a57e1e7a14	Fix langchain, tenacity versions	2024-06-17 14:52:11 +05:30
sabaimran	ce9c14f894	Fix more packages related to langchain in the pyproject.toml	2024-06-17 14:38:05 +05:30
sabaimran	ba0187798a	Get converastion id before retrieving relevant notes in non-socket code	2024-06-17 14:26:06 +05:30
Debanjum	d2d9f4888e	Upgrade Khoj Emacs UX (#812 ) - Open Khoj in Emacs Side pane Open Khoj chat, search in right pane to allow for ambient engagement - Improve Khoj Chat - Show online references used for chat - Make chat API call async to not block user interactions - Fix loading chat history, references in khoj.el chat buffer - Improve Khoj Search, Find Similar functions - Make calls to Khoj search API async to not block user interactions - Support Conversation Sessions - Create transient menu to open, create, delete conversation sessions from the Khoj Emacs client	2024-06-16 10:39:48 +05:30
Debanjum Singh Solanky	fe36adb7b9	Remove short keys to switch content type during search to avoid conflict - C-x o to switch to search org content conflicts with switch buffer shortkey This is more apparent in the async search scenario as it prevents perform other actions while async search is in progress - Also switching content type wouldn't scale to all the content types Khoj will support without causing more conflicting keybinding	2024-06-15 17:31:19 +05:30
Debanjum Singh Solanky	2a84524d19	Make khoj.el search, similar API calls async to not block user interactions	2024-06-15 17:30:58 +05:30
Debanjum Singh Solanky	c6b95f8776	Handle rendering messages using the old reference schema in khoj.el Previously references were a list instead of a map	2024-06-15 17:30:58 +05:30
Debanjum Singh Solanky	db056c896d	Delete old conversation sessions from the chat menu in Khoj Emacs	2024-06-15 17:30:58 +05:30
Debanjum Singh Solanky	e3d995a74f	Extract select conversation session logic into func for reusability	2024-06-15 17:30:38 +05:30
Debanjum Singh Solanky	e15dc23bbe	Improve logic to create vs reuse window for khoj side pane logic Khoj side pane occupies a vertically split bottom right side pane. If the bottom right window is not a vertical split, create a new vertical split pane for khoj, otherwise reuse the existing window	2024-06-15 16:37:41 +05:30
Debanjum Singh Solanky	055e5e8d26	Create new conversation from the chat menu in Khoj Emacs	2024-06-15 16:37:41 +05:30
Debanjum Singh Solanky	c33954cd93	Fix loading an empty chat session in Emacs	2024-06-15 16:37:41 +05:30
Debanjum Singh Solanky	e21c0648ae	Create, use reusable function to call Khoj API from elisp	2024-06-15 16:37:41 +05:30
Debanjum Singh Solanky	7bcb49b6e7	Support conversation sessions in the Khoj Emacs client Add option in khoj main transient menu option to open menu to - switch between existing conversations	2024-06-15 13:13:20 +05:30
Debanjum Singh Solanky	df9c5ff263	Show online references used for chat response as footnotes in Emacs Previously online references used weren't being shown	2024-06-15 13:13:19 +05:30
sabaimran	82f37971c5	Fix broken link in automations.md	2024-06-14 16:22:27 +05:30
sabaimran	25d8cdd9cd	Misc fixes: - Fix getting file filters for not found conversations - Allow iamge rendering in automation emails - Fix nearest 15th minute calculation in automations creation	2024-06-14 16:20:22 +05:30
sabaimran	971f1cd897	Add basic page about automations	2024-06-14 15:52:30 +05:30
sabaimran	17bce930ba	Add a documentation page for keyboard shortcuts	2024-06-14 14:30:31 +05:30
Raghav Tirumale	35715096f4	UX Improvement: Keyboard Shortcuts for Recent Messages (#804 ) * added keyboard shortcuts to access old queries	2024-06-14 12:45:09 +05:30
sabaimran	2dcfb3c2f0	Fix bug for drag and drop single file	2024-06-14 12:01:10 +05:30
sabaimran	7e4a61f2ac	Disable rate limiting if billing is not enabled	2024-06-12 21:39:02 +05:30
Debanjum Singh Solanky	385057f09e	Make khoj.el chat API call async to not block user interactions	2024-06-12 21:04:48 +05:30
sabaimran	45e725ac9c	Use the summarizer model for generating improved image prompts	2024-06-12 17:41:12 +05:30
Raghav Tirumale	673d0d367c	Fix: Adding Support for Uploading Multiple Files (#803 ) * added support for uploading multiple files at a time. * optimized multiple file upload to use a batch upload * allowing files to upload even if there is one unsupported file	2024-06-12 15:51:35 +05:30
Debanjum Singh Solanky	906ebee075	Open Khoj chat, search in right pane to allow for ambient engagement See the currently active window in context while doing chat, search or find similar operations in a side pane. This is similar to how we've moved Khoj on Obsidian into the side pane as well	2024-06-09 23:32:34 +05:30
Debanjum Singh Solanky	cd4baa3fa5	Fix loading chat history, references in khoj.el chat buffer	2024-06-09 18:34:00 +05:30
Debanjum	6afbd8032e	Improve Intermediate Steps in Formulating Chat Response (#799 ) # Major - Disambiguate Text output mode to disambiguate from Default data source lookup - Fix showing headings in intermediate step in generating chat response - Remove "Path" prefix from org ancestor heading in compiled entry # Minor - Fix OpenAI chat actor, director unit tests	2024-06-09 07:55:01 +05:30
Debanjum Singh Solanky	f440ddbe1d	Fix openai chat actor, director tests - Update test ChatModelOptions setup since update to it's schema - Fix stale function calls using their updated signatures	2024-06-09 07:24:47 +05:30
sabaimran	2e209ab28b	Handle case where conversation does not (yet) exist	2024-06-08 16:22:12 +05:30
sabaimran	849c38c0a4	Add support for managing audiences for new users	2024-06-08 15:51:17 +05:30
sabaimran	06a47ee457	Add language-specific syntax highlighting via highlight.js (#802 ) * Add language-specific syntax highlighting via highlight.js - Add highlight.js to our assets CDN for fast load and compliance with the CSP - See other stylesheets options here: https://cdnjs.com/libraries/highlight.js * Bonus: set min-height to prevent increasing length of the sessions pane * Fix references rendering and add highlight.js in public conversation	2024-06-08 15:17:09 +05:30
Debanjum Singh Solanky	5f2442450c	Update truncation test to reduce flakyness in cloud tests Removed dependency on faker, factory for the truncation tests as that seems to be the point of flakiness	2024-06-07 19:42:48 +05:30
sabaimran	dbb06466bf	Minor fit/finish updates to the file filter experience	2024-06-07 15:05:00 +05:30
sabaimran	58a02f06ea	Fix multilingual font rendering (#797 ) * Fix multilingual font rendering; fallback to an Arabic language font which contains more Asian characters. Close #756 * Tune font-sizes and styling to accomodate new fonts with old sizing - Move connection-status styling out from inline html into css block - Remove start typing chat-input height jitter - align new-conversation button, text - use relative font sizes instead of absolute font sizes in most places --------- Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-06-07 11:53:47 +05:30
Raghav Tirumale	ba16afd3c2	New Feature: Adding File Filtering to Conversations (#788 ) * UI update for file filtered conversations * Interactive file menu #UI to add/remove files on each conversation as references. * Backend changes implemented to load selected file filters from a conversation into the querying process. --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-06-07 10:53:37 +05:30
Debanjum Singh Solanky	f91cdf8e18	Fix showing headings in intermediate step in generating chat response	2024-06-06 16:52:23 +05:30
Debanjum Singh Solanky	18f7e6e7ed	Remove "Path" prefix from org ancestor heading in compiled entry	2024-06-06 16:51:26 +05:30
sabaimran	8d701ebe22	Add fedCM to accommodate google migration (#798 ) - See migration guidelines here: https://developers.google.com/identity/gsi/web/guides/fedcm-migration#fedcm_flag	2024-06-06 14:23:16 +05:30
Debanjum Singh Solanky	dd2225b1aa	Use Text output mode to disambiguate from Default data source lookup Previously if default output was selected by Khoj, we'd end up doing an documents search as well, even when Khoj selected internet or general data source to lookup. This update disambiguates the default information mode from the text output mode. To avoid doing documents search when not deemed necessary by Khoj	2024-06-06 11:56:48 +05:30
Debanjum Singh Solanky	a1e4f4bde7	Gracefully skip indexing when empty list of docs provided Improve error message when fail to index content	2024-06-05 19:39:15 +05:30
Debanjum Singh Solanky	21987f60c7	Use `-difference' to get files to delete. Make batch size defcustom Improve docstrings to align with `checkdoc' requirement for all args being mentioned	2024-06-05 19:39:15 +05:30
Debanjum	bfacd65971	Batch upload files for indexing from the Emacs client (#735 ) from yuzhou721/master Encode filenames and batch file uploads to improve sending content to index from the Emacs client	2024-06-05 19:31:06 +05:30
sabaimran	a9c383e62c	Use an ASGI application, rather than WSGI - ASGI should be the preferred application, as our codebase runs a lot of async code	2024-06-05 09:25:08 +05:30
sabaimran	0816cec4bc	Manually close old db connections periodically	2024-06-04 22:19:47 +05:30
sabaimran	acfdc8da77	Explicitly set the connection age to 0 in the django settings. Seems to be some strange behavior with async gunicorn + django db	2024-06-04 20:31:51 +05:30
Debanjum Singh Solanky	85a343363b	Release Khoj version 1.13.0	2024-06-04 11:57:44 +05:30
Debanjum	1dfd6d7391	Merge pull request from GHSA-h2q2-vch3-72qm Add CSP and sanitize chat messages in Obsidian, Desktop, Web apps	2024-06-04 11:29:21 +05:30
Debanjum Singh Solanky	b757ba664f	Sanitize chat messages to render in Obsidian, Desktop, Web apps Use DOMPurify to escape any unsafe HTML in chat message before adding it to DOM via innerHTML updates to a HTML element	2024-06-04 10:53:30 +05:30
Debanjum Singh Solanky	9f80c2ab76	Enforce Content-Security-Policy (CSP) in Obsidian, Desktop, Web apps Prevent XSS attacks by enforcing Content-Security-Policy (CSP) in apps. Do not allow loading images, other assets from untrusted domains. - Only allow loading assets from trusted domains like 'self', khoj.dev, ipapi for geolocation, google (fonts, img) - images from khoj domain, google (for profile pic) - assets from khoj domain - Do not allow iframe src - Allow unsafe-inline script and styles for now as markdown-it escapes html in user, khoj chat - Add hostURL to CSP of the Desktop, Obsidian apps Given web client is served by khoj server, it doesn't need to explicitly allow for khoj.dev domain. So if user self-hosting, it'll automatically allow the domain in the CSP (via 'self') Whereas the Obsidian, Desktop clients allow configure the server URL. Note switching server URL breaks CSP until app is reloaded	2024-06-04 10:53:30 +05:30
Debanjum Singh Solanky	179c70dba8	Upgrade Khoj llama-cpp, django and jinja dependencies	2024-06-04 09:05:53 +05:30
Debanjum Singh Solanky	bbcdb8413d	Add null checks, fix build errors in Khoj plugin on newer Obsidian	2024-06-03 18:03:11 +05:30
Debanjum Singh Solanky	d8ace4d34c	Highlight the agents, automation tab when active on the web app	2024-06-03 16:57:03 +05:30
sabaimran	4679f07336	Clean up some of the design of agents, inspired by dicussion #792	2024-06-03 12:52:07 +05:30
Debanjum Singh Solanky	8cdab5f31a	Update slash command UX in chat UI of desktop app to match web app Make commands in popup menu on typing slash in chat input selectable	2024-06-02 17:27:37 +05:30
Debanjum Singh Solanky	7828bd6f2e	Hide command popup & focus on chatInput on selecting command in web app Style command popup cursor and add highlight to indicate using slash command	2024-06-02 17:27:37 +05:30
Debanjum	cf8c9c2a3d	Serve image assets from Khoj domain, not directly from S3 bucket (#734 ) - Serve generated images from Khoj domain instead of directly from AWS S3 - Rename assets URL from Khoj S3 bucket to assets.khoj.dev	2024-06-02 17:24:35 +05:30
sabaimran	5bb3689562	Do not stream responses in the scheduled_chat response	2024-06-02 11:31:15 +05:30
sabaimran	5132b01ab1	Remove intent_type from telemetry update in api_chat	2024-06-02 10:21:38 +05:30
Raghav Tirumale	a3934b3aaa	Improved Command Menu and Help Command (#774 ) * The command menu (triggered by "/") now has a clickable list of possible commands, that automatically fill into the chat when pressed. * The `/help` command now searches `khoj.dev` pages to provide useful assistance to the user. --------- Co-authored-by: raghavt3 <raghavt3@illinois.edu> Co-authored-by: sabaimran <65192171+sabaimran@users.noreply.github.com>	2024-06-01 22:33:31 +05:30
sabaimran	6d10f98498	Add additional lines for KHOJ_NO_HTTPS and KHOJ_DOMAIN in the docker-compose	2024-06-01 21:48:43 +05:30
sabaimran	841cbff249	Add documentation for setting up google auth in self-hosted khoj. Closes #771	2024-06-01 21:38:21 +05:30
sabaimran	89178bcebd	Fix formatting issues for task email in mobile	2024-06-01 14:19:12 +05:30
Debanjum	b499b3fe2a	Upgrade Khoj Obsidian: Chat from Side Pane, Stream Intermediate Steps, Copy Message to Clipboard (#736 ) ### Details - Chat with Khoj from right pane on Obsidian - Modal was too ephemeral, couldn't have it open for reference, quick jump to Khoj chat - Stream intermediate steps taken by Khoj for generating response to the chat pane Gives more transparency into Khoj 'thinking' process, e.g internet, notes searches performed, documents read etc. The feedback allows us to tune our messages to elicit better responses by Khoj - Add ability to copy message to clipboard, paste chat messages directly into current file - Jump to Search, Find Similar functions from navigation bar on the Khoj Obsidian side pane - Improve spacing, use consistent colors in chat message references and buttons Resolves #789, #754	2024-06-01 13:29:21 +05:30
sabaimran	8b9c26c468	Remove unused method	2024-06-01 12:54:43 +05:30
sabaimran	5ec641837a	Allow automations to be shareable (#790 ) * Updating the API / UI to support sharing of automations * Allow people to see the automations even when not logged in, and add an overlay effect * Handle unauthenticated users taking actions * Support showing pre-filled automation details on the config automations page * Redirect user to login if they try to add an automation while unauthenticated	2024-06-01 12:44:49 +05:30
Debanjum Singh Solanky	7d7d4cf5c3	Make new chat message text selectable in Obsidian side pane Resolves #789	2024-06-01 11:01:39 +05:30
Debanjum Singh Solanky	7fb7f200b3	Fix rendering text in chat messages with bulleted lists Improves #789	2024-06-01 10:51:22 +05:30
Debanjum Singh Solanky	7a93599fe8	Merge branch 'master' into upgrade-khoj-on-obsidian - Conflicts: - src/khoj/interface/web/chat.html Use our changes with feedback button changes from master	2024-06-01 10:07:43 +05:30
Debanjum Singh Solanky	92bab9fa61	Get Conversation session action buttons out from under the three dot menu	2024-05-31 20:11:00 +05:30
Debanjum Singh Solanky	7fa42daf89	Render action buttons for new Khoj chat responses in Obsidian - Dedupe the code to add action buttons to chat messages - Update the renderIncrementalMessage function to also add the action buttons to newly generated chat messages by Khoj	2024-05-31 20:11:00 +05:30
Debanjum Singh Solanky	2d010db83f	Toggle chat session view on clicking the Obsidian chat sessions button	2024-05-31 20:11:00 +05:30
Debanjum Singh Solanky	275d4877a6	Fix loading spinner visibility by using contrasting background color Fix code formating of Khoj chat view in Obsidian	2024-05-31 20:09:24 +05:30
sabaimran	2667ef4544	Refresh the conversation from the db in the websocket flow	2024-05-31 16:15:56 +05:30
sabaimran	fd07abbfc8	Decrease the life of one connection	2024-05-31 15:39:15 +05:30
Debanjum	3090b84252	Disable Minutely Recurrence for Automations (#781 ) * Disable automation recurrence at minute level frequency * Set a max lifetime for django's connections to the db * Disable any automation that has a non-numeric first digit (i.e., recuring on the minute level) * Re-enable automations --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-05-31 12:50:19 +05:30
sabaimran	5dca48d9fc	Fix setting of conn_max_age variable	2024-05-31 11:07:13 +05:30
sabaimran	76f941f4e5	Revert email from from to sender again in resend API. keeps switching?	2024-05-31 10:30:18 +05:30
sabaimran	b27f59b12b	Remove all unused code related to websockets	2024-05-30 11:39:04 +05:30
sabaimran	4b3d3fe7ea	/s/sender/from in resend calls	2024-05-30 08:43:46 +05:30
sabaimran	2076543e32	Disable AP Scheduler while performing maintenance	2024-05-30 08:02:59 +05:30
sabaimran	4aac84e1c1	Pin rsesend verison in pyproject.toml	2024-05-30 07:05:11 +05:30
Debanjum Singh Solanky	7823ef09dc	Simplify conditional code. Improve logs to track conversion progress	2024-05-29 17:50:07 +05:30
Debanjum Singh Solanky	215db8cab3	Reduce log level of noisy process lock logs	2024-05-29 13:14:44 +05:30
Debanjum Singh Solanky	7b18919564	Tag external links to open in a separate window on the Desktop app Previously clicking inline links would open the URL directly in the Desktop app. This was strange and it didn't provide any way to go back to Khoj desktop app UI from the opened link	2024-05-29 10:12:50 +05:30
Debanjum Singh Solanky	c957a6cb43	Delete unused base_processor_integration html file from web interface	2024-05-29 08:30:13 +05:30
sabaimran	7dd72c1d25	Fix trailing whitespace issue in development.mdx	2024-05-29 04:36:46 +05:30
sabaimran	cb33fb67fe	Remove the automations-related dead code in the web config	2024-05-29 04:22:45 +05:30
Debanjum Singh Solanky	15c5873c20	Provide more context in docs for self-hosting Khoj on Windows	2024-05-28 20:56:26 +05:30
Debanjum Singh Solanky	7594401461	Fix expand chat reference animation in web, desktop, obsidian clients	2024-05-28 20:56:26 +05:30
Debanjum Singh Solanky	1ea7675fc9	View, switch chat sessions from Obsidian chat pane	2024-05-28 20:33:39 +05:30
Debanjum Singh Solanky	e86899eec4	Click on referenced notes by Khoj chat to open it in Obsidian vault Allow opening Khoj chat references in Obsidian vault if the reference is a heading or file in the current Obsidian vault	2024-05-28 10:16:40 +05:30
Debanjum	39faae68c0	Merge pull request #768 from MythicalCow/documentation/windows-development-fixes Documentation Fixes for Development Page	2024-05-28 00:26:15 +05:30
Raghav Tirumale	4a8920f9a4	formatting fix	2024-05-27 12:52:08 -05:00
Raghav Tirumale	9a11a3cd63	Added installation notes for windows users and added postgres setup instructions.	2024-05-27 12:49:52 -05:00
Desmond	70fea6c6b6	fix: delete file request	2024-05-27 14:46:26 +08:00
sabaimran	607534021b	Add a link to github in the settings menu, improve styling	2024-05-27 11:39:30 +05:30
Desmond	3f49b5a4ab	fix: emacs tests	2024-05-27 10:42:09 +08:00
sabaimran	b97ca9d19d	Skip using max_tokens as input to the extract questions step, as that's not used for max_output	2024-05-27 01:23:54 +05:30
sabaimran	9ebf3a4d80	Improve the admin experience, add more metadata to the list_display - Don't propagate max_tokens to the openai chat completion method. the max for the newer models is fixed at 4096 max output. The token limit is just used for input	2024-05-27 00:49:20 +05:30
sabaimran	01cdc54ad0	Add support for Anthropic models (#760 ) * Add support for chatting with Anthropic's suite of models - Had to use a custom class because there was enough nuance with how the anthropic SDK works that it would be better to simply separate out the logic. The extract questions flow needed modification of the system prompt in order to work as intended with the haiku model	2024-05-26 22:50:34 +05:30
Debanjum Singh Solanky	0f796a79ec	Extract function to get link to entry in Obsidian vault for reuse	2024-05-26 18:03:15 +05:30
Debanjum Singh Solanky	e24ca9ec28	Pass file path of each doc reference in references returned by API - Pass file path of reference along with the compiled reference in list of references returned by chat API converts - Update the structure of references from list of strings to list of dictionary (containing 'compiled' and 'file' keys) - Pull out the compiled reference from the new references data struct wherever it was is being used	2024-05-26 18:02:11 +05:30
Debanjum Singh Solanky	ba330712f8	Fix to always pass online results in chat API response	2024-05-26 13:56:55 +05:30
Debanjum Singh Solanky	38d8d2bb56	Show online references used to generate response in Obsidian chat view	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	f495d338eb	Modularize render message with references func in web based clients Simplify, reuse, standardize code to render messages with references in the obsidian, web and desktop clients. Specifically: - Reuse function to create reference section, dedupe code - Create reusable function to generate image markdown - Simplify logic to render message with references	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	14a2006c76	Stream steps taken to generate response in Obsidian chat pane - Setup websocket using Khoj web app as reference. - Moved the geolocating code to chat view out from the general pane view - Use loading spinner from web instead of the thinking emoji	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	afcd22d30c	Improve spacing, colors of chat message references and buttons Works better with dark modes. References have more spacing and adhere to background color of the chat message itself	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	bd4931e70b	Add ability to paste chat messages directly into current file It'll replace any highlighted text with the chat message or if not text is highlighted, it'll insert the chat message at the last cursor position in the active file	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	032ad3b521	Add ability to copy messages to clipboard from Obsidian Khoj chat	2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky	57f1c53214	Create Nav bar for Obsidian pane. Use abstract View class for reuse - Jump to chat, show similar actions from nav menu of Khoj side pane - Add chat, search icons from web, desktop app - Use lucide icon for find similar (for now) - Match proportions of find similar icon to khoj other icons via css, js - Use KhojPaneView abstract class to allow reuse of common functionality like - Creating the nav bar header in side pane views - Loading geo-location data for chat context This should make creating new views easier	2024-05-26 13:55:22 +05:30
sabaimran	e2922968d6	Move some gifs to the assets s3 bucket and add instructions for Ollama, shareable conversations	2024-05-25 01:08:20 +05:30
sabaimran	e23c803cee	Release Khoj version 1.12.1	2024-05-24 21:42:03 +05:30
sabaimran	0308699849	Use links from assets.khoj.dev to render images in the automations page	2024-05-24 20:18:02 +05:30
sabaimran	3f9c20a399	Make it easier to manage server-level chat settings (#729 ) * Add support for server-wide model settings fix web page reading results returning logic	2024-05-24 20:15:18 +05:30
sabaimran	cbbbe2da9a	Add a schedule picker and automations preview func (#747 ) * Update suggested automations * add a schedule picker when creating an automation * Create a new conversation in flow of the automation scheduling in order to send a preview and deliver more consistent results * Start adding in scaffolding to manually trigger a test job for an automation * Add support for manually triggering automations for testing * Schedule automation asynchronously * Update styling of the preview button * Improve admin lookup experience and prevent jobs from being scheduled to run every minute of everyday * Ignore mypy issues on job info short description	2024-05-24 19:42:47 +05:30
Ikko Eltociear Ashimine	ac3e5089a2	docs: update typo in desktop.md (#744 ) reponses -> responses	2024-05-24 03:52:03 +05:30
Md. Shahnewaz Siddique	3af06a3d5a	Updated installation instructions for windows, linux in readme (#741 )	2024-05-24 03:51:25 +05:30
sabaimran	4511c6ae7c	Fix bug in chat feedback flow - user message not included during live chat	2024-05-21 14:55:39 -05:00
Desmond	a3c6045328	Merge remote-tracking branch 'origin/master'	2024-05-21 21:55:53 +08:00
Desmond	b0630c1a98	Simplify partition	2024-05-21 21:52:01 +08:00
sabaimran	0b7910d4af	Pin th elangchain-community version explicitly	2024-05-21 05:26:17 -05:00
Raghav Tirumale	d57772f9e7	Add Feedback Buttons on Chat (#721 ) ### Description and Rationale for Changes This feature includes thumbs up and thumbs down buttons on Khoj's chat responses that provide automated feedback. When a thumbs up/down button is clicked, the code sends an email to team@khoj.dev with the following: * user query * khoj's response * whether the sentiment of the user was good or bad. This is critical in improving Khoj's nondeterministic LLM model for a better user experience. ### List of Changes * new endpoint in `api_chat.py` (/feedback) that can be used to trigger mail sending). * thumbs up and thumbs down buttons implemented in `chat.html` * new function in `routers/email.py` to handle feedback email sending via resend * `feedback.html` template for a formatted email with the feedback. --------- Co-authored-by: mythicalcow <mythicalcow@linux.myguest.virtualbox.org> Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-05-20 16:29:08 -05:00
Debanjum	f941948d11	Merge pull request #738 from joshavant/patch-1 Improve telemetry.md disabling instructions in docs	2024-05-17 21:43:42 +05:30
Josh Avant	37ad1d5397	Update telemetry.md disabling instructions	2024-05-15 15:18:00 -05:00
sabaimran	7feaf34702	Fix capitalization, update suggeted prompt	2024-05-10 02:36:13 -07:00
sabaimran	b545aceb47	Use a simpler example for the sample automation and put schedule on top of instructions	2024-05-09 13:53:19 -07:00
sabaimran	2b8e5a86cc	Update version for resent library in pyproject.toml	2024-05-09 13:43:27 -07:00
sabaimran	7ae00832bd	Rname from parameter to sender in resend call	2024-05-09 13:29:39 -07:00
sabaimran	fbd76f8ebe	Improve the UX of automations (#737 ) * Improve the automations UX - Add suggested jobs to elimiinate some of the cold start problem - Make each of the tasks cards that are clickable/editable * Hide suggested automations that have already been added * Add a footer and reapply styling when a save action is taken on a card	2024-05-09 01:29:48 -07:00
sabaimran	70d0ee4310	Only remove the process lock from a process that created it	2024-05-08 10:14:52 -07:00
Desmond Deng	20303feb3a	Merge branch 'khoj-ai:master' into master	2024-05-08 13:46:34 +08:00
Desmond	150cd18bf3	Update batch-size	2024-05-08 13:44:22 +08:00
Desmond	192cd53003	Batch send of index files	2024-05-08 13:38:40 +08:00
sabaimran	a50deb2762	Add better handling for empty responses	2024-05-07 11:49:33 -07:00
sabaimran	4aed6bd274	Add an admin view for subscriptions	2024-05-07 11:48:52 -07:00
sabaimran	77626d28d1	Include stack trace when automation is not successfully craeted	2024-05-07 06:52:41 -07:00
sabaimran	0c8c565ab0	Don't include the whole stack trace for an integrity error	2024-05-07 06:48:18 -07:00
Debanjum Singh Solanky	0a1a6cd041	Get detailed user info in Obsidian from the new v1/user API Previously we were just getting user email from the /health API Instead store the retrieved user info in the user settings	2024-05-07 04:37:26 +08:00
Debanjum Singh Solanky	f8f9d066db	Focus on input field, scroll to latest message on opening chat pane Previously scroll and chat input focus weren't applied as view hadn't been rendered yet	2024-05-07 04:37:26 +08:00
Debanjum Singh Solanky	9f65e8de98	Open Khoj Chat as a Pane instead of a Modal - Allows having it open on the side as you traverse your Obsidian notes - Allow faster time to response, having responses visible for context - Enables ambient interactions	2024-05-07 04:37:26 +08:00
sabaimran	9ae828cf11	Use asssets.khoj.dev for loading math katex rendering	2024-05-07 01:43:46 +08:00
sabaimran	cf0b7628d0	Add the url scheme to the public share url	2024-05-06 21:37:49 +08:00
sabaimran	f6aaecb04f	Fix construction method for public share conversation URL	2024-05-06 08:32:51 +05:30
sabaimran	14c9bea663	Make conversations optionally shareable (#712 ) * Make conversations optionally shareable - Shared conversations are viewable by anyone, without a login wall - Can share a conversation from the three dot menu - Add a new model for Public Conversation - The rationale for a separate model is that public and private conversations have different assumptions. Separating them reduces some of the code specificity on our server-side code and allows us for easier interpretation and stricter security. Separating the data model makes it harder to accidentally view something that was meant to be private - Add a new, read-only view for public conversations	2024-05-05 23:16:04 +05:30
Debanjum Singh Solanky	80cbaca935	Serve generated images from Khoj domain instead of directly from S3 Use CNAME to forward requests from the khoj subdomain to the equivalent S3 bucket	2024-05-04 20:07:10 +05:30
Debanjum Singh Solanky	425496844b	Rename assets URL from Khoj S3 bucket to assets.khoj.dev Server khoj assets from khoj domain	2024-05-04 20:07:10 +05:30
sabaimran	88daa841fd	Rename process lock migration and add a reverse migration step	2024-05-04 20:05:00 +05:30
sabaimran	509a8a412c	Throw an error if trying to create a process lock that already exists. Names should be unique	2024-05-04 19:03:53 +05:30
sabaimran	7100614de5	Add support for rendering math equations in the web view (#733 ) - Add parsing logic for LaTeX-format math equations in the web chat - Add placeholder delimiters when converting the markdown to HTML in order to avoid removing the escaped characters - Add the `<!DOCTYPE html>` specification to the page	2024-05-04 15:59:17 +05:30
Debanjum Singh Solanky	d9b3482b1a	Show error when required fields to create automation are not set	2024-05-04 11:17:30 +05:30
Debanjum Singh Solanky	91a5643c5c	Use Preview label for Automate feature. Prefix mailto: link to contact	2024-05-04 10:59:17 +05:30
Debanjum Singh Solanky	fd2328ab40	Do not hard code base url of path to automation icon in chat message	2024-05-04 10:59:07 +05:30
sabaimran	a38f3227e2	Revert domain in task task send emails	2024-05-03 15:27:27 +05:30
sabaimran	a1263951e9	Use mail to in email contact link	2024-05-03 12:16:56 +05:30
sabaimran	7c9847fe48	Increase jitter to 60	2024-05-03 11:38:22 +05:30
sabaimran	737ebfd521	Make improvements to online search prompts and use a custom domain for automations emails	2024-05-03 10:47:42 +05:30
sabaimran	42e9504ba8	Use a different function for getting last run time, avoid async/sync issues	2024-05-02 12:13:45 +05:30
sabaimran	9e8491b814	Add experimental disclaimers to the automations	2024-05-02 11:40:37 +05:30
sabaimran	c418449311	Add additional robustness in verifying job execution parameters at run time	2024-05-02 11:13:04 +05:30
sabaimran	690e9d8ed3	Collapse the reminders after they're successfully scheduled	2024-05-02 09:55:04 +05:30
sabaimran	6b648ee3ad	Add experimental disclaimer in the automation page	2024-05-02 09:21:27 +05:30
sabaimran	f4fbc91515	Remove the exclamation point from the email	2024-05-01 19:01:51 +05:30
sabaimran	bddd1d0fcb	Quip, smart reminders	2024-05-01 16:39:07 +05:30
sabaimran	bc8b92a77d	Release Khoj version 1.12.0	2024-05-01 16:30:48 +05:30
sabaimran	9d02c354dd	Merge pull request #732 from khoj-ai/fit-and-finish/schedule-tasks Fixes and improves for scheduled tasks	2024-05-01 03:16:09 -07:00
sabaimran	b499851097	Use the cleaned query as the reference query in the email notification	2024-05-01 15:33:11 +05:30
sabaimran	f24495e0e6	Fix time zone used in query history. Closes #694	2024-05-01 15:31:48 +05:30
sabaimran	7fd57d737e	Adjustments to improve overall styling of config page, email template	2024-05-01 14:19:47 +05:30
sabaimran	28578310d1	Add log line when sending a task-related email	2024-05-01 13:56:02 +05:30
sabaimran	a86f95117e	Add the subject generation prompt and helper method	2024-05-01 13:55:32 +05:30
sabaimran	c30ba2e551	Set subject dynamically when creating new tasks, and make some minor improvments to the automations UI	2024-05-01 13:54:59 +05:30
sabaimran	d1b2037676	Shutdown the scheduler when the application is exiting	2024-05-01 13:53:34 +05:30
Debanjum	10f623154e	Enable Creating Automations from Khoj (#731 ) ## Support Scheduling Automations (#695) 1. Detect when user intends to schedule a task, aka reminder - Support new `reminder` output mode to the response type chat actor - Show examples of selecting the reminder output mode to the response type chat actor 2. Extract schedule time (as cron timestring) and inferred query to run from user message 3. Use APScheduler to call chat with inferred query at scheduled time ## Make Automations Persistent (#714) - Make scheduled jobs persistent and work in multiple worker setups - Add new operation Scheduled Job to Operation enum of ProcessLock ## Add UX to Configure Scheduled Tasks (#715) - Add section in settings page to view, delete your scheduled tasks - Add API endpoints to get and delete user scheduled tasks ## Make Automations more Robust. Improve UX (#718) - Decouple Task Run from User Notification - Make Scheduling more Robust - Use JSON mode to get parse-able output from chat model - Make timezone calculation programmatic on server instead of asking chat model - Use django-apscheduler to handle apscheduler and django interfacing - Improve automation UX. Move it out into separate top level page - Allow creating, modifying automations from the automations page - Infer cron from natural language client side to avoid roundtrip	2024-05-01 11:08:19 +05:30
Debanjum Singh Solanky	89a8dbb81a	Fix edit job API. Use user timezone, pass all reqd. params to automation - Pass user and calling_url to the scheduled chat too when modifying params of automation - Update to use user timezone even when update job via API - Move timezone string to timezone object calculation into the schedule automation method	2024-05-01 10:29:49 +05:30
Debanjum Singh Solanky	19c5af3ebc	Handle natural language to cron translation error on web client	2024-05-01 09:10:18 +05:30
Debanjum Singh Solanky	70ee9ddf91	Merge migrations from main with feature branch	2024-05-01 09:10:18 +05:30
Debanjum Singh Solanky	8f28f6cc1e	Remove now unused location data from being passed to automation funcs	2024-05-01 08:48:16 +05:30
Debanjum Singh Solanky	815966cb25	Unify, modularize DB adapters to get automation metadata by user further	2024-05-01 08:47:50 +05:30
Debanjum Singh Solanky	21bdf45d6f	Add link to Automate page in nav pane of the web app	2024-05-01 08:47:50 +05:30
Debanjum Singh Solanky	bd5008136a	Move automations into independent page. Allow direct automation - Previously it was a section in the settings page. Move it to independent, top-level page to improve visibility of feature - Calculate crontime from natural language on web client before sending it to to server for saving new/updated schedule to disk. - Avoids round-trip of call to chat model - Convert POST /api/automation API endpoint into a direct request for automation with query_to_run, subject and schedule provided via the automation page. This allows more granular control to create automation - Make the POST automations endpoint more robust; runs validation checks, normalizes parameters	2024-05-01 08:47:48 +05:30
Debanjum Singh Solanky	cbc8a02179	Make, use func for constructing the automation created response - Dedupe logic across http, ws chat API endpoints - Reduces size of already too long http, ws chat API endpoint funcs	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	c52ed333fa	Make content, cards on config pages occupy the whole middle column - Make the config page content use the same top level 3-column layout as the khoj-header-wrapper This ensures the content is aligned with heading pane width - Let cards and other settings sections scale to the width of their grid element. This utilizes more of the screen space and does it consistently across the different settings pages	2024-05-01 08:30:10 +05:30
sabaimran	ad4145e48c	Fix unique has for job id	2024-05-01 08:30:10 +05:30
sabaimran	311d58e1ed	Ensure the automated_task command is removed from the prepended query	2024-05-01 08:30:10 +05:30
sabaimran	eb65532386	Use Django ap scheduler in place of the sqlalchemy one	2024-05-01 08:30:10 +05:30
sabaimran	06213ea814	Fix token retrieval when executing the job and name async job approriately	2024-05-01 08:30:10 +05:30
sabaimran	ca8a7d8368	Revert sync -> aync in send welcome email method	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	6936875a82	Use DB adapter to unify logic to get, delete automation by auth user To use place with logic to get, view, delete (and edit soon) automations by (authenticated) user, instead of scattered across code	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	1238cadd31	Allow editting query-to-run from the automation config section	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	cb2b1dccc5	Add icon for Automation feature. Replace old icons for delete, new	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	23f2057868	Allow creating automations from automation settings section in web ui - Create new POST API endpoint to create automations - Use it in the settings page on the web interface to create new automations This simplified managing automations from the setting page by allowing both delete and create from the same page	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	2f9241b5a3	Rename scheduled task to automations across code and UX - Fix query, subject parameters passed to email template - Show 12 hour scheduled time in automation created chat message	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	230d160602	Improve rendering task scheduled settings view and message - Render crontime string in natural language in message & settings UI - Show more fields in tasks web config UI - Add link to the tasks settings page in task scheduled chat response - Improve task variables names Rename executing_query to query_to_run. scheduling_query to scheduling_request	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	d341b1efe8	Store, retrieve task metadata from the job name field	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	ae10ff4a5f	Create create_scheduled_task func to dedupe logic across ws, http APIs Previously, both the websocket and http endpoint were implementing the same logic. This was becoming too unwieldy	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	8dfa0bf047	Simplify task scheduler prompt. No timezone conversion. Infer subject - Make timezone aware scheduling programmatic, instead of asking the chat model to do the conversion. This removes the need for scratchpad and may let smaller models handle the task as well - Make chat model infer subject for email. This should make the notification email more readable - Improve email by using subject in email subject, task heading. Move query to email final paragraph, which is where task metadata should go	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	2c563ad280	Use hash of query in process lock id to standardize id format - Using inferred_query directly was brittle (like previous job id) - And process lock id had a limited size, so wouldn't work for larger inferred query strings	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	3ce06a938c	Render scheduled task response as html to improve readability in email	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	c17dbbeb92	Render next run time in user timezone in config, chat UIs - Pass timezone string from ipapi to khoj via clients - Pass this data from web, desktop and obsidian clients to server - Use user tz to render next run time of scheduled task in user tz	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	6736551ba3	Improve scheduled task text rendered in UI	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	0e01362469	Merge DB migrations from master with those from scheduled task feature	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	a5ed4f2af2	Send email to share results of scheduled task	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	69775b6d6e	Add /task command. Use it to disable scheduling tasks from tasks This takes the load of the task scheduling chat actor / prompt from having to artifically differentiate query to create scheduled task from a scheduled task run.	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	22289a0002	Improve task scheduling by using json mode and agent scratchpad - The task scheduling actor was having trouble calculating the timezone. Giving the actor a scratchpad to improve correctness by thinking step by step - Add more examples to reduce chances of the inferred query looping to create another reminder instead of running the query and sharing results with user - Improve task scheduling chat actor test with more tests and by ensuring unexpected words not present in response	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	7f5981594c	Only notify when scheduled task results satisfy user's requirements There's a difference between running a scheduled task and notifying the user about the results of running the scheduled task. Decide to notify the user only when the results of running the scheduled task satisfy the user's requirements. Use sync version of send_message_to_model_wrapper for scheduled tasks	2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky	7e084ef1e0	Improve job id. Fix refreshing list of jobs on delete from config page	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	a1e5195c8b	Save separate user message time from Khoj response time in chat logs Previously user message time was being stored the same as Khoj response time in conversation logs.	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	5133b6e73b	Minor improvements to styling the config page	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	648f1a5c71	Suffix chat response element vars with "El" in chat.html of web, desktop apps	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	98d0ffecf1	Add section in settings page to view, delete your scheduled tasks	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	423d61796d	Add API endpoints to get and delete user scheduled tasks	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	af0972c539	Make scheduled jobs persistent and work in multiple worker setups - Store scheduled job state in Postgres so job schedules persist across app restarts - Use Process Locks to only allow single worker to process a given job type. This prevents duplicating job runs across all workers	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	fcf878e1f3	Add new operation Scheduled Job to Operation enum of ProcessLock	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	c28d7d3414	Add basic chat actor test to infer scheduled queries	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	c11742f443	Add chat actor to schedule run query for user at specified times - Detect when user intends to schedule a task, aka reminder Add new output mode: reminder. Add example of selecting the reminder output mode - Extract schedule time (as cron timestring) and inferred query to run from user message - Use APScheduler to call chat with inferred query at scheduled time - Handle reminder scheduling from both websocket and http chat requests - Support constructing scheduled task using chat history as context Pass chat history to scheduled query generator for improved context for scheduled task generation	2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky	9e068fad4f	Handle null ref, when refresh conversation from db in websocket chat	2024-04-30 14:19:07 +05:30
sabaimran	37879a7850	Release Khoj version 1.11.2	2024-04-30 13:31:06 +05:30
sabaimran	93b41170d1	Refresh the conversation log from the db before addressing the next query	2024-04-30 13:27:51 +05:30
Debanjum Singh Solanky	f1545d2b2f	Add, fix help link, improve title style in web ui config pages - Align title text with icon better in all config cards - Fix help link to github setup docs - Fix help link to notion setup docs	2024-04-30 05:50:08 +05:30
Debanjum Singh Solanky	e6da0f9a8c	Fix response type of delete client tokens API endpoint Previously the make delete API response failed, after deleting token. Required a page refresh to see that the API token was actually gone. This was happening because the response type of the delete token API endpoint isn't a string, so it failed FastAPI response validation checks.	2024-04-30 02:46:52 +05:30
sabaimran	0f4c3518d3	Allow session cookies to be stored with a lax policy for some localhost scenarios	2024-04-29 15:48:45 +05:30
sabaimran	5beedc9734	Use Secure proxy ssl header only if no https	2024-04-29 15:33:21 +05:30
sabaimran	408f4780ce	Add and update documentation for setting up khoj with an openai proxy server or offline llm	2024-04-27 20:16:32 +05:30
sabaimran	12258f02d7	Release Khoj version 1.11.1	2024-04-27 18:42:24 +05:30
sabaimran	2047b0c973	Support customization of the OpenAI base url in admin settings (#725 ) - Allow self-hosted users to customize their open ai base url. This allows you to easily use a proxy service and extend support for other models. - This also includes a migration that associates any existing openai chat model configuration with an openai processor configuration - Make changing model a paid/subscriber feature - Removes usage of langchain's OpenAI wrapper for better control over parsing input/output	2024-04-27 18:24:35 +05:30
sabaimran	49834e3b00	Add a hero image for the og:image meta tag	2024-04-27 17:07:21 +05:30
sabaimran	138f12f957	Fix indentation and revert first run message link styling to all links	2024-04-27 09:56:58 +05:30
Debanjum Singh Solanky	4395ed8065	Improve extract_questions func. Set message role to user, not assistant Previous behavior of passing message with role = "assistant was reducing instruction following quality of the model	2024-04-26 11:55:22 +05:30
Debanjum Singh Solanky	346499f12c	Fix, improve args being passed to chat_completion args - Allow passing completion args through completion_with_backoff - Pass model_kwargs in a separate arg to simplify this - Pass model in `model_name' kwarg from the send_message_to_model func `model_name' kwarg is used by langchain, not `model' kwarg	2024-04-26 11:55:22 +05:30
sabaimran	d8f2eac6e0	Release Khoj version 1.11.0	2024-04-25 17:24:59 +05:30
Debanjum Singh Solanky	1842017393	Skip trying to index deleted files, folders from Desktop app Previously app would crash on startup if desktop app was told to index a file that had been deleted afterwards	2024-04-25 15:23:05 +05:30
Debanjum	17a06f152c	Support Llama 3 and Improve Offline Chat Actors (#724 ) - Add support for Llama 3 in Khoj offline mode - Make chat actors generate valid json with more local models - Fix offline chat actor tests	2024-04-25 14:00:56 +05:30
Debanjum	220e5516ab	Make Search Models More Configurable. Upgrade Default Cross-Encoder (#722 ) - Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall - Support more embedding models by making query, docs encoding configurable	2024-04-25 13:55:49 +05:30
Debanjum Singh Solanky	cf08eaf786	Add comments explaining each field in the search model config in DB	2024-04-25 13:54:13 +05:30
Debanjum	4ee5ac7c20	Fix Chat UI and Indexing on Desktop App (#723 ) - Make valid file extension checking case insensitive on Desktop app - Skip indexing non-existent folders on Desktop app - Pass auth headers to fix lazy load of chat messages on Desktop app - Set chat-message height to height of content in web, desktop	2024-04-24 18:49:03 +05:30
Debanjum Singh Solanky	89ef23de50	Upgrade gunicorn and make it only a production dependency	2024-04-24 11:28:55 +05:30
Debanjum Singh Solanky	799efb5974	Create DB migration to add new fields and change default cross-encoder	2024-04-24 09:50:34 +05:30
Debanjum Singh Solanky	ec41482324	Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall Previous cross-encoder model was a few years old, newer models should have improved in quality. Model size increases by 50% compared to previous for better performance, at least on benchmarks	2024-04-24 09:50:09 +05:30
Debanjum Singh Solanky	7eaf9367fe	Support more embedding models by making query, docs encoding configurable Most newer, better embeddings models add a query, docs prefix when encoding. Previously Khoj admins couldn't configure these, so it wasn't possible to use these newer models. This change allows configuring the kwargs passed to the query, docs encoders by updating the search config in the database.	2024-04-24 09:49:17 +05:30
Debanjum Singh Solanky	f2db8d7d99	Fix offline chat actor tests Do not check for original q in extracted questions. Since this was removed in a previous commit	2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky	4f7237b158	Make chat actors generate valid json with more local models Improve tool, online search, webpage links, docs search chat actor prompts. Ensure works with hermes-2-pro and llama-3. Be more specific about generating JSON and not saying anything else.	2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky	a2e4e4bede	Add support for Llama 3 in Khoj offline mode - Improve extract question prompts to explicitly request JSON list - Use llama-3 chat format if HF repo_id mentions llama-3. The llama-cpp-python logic for detecting when to use llama-3 chat format isn't robust enough currently	2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky	8e77b3dc82	Fix infer_max_tokens func when configured_max_tokens is set to None	2024-04-24 09:36:29 +05:30
Debanjum Singh Solanky	8196ab62f9	Make valid file extension checking case insensitive on Desktop app	2024-04-24 09:35:20 +05:30
Debanjum Singh Solanky	5def14e3bb	Skip indexing non-existent folders on Desktop app	2024-04-24 09:35:20 +05:30
Debanjum Singh Solanky	cd05f262a6	Pass auth headers to fix lazy load of chat messages on Desktop app	2024-04-24 09:35:20 +05:30
Debanjum Singh Solanky	4d5d3e6433	Set chat-message height to height of content in web, desktop In some cases, especially with image generation requests, this was causing the chat messages to overlap in the chat UI	2024-04-24 09:35:20 +05:30
sabaimran	60658a8037	Get rid of enable flag for the offline chat processor config - Default, assume that offline chat is enabled if there is an offline chat model option configured	2024-04-23 23:08:29 +05:30
sabaimran	ac474fce38	Ensure that the tokenizer and max prompt size are used the wrapper method	2024-04-23 21:22:23 +05:30
Olatoyan George	ad59180fb8	Added indication in the desktop UI for back-end connectivity (#711 ) * Changed the styling of the link that takes a user to the settings page into a button * added an indicator that shows if a user is connected to the server or not * made a class name more descriptive and also made the text in first run message more intuitive * changed the command to install dependencies in the README.md * changed the class name of the first run message text to be more descriptive * added icons in the desktop UI that shows if a file is synced successfully or not * made the link class name in the homepage more descriptive * fixed the hover issue on status box in the chat header pane * fixed hovering issue on status box on macOS	2024-04-23 16:43:48 +05:30
Debanjum	419b044ac5	Use set, inferred max token limits wherever chat models are used (#713 ) - User configured max tokens limits weren't being passed to `send_message_to_model_wrapper' - One of the load offline model code paths wasn't reachable. Remove it to simplify code - When max prompt size isn't set infer max tokens based on free VRAM on machine - Use min of app configured max tokens, vram based max tokens and model context window	2024-04-23 16:42:35 +05:30
AjaySDwivedi1	abf6f963ea	Replaced reinitialize and save all button to a sync button in config.… (#701 ) Replaced reinitialize and save all button to a sync button in config	2024-04-23 16:42:11 +05:30
Debanjum Singh Solanky	c39c4e4ec4	Improve prompt for online search query generation chat actor - Allow searching github, pypi for information about Khoj - Enable creating multiple search queries by rewording prompt	2024-04-22 01:32:11 +05:30
Debanjum Singh Solanky	175169c156	Use set, inferred max token limits wherever chat models are used - User configured max tokens limits weren't being passed to `send_message_to_model_wrapper' - One of the load offline model code paths wasn't reachable. Remove it to simplify code - When max prompt size isn't set infer max tokens based on free VRAM on machine - Use min of app configured max tokens, vram based max tokens and model context window	2024-04-20 11:23:28 +05:30
Debanjum Singh Solanky	002cd14a65	Only let agent use online search tool if connected to it	2024-04-20 11:19:48 +05:30
Debanjum Singh Solanky	75c9ebbc54	Only show uvicorn debug logs at higher verbosity levels Don't automatically show the uvicorn logs when in_debug_mode, only show on at least verbosity = 2, i.e when start khoj with -vv flag	2024-04-20 11:18:01 +05:30
sabaimran	c6d668bacf	Bump gunicorn workers per server up to 2	2024-04-18 11:32:51 +05:30
sabaimran	c9a8abafa4	Merge pull request #710 from khoj-ai/add-run-with-process-lock-and-fix-edge-cases Extract run with process lock logic into func. Use it to re-index content	2024-04-17 01:29:02 -07:00
sabaimran	6de4a4873a	Fix image-related client unit test	2024-04-17 13:28:48 +05:30
sabaimran	3132430737	Add tests for the db lock	2024-04-17 13:22:41 +05:30
sabaimran	d11354f9c8	Remove additional references to image content config	2024-04-17 13:00:50 +05:30
sabaimran	105dbf49e4	Fix max_duration_in_seconds for the update_embeddings job	2024-04-17 13:00:18 +05:30
Debanjum Singh Solanky	8e0bae894d	Extract run with process lock logic into func. Use for content reindexing	2024-04-17 12:31:19 +05:30
Debanjum Singh Solanky	e9f608174b	Fix access to Khoj admin panel from non HTTPS custom domains To access the Khoj admin panel from a non HTTPS custom domain the `KHOJ_NO_SSL' and `KHOJ_DOMAIN' env vars need to be explictly set. See the updated setup docs for details. Resolves #662	2024-04-17 03:20:05 +05:30
sabaimran	46210695b6	pin version of huggingface hub explicitly to ensure relevant constants are present. Closes #708	2024-04-17 01:09:36 +05:30
sabaimran	b0059654c9	Do not create an import error if the resend module is not available	2024-04-17 01:00:22 +05:30
sabaimran	f04ead7c37	Remove seting up log line for configuring image search	2024-04-17 00:45:39 +05:30
sabaimran	0208688801	Increase factor for n_ctx reduciton to 2e6	2024-04-17 00:41:36 +05:30
Debanjum Singh Solanky	1f2ffce85b	Copy chat message with it's markdown formatting in Web, Desktop apps	2024-04-16 22:10:34 +05:30
sabaimran	91c8b137f1	Add a database lock for jobs that shouldn't be run by multiple workers (#706 ) * Add a database lock for jobs that shouldn't be run by multiple workers * Import relevant functions from utils.helpers	2024-04-16 21:29:27 +05:30
sabaimran	adb2e8cc5f	Check if n is populated before making a comparison	2024-04-16 02:05:58 +05:30
Debanjum Singh Solanky	6707ccc463	Check before updating "chat" key in meta_log in chat history API endpoint	2024-04-15 21:06:47 +05:30
Debanjum Singh Solanky	4e7812fe55	Use Django management cmd to update inline images in DB to/from WebP/PNG This provides Khoj server admins more control on migrating their S3 images to WebP format from PNG	2024-04-15 20:19:49 +05:30
Debanjum Singh Solanky	7fab8d6586	Only use chat messages count in history API endpoint when set by client	2024-04-15 19:12:57 +05:30
Debanjum	6b3ef61dd2	Improve Chat Page Load Perf, Offline Chat Perf and Miscellaneous Fixes (#703 ) ### Store Generated Images as WebP - `78bac4ae` Add migration script to convert PNG to WebP references in database - `c6e84436` Update clients to support rendering webp images inline - `d21f22ff` Store Khoj generated images as webp instead of png for faster loading ### Lazy Fetch Chat Messages to Improve Time, Data to First Render This is especially helpful for long conversations with lots of images - `128829c4` Render latest msgs on chat session load. Fetch, render rest as they near viewport - `9e558577` Support getting latest N chat messages via chat history API ### Intelligently set Context Window of Offline Chat to Improve Performance - `4977b551` Use offline chat prompt config to set context window of loaded chat model ### Fixes - `148923c1` Fix to raise error on hitting rate limit during Github indexing - `b8bc6bee` Always remove loading animation on Desktop app if can't login to server - `38250705` Fix `get_user_photo` to only return photo, not user name from DB ### Miscellaneous Improvements - `689202e0` Update recommended CMAKE flag to enable using CUDA on linux in Docs - `b820daf3` Makes logs less noisy	2024-04-15 18:34:29 +05:30
Debanjum Singh Solanky	a352940dfd	Use Django management command to update images URL in DB to WebP This provides Khoj server admins more control on migrating their S3 images to WebP format from PNG	2024-04-15 17:53:41 +05:30
Debanjum Singh Solanky	7d8e8eb0cf	Use Enum to type text-to-image intent of Khoj chat response	2024-04-15 17:53:40 +05:30
Debanjum Singh Solanky	128829c477	Show latest msgs on chat session load. Fetch rest as they near viewport - Reduces time to first render when loading long chat sessions - Limits size of first page load, when loading long chat sessions These performance improvements are maximally felt for large chat sessions with lots of images generated by Khoj Updated web and desktop app to support these changes for now	2024-04-15 16:10:56 +05:30
Debanjum Singh Solanky	9e5585776c	Support getting latest N chat messages via chat history API Get latest N if N > 0, else return all messages except latest N from the conversation	2024-04-15 15:32:32 +05:30
Debanjum Singh Solanky	e5ff85f6fb	Start fetching khoj css before icons to reduce time with no styling This should reduce frequency of page load jitter when icons are loaded before style is applied	2024-04-15 15:32:32 +05:30
Debanjum Singh Solanky	d5de59d411	Do not assume results key present in notion content when indexing	2024-04-15 08:02:20 +05:30
Debanjum Singh Solanky	4977b55106	Use offline chat prompt config to set context window of loaded chat model Previously you couldn't configure the n_ctx of the loaded offline chat model. This made it hard to use good offline chat model (which these days also have larger context) on machines with lower VRAM	2024-04-14 02:35:36 +05:30
Debanjum Singh Solanky	689202e00e	Update recommended CMAKE flag to enable using CUDA on linux in Docs	2024-04-14 02:35:27 +05:30
Debanjum Singh Solanky	148923c13a	Fix to raise error on hitting rate limit during Github indexing	2024-04-13 22:09:13 +05:30
sabaimran	f24d71c71c	Improve the agents UX (#702 ) - Make the chat buttons look more clickable - Show agent name in new conversation message - Add an icon to the CTA to send agent a message	2024-04-13 20:11:37 +05:30
Debanjum Singh Solanky	78bac4ae05	Add migration script to convert PNG to WebP references in database	2024-04-13 19:06:28 +05:30
Debanjum Singh Solanky	c6e8443631	Update clients to support rendering webp images inline This is for self-hosted scenarios where AWS S3 uploads is not enabled	2024-04-13 13:11:18 +05:30
Debanjum Singh Solanky	d21f22ffa1	Store Khoj generated images as webp instead of png for faster loading	2024-04-13 13:03:32 +05:30
Debanjum Singh Solanky	b820daf38f	Makes logs less noisy - Show telemetry enabled/disabled state on init, not every 2 minutes - Convert no docs synced logs to debug level instead of warning Having synced docs isn't as important to use Khoj now, unlike before	2024-04-13 11:22:58 +05:30
Debanjum Singh Solanky	b8bc6bee83	Always remove loading animation on Desktop app if can't login to server	2024-04-13 11:02:44 +05:30
Debanjum Singh Solanky	382507051f	Fix get_user_photo to only return photo, not user name from DB	2024-04-13 11:02:30 +05:30
sabaimran	f06ec485cb	Fix redirect url process for login flow, existing user	2024-04-12 17:10:05 +05:30
sabaimran	87b9a93fa1	Update assertion line to match new logic	2024-04-12 13:09:19 +05:30
sabaimran	b86e68a29d	Make it easier to view agents in the admin page	2024-04-12 13:02:22 +05:30
sabaimran	e58bd0e485	Remove mbox file from list of files expected to be included	2024-04-12 12:55:22 +05:30
sabaimran	6634d603a8	Add links for contributors to use in the readme	2024-04-12 12:49:12 +05:30
sabaimran	1377a44a1a	Suppress debug logs from uvicorn.error to avoid clutter from websockets - If application is not in DEBUG_MODE	2024-04-12 12:12:16 +05:30
Debanjum Singh Solanky	89b8ec3546	Release Khoj version 1.10.2	2024-04-12 11:53:32 +05:30
Debanjum Singh Solanky	50b4788a91	Remove chat loading animation in login required state on Desktop app	2024-04-12 11:50:54 +05:30
Debanjum Singh Solanky	b3f4794d91	Remove the unnecessary async/await func chains on Desktop app	2024-04-12 11:49:25 +05:30
Debanjum Singh Solanky	1e30a072d4	Just use file ext to identify indexable files to fix Desktop app install - Magika on Desktop app was too bloated (100Mb to 250Mb) and broke install for some reason. Not sure why it was causing the app install to fail but do not have time to currently investigate - Just use file extensions whitelist it's good enough for now. Let server handle the deeper identification of file type	2024-04-12 11:16:07 +05:30
Debanjum Singh Solanky	5c7797dbca	Only check content type if file extension cannot identify text file	2024-04-12 03:40:42 +05:30
Debanjum Singh Solanky	7d2ef728e6	Fix identifying pdf files on server Introduced bug in previous commit that would stop indexing PDF files as trying to check content_group instead of mime_type is application/pdf	2024-04-12 03:07:46 +05:30
Debanjum Singh Solanky	07f8fb5c5b	Release Khoj version 1.10.1	2024-04-12 02:18:07 +05:30
Debanjum Singh Solanky	a7d9102c33	Make identifying text, code files with Magika more robust on server Use identified content group rather than mime_type to find text files.	2024-04-12 02:12:26 +05:30
Debanjum Singh Solanky	60337086f9	Release Khoj version 1.10.0	2024-04-12 01:01:02 +05:30
Debanjum Singh Solanky	34c3f70203	Index only files with valid text extension in folders synced by Desktop app This maintains consistent set of indexable files from Desktop app, whether indexing via file or folder filters	2024-04-12 00:59:54 +05:30
Debanjum	9a48f72041	Index more text file types from Desktop, Github (#692 ) ### Index more text file types - Index all text, code files in Github repos. Not just md, org files - Send more text file types from Desktop app and improve indexing them - Identify file type by content & allow server to index all text files ### Deprecate Github Indexing Features - Stop indexing commits, issues and issue comments in a Github repo - Skip indexing Github repo on hitting Github API rate limit ### Fixes and Improvements - Fix indexing files in sub-folders from Desktop app - Standardize structure of text to entries to match other entry processors	2024-04-12 00:08:29 +05:30
Debanjum Singh Solanky	0819b83d0b	Fix constructing status update strings for intermediate chat steps	2024-04-11 20:31:32 +05:30
Debanjum Singh Solanky	d15b9bc272	Tell doc search actor to not generate online queries for doc search This can pick up irrelevant details from notes	2024-04-11 19:49:41 +05:30
Debanjum Singh Solanky	15a78b19ad	Improve Inferred Document Search Query Extraction from GPT Using stop_words = "\n" was preventing JSON responses with newlines in them	2024-04-11 19:24:04 +05:30
Debanjum Singh Solanky	653681967e	Show inferred document search queries in intermediate chat step on Web app	2024-04-11 19:24:04 +05:30
Debanjum Singh Solanky	997741119a	Show better intermediate steps when responding to chat via web socket - Show internet search, webpage read, image query, image generation steps - Standardize, improve rendering of the intermediate steps on the web app Benefits: 1. Improved transparency, allow users to see what Khoj is doing behind the scenes and modify their query patterns to improve response quality 2. Reduced websocket connection keep alive timeouts for long running steps	2024-04-11 18:04:40 +05:30
sabaimran	fae7900f19	Remove more	2024-04-11 00:27:44 +05:30
sabaimran	5d1dd3e2b7	If resend not enabled, don't send the welcome email	2024-04-10 23:52:42 +05:30
sabaimran	d2f9c43c8e	Use datetime.timezone.utc instead of datetime.utc	2024-04-10 23:07:43 +05:30
Debanjum Singh Solanky	f2dc9709b7	Use Magika to more robustly identify text files to send for indexing - `file-type' doesn't handle mis-labelled files or files without extensions well - Only show supported file types in file selector dialog on Desktop app Use Magika to get list of text file extensions. Combine with other supported extensions to get complete list of supported file extensions. Use it to limit selectable files in the File Open dialog. Note: Folder selector will index text files with no extensions as well	2024-04-10 22:44:24 +05:30
sabaimran	3fe94a67b0	Send welcome emails when a new user signs up (#691 ) * Don't trigger any re-indexing on server initailization * Integrate Resend to send welcome emails when a new user signs up - Only send if this is the first time they've signed in - Configure welcome email with basic styling, as more complex designs don't work and style tag did not work	2024-04-10 19:57:33 +05:30
Debanjum	6d153022f6	Improve nav pane, chat session UI on Desktop, Web app (#693 ) ### Enable copying chat messages. Improve copy button behavior and styling - Add button to copy chat messages on Desktop, Web apps - Improve copy button's icon, hover color & click animation in Desktop, Web apps ### Improve Navigation, Chat Session Panes on Desktop, Web apps - Dynamically generate navigation menu based on user info from server - Create API endpoint to get authenticated user information - Collapse navigation tabs into icons on mobile. Add spacing to them - Add Chat navigation tab back to top pane on Web app - Use proper icons for Search, Chat and Agents tab on navigation pane ### Miscellaneous Improvements - Make current chat expand to full width when session panel collapsed on Desktop App - Add chat session loading spinner to Desktop App (same as Web app) ### Fixes - Show title bar in Khoj desktop app on Windows to simplify close, minimize etc. - Only render first run setup message once if error or server not running - Fix showing Search navigation tab from Agent pages on web client	2024-04-10 19:54:12 +05:30
Debanjum Singh Solanky	48d249db9e	Center the nav item text and user profile initial icons	2024-04-10 19:38:43 +05:30
Debanjum Singh Solanky	60f6a1c6f1	Use svg icons in nav pane to standardize styling on Web, Desktop apps Emojis varied based on device. svg icons standardize icon styles of the web, desktop apps	2024-04-10 19:38:43 +05:30
Debanjum Singh Solanky	cccea484e4	Pass username, location context in system prompt instead of chat message The username and location in system prompt should disambiguate user context from user's actual message for the chat model. It doesn't need to be told to not mention the context or acknowledge the context instructions in it's response, as it understands that this information is just context and not part of the user's actual message.	2024-04-10 15:05:33 +05:30
Debanjum Singh Solanky	804c04f7b9	Do not render copy message button on every Khoj thinking step Only render copy chat message button once, after message text is rendered	2024-04-10 14:48:36 +05:30
sabaimran	bb15c9605d	Add a sitemap plugin	2024-04-10 14:35:04 +05:30
sabaimran	a4afada746	Remove client-side timeouts for the khoj socket	2024-04-10 13:35:25 +05:30
Debanjum Singh Solanky	cadeaac769	Align conversation sessions side panel on Desktop app with Web app - Move new conversation button to right of "Conversation" title - Reduce size of chat message loading ellipsis animation - Add loading animation for chat session	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	1c3d129e08	Add button to copy chat messages on Desktop client	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	0a5a91619e	Improve copy button's icon, hover color & click animation in Desktop UI	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	184873213c	Add button to copy chat messages on Web client	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	f56522cb8e	Improve copy button's icon, hover color & click animation in Web UI	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	8ff3890ba8	Dynamically generate navigation menu based on user info from server	2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky	94c69eb8e3	Create API endpoint to get authenticated user information This help clients render UI with user information	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	377e979800	Make current chat expand to full width when session panel collapsed This behavior also matches web client behavior on chat session panel collapse	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	913dcdfbcd	Only render first run setup message once if error or server not running	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	3b630841bd	s/aget_all_filenames_by_source/get_all_filenames_by_source as sync func	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	e45edbb992	Collapse navigation tabs into icons on mobile. Add spacing to them	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	93edd5427f	Add Chat navigation tab back to top pane on web client Reduces user confusion on how to go to chat pane Add emoji's for each tab to provide cleaner, iconified division between the nav options	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	8159d1ab25	Fix showing Search navigation tab from Agent pages on web client The `has_documents' flag wasn't being passed. So the search tab always showing up as empty instead of being dynamically enabled if documents had been indexed.	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	76cb543347	Show title bar in Khoj desktop app on Windows	2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky	f040418cf1	Fix indexing files in sub-folders on the Desktop app - `fs.readdir' func in node version 18.18.2 has buggy `recursive' option See nodejs/node#48640, effect-ts/effect#1801 for details - We were recursing down a folder in two ways on the Desktop app. Remove `recursive: True' option to the `fs.readdirSync' method call to recurse down via app code only	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	a8dec1c9d5	Index all text, code files in Github repos. Not just md, org files	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	8291b898ca	Standardize structure of text to entries to match other entry processors Add process_single_plaintext_file func etc with similar signatures as org_to_entries and markdown_to_entries processors The standardization makes modifications, abstractions easier to create	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	079f409238	Skip indexing Github repo on hitting Github API rate limit Sleep until rate limit passed is too expensive, as it keeps a app worker occupied. Ideally we should schedule job to contine after rate limit wait time has passed. But this can only be added once we support jobs scheduling.	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	d5c9b5cb32	Stop indexing commits, issues and issue comments in Github indexer Normal indexing quickly Github hits rate limits. Purpose of exposing Github indexer is for indexing content like notes, code and other knowledge base in a repo. The current indexer doesn't scale to index metadata given Github's rate limits, so remove it instead of giving a degraded experience of partially indexed repos	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	7ff1bd9f8b	Send more text file types from Desktop app and improve indexing them - Allow syncing more file types from desktop app to index on server - Use `file-type' package to identify valid text file types on Desktop app - Split plaintext entries into smaller logical units than a whole file Since the text splitting upgrades in #645, compiled chunks have more logical splits like paragraph, sentence. Show those (potentially) smaller snippets to the user as references - Tangential Fix: Initialize unbound currentTime variable for error log timestamp	2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky	89915dcb4c	Identify file type by content & allow server to index all text files - Use Magika's AI for a tiny, portable and better file type identification system - Existing file type identification tools like `file' and `magic' require system level packages, that may not be installed by default on all operating systems (e.g `file' command on Windows)	2024-04-09 20:19:39 +05:30
sabaimran	312528d471	Fix typo in SECURE_PROXY_SSL_HEADER settings	2024-04-09 12:33:21 +05:30
sabaimran	e56c5e67dd	Revert SSL Redirect setting as it prevents the admin page from loading	2024-04-09 12:24:48 +05:30
sabaimran	1770bb174b	Add UUID to the KhojUser search fields and inc frequency of telemetry job to 2 mins	2024-04-09 11:51:51 +05:30
sabaimran	ab51ae9091	Use SECURE_SSL_REDIRECT to ensure requests are routed to https always	2024-04-09 10:18:12 +05:30
sabaimran	1c229dad91	Set daily limit for unsubsribed users to 5 in websocket API	2024-04-08 21:16:48 +05:30
sabaimran	27815d982c	Redirect user to the login page when either of the csrf token inputs is missing	2024-04-08 20:22:17 +05:30
sabaimran	d257629f81	Handle case when properties field isn't present in the page	2024-04-08 16:15:47 +05:30
Debanjum	9b68062fa9	Add Sponsors Section to Readme	2024-04-08 03:09:24 -07:00
sabaimran	089e0d028b	Add a more gracefull error message when the rate limit is exceeded	2024-04-08 15:20:54 +05:30
Debanjum	11ce3e2268	Update Text Chunking Strategy to Improve Search Context (#645 ) ## Major - Parse markdown, org parent entries as single entry if fit within max tokens - Parse a file as single entry if it fits with max token limits - Add parent heading ancestry to extracted markdown entries for context - Chunk text in preference order of para, sentence, word, character ## Minor - Create wrapper function to get entries from org, md, pdf & text files - Remove unused Entry to Jsonl converter from text to entry class, tests - Dedupe code by using single func to process an org file into entries Resolves #620	2024-04-08 13:56:38 +05:30
Debanjum Singh Solanky	9239c2c2ed	Update drop large words test to ensure newlines considerd word boundary Prevent regression to #620	2024-04-08 13:38:08 +05:30
Debanjum Singh Solanky	67b1178aec	Remove debug logs generated while compiling org-mode entries	2024-04-08 13:01:24 +05:30
Debanjum	4eda79cc3a	Support using Python 3.12 with Khoj (#690 ) ### Why - Python 3.12 is the default Python on Ubuntu 24.04 LTS, Windows and Mac via Homebrew - Python 3.12 has a bunch of improvements that can be explored with Khoj (e.g per core GIL for performance) ## Changes - The latest PyTorch now supports Python 3.12 - RapidOCR for indexing image PDFs doesn't currently support python 3.12. But it's an optional dependency, so only install it if python < 3.12 ### Testing - Verified Khoj installs fine on Windows and Mac with Python 3.12 - Verified Khoj chat works fine on Mac, Windows with Python 3.12 Resolves #522	2024-04-08 11:43:34 +05:30
sabaimran	731ad03348	Skip indexing commits that are missing properties	2024-04-07 15:19:07 +05:30
sabaimran	376eaf64cd	Check if results are present in the pages or db response in Notion	2024-04-07 15:19:07 +05:30
Debanjum Singh Solanky	8222615280	Do not add original user message to knowledge search queries for offline chat It's not required anymore. The extracted questions by the offline chat model being used should be good enough.	2024-04-07 11:29:35 +05:30
Debanjum Singh Solanky	e3deb29f8e	Upgrade khoj.el workflow to use Python 3.11	2024-04-07 11:24:07 +05:30
Debanjum Singh Solanky	14fbf594b2	Support using Python 3.12 with Khoj - RapidOCR for indexing image PDFs doesn't currently support python 3.12. It's an optional dependency anyway, so only install it if python < 3.12 - Run unit tests with python version 3.12 as well Resolves #522	2024-04-07 11:23:44 +05:30
sabaimran	86c831f7e2	Add a link to the data sources portion in the clients documentation	2024-04-07 09:32:58 +05:30
sabaimran	351fb31a34	Add webpage search to socket codepath, add a feature page for online search	2024-04-07 09:23:29 +05:30
Debanjum Singh Solanky	4be4c53222	Release Khoj version 1.9.0	2024-04-05 17:13:58 +05:30
sabaimran	54db0152b9	Add link to the khoj cloud service for connection to Notion	2024-04-05 15:41:43 +05:30
sabaimran	81f1450c1c	Update yarn.lock to sync with package.json for documentation	2024-04-05 15:36:23 +05:30
sabaimran	d22fd6dfe3	Get rid of unnecessary package-lock.json file	2024-04-05 15:34:02 +05:30
sabaimran	7d7ce92e46	Add updated information in docs about the Notion integration	2024-04-05 15:31:43 +05:30
sabaimran	2aedd3c819	Increase freq. of telemetry upload to every 5 minutes	2024-04-05 14:13:47 +05:30
sabaimran	3b1234d084	Await the calls to the db in the notion.py file	2024-04-05 13:58:14 +05:30
sabaimran	19c10b1418	Upgrade the package versions used in yarn.lock for the documentation project	2024-04-05 13:25:41 +05:30
sabaimran	00a67e9524	Add additional log lines when configuring the Notion settings for a user in the callback	2024-04-05 13:19:24 +05:30
sabaimran	d23f7da8e3	Handle the case where a previous serach model isn't set when updating the model	2024-04-05 13:18:51 +05:30
sabaimran	f57f9f672d	Address Notion, Image tech debt in indexing code path (#687 ) * Add support for using OAuth2.0 in the Notion integration * Add notion to the admin page * Remove unnecessary content_index and image search/setup references * Trigger background job to start indexing Notion after user configures it * Add a log line when a new Notion integration is setup * Fix references to the configure_content methods	2024-04-05 12:10:03 +05:30
sabaimran	69dee75c34	Update the readme for accuracy, updated demos	2024-04-04 10:57:24 +05:30
sabaimran	a60321b68e	Push khoj to include inline references when possible	2024-04-04 10:31:13 +05:30
sabaimran	5bdcb4e69c	Wait for location data to be returned before setting up the socket connection	2024-04-04 10:31:13 +05:30
Debanjum Singh Solanky	00f599ea78	Fix passing flags to re.split to break org, md content by heading level `re.MULTILINE' should be passed to the `flags' argument, not the `max_splits' argument of the `re.split' func This was messing up the indexing by only allowing a maximum of re.MULTILINE splits. Fixing this improves the search quality to previous state	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	32ac0622ff	Extract dates from compiled text entries	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	29c1c18042	Increase search distance to get relevant content for chat post indexer update More content indexed per entry would result in an overall scores lowering effect. Increase default search distance threshold to counter that - Details - Fix expected results post indexing updates - Fix search with max distance post indexing updates - Minor - Remove openai chat actor test for after: operator as it's not expected anymore	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	ad4fa4b2f4	Fix adding file path instead of stem to markdown entries	2024-04-04 02:41:55 +05:30
sabaimran	720139c3c1	Fix all unit tests for test_text_search	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	44b3247869	Update logical splitting of org-mode text into entries - Major - Do not split org file, entry if it fits within the max token limits - Recurse down org file entries, one heading level at a time until reach leaf node or the current parent tree fits context window - Update `process_single_org_file' func logic to do this recursion - Convert extracted org nodes with children into entries - Previously org node to entry code just had to handle leaf entries - Now it recieve list of org node trees - Only add ancestor path to root org-node of each tree - Indent each entry trees headings by +1 level from base level (=2) - Minor - Stop timing org-node parsing vs org-node to entry conversion Just time the wrapping function for org-mode entry extraction This standardizes what is being timed across at md, org etc. - Move try/catch to `extract_org_nodes' from `parse_single_org_file' func to standardize this also across md, org	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	eaa27ca841	Only add spaces after heading if any tags in orgnode raw entry repr	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	2ea8a832a0	Log error when fail to index md file. Fix, improve typing in md_to_entries	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	44eab74888	Dedupe code by using single func to process an org file into entries Add type hints to orgnode and org-to-entries packages	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	db2581459f	Parse markdown parent entries as single entry if fit within max tokens These changes improve context available to the search model. Specifically this should improve entry context from short knowledge trees, that is knowledge bases with sparse, short heading/entry trees Previously we'd always split markdown files by headings, even if a parent entry was small enough to fit entirely within the max token limits of the search model. This used to reduce the context available to the search model to select appropriate entries for a query, especially from short entry trees Revert back to using regex to parse through markdown file instead of using MarkdownHeaderTextSplitter. It was easier to implement the logical split using regexes rather than bend MarkdowHeaderTextSplitter to implement it. - DFS traverse the markdown knowledge tree, prefix ancestry to each entry	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	982ac1859c	Parse markdown file as single entry if it fits with max token limits These changes improve entry context available to the search model Specifically this should improve entry context from short knowledge trees, that is knowledge bases with small files Previously we split all markdown files by their headings, even if the file was small enough to fit entirely within the max token limits of the search model. This used to reduce the context available to select the appropriate entries for a given query for the search model, especially from short knowledge trees	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	d8f01876e5	Add parent heading ancestory to extracted markdown entries for context Improve, update the markdown to entries extractor tests	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	86575b2946	Chunk text in preference order of para, sentence, word, character - Previous simplistic chunking strategy of splitting text by space didn't capture notes with newlines, no spaces. For e.g in #620 - New strategy will try chunk the text at more natural points like paragraph, sentence, word first. If none of those work it'll split at character to fit within max token limit - Drop long words while preserving original delimiters Resolves #620	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	a627f56a64	Remove unused Entry to Jsonl converter from text to entry class, tests This was earlier used when the index was plaintext jsonl file. Now that documents are indexed in a DB this func is not required. Simplify org,md,pdf,plaintext to entries tests by removing the entry to jsonl conversion step	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	28105ee027	Create wrapper function to get entries from org, md, pdf & text files - Convert extract_org_entries function to actually extract org entries Previously it was extracting intermediary org-node objects instead Now it extracts the org-node objects from files and converts them into entries - Create separate, new function to extract_org_nodes from files - Similarly create wrapper funcs for md, pdf, plaintext to entries - Update org, md, pdf, plaintext to entries tests to use the new simplified wrapper function to extract org entries	2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky	f01a12b1d2	Improve styling of chat sessions side panel - Move green server connected dot to the bottom. Show status when disconnected from server - Move "New conversation" button to right of the "Conversation" title - Center alignment of the new conversation and connection status buttons	2024-04-04 01:43:26 +05:30
sabaimran	dd1e5e145a	Use List[Any] for typing	2024-04-03 21:46:41 +05:30
sabaimran	b8087c4c8e	Add typing to empty list variables in github_to_entries	2024-04-03 21:41:36 +05:30
sabaimran	d036fdfc26	If tree is not in the contents, then just return empty files list	2024-04-03 17:55:25 +05:30
Debanjum Singh Solanky	f915b2bd14	Fix passing model_name param to chatml formatter for online chat	2024-04-03 17:21:43 +05:30
sabaimran	6aa88761b8	Skip creating the default agent if there's no default conversation config	2024-04-03 17:21:01 +05:30
sabaimran	9c42c8be6b	Merge pull request #679 from khoj-ai/features/chat-socket-streaming Add a websocket for streaming from the chat UI	2024-04-03 04:43:31 -07:00
sabaimran	b4f71e06b3	Add timeout after 10 minutes of inactivity on socket	2024-04-02 22:12:27 +05:30
sabaimran	f48426623d	resolve merge conflict in chat.html	2024-04-02 17:29:48 +05:30
sabaimran	bf1187f465	Use new online/websearch logic and add agent to chat_metadata	2024-04-02 17:20:38 +05:30
sabaimran	867e1007d1	Remove superfluous newline	2024-04-02 17:20:08 +05:30
sabaimran	228ad68042	Merge with origin/master	2024-04-02 17:02:21 +05:30
sabaimran	776550d5ce	Add a migration for updating the default chat model, update for existing users	2024-04-02 17:01:31 +05:30
sabaimran	47fc7e1ce6	Rebase with matser	2024-04-02 16:16:06 +05:30
Debanjum	215ab6e66a	Extract More Dates from entries to improve Date Filter (#683 ) - Overview - Extract more structured date variants (e.g with dot(.) & slash(/) separators, 2-digit year) - Extract some natural, partial dates as well from entries - Capability Add ability to extract the following additional date forms: - Natural Dates: 21st April 2000, February 29 2024 - Partial Natural Dates: March 24, Mar 2024 - Structured Dates: 20/12/24, 20.12.2024, 2024/12/20 Note: Previously only YYYY-MM-DD ISO-8601 structured date form was extracted for date filters - Performance Using regexes is MUCH faster than using the `dateparser' python library It's a little crude but gives acceptable performance for large datasets	2024-04-02 16:14:53 +05:30
Debanjum	3c3e48b18c	Migrate to Llama.cpp for Offline Chat (#680 ) ## Benefits - Support all GGUF format chat models - Support more GPUs like AMD, Nvidia, Mac, Vulcan (previously just Vulcan, Mac) - Support more capabilities like larger context window, schema enforcement, speculative decoding etc. ## Changes ### Major - Use llama.cpp for offline chat models - Support larger context window - Automatically apply appropriate chat template. So offline chat models not using llama2 format are now supported - Use better default offline chat model, NousResearch/Hermes-2-Pro-Mistral-7B - Enable extract queries actor to improve notes search with offline chat - Update documentation to use llama.cpp for offline chat in Khoj ### Minor - Migrate to use NouseResearch's Hermes-2-Pro 7B as default offline chat model in khoj.yml - Rename GPT4AllChatProcessor to OfflineChatProcessor Config, Model - Only add location to image prompt generator when location known	2024-04-02 15:49:42 +05:30
Debanjum Singh Solanky	7afee2d55c	Let offline chat model set context window. Improve, fix prompts	2024-03-31 16:19:35 +05:30
Debanjum Singh Solanky	4228965c9b	Handle msg truncation when question is larger than max prompt size Notice and truncate the question it self at this point	2024-03-31 15:50:06 +05:30
Debanjum Singh Solanky	c6487f2e48	Fix docs showing how to setup llama-cpp with Khoj	2024-03-31 15:36:40 +05:30
Debanjum Singh Solanky	886d49e3a4	Merge branch 'master' into migrate-to-llama-cpp-for-offline-chat	2024-03-31 00:59:20 +05:30
Debanjum Singh Solanky	4f65dde201	Release Khoj version 1.8.0	2024-03-31 00:06:15 +05:30
sabaimran	c0e78fd56d	Fix broken get-started documentation links	2024-03-30 15:05:12 +05:30
sabaimran	dd2a3f712b	Add more demo videos, images, add feature sections	2024-03-30 14:48:46 +05:30
sabaimran	4cb91a042e	Add an agents feature page, and clarification around custom domains	2024-03-30 14:20:46 +05:30
sabaimran	928f273bbe	Configure production setup for moving to single worker model	2024-03-30 10:35:55 +05:30
Debanjum Singh Solanky	7923903d21	Improve date filter regexes to extract structured, natural, partial dates - Much faster than using dateparser - It took 2x-4x for improved regex to extracts 1-15% more dates - Whereas It took 33x to 100x for dateparser to extract 65% - 400% more dates - Improve date extractor tests to test deduping dates, natural, structured date extraction from content - Extract some natural, partial dates and more structured dates Using regex is much faster than using dateparser. It's a little crude but should pay off in performance. Supports dates of form: - (Day-of-Month) Month\|AbbreviatedMonth Year\|2DigitYear - Month\|AbbreviatedMonth (Day-of-Month) Year\|2DigitYear	2024-03-30 00:07:19 +05:30
Debanjum Singh Solanky	104eeea274	Extract natural language and locale specific dates in content Previously we just extracted dates in YYYY-MM-DD format from content for date filterings during search. Use dateparser to extract dates across locales and natural language This should improve notes returned as context when chat searches knowledge base with date filters Fallback to regex for date parsing from content if dateparser fails - Limit natural date extractor capabilities to improve performance - Assume language is english Language detection otherwise takes a REALLY long time - Do not extract unix timestamps, timezone - This isn't required, as just using date and approximating dates as UTC	2024-03-30 00:06:56 +05:30
Debanjum Singh Solanky	90c5b3c410	Update stale Khoj pypi package metadata Use latest License, Intended Audience and Dev Status	2024-03-29 00:06:55 +05:30
sabaimran	1195f843a3	Remove forward slash from the root agents endpoint	2024-03-28 23:06:55 +05:30
Debanjum Singh Solanky	a374288cea	Use OIDC TrustedPublisher to publish khoj python package to PyPi	2024-03-28 22:58:36 +05:30
sabaimran	3417164ec2	Bump gunicorn workers up to 8	2024-03-28 22:34:13 +05:30
sabaimran	a1729b9b9e	Add telemetry for agents used in conversation, increase image width in agents page	2024-03-28 22:18:11 +05:30
sabaimran	d503b3e867	Use Personality vernacular in agent page - When setting up the default agent, configure every conversation that doesn't have an agent to use the Khoj agent - Fix reverse migration for the locale removal migration	2024-03-28 15:07:02 +05:30
sabaimran	e59de8c9b1	Constrain width/size of agent image in agents view	2024-03-28 13:32:11 +05:30
sabaimran	6cb38d92c0	Specify version of pypi gh publish action	2024-03-28 12:47:31 +05:30
sabaimran	56da96b2e9	Increase minimum python required in the pyproject, use python 3.11 for building the wheel in the workflow	2024-03-28 12:19:07 +05:30
sabaimran	22014cfcbc	Merge pull request #682 from khoj-ai/features/full-integration-agents Add support for custom agents configured by the server admin	2024-03-27 23:27:15 -07:00
sabaimran	17776daed8	Merge from master	2024-03-28 11:38:29 +05:30
sabaimran	32a505d841	Revert to using the nvidia base image for the next release	2024-03-28 11:37:37 +05:30
sabaimran	51d0c9b8b0	Add telemetry to keep state of new agents being used	2024-03-28 11:37:24 +05:30
sabaimran	46ebc55e2b	Add a top tab for agents	2024-03-28 11:37:01 +05:30
sabaimran	8397187231	Use default agent when creating a new conversation without agent specified	2024-03-28 11:36:27 +05:30
Debanjum Singh Solanky	8c4ef9270d	Fix using format string for logger in chat API endpoint	2024-03-27 16:31:22 +05:30
Debanjum Singh Solanky	4912c0ee30	Use extract queries actor to improve notes search with offline chat Previously we were skipping the extract questions step for offline chat as default offline chat model wasn't good enough to output proper json given the time it took to extract questions. The new default offline chat models gives json much more regularly and with date filters, so the extract questions step becomes useful given the impact on latency	2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky	1ebd5c3648	Rename GPT4AllChatProcessor* to OfflineChatProcessor Config, Model	2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky	2a0b943bb4	Use Hermes-2-Pro as default offline chat model in khoj.yml	2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky	dcdd1edde2	Update docs to show how to setup llama-cpp with Khoj - How to pip install khoj to run offline chat on GPU After migration to llama-cpp-python more GPU types are supported but require build step so mention how - New default offline chat model - Where to get supported chat models from on HuggingFace	2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky	8ca39a436c	Use llama.cpp for offline chat models - Benefits of moving to llama-cpp-python from gpt4all: - Support for all GGUF format chat models - Support for AMD, Nvidia, Mac, Vulcan GPU machines (instead of just Vulcan, Mac) - Supports models with more capabilities like tools, schema enforcement, speculative ddecoding, image gen etc. - Upgrade default chat model, prompt size, tokenizer for new supported chat models - Load offline chat model when present on disk without requiring internet - Load model onto GPU if not disabled and device has GPU - Load model onto CPU if loading model onto GPU fails - Create helper function to check and load model from disk, when model glob is present on disk. `Llama.from_pretrained' needs internet to get repo info from HuggingFace. This isn't required, if the model is already downloaded Didn't find any existing HF or llama.cpp method that looked for model glob on disk without internet	2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky	0a7392f6ec	Only add location to image prompt generator when location known	2024-03-26 22:33:01 +05:30
sabaimran	fdf78525b4	Part 2: Add web UI updates for basic agent interactions (#675 ) * Initial pass at backend changes to support agents - Add a db model for Agents, attaching them to conversations - When an agent is added to a conversation, override the system prompt to tweak the instructions - Agents can be configured with prompt modification, model specification, a profile picture, and other things - Admin-configured models will not be editable by individual users - Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications * Customize default behaviors for conversations without agents or with default agents * Add a new web client route for viewing all agents * Use agent_id for getting correct agent * Add web UI views for agents - Add a page to view all agents - Add slugs to manage agents - Add a view to view single agent - Display active agent when in chat window - Fix post-login redirect issue * Fix agent view * Spruce up the 404 page and improve the overall layout for agents pages * Create chat actor for directly reading webpages based on user message - Add prompt for the read webpages chat actor to extract, infer webpage links - Make chat actor infer or extract webpage to read directly from user message - Rename previous read_webpage function to more narrow read_webpage_at_url function * Rename agents_page -> agent_page * Fix unit test for adding the filename to the compiled markdown entry * Fix layout of agent, agents pages * Merge migrations * Let the name, slug of the default agent be Khoj, khoj * Fix chat-related unit tests * Add webpage chat command for read web pages requested by user Update auto chat command inference prompt to show example of when to use webpage chat command (i.e when url is directly provided in link) * Support webpage command in chat API - Fallback to use webpage when SERPER not setup and online command was attempted - Do not stop responding if can't retrieve online results. Try to respond without the online context * Test select webpage as data source and extract web urls chat actors * Tweak prompts to extract information from webpages, online results - Show more of the truncated messages for debugging context - Update Khoj personality prompt to encourage it to remember it's capabilities * Rename extract_content online results field to webpages * Parallelize simple webpage read and extractor Similar to what is being done with search_online with olostep * Pass multiple webpages with their urls in online results context Previously even if MAX_WEBPAGES_TO_READ was > 1, only 1 extracted content would ever be passed. URL of the extracted webpage content wasn't passed to clients in online results context. This limited them from being rendered * Render webpage read in chat response references on Web, Desktop apps * Time chat actor responses & chat api request start for perf analysis * Increase the keep alive timeout in the main application for testing * Do not pipe access/error logs to separate files. Flow to stdout/stderr * [Temp] Reduce to 1 gunicorn worker * Change prod docker image to use jammy, rather than nvidia base image * Use Khoj icon when Khoj web is installed on iOS as a PWA * Make slug required for agents * Simplify calling logic and prevent agent access for unauthenticated users * Standardize to use personality over tuning in agent nomenclature * Make filtering logic more stringent for accessible agents and remove unused method: * Format chat message query --------- Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-03-26 18:13:24 +05:30
Debanjum Singh Solanky	15ed208996	Use Khoj icon when Khoj web is installed on iOS as a PWA	2024-03-26 00:13:12 +05:30
sabaimran	f8eaff574f	Change prod docker image to use jammy, rather than nvidia base image	2024-03-25 23:09:58 +05:30
sabaimran	2b5341f53a	[Temp] Reduce to 1 gunicorn worker	2024-03-25 16:13:04 +05:30
sabaimran	991f500775	Do not pipe access/error logs to separate files. Flow to stdout/stderr	2024-03-25 16:12:39 +05:30
Debanjum	586654e2af	Allow directly reading web pages, even when SERP not enabled (#676 ) ### Overview Khoj can now read website directly without needing to go through the search step first ### Details - Parallelize simple webpage read and extractor - Rename extract_content online results field to web pages - Tweak prompts to extract information from webpages, online results - Test select webpage as data source and extract web urls chat actors - Render webpage read in chat response references on Web, Desktop apps - Pass multiple webpages with their urls in online results context - Support webpage command in chat API - Add webpage chat command for read web pages requested by user - Create chat actor for directly reading webpages based on user message	2024-03-24 16:25:25 +05:30
Debanjum Singh Solanky	9e52ae9e98	Time chat actor responses & chat api request start for perf analysis	2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky	dabf71bc3c	Render webpage read in chat response references on Web, Desktop apps	2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky	a2e79c94be	Pass multiple webpages with their urls in online results context Previously even if MAX_WEBPAGES_TO_READ was > 1, only 1 extracted content would ever be passed. URL of the extracted webpage content wasn't passed to clients in online results context. This limited them from being rendered	2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky	71b6905008	Parallelize simple webpage read and extractor Similar to what is being done with search_online with olostep	2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky	1167f6ddf9	Rename extract_content online results field to webpages	2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky	b22a7dae5d	Tweak prompts to extract information from webpages, online results - Show more of the truncated messages for debugging context - Update Khoj personality prompt to encourage it to remember it's capabilities	2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky	85c62efca1	Test select webpage as data source and extract web urls chat actors	2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky	ad6f6bb0ed	Support webpage command in chat API - Fallback to use webpage when SERPER not setup and online command was attempted - Do not stop responding if can't retrieve online results. Try to respond without the online context	2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky	a6b7432837	Add webpage chat command for read web pages requested by user Update auto chat command inference prompt to show example of when to use webpage chat command (i.e when url is directly provided in link)	2024-03-24 15:46:29 +05:30
sabaimran	8abc8ded82	Part 1: Server-side changes to support agents integrated with Conversations (#671 ) * Initial pass at backend changes to support agents - Add a db model for Agents, attaching them to conversations - When an agent is added to a conversation, override the system prompt to tweak the instructions - Agents can be configured with prompt modification, model specification, a profile picture, and other things - Admin-configured models will not be editable by individual users - Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications * Customize default behaviors for conversations without agents or with default agents * Use agent_id for getting correct agent * Merge migrations * Simplify some variable definitions, add additional security checks for agents * Rename agent.tuning -> agent.personality	2024-03-23 22:09:38 +05:30
sabaimran	4deb849fb1	Merge branch 'features/add-agents-ui' of github.com:khoj-ai/khoj into features/chat-socket-streaming	2024-03-23 14:04:25 +05:30
sabaimran	8edbd7094f	Let the name, slug of the default agent be Khoj, khoj	2024-03-23 14:03:58 +05:30
sabaimran	6b4c4f10b5	Merge branch 'features/add-agents-ui' of github.com:khoj-ai/khoj into features/chat-socket-streaming	2024-03-23 11:22:00 +05:30
sabaimran	20617614ae	Merge branch 'features/customize-chat-with-agents' of github.com:khoj-ai/khoj into features/add-agents-ui	2024-03-23 11:20:57 +05:30
sabaimran	2399d91f61	Merge migrations	2024-03-22 10:05:33 +05:30
sabaimran	d38089ab57	Merge with origin	2024-03-22 09:55:33 +05:30
Debanjum Singh Solanky	7416ca9ae1	Lower the default gunicorn workers running on prod	2024-03-21 04:35:52 +05:30
Debanjum Singh Solanky	aed4313cfc	Fix updating specific conversation by id from the chat API endpoint - Use the conversation id of the retrieved conversation rather than the potentially unset conversation id passed via API - await creating new chat when no chat id provided and no existing conversations exist	2024-03-21 02:46:52 +05:30
Debanjum Singh Solanky	ec6dc0daaf	Bump up the default gunicorn workers running on prod	2024-03-20 22:56:09 +05:30
sabaimran	6ba0d8e379	Add a connected notification if the websocket is connected	2024-03-20 20:53:28 +05:30
sabaimran	255b69dc58	Add a comma delimeter between outputted search queries	2024-03-20 19:43:35 +05:30
sabaimran	d84188b221	Scroll down when a message is added in the chat interface's handle stream response method	2024-03-20 15:04:41 +05:30
sabaimran	70ad78990a	Use a common method for sending a generic message to the client from the server in the ws connection	2024-03-20 15:04:14 +05:30
sabaimran	d4e83b060a	Update the web UI for the chat interface to establish a connection via a socket to the server - Move some common methods into separate functions to make the UI components more efficient - The normal HTTP-based chat connection will still work and serves as a fallback if the websocket is unavailable	2024-03-20 14:34:47 +05:30
sabaimran	a346f79b39	Add support for chatting via the web socket connection - Convert to a model of calling the search API directly with a function call (rather than using the API method) - Gracefully handle websocket connection disconnects - Ensure that the rest of the response is still saved, as it is currently, if the user disconects from the client - Setup unchangeable context at the beginning of the session when the connection is established (like location, username, etc)	2024-03-20 14:33:33 +05:30
sabaimran	36af9776e6	Add the websockets dependency to pyproject.toml	2024-03-20 14:11:18 +05:30
Debanjum Singh Solanky	62a83dc9bb	Fix online search actor to use natural dates not after: operator The recently added after: operator to online search actor was too restrictive, gave worse results than when just use natural language dates in search query	2024-03-15 21:50:14 +05:30
Debanjum Singh Solanky	4a1e6a2275	Convert deleted old user requests log line to debug from info	2024-03-15 20:50:10 +05:30
Debanjum Singh Solanky	9a068dadbf	Fix extract questions prompt to use YYYY-MM-DD date filter format	2024-03-15 18:43:18 +05:30
Debanjum	bb2693c792	Improve Chat Session UX, Fix Login, Chat Message Truncation (#677 ) ### Improve - Improve delete, rename chat session UX in Desktop, Web app - Get conversation by title when requested via chat API ### Fix - Allow unset locale for Google authenticating user - Handle truncation when single long non-system chat message - Fix setting chat session title from Desktop app - Only create new chat on get if a specific chat id, slug isn't requested	2024-03-15 18:19:36 +05:30
Debanjum Singh Solanky	ecddf98430	Handle truncation when single long non-system chat message Previously was assuming the system prompt is being always passed as the first message. So expected there to be at least 2 messages in logs. This broke chat actors querying with single long non system message. A more robust way to extract system prompt is via the message role instead	2024-03-15 15:58:39 +05:30
Debanjum Singh Solanky	ec0c35b7ed	Improve delete, rename chat session UX in Desktop, Web app - Ask for Confirmation before deleting chat session in Desktop, Web app - Save chat session rename on hitting enter in title edit input box - No need to flash previous conversation cleared status message - Move chat session delete button after rename button in Desktop app	2024-03-15 15:58:19 +05:30
Debanjum Singh Solanky	924b1215ce	Allow unset locale for Google authenticated user	2024-03-15 15:35:20 +05:30
Debanjum Singh Solanky	c792fa819f	Fix setting chat session title from Desktop app Pass auth headers to not have the chat session title update request fail	2024-03-15 15:19:20 +05:30
Debanjum Singh Solanky	c9e05dc184	Get conversation by title when requested via chat API	2024-03-15 12:31:50 +05:30
sabaimran	724557fc7b	Merge branch 'master' of github.com:khoj-ai/khoj into features/add-agents-ui	2024-03-15 12:14:34 +05:30
sabaimran	7fc484ba7a	Merge branch 'master' of github.com:khoj-ai/khoj into features/customize-chat-with-agents	2024-03-15 12:13:28 +05:30
Debanjum Singh Solanky	cac26dafe3	Only create new chat on get if a specific chat id, slug isn't requested	2024-03-15 11:58:39 +05:30
sabaimran	416feb13ef	Fix layout of agent, agents pages	2024-03-15 11:17:40 +05:30
sabaimran	1b3fc68a87	Fix unit test for adding the filename to the compiled markdown entry	2024-03-15 11:01:48 +05:30
sabaimran	d734be61cf	Rename agents_page -> agent_page	2024-03-15 10:17:51 +05:30
Debanjum Singh Solanky	8cdfaf41ec	Update project URLs to show on pypi project page	2024-03-15 04:03:39 +05:30
Debanjum Singh Solanky	08993ff109	Add new, remove old known chat models from model to prompt size map	2024-03-15 04:02:25 +05:30
Debanjum Singh Solanky	fba0338787	Release Khoj version 1.7.0	2024-03-15 00:08:32 +05:30
sabaimran	345afec47e	Resolve merge conflicts/ use agent_slug instead of agent_id for lookup	2024-03-14 16:16:07 +05:30
Debanjum Singh Solanky	6118d1ff57	Create chat actor for directly reading webpages based on user message - Add prompt for the read webpages chat actor to extract, infer webpage links - Make chat actor infer or extract webpage to read directly from user message - Rename previous read_webpage function to more narrow read_webpage_at_url function	2024-03-14 14:58:37 +05:30
Debanjum	e549824fe2	Improve OpenAI Chat Actors and their prompts (#673 ) ### Major - Enforce json mode response from OpenAI chat actors prev using string lists - Use `gpt-4-turbo-preview' as default chat model, extract questions actor - Make Khoj read khoj website to respond with accurate, up-to-date information about itself - Dedupe query in notes prompt. Improve OAI chat actor, director tests ### Minor - Test data source, output mode selector, web search query chat actors - Improve notes search actor to always create a non-empty list of queries - Construct available data sources, output modes as a bullet list in prompts - Use consistent agent name across static and dynamic examples in prompts - Add actor's name to extract questions prompt to improve context for guidance	2024-03-14 12:44:40 +05:30
sabaimran	3caf0a79d8	Spruce up the 404 page and improve the overall layout for agents pages	2024-03-14 11:26:49 +05:30
sabaimran	c45030af44	Fix agent view	2024-03-14 11:13:19 +05:30
Debanjum Singh Solanky	a1ce12296f	Fix rendering online with note references post streaming chat response Previously only the notes references would get rendered post response streaming when when both online and notes references were used to respond to the user's message	2024-03-14 03:40:40 +05:30
Debanjum Singh Solanky	1aeea3d854	Fix opening external links from confirmation dialog box on desktop app	2024-03-14 02:29:22 +05:30
Debanjum Singh Solanky	2e5cc49cb3	Enforce json response from OpenAI chat actors prev using string lists - Allow passing response format type to OpenAI API via chat actors - Convert in-context examples to use json objects instead of str lists - Update actors outputting str list to request output to be json_object - OpenAI's json mode enforces the model to output valid json object	2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky	7211eb9cf5	Default to gpt-4-turbo-preview for chat model, extract questions actor GPT-4 is more expensive and generally less capable than gpt-4-turbo-preview	2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky	dd883dc53a	Dedupe query in notes prompt. Improve OAI chat actor, director tests - Remove stale tests - Improve tests to pass across gpt-3.5 and gpt-4-turbo - The haiku creation director was failing because of duplicate query in instantiated prompt	2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky	70b04d16c0	Test data source, output mode selector, web search query chat actors	2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky	14682d5354	Improve notes search actor to always create a non-empty list of queries - Remove the option for Notes search query generation actor to return no queries. Whether search should be performed is decided before, this step doesn't need to decide that - But do not throw warning if the response is a list with no elements	2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky	f5734826cb	Improve pick data source prompt to look online for info about Khoj - Add examples where user queries requesting information about Khoj results in the "online" data source being selected - Add an example for "general" to select chat command prompt	2024-03-14 01:21:13 +05:30
Debanjum Singh Solanky	9a516bed47	Construct available data sources, output modes as a bullet list in prompts	2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky	f28fb89af8	Use consistent agent name across static and dynamic examples in prompts Previously the examples constructed from chat history used "Khoj" as the agent's name but all 3 prompts using the func used static examples with "AI:" as the pertinent agent's name	2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky	f5793149a9	Add actor's name to extract questions prompt to improve context for guidance	2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky	73ad444086	Make online search Actor read khoj.dev for docs, info about Khoj - Add example to read khoj.dev website for up-to-date info to setup, use khoj, discover khoj features etc. - Online search should use site: and after: google search operators - Show example of adding the after: date filter to google search - Give local event lookup example using user's current location in query - Remove unused select search content type prompt	2024-03-14 00:34:57 +05:30
sabaimran	290712c3fe	Add web UI views for agents - Add a page to view all agents - Add slugs to manage agents - Add a view to view single agent - Display active agent when in chat window - Fix post-login redirect issue	2024-03-14 00:07:36 +05:30
Debanjum	3abe7ccb26	Improve Online Search Speed and Context (#670 ) ### Major - Read web pages in parallel to improve chat response time - Read web pages directly when Olostep proxy not setup - Include search results & web page content in online context for chat response ### Minor - Simplify, modularize and add type hints to online search functions	2024-03-11 22:16:30 +05:30
Debanjum Singh Solanky	dc86e44a07	Include search results & webpage content in online context for chat response Previously if a web page was read for a sub-query, only the extracted web page content was provided as context for the given sub-query. But the google results themselves have relevant snippets. So include them	2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky	d136a6be44	Simplify, modularize and add type hints to online search functions - Simplify content arg to `extract_relevant_info' function. Validate, clean the content arg inside the `extract_relevant_info' function - Extract `search_with_google' function outside the parent function - Call the parent function a more appropriate `search_online' instead of `search_with_google' - Simplify the `search_with_google' function using list comprehension. Drop empty search result fields from chat model context for response to reduce cost and response latency - No need to show stacktrace when unable to read webpage, basic error is enough - Add type hints to online search functions to catch issues with mypy	2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky	88f096977b	Read webpages directly when Olostep proxy not setup This is useful for self-hosted, individual user, low traffic setups where a proxy service is not required	2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky	ca2f962e95	Read, extract information from web pages in parallel to lower response time - Time reading webpage, extract info from webpage steps for perf analysis - Deduplicate webpages to read gathered across separate google searches - Use aiohttp to make API requests non-blocking, pair with asyncio to parallelize all the online search webpage read and extract calls	2024-03-11 18:41:02 +05:30
sabaimran	8e1445b15b	Use agent_id for getting correct agent	2024-03-11 14:44:46 +05:30
sabaimran	6ab649312f	Add a new web client route for viewing all agents	2024-03-11 14:40:40 +05:30
sabaimran	352168d6c2	Customize default behaviors for conversations without agents or with default agents	2024-03-11 14:20:28 +05:30
sabaimran	9b88976f36	Initial pass at backend changes to support agents - Add a db model for Agents, attaching them to conversations - When an agent is added to a conversation, override the system prompt to tweak the instructions - Agents can be configured with prompt modification, model specification, a profile picture, and other things - Admin-configured models will not be editable by individual users - Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications	2024-03-11 12:45:24 +05:30
sabaimran	1da453306e	Add num online for Discord badge	2024-03-10 17:48:30 +05:30
Debanjum	18fa3e2384	Rerank Search Results by Default on GPU machines (#668 ) - Trigger SentenceTransformer Cross Encoder models now run fast on GPU enabled machines, including Mac ARM devices since UKPLab/sentence-transformers#2463 - Details - Use cross-encoder to rerank search results by default on GPU machines and when using an inference server - Only call search API when pause in typing search query on web, desktop apps	2024-03-10 15:15:25 +05:30
Debanjum Singh Solanky	53d402480c	Rerank search results with cross-encoder when using an inference server If an inference server is being used, we can expect the cross encoder to be running fast enough to rerank search results by default	2024-03-10 15:09:46 +05:30
Debanjum Singh Solanky	44c8d09342	Only call search API when pause in typing search query on web, desktop apps Wait for 300ms since stop typing before calling search API. This smooths out UI jitter when rendering search results, especially now that we're reranking for every search query on GPU enabled devices Emacs already has 300ms debounce time. More convoluted to add debounce time to Obsidian search modal, so not updating that yet	2024-03-10 14:29:24 +05:30
Debanjum Singh Solanky	1105d8814f	Use cross-encoder to rerank search results by default on GPU machines Latest sentence-transformer package uses GPU for cross-encoder. This makes it fast enough to enable reranking on machines with GPU. Enabling search reranking by default allows (at least) users with GPUs to side-step learning the UI affordance to rerank results (i.e hitting Cmd/Ctrl-Enter or ENTER).	2024-03-10 14:29:21 +05:30
Debanjum	8eb3c441ec	Do not create new chat session when an old chat session is deleted (#669 ) ### Issue Previously deleting a chat session from the side panel on desktop, web app would sometimes result in also creating a new chat session ### Fix `get_conversation_by_user' shouldn't return new conversation if conversation with requested id not found. It should only return new conversation if no specific conversation is requested and no conversations found for user at all ### Miscellaneous Improvements - Chat history load should be logged as call to that chat_history api, not the "chat" api - Show status updates of clearing conversation history in chat input - Simplify web, desktop client code by removing unnecessary new variables ### Repro - Delete a new chat, this calls loadChat via window.onload which calls server /chat/history API endpoint with conversationId set to that of just deleted conversation sporadically The call to GET chat/history API with conversationId set occurs when window.onload triggers before the conversationId is deleted by the delete button after the DELETE /chat/history API call (via race) - In such a scenario, get_conversation_by_user called by chat/history API with conversationId of deleted conversation returns a new conversation	2024-03-10 14:14:43 +05:30
Debanjum Singh Solanky	fd81446ba3	Do not create new chat session when an old chat session is deleted - Fix `get_conversation_by_user' shouldn't return new conversation if conversation with requested id not found. It should only return new conversation if no specific conversation is requested and no conversations found for user at all - Repro - Delete a new chat, this calls loadChat via window.onload which calls server /chat/history API endpoint with conversationId set to that of just deleted conversation sporadically The call to GET chat/history API with conversationId set occurs when window.onload triggers before the conversationId is deleted by the delete button after the DELETE /chat/history API call (via race) - In such a scenario, get_conversation_by_user called by chat/history API with conversationId of deleted conversation returns a new conversation - Miscellaneous - Chat history load should be logged as call to that chat_history api, not the "chat" api - Show status updates of clearing conversation history in chat input - Simplify web, desktop client code by removing unnecessary new variables	2024-03-10 02:17:23 +05:30
Debanjum Singh Solanky	b7fad04870	Use consistent field name for queries in chat history & better image prompt	2024-03-09 19:11:03 +05:30
sabaimran	086d5f8324	Add link to drag drop pdf demo video	2024-03-09 17:02:23 +05:30
sabaimran	6aae9864d3	Fix Notion indexing and add an admin view for Entry objects	2024-03-09 16:25:23 +05:30
sabaimran	b3b6278af2	Update documentation to show how you can upload files	2024-03-09 15:58:13 +05:30
sabaimran	12d6c4da7d	Only include inferred queries in the conversation history for images, not links. Overflow the side panel when too long	2024-03-09 11:59:35 +05:30
Debanjum Singh Solanky	42d4bc6b14	Document installing Khoj on Phone as a Progressive Web App (PWA)	2024-03-08 21:18:06 +05:30
sabaimran	e5cd0237e3	Release Khoj version 1.6.2	2024-03-08 17:04:03 +05:30
Debanjum Singh Solanky	446ac7649d	Remove unused js method in web chat client, add newline to web data in prompt	2024-03-08 16:40:39 +05:30
Debanjum Singh Solanky	12d32ac99c	Increase user visibility into more errors during image generation Catch OpenAI connection error and errors during better image prompt generation	2024-03-08 16:40:39 +05:30
sabaimran	ff31759423	Fix target determination in the copy programmatic output button	2024-03-08 16:33:12 +05:30
sabaimran	9f934929c6	Infer mime type from file ending when not available in browser. Don't output image in conversation turns	2024-03-08 12:34:26 +05:30
sabaimran	81beb7940c	Upload generated images to s3, if AWS credentials and bucket is available (#667 ) * Upload generated images to s3, if AWS credentials and bucket is available. - In clients, render the images via the URL if it's returned with a text-to-image2 intent type * Make the loading screen more intuitve, less jerky and update the programmatic copy button * Update the loading icon when waiting for a chat response	2024-03-08 10:54:13 +05:30
sabaimran	13894e1fd5	add instructions for drag/drop files in sys prompt	2024-03-07 17:57:42 +05:30
sabaimran	7357b6eff1	Revert white-space preline and add more detailed help text when selecting file	2024-03-06 16:47:27 +05:30
sabaimran	b615c0719e	Support upload for files via drag/drop in the web UI (#666 ) * Add additional styling changes for showing UI changes when dragging file to the main screen * Add a loading spinner when file upload is in progress, and don't index github/notion when indexing files * Add an explicit icon for file uploading in the chat button menu * Add appropriate dragover styling when picking a file from the file picker/browser * Add a loading screen when retrieving chat history. Fix width of the chat window. Put attachment icon to the left of chat input	2024-03-06 16:43:05 +05:30
sabaimran	e323a6d69b	Include additional user context in the image generation flow (#660 ) * Make major improvements to the image generation flow - Include user context from online references and personal notes for generating images - Dynamically select the modality that the LLM should respond with - Retun the inferred context in the query response for the dekstop, web chat views to read * Add unit tests for retrieving response modes via LLM * Move output mode unit tests to the actor suite, rather than director * Only show the references button if there is at least one available * Rename aget_relevant_modes to aget_relevant_output_modes * Use a shared method for generating reference sections, simplify some of the prompting logic * Make out of space errors in the desktop client more obvious	2024-03-06 13:48:41 +05:30
sabaimran	3cbc5b0d52	Add links to blog in docs	2024-03-02 17:37:18 +05:30
sabaimran	880368635e	Set default value of KHOJ_DEBUG to False in the docker-compose file	2024-03-01 21:51:13 +05:30
Debanjum Singh Solanky	2d61591c22	Improve user visibility into errors during image generation	2024-02-29 13:19:13 +05:30
sabaimran	0bbb5cff85	Release Khoj version 1.6.1	2024-02-26 13:27:20 -08:00
sabaimran	c8194a7364	Make out of space errors in the desktop client more obvious	2024-02-26 11:53:36 -08:00
Debanjum Singh Solanky	956dd71d91	Clean entry before adding to DB and log when it fails Remove \0 null characters from entry fields as this is causing indexing errors	2024-02-27 01:19:34 +05:30
Debanjum Singh Solanky	bb613a8e1d	Make indentation styling more compact on Obsidian client	2024-02-25 14:41:45 +05:30
Debanjum Singh Solanky	682b70011f	Set chat body height to remove UX jitter on chat history load in Web, Desktop	2024-02-25 14:40:47 +05:30
Debanjum Singh Solanky	efe86ce159	Fix saved conversation logger to handle image responses	2024-02-25 13:46:32 +05:30
Debanjum Singh Solanky	4839f2901a	Open external links in Desktop app with default app for url on OS - Open external links using the default link handler registered on OS for the link type, e.g http:// -> firefox, mailto: thunderbird etc - Confirm before opening non-http URL using an external app	2024-02-25 13:21:52 +05:30
Debanjum	170bce2c02	Fix, Improve rendering images in Obsidian, Desktop, Web clients (#659 ) - Improve render of inferred query in image chat messages in Web, Desktop apps - Add inferred queries to image chat responses in Obsidian client - Fix rendering images from Khoj response in Obsidian client	2024-02-25 00:56:26 +05:30
Debanjum Singh Solanky	f84606325c	Improve render of inferred query in image chat messages in Web, Desktop apps	2024-02-25 00:47:06 +05:30
Debanjum Singh Solanky	a2e53d5e41	Add inferred queries to image chat responses in Obsidian client	2024-02-25 00:24:58 +05:30
Debanjum Singh Solanky	9b61f0b5f7	Fix rendering images from Khoj response in Obsidian client	2024-02-25 00:11:11 +05:30
sabaimran	b9d0533d92	Misc. fixes to prompting, admin, and others (#658 ) * Simplify and clarify prompt for selecting toolset dynamically * Add error handling around call to OLOSTEP api * Fix conversation admin page * Skip adding none or empty entries in the chunking method	2024-02-24 10:25:42 -08:00
Debanjum Singh Solanky	0e0e751ef7	Improve docstring of entrypoint function to the emacs client	2024-02-24 21:09:41 +05:30
Debanjum	8855529637	Improve Syncing Obsidian Vault, Invalidate Static Assets in Browser Cache in Web Client (#657 ) - Improve - Only send files modified since their last sync for indexing on server from the Obsidian client - Fix - Invalidate static asset browser cache in Web client when Khoj version changes	2024-02-24 20:20:30 +05:30
Debanjum Singh Solanky	a46f70c4b0	Remove deprecated lastSyncedFiles settings field from Obsidian client	2024-02-24 20:18:22 +05:30
Debanjum Singh Solanky	03a6b491b2	Warn when can't identify mimeType of files in Desktop, Obsidian clients	2024-02-24 19:59:03 +05:30
Debanjum Singh Solanky	3675ab4864	Only sync modified files from the Obsidian client Previously we'd send all files in vault and let the server deduplicate. This changes takes inspiration from the desktop app, and only pushes files which were modified after their previous sync with the server. This should reduce the processing load on the server	2024-02-24 07:48:40 +05:30
Debanjum Singh Solanky	ddfbf31bc8	Append version query param to web asset URLs to bypass browser cache Ensure latest assets are loaded when khoj version is updated	2024-02-24 06:49:25 +05:30
sabaimran	42773e808c	Retrieve, create, and save conversations differently for ClientApplications (#656 ) * Retrieve, create, and save conversations differently if they're coming from a client application - Not all of our client apps will necessarily maintain state over the conversation IDs available to a user. For some (single-threaded conversations), it should just use a single conversation. Fix the code to do so * Simplify conversation retrieval logic * Keep 0 padding below chat response * Add order_by sorting to retrieving the conversation without id	2024-02-23 11:32:00 -08:00
Debanjum	9afb2a14ef	Fix and Improve Chat UI in Web, Desktop apps (#655 ) ### Improvements to Chat UI on Web, Desktop apps - Improve styling of chat session side panel - Improve styling of chat message bubble in Desktop, Web app - Add frosted, minimal chat UI to background of Login screen - Improve PWA install experience of Khoj ### Fixes to Chat UI on Web, Desktop apps - Fix creating new chat sessions from the Desktop app - Only show 3 starter questions even when consecutive chat sessions created ### Other Improvements - Update Khoj cloud trial period to a fortnight instead of a week - Document using venv to handle dependency conflict on khoj pip install Resolves #276	2024-02-23 19:27:02 +05:30
Debanjum Singh Solanky	c70ca78cdc	Improve PWA install experience for Khoj on Desktop, Mobile - Resolve PWA issues thrown by Chrome/Edge - Add screenshot samples showcasing remember, browse and draw features - This can provide a richer app store like experience when installing Khoj PWA on Mobile or Desktop - Add wide and narrow screenshots to show Mobile vs Desktop UX - Add higher resolution favicon for PWA - Use single web manifest instead of separate ones for Chat, Search - Update manifest description with more details about Khoj features	2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky	e10b260988	Update web login screen to show frosted minimal chat UI in background	2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky	1b0318564e	Log when conversation turn is saved to DB	2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky	4c39960917	Make number of conversation starters to get from DB configurable	2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky	50617594fd	Only show 3 starter questions even when consecutive chat sessions created Reset starter question suggestions before appending in web, desktop app Otherwise previously it'd keep adding to existing starter question suggestions on each new session creation if multiple consecutive new chat sessions created. This would result in more than the 3 expected starter questions being displayed at a time	2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky	102f5c3f53	Improve styling of chat session side panel - Make collapse, expand toggle arrow point in the direction the action will expand the side panel in - Make the collapsed side panel reduce to a 1px sliver	2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky	6283d9fe83	Update Khoj cloud trial period to a fortnight instead of a week - Improve rate limit error message wording - Make the "too many requests" error message more robust. Should throw that exception fix self.request >= self.subscribed_requests because upgrading wouldn't fix this rate limiting	2024-02-23 18:33:56 +05:30
Debanjum Singh Solanky	05c1903784	Fix creating new chat sessions from the Desktop app Code wasn't passing the authorization header in the POST request to create new chat session	2024-02-23 18:33:56 +05:30
Debanjum Singh Solanky	8a219b6e9c	Improve styling of chat message bubble in Desktop, Web app - Respect newline with pre-line but not for bullets to improve formatting of responses by Khoj - Respect bold font by loading tajawal font with other weights - Reduce bottom margin in chat message bubble, its taking too much space	2024-02-23 18:33:56 +05:30
sabaimran	b4902090e7	Misc. chat and application improvements (#652 ) * Document original query when subqueries can't be generated * Only add messages to the chat message log if it's non-empty * When changing the search model, alert the user that all underlying data will be deleted * Adding more clarification to the prompt input for username, location * Check if has_more is in the notion results before getting next_cursor * Update prompt template for user name/location, update confirmation message when changing search model	2024-02-22 19:09:22 -08:00
Debanjum Singh Solanky	57dce91c91	Document using venv to handle dependency conflict on khoj pip install Resolves #276	2024-02-23 02:07:08 +05:30
Debanjum Singh Solanky	7271164256	Set chat session title to textContent of the chat session HTML element We don't expect/want the user to use HTML titles for chat session	2024-02-23 02:07:08 +05:30
sabaimran	f8ec6b4464	Remove backslash for default route in api_chat	2024-02-20 20:09:44 -08:00
sabaimran	699545366b	Set gunicorn config to use 4 workers	2024-02-20 15:06:20 -08:00
sabaimran	b1c86fee3b	Release Khoj version 1.6.0	2024-02-20 14:12:24 -08:00
sabaimran	45c5a2598d	Temp - change gunicorn config to use a single worker	2024-02-20 13:56:51 -08:00
sabaimran	44f8f20ea7	Miscellaneous bugs and fixes for chat sessions (#646 ) * Display given_name field only if it is not None * Add default slugs in the migration script * Ensure that updated_at is saved appropriately, make sure most recent chat is returned for default history * Remove the bin button from the chat interface, given deletion is handled in the drop-down menus * Refresh the side panel when a new chat is created * Improveme tool retrieval prompt, don't let /online fail, and improve parsing of extract questions * Fix ending chat response by offline chat on hitting a stop phrase Previously the whole phrase wouldn't be in the same response chunk, so chat response wouldn't stop on hitting a stop phrase Now use a queue to keep track of last 3 chunks, and to stop responding when hit a stop phrase * Make chat on Obsidian backward compatible post chat session API updates - Make chat on Obsidian get chat history from `responseJson.response.chat' when available (i.e when using new api) - Else fallback to loading chat history from responseJson.response (i.e when using old api) * Fix detecting success of indexing update in khoj.el When khoj.el attempts to index on a Khoj server served behind an https endpoint, the success reponse status contains plist with certs. This doesn't mean the update failed. Look for :errors key in status instead to determine if indexing API call failed. This fixes detecting indexing API call success on the Khoj Emacs client, even for Khoj servers running behind SSL/HTTPS * Fix the mechanism for populating notes references in the conversation primer for both offline and online chat * Return conversation.default when empty list for dynamic prompt selection, send all cmds in telemetry * Fix making chat on Obsidian backward compatible post chat session API updates New API always has conversation_id set, not `chat' which can be unset when chat session is empty. So use conversation_id to decide whether to get chat logs from `responseJson.response.chat' or `responseJson.response' instead --------- Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2024-02-20 13:55:35 -08:00
sabaimran	138f5223bd	Fix process for generating embeddings for Notion entries (#648 ) * Fix process for generating embeddings for Notion entries * If no title field found, just log a warning and set the title to	2024-02-20 13:46:56 -08:00
Debanjum	43013c4fd4	Make Production Dependencies for Khoj Cloud Optional to Install (#647 ) - Remove unused git dependency from Docker images - Move python packages used for test into dev dependency group - Only enable API token, Whatsapp cards on Web UI when Stripe, Twilio setup - Move production dependencies to prod python packages group - Fix docs links in Khoj welcome chat message	2024-02-16 17:42:23 +05:30
Debanjum Singh Solanky	4696577636	Upgrade python dependencies	2024-02-16 17:41:09 +05:30
Debanjum Singh Solanky	4007c871ae	Remove unused git dependency from Docker images	2024-02-16 17:41:09 +05:30
Debanjum Singh Solanky	e21a8530f3	Move used python packages for test into dev dependency group The test dependency group was being used independently	2024-02-16 17:41:09 +05:30
Debanjum Singh Solanky	4722da9642	Only enable API token, Whatsapp cards on Web UI when Stripe, Twilio setup	2024-02-16 17:41:09 +05:30
Debanjum Singh Solanky	cf4a524988	Move production dependencies to prod python packages group This will reduce khoj dependencies to install for self-hosting users - Move auth production dependencies to prod python packages group - Only enable authentication API router if not in anonymous mode - Improve error with requirements to enable authentication when not in anonymous mode	2024-02-16 17:41:08 +05:30
Debanjum Singh Solanky	d7dbb715ef	Fix docs links in khoj introductory chat message	2024-02-13 22:38:03 +05:30
sabaimran	32ec54172e	Add additional personalization in Chat via Location, Username (#644 ) * Add location metadata to chat history * Add support for custom configuration of the user name * Add region, country, city in the desktop app's URL for context in chat * Update prompts to specify user location, rather than just location. * Add location data to Obsidian chat query * Use first word for first name, last word for last name when setting profile name	2024-02-13 17:05:13 +05:30
sabaimran	a3eb17b7d4	Have Khoj dynamically select conversation command(s) in chat (#641 ) * Have Khoj dynamically select which conversation command(s) are to be used in the chat flow - Intercept the commands if in default mode, and have Khoj dynamically guess which tools would be the most relevant for answering the user's query * Remove conditional for default to enter online search mode * Add multiple-tool examples in the prompt, make prompt for tools more specific to info collection	2024-02-11 17:11:32 +05:30
sabaimran	69344a6aa6	Add support for multiple chat sessions in the desktop application (#639 ) * Add chat sessions to the desktop application * Increase width of the main chat body to 90vw * Update the version of electron * Render the default message if chat history fails to load * Merge conversation migrations and fix slug setting * Update the welcome message, use the hostURL, and update background color for chat actions * Only update the window's web contents if the page is config	2024-02-11 16:05:28 +05:30
sabaimran	1412ed6a00	Support multiple chat sessions within the web UI (#638 ) * Enable support for multiple chat sessions within the web client - Allow users to create multiple chat sessions and manage them - Give chat session slugs based on the most recent message - Update web UI to have a collapsible menu with active chats - Move chat routes into a separate file * Make the collapsible side panel more graceful, improve some styling elements of the new layout * Support modification of the conversation title - Add a new field to the conversation object - Update UI to add a threedotmenu to each conversation * Get the default conversation if a matching one is not found by id	2024-02-11 15:48:28 +05:30
sabaimran	208ccc83ec	Fix version of gpt4all to 2.1.0 as it's not backwards compatible	2024-02-10 09:32:04 +05:30
Debanjum Singh Solanky	70f74cde68	Fix timestamps to separate each logline. Info log response start time	2024-02-07 20:45:16 +05:30
Debanjum Singh Solanky	667b975400	Free space on Github workflow VM to build Khoj docker images	2024-02-06 23:37:51 +05:30
Debanjum Singh Solanky	8e5db72140	Release Khoj version 1.5.1	2024-02-06 23:09:33 +05:30
Debanjum	fc1b8f6fb6	Fix Khoj Obsidian plugin on Obsidian Mobile (#635 ) - Removed node-fetch dependency to work on mobile. - Fix CORS issue for Khoj (streaming) chat on Obsidian mobile - Verified Khoj plugin, search, chat work on Obsidian mobile. ## Details ### Major - Allow calls to Khoj server from Obsidian mobile app to fix CORS issue - Chat stream using default `fetch' not `node-fetch' in obsidian plugin ### Minor - Load chat history after other elements in chat modal on Obsidian are rendered - Scroll to bottom of chat modal on Obsidian across mobile & desktop	2024-02-06 22:03:51 +05:30
Debanjum	c6fa98ce3e	Make Offline Chat Date Aware (#636 ) - Provide more context and instructions to offline chat on Khoj - Upgrade offilne chat quality tests to support more use-cases ### Details - Improve offline chat system prompt to think step by step - Make offline chat model current date aware. Improve system prompts - Fix actor, director tests using freeze time by ignoring transformers package	2024-02-06 21:32:34 +05:30
Debanjum Singh Solanky	fd238ff792	Load chat history after other elements in chat modal on Obsidian rendered This reduces laggy feeling due to latency of loading chat history from server	2024-02-06 21:25:43 +05:30
Debanjum Singh Solanky	e06a0c6ae0	Scroll to bottom of chat modal on Obsidian across mobile & desktop Put logic into single reused function	2024-02-06 21:25:43 +05:30
Debanjum Singh Solanky	07dc04f40e	Allow calls to Khoj server from Obsidian mobile app to fix CORS issue - Obsidian mobile uses capacitor js. Requests from it have origin as http://localhost on Android and capacitor://localhost on iOS - Allow those Obsidian mobile origins in CORS middleware of server	2024-02-06 21:25:43 +05:30
Debanjum Singh Solanky	dd4cf66be1	Improve offline chat system prompt to think step by step	2024-02-06 20:23:19 +05:30
Debanjum Singh Solanky	035165b534	Make offline chat model current date aware. Improve system prompts - Can now expect date awareness chat quality test to pass - Prevent offline chat model from printing verbatim user Notes and special tokens - Make it ask follow-up questions if it needs more context	2024-02-06 20:23:19 +05:30
Debanjum Singh Solanky	447904f0ab	Chat stream using default `fetch' not` node-fetch' in obsidian plugin Plugins using NodeJS libraries like `node-fetch' don't work on Obsidian mobile	2024-02-06 03:03:42 +05:30
Debanjum Singh Solanky	0d949140f4	Fix actor, director tests using freeze time by ignoring transformers package transformers package was causing freeze time to fail during setup	2024-02-06 03:00:48 +05:30
Debanjum Singh Solanky	c40f642afa	Move Use OpenAI Compatible LLM Server section to existing advanced page Add footnote on supported chat models to the self-hosting section	2024-02-04 16:16:55 +05:30
Debanjum Singh Solanky	523af5b3aa	Fix docs. Chat model options need to be set if using OpenAI proxy server	2024-02-04 06:42:05 +05:30
Debanjum Singh Solanky	ba79334863	Only log number of day old user requests, not the complete dictionary	2024-02-02 10:33:31 +05:30
Debanjum Singh Solanky	474afa5efe	Document using OpenAI-compatible LLM API server for Khoj chat This allows using open or commerical, local or hosted LLM models that are not supported in Khoj by default. It also allows users to use other local LLM API servers that support their GPU Closes #407	2024-02-02 10:31:27 +05:30
Debanjum Singh Solanky	1c6f1d94f5	Fix styling of Whatsapp card & notify banner in config page of web app - Put Whatsapp card back in Client section. - Fixes side spacing on cards - Improve Whatsapp card row gaps - Hide notification banner on web app load. Previously it showed up as a yellow dot on smaller displays	2024-02-01 22:59:57 +05:30
Debanjum Singh Solanky	e05474e7e0	Say when max-prompt, tokenizer fields needs setup in self-host docs	2024-01-31 08:42:22 +05:30
sabaimran	4daac334bc	Fix subscription state detection for users based on phone numbers, emails (#633 ) * Fix subscription state detection for users based on phone numbers, emails * Fix unit tests for api_user4 * Use a single method for determining subscription from user * Pass user object, rather than user.email for getting subscription state	2024-01-31 07:48:55 +05:30
sabaimran	fc4b57d9f6	Revert styling for white-space pre-line in the chat views as it looks bad	2024-01-29 18:29:54 +05:30
sabaimran	da854703aa	Release Khoj version 1.5.0	2024-01-29 18:05:10 +05:30
Debanjum	d1bfb245df	Improve Khoj Chat and Settings UI (#630 ) * Fix license in pyproject.toml. Remove unused utils.state import * Use single debug mode check function. Disable telemetry in debug mode - Use single logic to check if khoj is running in debug mode. Previously there were 3 different variants of the check - Do not log telemetry if KHOJ_DEBUG is set to true. Previously didn't log telemetry even if KHOJ_DEBUG set to false * Respect line breaks in user, khoj chat messages to improve formatting * Disable Whatsapp config section on web client if Twilio not configured Simplify Whatsapp configuration status checking js by standardizing external input to lower case * Disable Phone API when Twilio not setup and rate limit calls to it - Move phone api to separate router and only enable it if Twilio enabled - Add rate-limiting to OTP and verification calls * Add slugs for phone rate limiting --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2024-01-29 18:03:43 +05:30
sabaimran	9ad44f0e77	Include info about privacy in the docs (#631 ) * Add a page about privacy and organize some of the documentation * Add notice about telemetry * Improve copy for privacy section, link to telemetry section	2024-01-29 17:47:23 +05:30
sabaimran	4fb8d5c6d4	Store rate limiter-related metadata in the database for more resilience (#629 ) * Store rate limiter-related metadata in the database for more resilience - This helps maintain state even between server restarts - Allows you to scale up workers on your service without having to implement sticky routing * Make the usage exceeded message less abrasive * Fix rate limiter for specific conversation commands and improve the copy	2024-01-29 15:27:06 +05:30
sabaimran	71cbe5160d	Add retries in case the embeddings API fails (#628 ) * Add retries in case the embeddings API fails * Improve error handling in the inference endpoint API request handler - retry only if HTTP exception - use logger to output information about errors	2024-01-29 15:26:34 +05:30
sabaimran	b782683e60	Scrape results from Serper results using Olostep (#627 ) * Initailize changes to incporate web scraping logic after getting SERP results - Do some minor refactors to pass a symptom prompt to the openai model when making a query - integrate Olostep in order to perform the webscraping * Fix truncation error with new line, fix typing in olostep code * Use the authorization header for the token * Add a small hint/indicator for how to use Khojs other modalities in the welcome prompt * Add more detailed error message if Olostep query fails * Add unit tests which invoke Olostep in chat director * Add test for olostep tool	2024-01-29 14:16:50 +05:30
sabaimran	360b59cdb2	Add handling for None field values in logs and make telemetry upload more frequent	2024-01-26 00:00:55 +05:30
sabaimran	737fb6417b	Revert none checking in telemetry logs	2024-01-25 23:48:09 +05:30
sabaimran	211c5623e8	Improve error handling for telemetry uploads - Use response.raise_for_status when telemetry upload files - Do not send null packets to the destination server	2024-01-25 20:40:42 +05:30
Debanjum Singh Solanky	098a8e4fb1	Fix evaluating connected to server status in Obsidian plugin Only show welcome status message when khojApiKey not set and khojUrl set to khoj cloud	2024-01-25 18:04:29 +05:30
Debanjum Singh Solanky	518f3c0c99	Update docs to say khoj chat shown on obsidian ribbon now	2024-01-25 18:03:22 +05:30
Debanjum Singh Solanky	1c52ddf792	Bump up server side content indexing interval to ~1 day Reduce server side indexing load and API request failures	2024-01-25 13:33:34 +05:30
sabaimran	0fba1e27c5	Add hint to input text for using slash commands	2024-01-25 11:56:56 +05:30
sabaimran	da6cd5ddc4	Improve subqueries for online search and prompt generation for image (#626 ) * Improve subqueries for online search and prompt generation for image - Include conversation history so that subqueries or intermediate prompts are generated with the appropriate context	2024-01-24 17:42:59 +05:30
sabaimran	dbdca7d8d1	Disable swagger UI docs in production	2024-01-24 15:23:39 +05:30
sabaimran	ddf6fd9c09	Remove valid number alert	2024-01-23 17:57:27 +05:30
Debanjum Singh Solanky	17107a0337	Release Khoj version 1.4.0	2024-01-23 10:18:31 +05:30
Debanjum Singh Solanky	f69eafe95a	Update Readme with updated capabilties	2024-01-23 09:56:01 +05:30
sabaimran	679db51453	Add support for phone number authentication with Khoj (part 2) (#621 ) * Allow users to configure phone numbers with the Khoj server * Integration of API endpoint for updating phone number * Add phone number association and OTP via Twilio for users connecting to WhatsApp - When verified, store the result as such in the KhojUser object * Add a Whatsapp.svg for configuring phone number * Change setup hint depending on whether the user has a number already connected or not * Add an integrity check for the intl tel js dependency * Customize the UI based on whether the user has verified their phone number - Update API routes to make nomenclature for phone addition and verification more straightforward (just /config/phone, etc). - If user has not verified, prompt them for another verification code (if verification is enabled) in the configuration page * Use the verified filter only if the user is linked to an account with an email * Add some basic documentation for using the WhatsApp client with Khoj * Point help text to the docs, rather than landing page info * Update messages on various callbacks and add link to docs page to learn more about the integration	2024-01-22 18:14:58 -08:00
sabaimran	58bf917775	Update the font used across Khoj desktop and web to be Tajawal (#622 )	2024-01-20 23:13:33 +05:30
Debanjum	679f0f24a4	Improve Chat Input Pane Actions. Move to 1 Click Audio Chat on Mobile (#624 ) ## Major ### Move to single click audio chat UX on Obsidian, Desktop, Web clients New default UX has 1 long-press on mobile, 2-click on desktop to send transcribed audio message - New Audio Chat Flow 1. Record audio while microphone button pressed 2. Show auto-send 3s countdown timer UI for audio chat message Provide a visual cue around send button for how long before audio message is automatically sent to Khoj for response 3. Auto-send msg in 3s unless stop send message button clicked - Why - Removes the previous default of 3 clicks required to send audio message The record > stop > send process to send audio messages was unclear and effortful - Still allows stopping message from being sent, to make correction to transcribed audio - Removes inadvertent long audio transcriptions if forget to press stop while recording ### Improve chat input pane actions & icons on Obsidian. Desktop, Web clients - Use SVG icons in chat footer on web, desktop app - Move delete icon to left of chat input. This makes it harder to inadvertently click it - Add send button to chat input pane - Color chat message send button to make it primary CTA - Make chat footer shorter. Use no or round border on action buttons ## Minor - Stop rendering empty starter questions element when no questions present - Add round border, hover color to starter questions in web, desktop apps - Fix auto resizing chat input box when transcribed text added - Convert chat input into a text area in the Obsidian client	2024-01-20 21:52:56 +05:30
Debanjum Singh Solanky	ec3b837d00	Send audio message in 2-clicks on desktop to avoid holding down mic button	2024-01-20 21:40:38 +05:30
Debanjum Singh Solanky	f0daa45ae0	Move to single click audio chat UX on Obsidian client - Capabillity New default UX has 1 long-press to send transcribed audio message - Removes the previous default of 3 clicks required to send audio message - The record > stop > send process to send audio messages was unclear - Still allows stopping message from being sent, if users want to make correction to transcribed audio - Removes inadvertent long audio transcriptions if user forgets to press stop when recording - Changes - Record audio while microphone button pressed - Show auto-send 3s countdown timer UI for audio chat message Provide a visual cue around send button for how long before audio message is automatically sent to Khoj for response - Auto-send msg in 3s unless stop send message button clicked	2024-01-20 16:07:12 +05:30
Debanjum Singh Solanky	29a581d2b0	Move to single click audio chat UX on desktop app - Capabillity New default UX has 1 long-press to send transcribed audio message - Removes the previous default of 3 clicks required to send audio message - The record > stop > send process to send audio messages was unclear - Still allows stopping message from being sent, if users want to make correction to transcribed audio - Removes inadvertent long audio transcriptions if user forgets to press stop when recording - Changes - Record audio while microphone button pressed - Show auto-send 3s countdown timer UI for audio chat message Provide a visual cue around send button for how long before audio message is automatically sent to Khoj for response - Auto-send msg in 3s unless stop send message button clicked	2024-01-20 16:03:51 +05:30
Debanjum Singh Solanky	699e9ff878	Move to single click audio chat UX on web app - Capabillity New default UX has 1 long-press to send transcribed audio message - Removes the previous default of 3 clicks required to send audio message - The record > stop > send process to send audio messages was unclear - Still allows stopping message from being sent, if users want to make correction to transcribed audio - Removes inadvertent long audio transcriptions if user forgets to press stop when recording - Changes - Record audio while microphone button pressed - Show auto-send 3s countdown timer UI for audio chat message Provide a visual cue around send button for how long before audio message is automatically sent to Khoj for response - Auto-send msg in 3s unless stop send message button clicked	2024-01-20 15:56:46 +05:30
Debanjum Singh Solanky	26bd3533d8	Stop rendering empty starter questions element when no questions present	2024-01-20 11:39:58 +05:30
Debanjum Singh Solanky	7c8c475c3a	Add round border, hover color to starter questions in web, desktop apps	2024-01-20 00:51:11 +05:30
Debanjum Singh Solanky	8a488b9e39	Fix auto resizing chat input box when transcribed text added	2024-01-20 00:48:56 +05:30
Debanjum Singh Solanky	07ca137bdf	Convert chat input into a text area in the Obsidian client This allows for better readability of multi-line messages by users. The chat input is a text area in the other clients as well.	2024-01-20 00:48:56 +05:30
Debanjum Singh Solanky	d4552117f6	Add and improve chat input pane, actions, icons on Obsidian client - Move delete icon to left of chat input. This makes it harder to inadvertently click - Add send button to chat footer. Enter being the only way to send messages is not intuitive, outside standard modern UI patterns - Color chat message send button to make it primary CTA on web client - Make chat footer shorter. Use no or round border on action buttons	2024-01-20 00:48:56 +05:30
Debanjum Singh Solanky	c0ad64d9a3	Add and improve chat input pane, actions, icons on desktop client - Use SVG icons in chat footer on web - Move delete icon to left of chat input. This makes it harder to inadvertently click - Add send button to chat footer. Enter being the only way to send messages is not intuitive, outside standard modern UI patterns - Color chat message send button to make it primary CTA on web client - Make chat footer shorter. Use no or round border on action buttons	2024-01-20 00:29:49 +05:30
Debanjum Singh Solanky	ea85ebdacb	Add and improve chat input pane, actions, icons on web client - Use SVG icons in chat footer on web - Move delete icon to left of chat input. This makes it harder to inadvertently click - Add send button to chat footer. Enter being the only way to send messages is not intuitive, outside standard modern UI patterns - Color chat message send button to make it primary CTA on web client - Make chat footer shorter. Use no or round border on action buttons	2024-01-19 20:40:42 +05:30
sabaimran	039ed78253	Add support for a first-party client app to call into Khoj (Part 1) (#601 ) * Add support for a first party client app - Based on a client id and client secret, allow a first party app to call into the Khoj backend with a phone number identifier - Add migration to add phone numbers to the KhojUser object * Add plus in front of country code when registering a phone number. - Decrease free tier limit to 5 (from 10) - Return a response object when handling stripe webhooks * Fix telemetry method which references authenticated user's client app * Add better error handling for null phone numbers, simplify logic of authenticating user * Pull the client_secret in the API call from the authorization header * Add a migration merge to resolve phone number and other changes	2024-01-18 19:24:14 +05:30
Debanjum Singh Solanky	9dfe1bb003	Fix updating subscription when invoice paid. Revert renewal_date logic The actual issue was that `get_or_create_user_by_email' tried to create a subscription even if it already existed. With updated logic: - New subscription is only created when it doesn't already exist in `get_or_create_user_by_email' - `set_user_subscription' just updates the subscription state as user subscription object creation is already managed by `get_or_create_user_by_email'. So the other conditionals are unnecessary	2024-01-18 16:20:18 +05:30
Debanjum Singh Solanky	9b1a66c969	Fix updating subscription renewal date when invoice paid	2024-01-18 14:46:10 +05:30
sabaimran	93d5cb128c	Initialize embeddings to empty list before processing	2024-01-18 13:27:04 +05:30
Debanjum Singh Solanky	24af888c41	Release Khoj version 1.3.0	2024-01-18 11:42:13 +05:30
Debanjum Singh Solanky	2f1bb5c2c8	Upload Desktop App Artifacts to Github Release	2024-01-18 11:40:04 +05:30
sabaimran	e71ebb8068	Standardize issue templates and make them easier to use	2024-01-18 10:54:05 +05:30
sabaimran	efb4bd6780	Add a template for feature requests	2024-01-18 10:38:53 +05:30
sabaimran	6165ae56c2	Update bug report issue template - collect info about OS, device, server, client, and prompt to include any relevant data	2024-01-18 10:35:02 +05:30
Debanjum	8b4dd16255	Fix markdownRenderer arg to allow chat responses in Obsidian plugin (#619 ) - Issue Users with Dataview plugin would have error as its markdown post-processor expects the sourcePath to be a string This prevents Khoj from responding to chat messages in the Obsidian chat modal. Search via Obsidian still works but it throws the same dataview plugin error - Fix Pass a string as sourcePath to markdownRenderer to fix failing chat response and stop throwing dataview errors on search Resolves #614, Resolves #606	2024-01-18 10:18:31 +05:30
Debanjum	c8dbe8ee7b	Improve server status check and message in Obsidian client (#617 ) - Update health API to pass authenticated users their info - Improve Khoj server status check in Khoj Obsidian client - Show Khoj Obsidian commands even if no connection to server - Show Khoj chat by default in Obsidian side pane instead of search	2024-01-18 10:17:35 +05:30
Debanjum Singh Solanky	f9420e1209	Show Khoj Obsidian commands even if no connection to server Server connection check can be a little flaky in Obsidian. Don't gate the commands behind it to improve usability of Khoj. Previously the commands would get disabled when server connection check failed, even though server was actually accessible	2024-01-18 10:09:20 +05:30
Debanjum Singh Solanky	36bf42a860	Show Khoj chat by default in Obsidian side pane instead of search	2024-01-18 10:09:20 +05:30
Debanjum Singh Solanky	aab75a6ead	Improve Khoj server status check in Khoj Obsidian client - Update server connection status on every edit of khoj url, api key in settings instead of only on plugin load The error message was stale if connection fixed after changes in Khoj plugin settings to URL or API key, like on plugin install - Show better welcome message on first plugin install. Include API key setup instruction - Show logged in user email on Khoj settings page	2024-01-18 10:09:20 +05:30
Debanjum Singh Solanky	1a46734485	Fix markdownRenderer arg to allow chat responses in Obsidian plugin - Issue: Users with Dataview plugin would have error as its markdown post-processor expects the sourcePath to be a string This prevents Khoj from responding to chat messages in the Obsidian chat modal. Search via Obsidian still works but it throws the same dataview error - Fix: Pass a string as sourcePath to markdownRenderer to fix failing chat response Resolves #614, Resolves #606	2024-01-18 10:02:50 +05:30
sabaimran	e9e49ea098	Allow custom inference endpoint for the crossencoder model (#616 ) * Add support for custom inference endpoints for the cross encoder model - Since there's not a good out of the box solution, I've deployed a custom model/handler via huggingface to support this use case. * Use langchain.community for pdf, openai chat modules * Add an explicit stipulation that the api endpoint for crossencoder inference should be for huggingface for now	2024-01-18 10:02:12 +05:30
Debanjum Singh Solanky	08012c71b1	Update Dockerfile with swig system package required by PyMuPDF	2024-01-17 19:24:27 +05:30
Debanjum Singh Solanky	870af19ba4	Update health API to pass authenticated users their info This allows Khoj clients to get email address associated with user's API token for display in client UX In anonymous mode, default user information is passed	2024-01-17 13:38:57 +05:30
Debanjum	4d30f7d1d9	Short-circuit API rate limiter for unauthenticated users (#607 ) ### Major - Short-circuit API rate limiter for unauthenticated user Calls by unauthenticated users were failing at API rate limiter as it failed to access user info object. This is a bug. API rate limiter should short-circuit for unauthenicated users so a proper Forbidden response can be returned by API Add regression test to verify that unauthenticated users get 403 response when calling the /chat API endpoint ### Minor - Remove trailing slash to normalize khoj url in obsidian plugin settings - Move used /api/config API controllers into separate module - Delete unused /api/beta API endpoint - Fix error message rendering in khoj.el, khoj obsidian chat - Handle deprecation warnings for subscribe renew date, langchain, pydantic & logger.warn	2024-01-17 00:59:52 +05:30
Debanjum Singh Solanky	d26a4ffcea	Only run the OpenAI chat client, /online test when API keys are set	2024-01-17 00:36:03 +05:30
Debanjum Singh Solanky	2752e0d607	Update jinja2 and axios min supported package versions	2024-01-16 18:45:38 +05:30
Debanjum Singh Solanky	7039c202c8	Merge branch 'master' into short-circuit-api-rate-limiter	2024-01-16 18:18:34 +05:30
Debanjum Singh Solanky	8917228dbb	Remove unused, deprecated /api/config/data API endpoints - Use /api/health for server up check instead of api/config/default - Remove unused `khoj--post-new-config' method - Remove the now unused /config/data GET, POST API endpoints	2024-01-16 18:15:06 +05:30
Debanjum	51c59d0059	Remove the 1000 files limit when syncing from Desktop, Obsidian clients (#605 ) ### Major - Push 1000 files at a time from the Desktop client for indexing - Push 1000 files at a time from the Obsidian client for indexing - Test 1000 file upload limit to index/update API endpoint ### Minor - Show relevant error message in desktop app, e.g when can't connect to server - Pass indexed filenames in API response for client validation - Collect files to index in single dict to simplify index/update controller Resolves #573	2024-01-16 17:59:26 +05:30
Debanjum Singh Solanky	6ded4c1d75	Merge branch 'master' into fix-1000-file-index-update-limit	2024-01-16 16:50:58 +05:30
sabaimran	c24389cff5	Add Algolia to documentation website for better search	2024-01-16 15:53:53 +05:30
Debanjum	45f892dfdd	Fix Offline Chat without GPU and Decoding Chat Query before Processing - Only run /online command offline chat director test when `SERPER DEV_API_KEY' present - Decode URL encoded query string in chat API endpoint before processing - Make references and online_results optional params to converse_offline - Pass max context length to fix using updated `GPT4All.list_gpu' method	2024-01-16 14:53:34 +05:30
Debanjum Singh Solanky	e0b381d523	Only run /online command offline chat director test when SERPER KEY present	2024-01-16 13:09:38 +05:30
Debanjum Singh Solanky	16175137e5	Decode URL encoded query string in chat API endpoint before processing	2024-01-16 13:09:28 +05:30
Debanjum Singh Solanky	9fe1c8ae13	Make references and online_results optional params to converse_offline Fixes all the failing GPT4All tests because they were missing the online_results argument	2024-01-16 13:09:28 +05:30
Debanjum Singh Solanky	d74f8e03d3	Pass max context length to fix using updated GPT4All.list_gpu method It's signature was updated in GPT4All 2.1.0 pypi release. Resolves #610	2024-01-16 12:23:45 +05:30
Debanjum Singh Solanky	1ae6669fbf	Correctly handle API response when no files to index	2024-01-16 11:57:40 +05:30
sabaimran	50575b749b	Add option to use HuggingFace's inference endpoint for generating embeddings (#609 ) * Support using hosted Huggingface inference endpoint for embeddings generation * Since the huggingface inference endpoint is model-specific, make the URL an optional property of the search model config * Handle ECONNREFUSED error in desktop app * Drive API key via the search model config model and use more generic names	2024-01-16 08:58:24 +05:30
Debanjum Singh Solanky	ba37b28fb5	Improve batched error handling. Catch can't connect to server error Break out of batch processing when unable to connect to server or when requests throttled by server	2024-01-14 01:04:44 +05:30
Debanjum Singh Solanky	7dfbcd2e5a	Handle subscribe renew date, langchain, pydantic & logger.warn warnings - Ensure langchain less than 0.2.0 is used, to prevent breaking ChatOpenAI, PyMuPDF usage due to their deprecation after 0.2.0 - Set subscription renewal date to a timezone aware datetime - Use logger.warning instead of logger.warn as latter is deprecated - Use `model_dump' not deprecated dict to get all configured content_types	2024-01-12 01:46:52 +05:30
Debanjum Singh Solanky	5f97357fe0	Delete unused /api/beta API endpoint	2024-01-12 01:11:05 +05:30
Debanjum Singh Solanky	bb1c1b39d8	Move /api/config API controllers into separate module for code modularity	2024-01-12 01:11:04 +05:30
Debanjum Singh Solanky	ba99089a12	Short-circuit API rate limiter for unauthenticated user Calls by unauthenticated users were failing at API rate limiter as it failed to access user info object. This is a bug. API rate limiter should short-circuit for unauthenicated users so a proper Forbidden response can be returned by API Add regression test to verify that unauthenticated users get 403 response when calling the /chat API endpoint	2024-01-12 00:23:50 +05:30
Debanjum Singh Solanky	b1269fdad2	Remove trailing slash to normalize khoj url in obsidian plugin settings	2024-01-11 21:56:36 +05:30
Debanjum Singh Solanky	ffdb291fe0	Fix error message rendering in khoj.el, khoj obsidian chat - Fix failed to index error message in khoj.el - Fix chat model not configured message in khoj obsidian chat	2024-01-11 21:55:54 +05:30
Debanjum Singh Solanky	af9ceb00a0	Show relevant error msg in desktop app, e.g when can't connect to server	2024-01-09 23:09:34 +05:30
Debanjum Singh Solanky	43423432ce	Pass indexed filenames in API response for client validation	2024-01-09 23:09:34 +05:30
Debanjum Singh Solanky	5f9ac5a630	Collect files to index in single dict to simplify index/update controller Simplifies code while maintaining typing	2024-01-09 23:09:34 +05:30
Debanjum Singh Solanky	efe41aaaca	Push 1000 files at a time from the Desktop client for indexing FastAPI API endpoints only support uploading 1000 files at a time. So split all files to index into groups of 1000 for upload to index/update API endpoint	2024-01-09 23:09:34 +05:30
sabaimran	02187b19bb	Customize font styling for documentation	2024-01-08 08:50:42 +05:30
sabaimran	8389108653	Fix reference issue for demos in the main README	2024-01-08 08:29:51 +05:30
Debanjum	dbc59b2952	Fix, Improve Khoj Documentation Layout (#604 ) - 26f96e00 Use Khoj Client, Data sources diagrams in feature docs - `c82d34b6` Add Docs footer, nav pane links. Fix tagline, Remove announcement topbar - `d920e4d0` Make the docs overview page as the main docs landing page - `80d1ad5b` Fix image urls on docs overview page. Remove logo header in client docs	2024-01-08 02:00:02 +05:30
Debanjum Singh Solanky	efc7b08cd9	Use Khoj Client, Data sources diagrams in feature docs	2024-01-08 01:58:57 +05:30
Debanjum Singh Solanky	c82d34b659	Add Docs footer, nav pane links. Fix tagline, Remove announcement topbar	2024-01-08 01:17:47 +05:30
Debanjum Singh Solanky	d920e4d0a7	Make the docs overview page as the main docs landing page - Make the docs overview page available at docs.khoj.dev root instead of under docs.khoj.dev/docs path - Remove the new landing page, it is unnecessary. - Remove /docs path prefix from links to internal doc pages - Remove .md path suffix in internal doc pages for consistency	2024-01-08 01:13:42 +05:30
Debanjum Singh Solanky	80d1ad5b6f	Fix image urls on docs overview page. Remove logo header in client docs	2024-01-08 00:30:31 +05:30
sabaimran	ce53bc52c5	Modify permissions of the GITHUB_TOKEN for publishing to gh-pages	2024-01-07 20:53:57 +05:30
sabaimran	740453fa18	Use documentation folder for building project and uploading data	2024-01-07 20:50:15 +05:30
sabaimran	2be7c84203	Enter documentation repository before running yarn build	2024-01-07 20:46:21 +05:30
sabaimran	ad95e88838	Update node version in github action	2024-01-07 20:41:24 +05:30
sabaimran	bd9aa578f4	Add a yarn.lock file and use for node.js setup	2024-01-07 20:36:02 +05:30
sabaimran	9b991eb4fe	Migrate to using docusaurus, rather than docsify for documentation (#603 ) * Add docusaurus documentation (to replace the docsify setup * Remove older docs * Specify documentation as the gh pages build action working directory	2024-01-07 20:28:15 +05:30
Debanjum Singh Solanky	98081bc0d3	Update Uninstall Documentation for Khoj Server when Self Hosting	2024-01-06 01:37:29 +05:30
Debanjum Singh Solanky	5d52dc5b35	Fix spelling in the development documentation for Khoj	2024-01-04 19:24:58 +05:30
Debanjum Singh Solanky	b6d5392c0c	Release Khoj version 1.2.1	2024-01-04 18:45:37 +05:30
Debanjum Singh Solanky	fca7a5ff32	Push 1000 files at a time from the Obsidian client for indexing FastAPI API endpoints only support uploading 1000 files at a time. So split all files to index into groups of 1000 for upload to index/update API endpoint	2024-01-04 18:43:22 +05:30
Debanjum Singh Solanky	4ded32cc64	Test 1000 file upload limit to index/update API endpoint Due to FastAPI limitation	2024-01-03 22:14:36 +05:30
Debanjum Singh Solanky	4a234c8db3	Use default offline/openai chat model to extract DB search queries Make usage of the first offline/openai chat model as the default LLM to use for background tasks more explicit The idea is to use the default/first chat model for all background activities, like user message to extract search queries to perform. This is controlled by the server admin. The chat model set by the user is used for user-facing functions like generating chat responses	2024-01-03 14:04:49 +05:30
Debanjum Singh Solanky	e28adf2884	Also index pdf, markdown and plaintext files using khoj emacs client Previously you could only index org-mode files and directories from khoj.el Mark the `khoj-org-directories', `khoj-org-files' variables for deprecation, since `khoj-index-directories', `khoj-index-files' replace them as more appropriate names for the more general case Resolves #597	2024-01-03 11:46:17 +05:30
Debanjum Singh Solanky	5abaed9d08	Use user chosen OpenAI model to extract DB search questions from query Previously Khoj was selecting the first OpenAI model configured on server and not the OpenAI model configured by the user for themselves	2024-01-03 11:45:06 +05:30
Debanjum Singh Solanky	e582639efa	Move contributing section back down in sidebar of documentation website	2024-01-03 11:40:14 +05:30
Debanjum Singh Solanky	05536aab6b	Merge how users can share personal information in personality prompt	2024-01-03 11:40:14 +05:30
Liam Swayne	455f78b178	Replace var declarations with let declarations (#576 ) * Replace var declaration with let declaration	2023-12-29 10:20:48 +05:30
sabaimran	79913d4c17	Add isort to the pre-commit configuration and apply it to the whole project (#595 ) * Apply isort to the entire repository * Fix missing import issues in text_to_entries * Fix imports in migration files	2023-12-28 18:04:02 +05:30
sabaimran	738f050086	Merge pull request #587 from khoj-ai/features/search-model-options-custom Support multiple search models, with ability for custom user config	2023-12-28 13:09:49 +05:30
sabaimran	442c913de3	Update telemetry state for search model only if one is found, fix alt text for language setting	2023-12-28 12:53:53 +05:30
sabaimran	d3ab3f1b70	Rename matrix_blog to web and move the language setting into the content section	2023-12-28 12:44:49 +05:30
sabaimran	6946e038c2	Merge pull request #596 from khoj-ai/chore/add-developer-documentation Improve the developer documentation	2023-12-23 18:43:43 +05:30
sabaimran	00af6baeb6	Resolve merge conflicts with intro message in chat.html web view	2023-12-23 17:52:58 +05:30
sabaimran	c10602b6c5	Put contributing higher in the sidebar	2023-12-23 14:04:53 +05:30
sabaimran	fe415e1508	Add tip for using the good-first-issue tag in GH issues	2023-12-23 14:04:05 +05:30
sabaimran	3280715ca0	Update contributor guidelines - Add more accurate steps for building Khoj locally - Remove outdated instructions - Add specific steps to create a Github Issue - Add instructions for Obsidian plugin development	2023-12-23 14:00:52 +05:30
sabaimran	afec4394f9	Merge pull request #592 from ayushjha119/Fixed-Health-Check-to-Khoj-api Fixed health check to khoj api	2023-12-23 13:04:50 +05:30
sabaimran	c50eb8a691	Fix mypy/pre-commit issues	2023-12-23 11:44:37 +05:30
Debanjum Singh Solanky	21c55b4c0d	Release Khoj version 1.2.0	2023-12-22 21:43:47 +05:30
Debanjum Singh Solanky	e42111a8af	Fix bump_version.sh to commit, clean-up after desktop app version bump	2023-12-22 21:42:03 +05:30
Debanjum Singh Solanky	6a8c1fe423	Sanitize rendering chat references in Web, Desktop and Obsidian clients Use textContent instead of innerHTML to append references Resolves #583	2023-12-22 18:11:49 +05:30
Debanjum	6879daccc6	Fix Chat Streaming on Obsidian, Docker Image Version and First-Run, Chat Error Messages in Clients (#589 ) - Fix streaming chat response in Obsidian client - Fix first-run, chat error message in obsidian, desktop and web clients - Set Khoj app version to latest version in Docker images - Tag Khoj Docker image built on release with the `latest` tag This align docker image release cadence with client, server releases	2023-12-22 04:13:01 -08:00
Debanjum Singh Solanky	074123b9b9	Merge cloud, local dockerize workflows - Delete unused config directory	2023-12-22 17:11:52 +05:30
Debanjum Singh Solanky	d101297995	Use markdown formatted chat message in chat modal	2023-12-22 17:01:31 +05:30
Debanjum Singh Solanky	350fd89c8d	Clear chat history html in Obsidian if getChatHistory works too	2023-12-22 17:01:31 +05:30
Debanjum Singh Solanky	8d1e988059	Update tagging of the docker image on release, push to master & PR - Tag docker image with `tag_name' on release (i.e tag push) - Else tag with 'pre' on push to master - Else tag with 'dev' on push to PR branch - Only tag the latest release with release tag Previously the latest commit on master was being tagged with the latest tag. This doesn't sync with the release cadence of the rest of Khoj	2023-12-22 17:01:31 +05:30
Debanjum Singh Solanky	b5ae64cb3c	Dynamically set Khoj app version in the Dockerization Github workflows	2023-12-22 17:01:31 +05:30
Debanjum Singh Solanky	d3d47dce0b	Allow setting Khoj app version during docker build via build-args This will allow troubleshooting by getting the actual khoj version being used. Previously it was always set to a static 0.0.0 version Command to build Khoj docker image with dynamically set current app version: `docker-compose build server --build-arg VERSION=$(pipx run hatch version)'	2023-12-22 16:47:13 +05:30
ayushjha119	e487ec5370	fixed app to api health Check	2023-12-21 17:51:30 +05:30
Debanjum Singh Solanky	70607cbbbb	Update FRE message to get any Khoj client to sync files with server	2023-12-21 15:23:47 +05:30
ayushjha119	b3d7d6a79d	used the Response class from fastapi.responses and set the input for status_code to 200	2023-12-21 14:26:40 +05:30
sabaimran	e1aaff2053	Add more details about functionality in Khoj's intro message	2023-12-21 10:09:30 +05:30
sabaimran	a1211f40d7	Fix type declaration for the cross_encoder_model state variable. Update name of the new update API	2023-12-21 09:15:13 +05:30
sabaimran	089e4bee12	FIx unit tests with new search model configurations	2023-12-20 21:50:44 +05:30
Debanjum Singh Solanky	447c1b90e7	Fix streaming chat response in Obsidian client - Convert renderIncrementalMessage to an async method as MarkdownRenderer is an async method - Simplify code, remove unneeded JSON check	2023-12-20 14:51:19 +05:30
sabaimran	aa23da60a3	Add a notification banner to show temporary messages	2023-12-20 14:22:08 +05:30
Debanjum Singh Solanky	e04fe921eb	Fix first-run, chat error message in obsidian, desktop and web clients - Disable chat input field if getChatHistory had error as Khoj may not be setup correctly to chat	2023-12-20 14:03:07 +05:30
sabaimran	5ff9df9d4c	Add support per user for configuring the preferred search model from the config page - Honor this setting across the relevant places where embeddings are used - Convert the VectorField object to have None for dimensions in order to make the search model easily configurable	2023-12-20 13:25:43 +05:30
sabaimran	0f6e4ff683	Add a model that specifies the user's search model configuration - Update all endpoints that generate embeddings to use the new model. Incl. generating text embeddings, creating embeddings for a search query	2023-12-20 09:22:26 +05:30
sabaimran	6dd2b05bf5	Rebase with master	2023-12-19 21:02:49 +05:30
sabaimran	e3557cd8b7	Update the personality prompt to make Khoj aware that users can share data via the desktop app	2023-12-19 16:42:45 +05:30
sabaimran	927e477f68	Ignore typing error in custom action short description	2023-12-19 16:10:58 +05:30
sabaimran	946305d977	Add function to export conversations for debugging	2023-12-19 16:05:20 +05:30
sabaimran	903a01745f	Use 0px for padding for input row buttons in web	2023-12-18 16:09:06 +05:30
sabaimran	1e14a24f06	Merge pull request #586 from khoj-ai/features/misc-image-and-online-improvements Improvements to chat functionality and image generation	2023-12-17 23:28:08 +05:30
sabaimran	5b092d59f4	Ignore dict assignment typing error	2023-12-17 22:34:54 +05:30
sabaimran	03cb86ee46	Update typing and object assignment for new text to image method return	2023-12-17 21:28:33 +05:30
sabaimran	0288804f2e	Render the inferred query along with the image that Khoj returns	2023-12-17 21:02:55 +05:30
sabaimran	49af2148fe	Miscellaneous improvements to image generation - Improve the prompt before sending it for image generation - Update the help message to include online, image functionality - Improve styling for the voice, trash buttons	2023-12-17 20:25:35 +05:30
sabaimran	7cb64cb2f9	Add telemetry for image generation conversation command	2023-12-17 18:25:03 +05:30
sabaimran	e9ea0195b0	Merge pull request #585 from khoj-ai/fix/image-generation-and-csrf-cookie Fix image generation setup bug and CSRF cookie for admin login	2023-12-17 16:55:45 +05:30
sabaimran	09544dee09	Add TextToImageModelConfig to the admin page	2023-12-17 16:44:19 +05:30
sabaimran	0459666beb	CSRF Cookie not set error in prod. Try fixing https forwarding for mitigation	2023-12-17 12:55:18 +05:30
sabaimran	61dde8ed89	If text to image config isn't set, send back an error message to the client	2023-12-17 12:54:50 +05:30
sabaimran	fefaa2271d	Merge pull request #584 from khoj-ai/features/enforce-usage-limits-conversation-type Add a ConversationCommand rate limiter for the chat endpoint	2023-12-17 11:20:35 +05:30
sabaimran	3065cea562	Address mypy typing issues	2023-12-16 09:24:26 +05:30
sabaimran	5f6dcf9f2e	Add a rate limiter for the transcribe API endpoint	2023-12-16 09:18:56 +05:30
sabaimran	73a107690d	Add a ConversationCommand rate limiter for the chat endpoint	2023-12-16 09:03:52 +05:30
sabaimran	9b961ed496	Merge pull request #580 from khoj-ai/fix-upgrade-chat-to-create-images Support Image Generation with Khoj	2023-12-07 21:17:58 +05:30
Debanjum Singh Solanky	7504669f2b	Fix rendering image on chat response in obsidian client	2023-12-05 03:48:07 -05:00
Debanjum Singh Solanky	408b7413e9	Use global openai client for transcribe, image	2023-12-05 03:36:33 -05:00
Debanjum Singh Solanky	162b219f2b	Throw unsupported error when server not configured for image, speech-to-text	2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky	8f2f053968	Fix rendering image on chat response in web, desktop client	2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky	d124266923	Reduce promise based nesting in chat JS func used in desktop, web client Use async/await to reduce .then() based nesting to improve code readability	2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky	6e3f66c0f1	Use base64 encoded image instead of source URL for persistence The source URL returned by OpenAI would expire soon. This would make the chat sessions contain non-accessible images/messages if using OpenaI image URL Get base64 encoded image from OpenAI and store directly in conversation logs. This resolves the image link expiring issue	2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky	52c5f4170a	Show generated images in the chat modal of the Khoj Obsidian plugin	2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky	8016a57b5e	Show generated images in chat interface on Desktop client	2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky	cc051ceb4b	Show generated images in chat interface on Web client	2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky	252b35b2f0	Support /image slash command to generate images using the chat API	2023-12-05 01:51:14 -05:00
sabaimran	ef21d78c99	Initial changes to support multiple search model configurations - All search models are loaded into memory, and stored in a dictionary indexed by name - Still need to add database migrations and create a UI for user to select their choice. Presently, it uses the default option	2023-12-05 00:35:40 -05:00
Debanjum Singh Solanky	1d9c1333f2	Configure text to image models available on server - Currently supports OpenAI text to image model, by default dall-e-3 - Allow setting the text to image model via CLI during server setup	2023-12-04 21:27:53 -05:00
Debanjum Singh Solanky	f0222f6d08	Make save_to_conversation_log helper function reusable - Move it out to conversation.utils from generate_chat_response function - Log new optional intent_type argument to capture type of response expected. This can be type responses by Khoj e.g speech, image. It can be used to render responses by Khoj appropriately on clients - Make user_message_time argument optional, set the time to now by default if not passed by calling function	2023-12-04 19:42:12 -05:00
sabaimran	d2ddbef08f	Use a unique name for the temp PDF generated	2023-12-04 19:27:00 -05:00
sabaimran	d20746613a	Properly filter out empty PDFs for indexing	2023-12-04 16:15:17 -05:00
Debanjum Singh Solanky	316b7d471a	Handle offline chat model retrieval when no internet Offline chat shouldn't fail on retrieve_model when no internet, if model was previously downloaded and usable offline	2023-12-04 13:46:25 -05:00
Debanjum Singh Solanky	2b09caa237	Make online results an optional argument to the gpt converse method	2023-12-04 12:15:29 -05:00
Debanjum Singh Solanky	7009793170	Migrate to OpenAI Python library >= 1.0	2023-12-03 18:16:00 -05:00
sabaimran	62a89f79b7	Merge pull request #577 from khoj-ai/fix/user-subscription-email-not-exists Fix null exception when user does not exist for subscription	2023-12-03 15:14:31 -08:00
sabaimran	cc064ea57d	Fix circular import issue	2023-12-03 17:46:44 -05:00
sabaimran	21f8d63e89	If a user subscribes to Khoj with an email address that's not present in the DB, create an account	2023-12-03 17:28:40 -05:00
sabaimran	c5d297a9ed	Recursively search through folders for indexing	2023-12-03 16:17:28 -05:00
Debanjum Singh Solanky	a57d529f39	Fix path to system tray icon of Khoj desktop app	2023-12-03 00:12:50 -08:00
Debanjum Singh Solanky	106cdbe455	Release Khoj version 1.1.0	2023-11-30 20:09:08 -08:00
Debanjum Singh Solanky	10ce4ee11c	Ignore null params type check for markdown renderer in Obsidian client	2023-11-30 20:09:08 -08:00
Debanjum	02f40785aa	Merge Github workflows to dockerize for production (#575 )	2023-11-30 18:49:16 -08:00
sabaimran	a5ffa2342f	Add documentation for local setup and fix admin panel bugs - Wasn't able to login to the admin panel when KHOJ_DEBUG was not True. Fix this error so self-hosted users can get unblocked from accessing the admin settings - Don't force users to set their KHOJ_DJANGO_SECRET_KEY	2023-11-30 17:55:27 -08:00
Debanjum Singh Solanky	9d4bfdf47c	Merge Github workflows to dockerize for production	2023-11-30 17:18:13 -08:00
Debanjum Singh Solanky	d587632700	Clear result before render thinking placeholder emoji in Obsidian chat	2023-11-30 13:53:09 -08:00
Debanjum	a0686428ff	Render Chat Responses as Markdown in Desktop, Obsidian Client (#571 ) - Show temporary status message when copied to clipboard - Render chat responses as markdown in Desktop client - Render chat responses as markdown in chat modal of Obsidian client - Render references of new responses in chat modal on Obsidian client. Use new style for references - Properly stop `mediaRecorder` stream to clear microphone in-use state - Render newlines when references expanded in Web, Desktop and Obsidian clients	2023-11-30 13:52:02 -08:00
Debanjum Singh Solanky	48719ee0dd	Render newline separation in chat references to improve readability	2023-11-30 13:16:48 -08:00
Debanjum Singh Solanky	1a31a2efcf	Render Khoj chat streaming response as md & show refs in Obsidian - Use new style references for Khoj chat modal in Obsidian - Khoj Chat responses in Obsidian had regressed to not show references for new questions after modal has been opened. Now even those are rendered, and use new references style - Render chat response as markdown while it's being streamed	2023-11-30 13:02:00 -08:00
Debanjum Singh Solanky	0430fa67b6	Show temporary status message when copied to clipboard	2023-11-29 13:49:33 -08:00
Debanjum Singh Solanky	491a1a949a	Render chat responses as markdown in Desktop client too	2023-11-29 13:49:33 -08:00
Debanjum Singh Solanky	20ef5bfc93	Properly stop mediaRecorder stream to clear microphone in-use state	2023-11-29 13:48:35 -08:00
Debanjum Singh Solanky	8faa63c3c6	Convert config page buttons to use stronger yellow	2023-11-28 19:55:43 -08:00
Debanjum Singh Solanky	de5aa5c32e	Update pillow, aiohttp dependencies	2023-11-28 19:55:43 -08:00
sabaimran	fab57cc395	Fix pgvector installation instructions for Windows, Source	2023-11-28 14:46:09 -08:00
sabaimran	c4dcb51c91	Update headings for installation steps to indicate that local and docker setup are exclusive	2023-11-28 14:38:04 -08:00
Debanjum Singh Solanky	a6ca2076d5	Open link to Khoj app landing page from nav pane in current tab	2023-11-28 14:20:37 -08:00
Debanjum Singh Solanky	643e018947	Handle if user subscription field doesn't exists in telemetry func Avoid null ref in the method when running Khoj server in anon mode	2023-11-28 14:15:14 -08:00
Debanjum Singh Solanky	110d7646fc	Use milder yellow as primary Khoj theme color for chat, buttons etc.	2023-11-28 14:15:14 -08:00
sabaimran	18254850ab	Set a default value for the khoj django secret key and add additional guidance for setting environment variables on first run	2023-11-28 09:39:44 -08:00
sabaimran	24b5aaef0a	Merge pull request #569 from khoj-ai/features/enforce-subscription-status Enforce subscription state on the chat API access	2023-11-27 16:12:26 -08:00
sabaimran	6290b463f5	Compute size of the indexed data only if explicitly requested to avoid heavy load on the DB	2023-11-27 12:05:00 -08:00
sabaimran	eb5e3096e0	Change subscribed scope to premium	2023-11-27 11:39:20 -08:00
sabaimran	6e1ba11e59	Resolve merge conflicts for rendering chat response	2023-11-27 11:33:13 -08:00
sabaimran	239b31bc85	Clarify some of the langauge in the chat configuration docs	2023-11-27 10:44:05 -08:00
sabaimran	309ba7234c	Add instructions for setting up chat settings when locally hosting Khoj	2023-11-27 10:41:29 -08:00
sabaimran	5d8dbbdba4	Update instructions for Windows setup and add prerequisites for Docker	2023-11-27 10:32:02 -08:00
Debanjum Singh Solanky	71f2d54258	Render chat response as markdown while streaming on Web, Desktop clients	2023-11-26 20:27:10 -08:00
Debanjum Singh Solanky	9e714d032b	Fix Khoj telemetry server. Add server_version column	2023-11-26 15:05:43 -08:00
Debanjum	ebeae543ee	Speak to Khoj via Desktop, Web or Obsidian Client (#566 ) - Create speech to text API endpoint - Use OpenAI Whisper for ASR offline (by downloading Whisper model) or online (via OpenAI API) - Add speech to text model configuration to Database - Speak to Khoj from the Web, Desktop or Obsidian client	2023-11-26 14:32:11 -08:00
Debanjum Singh Solanky	b249bbb5b5	Limit max audio file size allowed for transcription on API endpoint	2023-11-26 14:19:46 -08:00
sabaimran	e438853b09	Add additional unit tests to verify behavior of unsubscribed/subscribed users	2023-11-26 13:09:00 -08:00
sabaimran	c18d52d1af	Add contributors to the README	2023-11-26 12:05:36 -08:00
Debanjum Singh Solanky	a79604b601	Fix return types of offline, online transcribe methods for python 3.9	2023-11-26 06:26:34 -08:00
Debanjum Singh Solanky	06f99ceb3c	Rename /api/speak API endpoint to /api/transcribe	2023-11-26 06:18:44 -08:00
Debanjum Singh Solanky	56a1a61c77	Remove unused button element retrieval code from web, desktop	2023-11-26 06:17:56 -08:00
Debanjum Singh Solanky	877532a167	Speak to Khoj from the Obsidian client - Add transcription button with mic icon - Collect audio recording on pressing mic - Process and send audio recording to server for transcription - Extract the functionality to flash status in chat input for reuse	2023-11-26 06:17:54 -08:00
Debanjum Singh Solanky	cc9eae5d18	Update default chat model to Mistral in GPT4AllProcessor config	2023-11-26 05:55:43 -08:00
Debanjum Singh Solanky	4636390f7f	Transcribe speech to text offline with Whisper - Allow server admin to configure offline speech to text model during initialization - Use offline speech to text model to transcribe audio from clients - Set offline whisper as default speech to text model as no setup api key reqd	2023-11-26 05:55:11 -08:00
Debanjum Singh Solanky	a0a7ab7ec8	Rename conversation.gpt4all package to conversation.offline	2023-11-26 04:19:32 -08:00
Debanjum Singh Solanky	499adf86a0	Move transcription using OpenAI API into independent package	2023-11-26 04:19:32 -08:00
Debanjum Singh Solanky	897170ab15	Use single db migration script for transcribe model, related updates	2023-11-26 04:19:32 -08:00
Debanjum Singh Solanky	28090216f6	Show transcription error status in chatInput placeholder on web, desktop - Extract flashing status message in chat input placeholder into reusable function - Use emoji prefixes for status messages - Improve alt text of transcribe button to indicate what the button does	2023-11-26 04:19:32 -08:00
Debanjum Singh Solanky	fc040825b2	Default to Offline chat with Mistral as minimal setup, no API key reqd.	2023-11-26 01:07:20 -08:00
Debanjum Singh Solanky	5a6547677c	Add type of operation variable in latest migration	2023-11-26 00:38:52 -08:00
Debanjum Singh Solanky	3e252036c3	Remove whitespace: pre-line from chat html, since markdown rendering	2023-11-26 00:27:29 -08:00
Debanjum Singh Solanky	b484795b8e	Merge branch 'master' into add-speak-to-chat - Conflicts: - src/interface/desktop/chat.html Combine and use common class names for speak component - src/khoj/database/adapters/__init__.py Combine imports - src/khoj/interface/web/chat.html Combine and use common class names for speak component - src/khoj/routers/api.py Combine imports	2023-11-26 00:26:21 -08:00
sabaimran	6233a957b4	Merge branch 'master' of github.com:khoj-ai/khoj into features/enforce-subscription-status	2023-11-25 22:46:10 -08:00
sabaimran	52b88de7f4	Indicate in the desktop if the user gets rate limited for indexing	2023-11-25 22:31:23 -08:00
Debanjum	e0a59cff68	Delete Conversation History from Web, Desktop, Obsidian Clients (#551 ) Add delete button to clear conversation history from Web, Desktop and Obsidian Khoj clients Resolves #523	2023-11-25 22:24:12 -08:00
Debanjum Singh Solanky	d0e294d8a5	Clear Conversation History from the Obsidian client - Fix font color for Khoj chat responses in Obsidian. Previous color had too low a contrast to be readable	2023-11-25 22:16:13 -08:00
sabaimran	73e38fccf3	Explicitly set billing to off in the test for being able to index a large set of data	2023-11-25 20:48:32 -08:00
sabaimran	b2afbaa315	Add support for rate limiting the amount of data indexed - Add a dependency on the indexer API endpoint that rounds up the amount of data indexed and uses that to determine whether the next set of data should be processed - Delete any files that are being removed for adminstering the calculation - Show current amount of data indexed in the config page	2023-11-25 20:28:04 -08:00
Debanjum Singh Solanky	07bf365c7c	Clear any network connections to khoj server via khoj.el on reindex - Ignore errors in deleting network requests to khoj server - Also delete open network connection to khoj server on auto reindex Otherwise when server is unreachable a bunch of failed network connections accrue in the processes list	2023-11-25 20:19:41 -08:00
sabaimran	dd1badae81	Use userwithtoken.user when authenticating with an API key	2023-11-24 22:18:45 -08:00
sabaimran	48b9116195	Fix to use user rather than user_with_token in authenticated credentials	2023-11-24 22:18:00 -08:00
sabaimran	771f9bcfa1	If the user subscription was created over 7 days ago, then their trial is expired	2023-11-24 22:08:32 -08:00
sabaimran	e5b1350523	Enforce API use limits depending on whether the server has billing enabled and whether the given user is subscribed	2023-11-24 21:55:16 -08:00
sabaimran	9c868ee10b	Use the state.billing_enabled field to determine whether to use the subscribed scope	2023-11-24 20:41:19 -08:00
sabaimran	69c8f45830	Use scopes to represent whether the use has a valid subscription in the middleware	2023-11-24 20:29:36 -08:00
Debanjum	25f3f2367e	Handle Server Unavailable Error from Khoj.el (#568 ) - Make auto-update of content index user configurable from khoj.el - Handle server unavailable error on auto-index schedule job in khoj.el Resolves #567	2023-11-24 16:46:07 -08:00
Debanjum Singh Solanky	138f4e3f3c	Make auto-update of content index user configurable from khoj.el	2023-11-24 16:40:50 -08:00
Debanjum Singh Solanky	0885fc6c23	Handle server unavailable error on auto-index schedule job in khoj.el	2023-11-24 16:39:44 -08:00
sabaimran	c13953311a	Add reflective questions to admin pages	2023-11-23 14:01:05 -08:00
sabaimran	c42ec32a95	Merge pull request #552 from khoj-ai/features/internet-enabled-search Support internet-enabled, online searching using Serper.dev	2023-11-23 12:34:05 -08:00
sabaimran	e3b32e412c	Merge pull request #556 from khoj-ai/features/reflective-suggested-questions Add support for suggesting base questions to users	2023-11-23 11:57:02 -08:00
sabaimran	5fac39afed	Fix PYTHONPATH reference in order to maintain appropriate package imports	2023-11-22 20:35:11 -08:00
sabaimran	c641b8df58	Update desktop package version	2023-11-22 17:54:53 -08:00
sabaimran	a1b2289074	Release Khoj version 1.0.1	2023-11-22 17:52:07 -08:00
sabaimran	e34db979b6	Add instructions for using the self hosted URL in clients	2023-11-22 17:32:43 -08:00
sabaimran	b1b037f0ea	Fix URL configuration issues with reorganized subfolders	2023-11-22 17:03:33 -08:00
sabaimran	e0949e232b	Import random in adapters file for selecting reflective question	2023-11-22 07:52:51 -08:00
sabaimran	256e8de40a	Merge with features/internet-enabled-search	2023-11-22 07:25:24 -08:00
Debanjum Singh Solanky	fd60db766e	Clear Conversation History from the Web Client	2023-11-22 03:35:00 -08:00
Debanjum Singh Solanky	d5a4830761	Clear Conversation History from the Desktop Client	2023-11-22 03:35:00 -08:00
Debanjum Singh Solanky	3096544cf2	Create API endpoint to clear user's chat history	2023-11-22 03:34:59 -08:00
Debanjum Singh Solanky	63675b3299	Speak to Khoj from the Desktop client - Use icons to style speech to text recording state	2023-11-22 02:47:17 -08:00
Debanjum Singh Solanky	2951fc92d7	Speak to Khoj from the Web client - Use icons to style speech to text recording state	2023-11-22 02:47:17 -08:00
Debanjum Singh Solanky	cc77bc4076	Create speech to text API endpoint. Use OpenAI whisper for ASR - Wrap audio transcription in try/catch and delete audio file after processing - Use configured speech to text model, else handle error	2023-11-22 02:47:06 -08:00
Debanjum Singh Solanky	1ca99b6eb0	Add speech to text model configuration to Database	2023-11-22 02:24:31 -08:00
sabaimran	60c23d9e3a	Add online search chat director tests	2023-11-21 23:08:36 -08:00
sabaimran	c652a7fd2d	Move text_to_entries under the new content folder	2023-11-21 22:25:17 -08:00
sabaimran	1e2af083f0	Rename the data_sources module to content	2023-11-21 22:11:32 -08:00
sabaimran	4cb28aeffb	Resolve merge conflicts with master	2023-11-21 22:07:41 -08:00
Debanjum Singh Solanky	4cdfe8fc4f	Re-enable Khoj Obsidian plugin for Mobile, as Khoj cloud is available	2023-11-21 16:33:48 -08:00
Debanjum	5d9d50157e	Clean Logs, Improve Message Rendering and Make Khoj Trusted Host Configurable (#561 ) - Append chat message to chat logs as TextNodes in web, desktop clients - Simplify Code to Identify Files from Github, Notion on Web, Desktop Client - Use file source to find entries from github, notion on web, desktop client - Pass file source to clients via text search API response - Make Django Logs Follow Khoj Log Format, Verbosity - Handle image search setup related warning - Format Django initializing outputs using Khoj logger format - Use `KHOJ_HOST` env var to set allowed/trusted domains to host Khoj	2023-11-21 15:14:34 -08:00
sabaimran	458e794d00	Revert PYTHONPATH to what it was before	2023-11-21 14:40:57 -08:00
Debanjum Singh Solanky	9e736d4340	Use KHOJ_DOMAIN for CORS allow_origins list as well - Default to app.khoj.dev - Remove unnecesary any_path regex in allow_origins. It only cares about host, paths are not set in origin header	2023-11-21 14:02:04 -08:00
sabaimran	5469e81a87	Use full path for the static directory in FastAPI and reflect deeper nesting of the django app	2023-11-21 13:44:45 -08:00
sabaimran	d199c4c35f	Resovle merge conflicts with matser	2023-11-21 13:35:56 -08:00
Debanjum Singh Solanky	76d041f633	Use KHOJ_HOST env var to set allowed/trusted domains to host Khoj Allows hosting Khoj behind other, non "khoj.dev" domains	2023-11-21 13:11:45 -08:00
Debanjum Singh Solanky	90d463c12a	Append chat message to chat logs as TextNodes in web, desktop clients	2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky	befcbcdd5d	Use file source to find entries from github, notion on web, desktop client This is a more robust mechanism of identification than via file name including github or notion domain names	2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky	3f0de45ec6	Pass file source to clients via text search API response Source of entry stored in DB is now passed to clients for processing	2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky	4aec581306	Handle image search setup related warning Ideally should rename model_directory to config_directory or some such but the current image search code will need to be migrated soon. So changing the variable name and creating a migration script for old khoj.yml files using model-directory variable isn't worth it Remove the explicity set of number of threads to use by pytorch. Use the default used by it.	2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky	b06628ee31	Format Django initializing outputs using Khoj logger format - Collect STDOUT from the `migrate', `collectstatic' commands and output using the Khoj logger format and verbosity settings - Only show Django `collectstatic' command output in verbose mode - Fix showing the Initializing Khoj log line by moving it after logger level set	2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky	6d9091bef5	Disable isort for now	2023-11-21 13:03:18 -08:00
sabaimran	341abf03ff	Handle none for search_type and use equals comparator rather than in for determining Notion type	2023-11-21 12:55:09 -08:00
Debanjum Singh Solanky	19e042037a	Run isort with black profile to avoid conflicts between the two	2023-11-21 12:52:07 -08:00
sabaimran	2bb989e9d8	Resolve merge conflicts and fix some import ordering	2023-11-21 12:30:43 -08:00
sabaimran	244b76ffed	Add isort for automatic import sorting and skip main.py because it's a drama queen 👑	2023-11-21 12:20:41 -08:00
Debanjum	8a0d92e2d7	Fix Connectivity Check in Obsidian Client (#559 ) from dtkav/bugfix-local-connectivity-check Check connection to Khoj server for self-hosted server. This check had regressed during the cloud rearchitecture	2023-11-21 12:05:16 -08:00
sabaimran	0e6f09b241	Merge pull request #562 from khoj-ai/fix/pypi-package-app-not-included Fix PyPi package app reference issue	2023-11-21 11:54:46 -08:00
sabaimran	61f6b8c0d4	Ignore-check step failed due to unrecognized code. Try using capital letters for indicator	2023-11-21 11:33:43 -08:00
sabaimran	38144a7a69	pull_request path should be src/khoj rather than src/	2023-11-21 11:33:07 -08:00
Debanjum	e5130fb3f3	Fix ranking search results on Obsidian (#560 ) This bug was causing the search results on the Obsidian client to be shown in the reverse order of their actual relevance. It reversed since entry scores returned by Khoj server are a distance metric since the move to Postgres. So lesser distance is better. Previously higher score was better.	2023-11-21 11:32:47 -08:00
sabaimran	333cb3445c	Use colon rather than equals to indicate typing	2023-11-21 11:28:51 -08:00
Debanjum Singh Solanky	645fd96634	Search across all content types from Khoj Obsidian client Previously it was only searching for PDF and Markdown files. This was meant to show only content from current vault as results. But it has not scaled well as other clients also allow syncing PDF and markdown files now. So remove this content type filter for now. A proper solution would limit by using file/dir filters on server or client side.	2023-11-21 11:19:33 -08:00
sabaimran	a1460a5bf9	Set operations to typed empty list in migration file	2023-11-21 11:14:40 -08:00
sabaimran	8932fc0c36	Ignore w004 check to bypass pypi warnings for check-wheel-contents - PyPi doesn't like to have files that start with numbers, however all of the generated django migration files start with numbers. To accommodate, skip this check. - Refer to https://pypi.org/project/check-wheel-contents/ for documentation and recommendation	2023-11-21 11:12:50 -08:00
sabaimran	71e794c26f	Remove the sys.append line in the main.py file, as it's not required	2023-11-21 10:57:21 -08:00
sabaimran	a474c31e02	Move the django app into the src/khoj folder for better organization and functionality - Our pypi package currently does not work because the django app and associated database is not included. To remedy this issue, move the app into the src/khoj folder. This has the added benefit of improved organization of the codebase, as all server related code is now in a single folder - Update associated file paths and system references	2023-11-21 10:56:04 -08:00
Debanjum Singh Solanky	c89bd49973	Fix ranking search results on Obsidian It's reversed since score of entries is now a distance metric on Khoj server. So lesser distance is better. Previously higher score was better	2023-11-21 01:24:59 -08:00
Debanjum	6d8e889917	Improve Self Hosted Khoj Setup (#557 ) - `c07401cf` Fix, Improve chat config via CLI on first run by using defaults - `d61b0dd5` Add Khoj Django app package to sys path to load Django module via pip install - `4e98acbc` Update minimum pydantic version to one with model_validate function	2023-11-20 17:25:53 -08:00
Daniel Grossmann-Kavanagh	f142999bce	fix khoj local server usage	2023-11-20 17:07:30 -08:00
Debanjum Singh Solanky	c07401cf76	Fix, Improve chat config via CLI on first run by using defaults - Fix setting prompt size for online chat - generally improve chat config via cli by using default chat model, prompt size for online and offline chat	2023-11-20 17:01:20 -08:00
sabaimran	b142de15a8	Merge branch 'features/internet-enabled-search' of github.com:khoj-ai/khoj into features/reflective-suggested-questions	2023-11-20 15:56:09 -08:00
sabaimran	a9623ef85a	Add requisite imports in order to instantiate offline model in adapters file	2023-11-20 15:27:42 -08:00
sabaimran	a8f13f334f	Fix merging issues with base after popping the stash	2023-11-20 15:22:50 -08:00
sabaimran	8fa0b69c67	Resolve merge issue with adapters methods	2023-11-20 15:21:06 -08:00
sabaimran	fee99779bf	Add subqueries for internet-connected search results and update client-side code accordingly - Add a wrapper method to help make direct queries to the LLM and determine any intermediate responses needed for handling the request	2023-11-20 15:19:15 -08:00
Debanjum Singh Solanky	d61b0dd55c	Add Khoj Django app package to sys path to load Django module via pip install	2023-11-20 14:55:00 -08:00
Debanjum Singh Solanky	4e98acbca7	Update minimum pydantic version to one with model_validate function	2023-11-20 14:52:37 -08:00
sabaimran	b8e6883a81	Merge branch 'master' of github.com:khoj-ai/khoj into features/internet-enabled-search	2023-11-19 16:20:08 -08:00
sabaimran	237195e20e	Make all name-related fields nullable within the GoogleUser	2023-11-19 14:22:32 -08:00
sabaimran	4def8cce36	Merge pull request #541 from asim-shrestha/patch-1 Add test separators	2023-11-19 14:14:34 -08:00
Debanjum	71799add0b	Index Parent Headings of Org-Mode Entries to Improve Search Context (#548 ) ### Overview The parent hierarchy of org-mode entries can store important context. This change updates OrgNode to track parent headings for each org entry and adds the parent outline for each entry to the index ### Details - Test search uses ancestor headings as context for improved results - Add ancestor headings of each org-mode entry to their compiled form - Track ancestor headings for each org-mode entry in org-node parser Resolves #85	2023-11-19 13:18:19 -08:00
sabaimran	e398a76779	Fix test word filter	2023-11-19 13:14:58 -08:00
sabaimran	33a9304428	Resolve merge conflicts	2023-11-19 12:57:55 -08:00
sabaimran	cfd76b8472	Add open graph links to configure Khoj Docs preview	2023-11-19 12:16:59 -08:00
sabaimran	ef5e9d66c1	Resolve merge conflicts in dependency imports	2023-11-19 11:42:20 -08:00
Debanjum Singh Solanky	c3465d6982	Release Khoj version 1.0.0	2023-11-19 09:50:25 -08:00
Debanjum	736744be3a	Update documentation to reflect new multi-user config scenario (#550 ) - Update docs to show how to use Khoj Cloud - Move self-hosting Khoj to separate section - Add page to setup Desktop app - Set default URL to Khoj Cloud URL in Obsidian, Emacs clients	2023-11-18 18:22:46 -08:00
Debanjum Singh Solanky	d0e84385f2	Simplify links in Khoj docs to use page_name.md with no prefixes This allows jumping to page via VSCode IDE and on docs website	2023-11-18 18:17:46 -08:00
Debanjum Singh Solanky	fc65d8a9fe	Add documentation page for the Khoj Desktop client	2023-11-18 18:17:35 -08:00
Debanjum Singh Solanky	35b469e488	Simplify setup, features since Khoj cloud in docs - No Khoj server setup required to start using Khoj from Obsidian, Emacs - Use tabs for install, upgrade in Emacs with different package managers - Use default subtitles in Khoj Docs - Deduplicate query filters, remove backend setup instructions in plugin pages - Remove stale Setup demo on Khoj Obsidian plugin docs	2023-11-18 17:25:52 -08:00
Debanjum Singh Solanky	e1bf1f0e86	Update default Khoj server URL to Khoj cloud on Emacs, Obsidian clients	2023-11-18 16:25:45 -08:00
Debanjum Singh Solanky	8775ce730a	Use URL fragments to allow jumping to config page sections on Web app	2023-11-18 16:25:45 -08:00
sabaimran	a5613cb08a	Merge pull request #554 from khoj-ai/fix/issues-with-prod-chat Fix misc. issues with chat configuration	2023-11-18 14:45:06 -08:00
sabaimran	f792b1e301	Remove already defined identical function	2023-11-18 14:08:50 -08:00
sabaimran	e2fff5dc47	Don't explicitly use value to get the model type value	2023-11-18 14:01:01 -08:00
sabaimran	a8a25ceac2	Honor user's chat settings when running the extract questions phase - Add marginally better error handling when GPT gives a messed up respones to the extract questions method - Remove debug log lines	2023-11-18 13:31:51 -08:00
sabaimran	67156e6aec	Add new logs for debugging issues with chat references	2023-11-18 12:10:50 -08:00
sabaimran	5de2ab6098	Change parse_obj calls to use model_validate per new pydantic specification	2023-11-18 12:10:36 -08:00
sabaimran	ebdb423d3e	Merge pull request #553 from khoj-ai/features/validation-errors Update types of base config models for pydantic 2.0	2023-11-18 00:42:56 -08:00
sabaimran	6d249645a6	Fix interpretation of the default search type	2023-11-18 00:04:18 -08:00
sabaimran	f180b2ba94	Resolve mypy errors for various data types	2023-11-17 23:26:15 -08:00
sabaimran	3328a41f08	Update types of base config models for pydantic 2.0	2023-11-17 23:08:52 -08:00
sabaimran	f688529150	Update the default configuration for the AppConfig	2023-11-17 19:26:31 -08:00
sabaimran	11ccb92755	Fix formatting of welcome message to use markdown	2023-11-17 18:55:59 -08:00
Debanjum Singh Solanky	ca87b4ede9	Wrap common API query parameters into shared class to deduplicate code - Upgrade FastAPI to >= latest version. Required upgrade of FastAPI. Earlier version didn't support wrapping common query params in class - Use per fixture app instead of a global FastAPI app in conftest - Upgrade minimum required Django version - Fix no notes chat director test with updated no notes message No notes message was updated in commit `118f1143`	2023-11-17 18:43:49 -08:00
sabaimran	262f3ccb59	Resolve mypy issues with formatting	2023-11-17 17:11:00 -08:00
sabaimran	a7e00898cb	Fix rendering even when no online context references are returned	2023-11-17 16:41:28 -08:00
sabaimran	0fcf234f07	Add support for using serper.dev for online queries - Use the knowledgeGraph, answerBox, peopleAlsoAsk and organic responses of serper.dev to provide online context for queries made with the /online command - Add it as an additional tool for doing Google searches - Render the results appropriately in the chat web window - Pass appropriate reference data down to the LLM	2023-11-17 16:19:11 -08:00
Debanjum Singh Solanky	33ad9b8e64	Update text search test since indexing ancestor hierarchy added	2023-11-17 15:26:55 -08:00
Debanjum Singh Solanky	55785d50c3	Use title, when present, as root ancestor of entries instead of file path	2023-11-17 15:03:27 -08:00
sabaimran	bfbe273ffd	Add some styling to the copy button for programmatic output	2023-11-17 12:18:35 -08:00
sabaimran	9ddf3b58c3	Use the markdown parser for rendering the chat messages in the web interface	2023-11-17 12:14:02 -08:00
sabaimran	a0b12b001a	Provide in-line rendering when output matches certain views	2023-11-17 11:04:36 -08:00
sabaimran	ec06d2c446	Move data indexer files into a separate folder under processor. Update assoc UTs	2023-11-16 17:19:55 -08:00
Debanjum Singh Solanky	68ac1e0193	Automate Desktop app builds on new release or push to master branch	2023-11-16 16:09:03 -08:00
sabaimran	45a42faec8	Make adjectives more positive for api token generation	2023-11-16 15:55:35 -08:00
sabaimran	3934633947	Update references to all documentation to reflect instructions for managed service - By default assume the audience of this website is people looking to understand the featuer offering of Khoj, and then people who are looking to self-host	2023-11-16 15:26:03 -08:00
sabaimran	7688228b9c	Update docs to reflect new setup processes and instructions based on rearchitecture - Most important updates include the depedency requirement to setup Postgres when running/setting Khoj up locally - Add instructiosn for Docker - Shift to recommend desktop client and update instructions for how to configure Khoj for user	2023-11-16 12:56:42 -08:00
sabaimran	118f1143ff	When user tries using the notes slash command without having any data indexed	2023-11-16 12:52:39 -08:00
sabaimran	e8a13f0813	Add multi-user support to Khoj and use Postgres for backend storage (#549 ) - Adds support for multiple users to be connected to the same Khoj instance using their Google login credentials - Moves storage solution from in-memory json data to a Postgres db. This stores all relevant information, including accounts, embeddings, chat history, server side chat configuration - Adds the concept of a Khoj server admin for configuring instance-wide settings regarding search model, and chat configuration - Miscellaneous updates and fixes to the UX, including chat references, colors, and an updated config page - Adds billing to allow users to subscribe to the cloud service easily - Adds a separate GitHub action for building the dockerized production (tag `prod`) and dev (tag `dev`) images, separate from the image used for local building. The production image uses `gunicorn` with multiple workers to run the server. - Updates all clients (Obsidian, Emacs, Desktop) to follow the client/server architecture. The server no longer reads from the file system at all; it only accepts data via the indexer API. In line with that, removes the functionality to configure org, markdown, plaintext, or other file-specific settings in the server. Only leaves GitHub and Notion for server-side configuration. - Changes license to GNU AGPLv3 Resolves #467 Resolves #488 Resolves #303 Resolves #345 Resolves #195 Resolves #280 Resolves #461 Closes #259 Resolves #351 Resolves #301 Resolves #296	2023-11-16 11:48:01 -08:00
sabaimran	1466aef554	Change license to GNU AGPLv3 from GNU GPLv3 - This enforces that upstream consumers of this code should open source their software for any network-distributed services	2023-11-16 11:14:06 -08:00
sabaimran	36d200580b	Use a different name for the production-config containers	2023-11-16 10:28:28 -08:00
sabaimran	ba633c4015	Only build the production docker image when pushing to master	2023-11-16 09:24:57 -08:00
Debanjum Singh Solanky	ddb07def0d	Test search uses ancestor headings as context for improved results - Update test data to add deeper outline hierarchy for testing hierarchy as context - Update collateral tests that need count of entries updated, deleted asserts to be updated	2023-11-16 03:05:19 -08:00
Debanjum Singh Solanky	74403e3536	Add ancestor headings of each org-mode entry to their compiled form Resolves #85	2023-11-16 02:54:41 -08:00
Debanjum Singh Solanky	305c25ae1a	Track ancestor headings for each org-mode entry in org-node parser	2023-11-16 02:39:14 -08:00
Debanjum	208ddddc6a	Make Search Model Configurable on Server (#544 ) - Make search model configurable on server - Update migration script to get search model from `khoj.yml` to Postgres - Update first run message on Khoj Desktop and Web app landing page - Other miscellaneous bug fixes	2023-11-16 00:11:58 -08:00
Debanjum Singh Solanky	cc05013715	Update first run message on Web app with Chat models setup instructions - Link to Django admin panel for user to create Chat Models on their Khoj server - This should only get hit when user is not using Khoj cloud, as Khoj cloud would already have Chat models configured	2023-11-15 22:44:24 -08:00
Debanjum Singh Solanky	6c1693b8f4	Update first run message on Desktop app with API token setup instructions - Open Web app settings in the default browser via link click - Open Desktop app settings via link click	2023-11-15 22:44:11 -08:00
Debanjum Singh Solanky	922983bd53	Set max cos distance to 0.18. Test search API query with max distance	2023-11-15 20:26:21 -08:00
Debanjum Singh Solanky	18dbad5edb	Use Sigmoid to normalize cross-encoder score between 0-1 - While sigmoid normalization isn't required for reranking. Normalizing score to distance metrics for both encoder and cross encoder scores is useful to reason about them - Softmax wasn't required as don't need probabilities, sigmoid is good enough to get distance metric	2023-11-15 19:31:59 -08:00
sabaimran	0da4db4310	Merge pull request #547 from khoj-ai/features/fix-api-token-generator Update the return type of the API token generator	2023-11-15 19:23:18 -08:00
sabaimran	ea144de438	Merge with master	2023-11-15 18:34:46 -08:00
sabaimran	6b17aeb32d	Resolve merge conflicts in auth.py with remove KhojApiUser import	2023-11-15 17:32:53 -08:00
Debanjum Singh Solanky	348cc0cf0e	Use better name for DB adapter func to create user by Google token	2023-11-15 17:31:50 -08:00
Debanjum Singh Solanky	08a057bdd5	Rename SearchModel to SearchModelConfig DB model, Require Cross-Encoder	2023-11-15 17:31:50 -08:00
Debanjum Singh Solanky	0679b2a7bd	Use embeddings model store from state in text to entries Do not need to instantiating it separately. In all other places we're using the embeddings model store in global state anyway	2023-11-15 17:31:50 -08:00
sabaimran	f88a5867b4	Allow dockerize step to run for prod from PR temporarily	2023-11-15 17:31:50 -08:00
sabaimran	245a9cbf63	Fix return type of the update_or_create method	2023-11-15 17:31:50 -08:00
sabaimran	10be8dfad9	Rename dockerize dev action to be more accurate	2023-11-15 17:31:50 -08:00
sabaimran	70f5d0ed3c	Add a dev workflow for GitHub actions, change the production workflow to only kick off when pushed to master	2023-11-15 17:31:50 -08:00
sabaimran	bbae7dd83c	Update logic for creating a new user to use aupdate_or_create	2023-11-15 17:31:50 -08:00
sabaimran	154de8c629	Update format for return type of the generate token method	2023-11-15 17:31:12 -08:00
sabaimran	cf74fa4a70	Allow dockerize step to run for prod from PR temporarily	2023-11-15 17:04:48 -08:00
sabaimran	8e62af77b9	Update format for return type of the generate token mehtod	2023-11-15 17:03:01 -08:00
sabaimran	4a487aff23	Fix return type of the update_or_create method	2023-11-15 14:35:42 -08:00
sabaimran	992e54c218	Rename dockerize dev action to b emore accurate	2023-11-15 14:09:28 -08:00
sabaimran	99f5a6082e	Add a dev workflow for GitHub actions, change the production workflow to only kick off when pushed to master	2023-11-15 14:07:25 -08:00
sabaimran	b63856ecb4	Update logic for creating a new user to use aupdate_or_create	2023-11-15 12:50:39 -08:00
sabaimran	b8e7488a95	Use a more permissive distance filter for search results from notes	2023-11-15 11:13:47 -08:00
sabaimran	d06b2cf24b	Downgrade pyproject.toml to avert depedency conflict	2023-11-15 10:47:54 -08:00
sabaimran	05b7542115	Remove config lock from the state	2023-11-15 10:44:45 -08:00
sabaimran	ecd005cac0	Check if search model is already in DB before creating a new one	2023-11-15 10:41:35 -08:00
Debanjum Singh Solanky	9c6e7bdea2	Upgrade server, desktop app dependencies to resolve CVE bugs	2023-11-15 01:47:53 -08:00
Debanjum Singh Solanky	5a6ab9cc85	Fix failing client tests	2023-11-15 00:17:44 -08:00
Debanjum Singh Solanky	8f200cf53f	Remove unused parameter from configure_search_type method	2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky	f8e5e118e1	Only create KhojUser on login if doesn't already exist	2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky	3d8d6145f2	Add search model config from khoj.yml to Postgres DB via migration script	2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky	4af194d74b	Make search model configurable on server - Expose ability to modify search model via Django admin interface - Previously the bi_encoder and cross_encoder models to use were set in code - Now it's user configurable but with a default config generated by default	2023-11-14 19:09:35 -08:00
Debanjum	b734984d6d	Fix, Improve Khoj with multi-user, db support for Khoj Cloud Release (#539 ) ### Overview Prepare Khoj with multi-user, db support for Khoj Cloud release ### Details - Add first run experience to configure Khoj via khoj CLI - Improve Web app settings page: Move files data into content section card. Move content index update button(s) to content section - Improve OpenAI chat prompts - Push more general information for OpenAI models into system prompt - Make it more aware of it's current capabilities - Weaken asking follow-up questions - Rate-limit calls to the chat API - Add back search results quality threshold - Normalize quality score definitions across cross_encoder, encoder to distance metric - Remove reference to deprecated button - Await result of the search query - Fixed Langchain issue by allowing the Docker image to rebuild with a later package version	2023-11-14 16:55:34 -08:00
Debanjum Singh Solanky	e98141f4c3	Subscribe default user to standard plan with a far away renewal date Self hosted users in anonymous mode have all capabilities unlocked	2023-11-14 16:31:39 -08:00
Debanjum Singh Solanky	9d30fda26d	Deduplicate, improve name of prompt templates for GPT4All chat models - Do not pass unused rerank_results parameter to text_search.query method	2023-11-14 16:31:09 -08:00
Debanjum Singh Solanky	795ec9eb55	Add KHOJ_prefix to server admin credentials environment variables	2023-11-14 16:13:13 -08:00
sabaimran	ee005de662	Rename django files URL to server instead of django	2023-11-14 12:36:38 -08:00
sabaimran	75e5a6b6de	Remove all the example mounted volumes as they're no longer required in the new architecture	2023-11-14 12:31:24 -08:00
sabaimran	20ce3d0c78	Update default docker compose configuration with Khoj local mode	2023-11-14 12:21:26 -08:00
sabaimran	8c36079f74	Add a first run experience to intialize the admin user if none exists and setup chat models	2023-11-13 21:07:12 -08:00
Debanjum Singh Solanky	e9adb58c16	Rate limit calls to the /chat API per user, per day/minute	2023-11-13 19:41:46 -08:00
Debanjum Singh Solanky	33a8eb0470	Log when new user is created	2023-11-13 19:37:24 -08:00
sabaimran	603f838115	Block input text field when waiting for chat response	2023-11-11 17:14:37 -08:00
Asim Shrestha	0bfc094e18	Add test separators	2023-11-11 17:08:58 -08:00
Debanjum Singh Solanky	9c321ac070	Fix cross encoder to use softmax to convert it to a distance metric	2023-11-11 16:12:24 -08:00
sabaimran	8a824167cf	Merge branch 'fix/imports-and-references' of github.com:khoj-ai/khoj into fix/imports-and-references	2023-11-11 12:59:31 -08:00
sabaimran	fa428932a8	Update URL for downloading the desktop application	2023-11-11 12:59:15 -08:00
Debanjum Singh Solanky	941c7f23a3	Only get text search results above confidence threshold via API - During the migration, the confidence score stopped being used. It was being passed down from API to some point and went unused - Remove score thresholding for images as image search confidence score different from text search model distance score - Default score threshold of 0.15 is experimentally determined by manually looking at search results vs distance for a few queries - Use distance instead of confidence as metric for search result quality Previously we'd moved text search to a distance metric from a confidence score. Now convert even cross encoder, image search scores to distance metric for consistent results sorting	2023-11-11 04:11:33 -08:00
Debanjum Singh Solanky	e44e6df221	Reduce data dumped in console log from web, desktop app	2023-11-11 02:05:07 -08:00
Debanjum Singh Solanky	f044a89d50	Show status in Save, Reinitialize button of config page on web app - Show non-transient error message in status element if action fails - On success, just show temporary success message within button	2023-11-11 02:04:58 -08:00
Debanjum Singh Solanky	f17d9da36c	Move Configure, Reinitialize buttons into the Content section on Web app Remove the Results Count button from the web app. It's hanging weirdly with not much context to its purpose. Reintroduce it in the Search card when created under the Features section	2023-11-11 02:01:39 -08:00
Debanjum Singh Solanky	325cb0f7fb	Show message in Save button of Github, Notion config save in web app Show the success, failure message only temporarily. Previously it stuck around after clicking save until page refresh	2023-11-11 02:01:39 -08:00
Debanjum Singh Solanky	b34d4fa741	Save config, update index on save of Github, Notion config in web app Reduce user confusion by joining config update with index updation for each content type. So only a single click required to configure any content type instead of two clicks on two separate pages	2023-11-11 00:33:49 -08:00
Debanjum Singh Solanky	c4364b9100	Weaken asking follow-up qs and q&a mode in notes prompt to OpenAI models - Notes prompt doesn't need to be so tuned to question answering. User could just want to talk about life. The notes need to be used to response to those, not necessarily only retrieve answers from notes - System and notes prompts were forcing asking follow-up questions a little too much. Reduce strength of follow-up question asking	2023-11-10 23:36:43 -08:00
Debanjum Singh Solanky	cba371678d	Stop OpenAI chat from emitting reference notes directly in chat body The Chat models sometime output reference notes directly in the chat body in unformatted form, specifically as Notes:\n['. Prevent that. Reference notes are shown in clean, formatted form anyway	2023-11-10 23:36:43 -08:00
Debanjum Singh Solanky	8585976f37	Revert "Use notes in system prompt, rather than in the user message" This reverts commit `e695b9ab8c`.	2023-11-10 23:36:43 -08:00
Debanjum Singh Solanky	b6441683c6	Increase reference text on 1st expansion to 3 lines and 140 characters	2023-11-10 23:36:43 -08:00
sabaimran	55c97241b5	Merge branch 'fix/imports-and-references' of github.com:khoj-ai/khoj into fix/imports-and-references	2023-11-10 22:38:34 -08:00
sabaimran	e2e96f9aa4	Add default settings to let new users be subscribed on trial - Add the default user to a subscription trial - Update associated unit tests	2023-11-10 22:38:28 -08:00
Debanjum Singh Solanky	501e7606a0	Increase reference text on 1st expansion to 3 lines and 140 characters	2023-11-10 21:27:04 -08:00
sabaimran	0a950d9382	Fix checker to determine if obsidian client is connected	2023-11-10 19:21:58 -08:00
sabaimran	c736604366	Merge with remote	2023-11-10 17:50:15 -08:00
sabaimran	b0b07bde6c	Allow chat reference to expand enough to show the whole reference, rather than constraining the height	2023-11-10 17:49:20 -08:00
sabaimran	14f8c151c8	Fix return type of the generate_chat_response method	2023-11-10 17:48:54 -08:00
Debanjum Singh Solanky	45b8670c25	Fix return type hint for generate_chat_response func	2023-11-10 17:34:19 -08:00
Debanjum Singh Solanky	c9c0ba67c6	Fix chat_client configurations for OpenAI chat director tests	2023-11-10 17:29:23 -08:00
Debanjum Singh Solanky	9b6c5ddba4	Update action row padding in cards on config page of web app	2023-11-10 16:53:25 -08:00
sabaimran	54d4fd0e08	Add chat_model data for logging selected models to telemetry	2023-11-10 16:46:34 -08:00
sabaimran	e695b9ab8c	Use notes in system prompt, rather than in the user message	2023-11-10 15:09:33 -08:00
sabaimran	cec932d88a	Update prompt so that GPT is more context aware with its capabilities	2023-11-10 14:37:11 -08:00
sabaimran	262a8574d1	Add a test to verify that a user without data sucessfully returns a respones to the /search endpoint	2023-11-10 14:00:58 -08:00
sabaimran	e62788ad79	Await result for determining if user has entries	2023-11-10 13:51:56 -08:00
sabaimran	1a56344f12	Remove the old syncData reference as it no longer exists	2023-11-10 10:10:07 -08:00
Debanjum	a348f1a6ab	Reduce Desktop App UX Save, Sync Confusion (#538 ) - Show next sync time to make users aware of data sync is automated - Keep a single Save button to reduce confusion. It does what Save All previously did. Intent to manual sync should Save All - Default to using app.khoj.dev as default Khoj URL to ease Cloud sync setup - Add detailed chat intro message, mention download desktop app for docs sync - Only show search in web app nav pane if user has documents indexed - Hide download desktop app message in web app if synced files exist - Mark generated profile pic with subscription circle in web app	2023-11-10 00:57:45 -08:00
Debanjum Singh Solanky	39ad1c6ce6	Release Khoj version 0.14.0 Fix Khoj subtitle in manifest of Khoj Obsidian plugin	2023-11-10 00:28:33 -08:00
Debanjum Singh Solanky	745d6bfeed	Add detailed intro message, mention download desktop app for docs sync	2023-11-10 00:20:28 -08:00
Debanjum Singh Solanky	6eb7df717c	Only show search in web app nav pane if user has documents indexed	2023-11-09 19:14:54 -08:00
Debanjum Singh Solanky	c0789dc57b	Use email to get_user_subscription from DB and other DB adapters - Needing user subscription requires chaining function - Simplify get_file_sources DB adapter	2023-11-09 19:09:57 -08:00
Debanjum Singh Solanky	841ed95521	Move active user profile halo check into nav pane macro on web app	2023-11-09 18:05:19 -08:00
Debanjum Singh Solanky	ddac693762	Hide download desktop app message in web app if synced files exist	2023-11-09 17:47:00 -08:00
Debanjum Singh Solanky	30a9674f25	Mark generated profile pic with subscription circle in web app	2023-11-09 15:22:38 -08:00
Debanjum Singh Solanky	d6e6ed1cfa	Keep single Save button, Show next sync, default to prod Khoj URL in Desktop app - Make mutable syncing variable not a const - Show next sync time to make users aware of data sync is automated - Keep a single Save button to reduce confusion. It does what Save All previously did. Intent to manual sync should Save All - Default to using app.khoj.dev as default Khoj URL to ease setup	2023-11-09 14:04:58 -08:00
Debanjum Singh Solanky	e1f0128576	Change config migration script to update to 0.15.0 version Next release, 0.14.0 wouldn't contain the migration to Postgres	2023-11-09 12:21:58 -08:00
Debanjum Singh Solanky	17cbbb0b01	Use Consistent Environment Variable for KHOJ_DEBUG	2023-11-09 11:01:28 -08:00
Debanjum Singh Solanky	391db80499	Improve subscribed user profile pictures and nav pane selection - Add yellow halo around subscribed user profile - Fix highlighting current page in header nav pane	2023-11-09 00:57:05 -08:00
Debanjum Singh Solanky	605058c72a	Allow null user profile picture from Google OAuth in DB - Fix width of generated profile picture generated for user - Ignore unused Stripe webhook events	2023-11-09 00:46:59 -08:00
Debanjum	1d3bdf8fdb	Create Billing integration. Improve Settings pages on Desktop, Web apps (#537 ) ### Major - Expose Billing via Stripe on Khoj Web app for Khoj Cloud subscription - Expose card on web app config page to manage subscription to Khoj cloud - Create API webhook, endpoints for subscription payments using Stripe - Put Computer files to index into Card under Content section - Show file type icons for each indexed file in config card of web app - Enable deleting all indexed desktop files from Khoj via Desktop app - Create config page on web app to manage computer files indexed by Khoj - Track data source (computer, github, notion) of each entry - Update content by source via API. Make web client use this API for config - Store the data source of each entry in database ### Cleanup - Set content enabled status on update via config buttons on web app - Delete deprecated content config pages for local files from web client - Rename Sync button, Force Sync toggle to Save, Save All buttons ### Fixes - Prevent Desktop app triggering multiple simultaneous syncs to server - Upgrade langchain version since adding support for OCR-ing PDFs - Bubble up content indexing errors to notify user on client apps	2023-11-08 19:55:35 -08:00
Debanjum Singh Solanky	a2609973b8	Disable Subscription if Stripe environment not setup Deduplicate DJANGO_SECRET_KEY and KHOJ_DJANGO_SECRET_KEY to latter name as prefixed with KHOJ as KHOJ app specific	2023-11-08 19:39:32 -08:00
Debanjum Singh Solanky	09e1235832	Auto update billing card UI on (re/un-)subscribe click on web app Previously required a page load to see the updated billing state after clicking resubscribe or unsubscribe buttons	2023-11-08 18:38:12 -08:00
Debanjum Singh Solanky	8b8bb15866	Keep sync state in memory, initialized to false in Desktop app Prevent deadlock if desktop app killed in middle of syncing	2023-11-08 18:03:08 -08:00
Debanjum Singh Solanky	c043eb54ae	Use typed entry source instead of raw str to map source to conf in api.py	2023-11-08 18:03:08 -08:00
Debanjum Singh Solanky	8178004e6d	Move Subscription data into separate table in DB. Merge migrations	2023-11-08 18:03:08 -08:00
Debanjum Singh Solanky	3bb10128ef	Move subscription API to separate, independent router	2023-11-08 16:20:27 -08:00
Debanjum Singh Solanky	ec1395d072	Clean, merge subscription update events, API and functions - Reduce webhook triggers for subscription updates - Merge subscription update API endpoint, functions for (re/un-)subscribe	2023-11-08 15:55:20 -08:00
Debanjum Singh Solanky	ef5c13f968	Keep user subscription state. Update it when user has unsubscribed	2023-11-08 12:08:36 -08:00
Debanjum Singh Solanky	c52affc6d9	Get Khoj Cloud Subscription URL via environment variable	2023-11-08 12:07:53 -08:00
sabaimran	609d358b1a	Use sql datetime comparison for detecting validity of subscription renewal date - Update the unsubscribe endpoint to use query params - Use subscription id to process unsubscribe endpoint, rather than the customer id	2023-11-07 19:17:36 -08:00
sabaimran	98cf095b65	Fix bug for rendering chat references in LLM response	2023-11-07 16:44:41 -08:00
sabaimran	0e1cdb6536	Add additional error handling for processing unknown Stripe events and fix typo in STRIPE_SIGNING env variable	2023-11-07 16:43:05 -08:00
sabaimran	08c86927cb	Merge branch 'features/multi-user-support-khoj' of github.com:khoj-ai/khoj into fix-improve-config-page-on-desktop-and-web-app	2023-11-07 12:46:49 -08:00
sabaimran	cec54e3a8a	Merge pull request #536 from khoj-ai/features/update-chat-ui Update the chat UI to have richer representation of the references	2023-11-07 12:34:57 -08:00
Debanjum Singh Solanky	f466751f4d	Expose card on web app config page to manage subscription to Khoj cloud	2023-11-07 10:21:00 -08:00
Debanjum Singh Solanky	9aaf475c8a	Create API webhook, endpoints for subscription payments using Stripe - Add fields to mark users as subscribed to a specific plan and subscription renewal date in DB - Add ability to unsubscribe a user using their email address - Expose webhook for stripe to callback confirming payment	2023-11-07 10:20:51 -08:00
Debanjum Singh Solanky	156421d30a	Show file type icons for each indexed file in config card of web app	2023-11-07 05:48:44 -08:00
Debanjum Singh Solanky	045c2252d6	Set content enabled status on update via config buttons on web app Previously hitting configure or disable wouldn't update the state of the content cards. It needed page refresh to see if the content was synced correctly. Now cards automatically get set to new state on hitting disable button on card or global configure buttons	2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky	7c424e0d5f	Enable deleting all indexed desktop files from Khoj via Desktop app	2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky	779fa531a5	Prevent Desktop app triggering multiple simultaneous syncs to server Lock syncing to server if a sync is already in progress. While the sync save button gets disabled while sync is in progress, the background sync job can still trigger a sync in parallel. This sync lock prevents that	2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky	404d47f1a1	Bubble up content indexing errors to notify user on client apps	2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky	6e957584ac	Create config page on web app to manage computer files indexed by Khoj Remove the table of all files indexed by Khoj. This seems overkill and doesn't match the UI semantics of the other data sources like Github, Notion. Create instead a data source card for computer files with the same update, disable semantics of the Github and Notion data source cards Users can disable each data source from its card on the main config page. They can see/delete individual files indexed from the computer data source once they click into the computer files data source card on the config page	2023-11-07 04:42:53 -08:00
Debanjum Singh Solanky	d527b644f4	Update content by source via API. Make web client use this API for config	2023-11-07 03:41:19 -08:00
Debanjum Singh Solanky	9ab327a2b6	Store the data source of each entry in database This will be useful for updating, deleting entries by their data source. Data source can be one of Computer, Github or Notion for now Store each file/entries source in database	2023-11-07 02:18:48 -08:00
Debanjum Singh Solanky	c82cd0862a	Delete deprecated content config pages for local files from web client The desktop app now manages syncing local computer files to index The server only manages "cloud" data source like github and notion.	2023-11-06 23:55:37 -08:00
Debanjum Singh Solanky	9f47fc8e34	Upgrade langchain version since adding support for OCR-ing PDFs	2023-11-06 21:58:33 -08:00
Debanjum Singh Solanky	97cf8339aa	Rename Sync button, Force Sync toggle to Save, Save All buttons	2023-11-06 21:57:37 -08:00
Debanjum Singh Solanky	a08b152358	Improve log messages in text_entries and memory leak unit test	2023-11-06 19:27:31 -08:00
sabaimran	6c8689e4ae	Update corresponding chat UX in the desktop client as well	2023-11-06 16:18:41 -08:00
sabaimran	e01ecf1419	/s/references/reference to fix bug of jumping references	2023-11-06 16:12:25 -08:00
Debanjum	38f24a037d	Improve Indexing Text Entries (#535 ) Major - Ensure search results logic consistent across migration to DB, multi-user - Manually verified search results for sample queries look the same across migration - Flatten indexing code for better indexing progress tracking and code readability Minor - `a4f407f` Test memory leak on MPS device when generating vector embeddings - `ef24485` Improve Khoj with DB setup instructions in the Django app readme (for now) - `f212cc7` Arrange remaining text search tests in arrange, act, assert order - `022017d` Fix text search tests to test updated indexing log messages	2023-11-06 16:01:53 -08:00
sabaimran	270f7b3eb3	Update the chat UI to have richer representation of the references	2023-11-05 15:46:43 -08:00
sabaimran	81a615d7dd	Merge pull request #534 from khoj-ai/features/code-config-cleanup Small fixes and update config UI to manage indexed data	2023-11-05 15:45:45 -08:00
sabaimran	8ebb12820c	Add OCR runtime dependencies to prod Dockerfile as well	2023-11-05 15:40:05 -08:00
sabaimran	d697d752c2	Use repeat rather than manually specify auto in grid-template-rows Co-authored-by: Debanjum <debanjum@gmail.com>	2023-11-05 15:23:42 -08:00
sabaimran	3d6e8d53fe	Try adding dependencies for libgl in order to run OCR in github action unit tests	2023-11-05 15:09:40 -08:00
sabaimran	5f1e37fff0	Adjust indentation for css property	2023-11-05 14:33:23 -08:00
sabaimran	fdd727712f	Rename test files from x_to_jsonl to x_to_entries	2023-11-05 14:33:07 -08:00
Debanjum Singh Solanky	a4f407f595	Test memory leak on MPS device when generating vector embeddings Slope threshold of 2.0 determined qualitatively on local Mac device Minor unused import and clean-up	2023-11-05 03:48:54 -08:00
Debanjum Singh Solanky	ef24485ada	Improve Khoj with DB setup instructions in the Django app readme (for now)	2023-11-05 02:04:52 -08:00
Debanjum Singh Solanky	f212cc7174	Arrange remaining text search tests in arrange, act, assert order	2023-11-05 02:04:52 -08:00
Debanjum Singh Solanky	022017dd0f	Fix text search tests to test updated indexing log messages	2023-11-05 02:04:52 -08:00
sabaimran	084a8becc5	Fix but to prevent default in chat trigger	2023-11-04 20:13:33 -07:00
Debanjum Singh Solanky	5489e98b9c	Do not index org heading entries by default This is to maintain the previous default behavior	2023-11-04 20:09:25 -07:00
Debanjum Singh Solanky	34b5a86d1d	Use SentenceTransformer to disable progress bar when encoding query The Langchain HuggingFaceEmbeddings wrapper doesn't support disabling progressbar, not especially for only query but not documents. This makes the logs noisy with encoding progressbar for each incremental queries No features of the Langchain wrapper for SentenceTransformer was currently being used anyway for now, and we can always switch back to it if required	2023-11-04 20:09:25 -07:00
Debanjum Singh Solanky	dc9946fc03	Flatten nested loops, improve progress reporting in text_to_jsonl indexer Flatten the nested loops to improve visibilty into indexing progress Reduce spurious logs, report the logs at aggregated level and update the logging description text to improve indexing progress reporting	2023-11-04 20:09:25 -07:00
sabaimran	88eeee3f4b	Move try/catch for import one line later	2023-11-04 19:46:47 -07:00
sabaimran	dbaa892665	Flip catching modulenotfound to import error exception	2023-11-04 19:34:10 -07:00
sabaimran	8c3d5a49da	Add try/except around image extraction step	2023-11-04 19:27:18 -07:00
sabaimran	fdfab39942	Update the config UI to show all files indexed with option to delete - Given the separation of the client and server now, the web UI will no longer support configuration of local file paths of data to index - Expose a way to show all the files that are currently set for indexing, along with an option to delete all or specific files	2023-11-04 19:03:34 -07:00
sabaimran	800bb4f458	Remove references to demo - The demo setting is no longer necessary for the time being, as we won't have anymore demo instances	2023-11-04 17:17:04 -07:00
sabaimran	b5972e9311	Use OCR to extract image text in PDFs	2023-11-04 17:15:28 -07:00
sabaimran	d1d210605e	Merge branch 'features/multi-user-support-khoj' of github.com:khoj-ai/khoj into features/multi-user-support-khoj	2023-11-04 14:29:34 -07:00
sabaimran	3678aa5614	Add tests to validate expected behaviors in the multi-user scenario	2023-11-04 14:29:30 -07:00
Debanjum	12b5ef6540	Improve Theming of Web, Desktop and Obsidian Client App (#532 ) - Update theme for Desktop, Web and Obsidian client apps to use lighter colors - Show splash screen on starting Desktop app - Make chat the landing page on Desktop and Web clients - Simplify style of login page on Web app - Add About page for Desktop app accessible from system tray menu	2023-11-04 12:29:56 -07:00
Debanjum Singh Solanky	8273bf26b7	Fix multi-line chat input and output render on web, desktop clients - Remove spurious whitespace in chat input box on page load being added because text area element was ending on newline - Do not insert newline in message when send message by hitting enter key This would be more evident when send message with cursor in the middle of the sentence, as a newline would be inserted at the cursor point - Remove chat message separator tokens from model output. Model sometimes starts to output text in it's chat format	2023-11-04 01:09:35 -07:00
Debanjum Singh Solanky	2f1756cc15	Do not use icon for each file, folder to index in desktop app. Other minor fixes based on PR feedback	2023-11-04 00:13:10 -07:00
Debanjum Singh Solanky	e8f568d79c	Make splash screen wider, opaque and fix it's spinner radius Radius should be such that final spin doesn't extend out of the circle Opaque background improves contrast for better visual	2023-11-03 23:59:21 -07:00
Debanjum Singh Solanky	3ef05f4803	Use css var for main font color in search, chat page of desktop app	2023-11-03 23:59:21 -07:00
Debanjum Singh Solanky	a19cbde2d7	Add About page for Khoj to Desktop app. Expose it via system tray - Pass current khoj version from package.json to about page via electron IPC between backend js and frontend page - Update Khoj information in default About screen as well, in case it's exposed anywhere else	2023-11-03 23:59:21 -07:00
Debanjum Singh Solanky	a327294ee9	Rename khoj.js to utils.js in web and desktop client apps	2023-11-03 18:13:37 -07:00
Debanjum Singh Solanky	db57eeaefe	Console log a welcome message on loading Desktop client	2023-11-03 05:15:41 -07:00
Debanjum Singh Solanky	6fae6fb2a4	Merge branch 'features/multi-user-support-khoj' into improve-client-app-theming	2023-11-03 04:58:41 -07:00
Debanjum Singh Solanky	4cd76311ad	Slow down spinning at end of splash sequence. Make animation bigger	2023-11-03 04:28:17 -07:00
Debanjum Singh Solanky	34661c33a2	Show splash screen on starting desktop app	2023-11-03 03:19:08 -07:00
Debanjum Singh Solanky	126d3f4563	Render each file, folder to index row with icon in desktop app Make the file, folders to index look less like an editable field	2023-11-03 02:48:42 -07:00
Debanjum Singh Solanky	80ae132cad	Update Desktop, Obsidian client color theme to lighter yellow - Update background color to a different shade of white - Make primary and primary hover colors less intense and more aligned with lantern flame shade - Add water, leaf, flower color variables	2023-11-03 02:48:42 -07:00
sabaimran	fb6ebd19fc	Fix refactor bugs, CSRF token issues for use in production (#531 ) Fix refactor bugs, CSRF token issues for use in production * Add flags for samesite settings to enable django admin login * Include tzdata to dependencies to work around python package issues in linux * Use DJANGO_DEBUG flag correctly * Fix naming of entry field when creating EntryDate objects * Correctly retrieve openai config settings * Fix datefilter with embeddings name for field	2023-11-02 23:02:38 -07:00
Debanjum Singh Solanky	345856e7be	Merge branch 'master' of github.com:khoj-ai/khoj into features/multi-user-support-khoj Merge changes to use latest GPT4All with GPU, GGUF model support into khoj multi-user support rearchitecture branch	2023-11-02 22:44:25 -07:00
Debanjum Singh Solanky	041074ccd6	Make chat the landing page for the desktop app Chat, unlike search, doesn't knowledge base indexing setup. So you can get started with chat much faster.	2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky	3801105b2a	Make chat the landing page for the web app Chat, unlike search, doesn't knowledge base indexing setup. So you can get started with chat much faster.	2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky	0d4e7d46c2	Fix color and size of profile picture circle in nav pane	2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky	4fbe8ac6b1	Console log a welcome message on loading web client	2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky	9fc6c97139	Use Khoj standard font family, weight in web client settings page	2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky	b6f07099cd	Simplify login page styling on web client - Center all elements: icon, text and button - Use khoj icon not logo-text - Simplify login title text	2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky	7b7f6d3bc8	Update web client theme to a lighter - Update background color to a different shade of white - Make primary and primary hover colors less intense and more aligned with lantern flame shade - Add water, leaf, flower color variables	2023-11-02 20:42:21 -07:00
sabaimran	fe860aaf83	Merge branch 'features/multi-user-support-khoj' of github.com:khoj-ai/khoj into features/multi-user-support-khoj	2023-11-02 14:56:01 -07:00
sabaimran	2c9496bcf1	Add additional null checks in the migrate_server_pg script	2023-11-02 14:55:58 -07:00
sabaimran	20df0f5330	Use url_path_for for creating the login page URL in the application	2023-11-02 14:55:14 -07:00
sabaimran	fd11b78552	Fix migration script error when openai not available (#530 )	2023-11-02 11:28:08 -07:00
sabaimran	fe6720fa06	[Multi-User Part 8]: Make conversation processor settings server-wide (#529 ) - Rather than having each individual user configure their conversation settings, allow the server admin to configure the OpenAI API key or offline model once, and let all the users re-use that code. - To configure the settings, the admin should go to the `django/admin` page and configure the relevant chat settings. To create an admin, run `python3 src/manage.py createsuperuser` and enter in the details. For simplicity, the email and username should match. - Remove deprecated/unnecessary endpoints and views for configuring per-user chat settings	2023-11-02 10:43:27 -07:00
Debanjum	0fb81189ca	[Multi-User Part 7]: Improve Sign-In UX & Rename DB Models for Readability (#528 ) ### ✨ New - Create profile pic drop-down menu in navigation pane Put settings page, logout action under drop-down menu ### ⚙️ Fix - Add Key icon for API keys table on Web Client's settings page ### 🧪 Improve - Rename `TextEmbeddings` to `TextEntries` for improved readability - Rename `Db.Models` `Embeddings`, `EmbeddingsAdapter` to `Entry`, `EntryAdapter` - Show truncated API key for identification & restrict table width for config page responsiveness	2023-11-01 18:05:20 -07:00
Debanjum Singh Solanky	12b3eeae9e	Use Khoj fonts on config page of web and desktop apps too Previously pico.css font-families were being selected for the config page. This was different from the fonts used by index.html, chat.html This improves spacing issue of heading further	2023-11-01 17:50:50 -07:00
Debanjum Singh Solanky	022d695309	Switch to narrow view below width of 700px on web client This makes the dropdown menu align better to the profile picture in mobile view	2023-11-01 17:49:44 -07:00
Debanjum Singh Solanky	6a0adfbfbb	Default to profile picture with Initial if user has no profile picture	2023-11-01 17:49:44 -07:00
Tuan Nguyen	354605e73e	Autofocus to chat input when openning chat (#524 )	2023-11-01 16:09:45 -07:00
Debanjum Singh Solanky	d92a2d03a7	Rename Files, Classes from X_To_JSONL to more appropriate X_To_Entries These content processors are converting content into entries in DB instead of entries in JSONL file	2023-11-01 14:51:33 -07:00
Debanjum Singh Solanky	2ad2055bcb	Remove user null check in API controllers that require authentication	2023-11-01 14:38:19 -07:00
Debanjum Singh Solanky	7ac5a4766d	Match spacing of navigation header pane in config vs search/chat pages	2023-11-01 14:38:19 -07:00
Debanjum Singh Solanky	2e3a4a6a9b	Use Jinja macro to deduplicate navigation header HTML	2023-11-01 14:38:12 -07:00
Debanjum Singh Solanky	c631b61a81	Put colors shared by index, chat html into khoj css global variables	2023-11-01 02:13:24 -07:00
Debanjum Singh Solanky	f585a71744	Put logout, settings under dropdown menu with logged in user's profile picture - Create dropdown menu. Put settings page, logout action under it - Make user's profile picture the dropdown menu heading - Create khoj.js to store shared js across web client It currently stores the dropdown menu open, close functionality - Put shared styling for khoj dropdown menu under khoj.css	2023-11-01 02:13:24 -07:00
Debanjum Singh Solanky	58a7171911	Show truncated API key for identification & restrict table width - Use a function to generate API Key table row HTML, to dedup logic - Show delete, copy icon hints on hover - Reduce length of copied message to not expand table width - Truncating API key helps keep the API key table width within width of smaller width displays	2023-10-31 23:10:26 -07:00
Debanjum Singh Solanky	9cebd7f856	Add emoji icons to Search, Chat, Settings items in nav menu of Web client Emoji icons have already been added to the Search, Chat and Settings top navigation menu in the desktop client. This change adds these to the web client as well	2023-10-31 22:38:44 -07:00
Debanjum Singh Solanky	f77336ba61	Add key icon for API keys table in Web client config page	2023-10-31 19:01:09 -07:00
Debanjum Singh Solanky	87e6b1eab9	Rename TextEmbeddings to TextEntries for improved readability Improves readability as name has closer match to underlying constructs	2023-10-31 18:55:59 -07:00
Debanjum Singh Solanky	bcbee05a9e	Rename DbModels Embeddings, EmbeddingsAdapter to Entry, EntryAdapter Improves readability as name has closer match to underlying constructs - Entry is any atomic item indexed by Khoj. This can be an org-mode entry, a markdown section, a PDF or Notion page etc. - Embeddings are semantic vectors generated by the search ML model that encodes for meaning contained in an entries text. - An "Entry" contains "Embeddings" vectors but also other metadata about the entry like filename etc.	2023-10-31 18:50:54 -07:00
sabaimran	54a387326c	[Multi-User Part 6]: Address small bugs and upstream PR comments (#518 ) - `08654163cb`: Add better parsing for XML files - `f3acfac7fb`: Add a try/catch around the dateparser in order to avoid internal server errors in app - `7d43cd62c0`: Chunk embeddings generation in order to avoid large memory load - `e02d751eb3`: Addresses comments from PR #498 - `a3f393edb4`: Addresses comments from PR #503 - `66eb078286`: Addresses comments from PR #511 - Address various items in https://github.com/khoj-ai/khoj/issues/527	2023-10-31 17:59:53 -07:00
sabaimran	5f3f6b7c61	[Multi-User Part 5]: Add a production Docker file and use a gunicorn configuration with it (#514 ) - Add a productionized setup for the Khoj server using `gunicorn` with multiple workers for handling requests - Add a new Dockerfile meant for production config at `ghcr.io/khoj-ai/khoj:prod`; the existing Docker config should remain the same	2023-10-26 13:15:31 -07:00
Debanjum	9acc722f7f	[Multi-User Part 4]: Authenticate using API Tokens (#513 ) ### ✨ New - Use API keys to authenticate from Desktop, Obsidian, Emacs clients - Create API, UI on web app config page to CRUD API Keys - Create user API keys table and functions to CRUD them in Database ### 🧪 Improve - Default to better search model, [gte-small](https://huggingface.co/thenlper/gte-small), to improve search quality - Only load chat model to GPU if enough space, throw error on load failure - Show encoding progress, truncate headings to max chars supported - Add instruction to create db in Django DB setup Readme ### ⚙️ Fix - Fix error handling when configure offline chat via Web UI - Do not warn in anon mode about Google OAuth env vars not being set - Fix path to load static files when server started from project root	2023-10-26 12:33:03 -07:00
sabaimran	4b6ec248a6	[Multi-User Part 3]: Separate chat sesssions based on authenticated users (#511 ) - Add a data model which allows us to store Conversations with users. This does a minimal lift over the current setup, where the underlying data is stored in a JSON file. This maintains parity with that configuration. - There does _seem_ to be some regression in chat quality, which is most likely attributable to search results. This will help us with #275. It should become much easier to maintain multiple Conversations in a given table in the backend now. We will have to do some thinking on the UI.	2023-10-26 11:37:41 -07:00
sabaimran	a8a82d274a	[Multi-User Part 2]: Add login pages and gate access to application behind login wall (#503 ) - Make most routes conditional on authentication if anonymous mode is not enabled. If anonymous mode is enabled, it scaffolds a default user and uses that for all application interactions. - Add a basic login page and add routes for redirecting the user if logged in	2023-10-26 10:17:29 -07:00
sabaimran	216acf545f	[Multi-User Part 1]: Enable storage of settings for plaintext files based on user account (#498 ) - Partition configuration for indexing local data based on user accounts - Store indexed data in an underlying postgres db using the `pgvector` extension - Add migrations for all relevant user data and embeddings generation. Very little performance optimization has been done for the lookup time - Apply filters using SQL queries - Start removing many server-level configuration settings - Configure GitHub test actions to run during any PR. Update the test action to run in a containerized environment with a DB. - Update the Docker image and docker-compose.yml to work with the new application design	2023-10-26 09:42:29 -07:00
Debanjum Singh Solanky	9677eae791	Expose CLI flag to disable using GPU for offline chat model - Offline chat models outputing gibberish when loaded onto some GPU. GPU support with Vulkan in GPT4All seems a bit buggy - This change mitigates the upstream issue by allowing user to manually disable using GPU for offline chat Closes #516	2023-10-25 17:51:46 -07:00
Debanjum Singh Solanky	5bb14a05a0	Update system requirements in docs for offline chat models	2023-10-22 19:04:23 -07:00
Debanjum Singh Solanky	0f1ebcae18	Upgrade to latest GPT4All. Use Mistral as default offline chat model GPT4all now supports gguf llama.cpp chat models. Latest GPT4All (+mistral) performs much at least 3x faster. On Macbook Pro at ~10s response start time vs 30s-120s earlier. Mistral is also a better chat model, although it hallucinates more than llama-2	2023-10-22 19:04:23 -07:00
sabaimran	6dc0df3afb	Pin pytorch version to 2.0.1 in order to avoid exit code 139 in Docker container (#512 )	2023-10-20 14:10:21 -07:00
sabaimran	963cd165eb	Resolve merge conflicts	2023-10-19 14:39:05 -07:00
Simon Butler	e3f8a95784	Update emacs.md (#510 ) Minor correction for emacs-lisp in minimal install	2023-10-19 12:28:08 -07:00
Debanjum	d93395ae48	Set >=6Gb RAM required for offline chat Llama v2 7B with 4bit quantization technically needs ~3.5Gb RAM (7B * 0.5byte), practically a system with 6Gb of RAM should suffice	2023-10-18 12:05:54 -07:00
Debanjum Singh Solanky	8346e1193c	Release Khoj version 0.13.0	2023-10-18 03:43:54 -07:00
Debanjum Singh Solanky	6631fc38db	Delete plaintext config via API. Catch any offline model loading exception	2023-10-18 03:37:45 -07:00
Debanjum Singh Solanky	53abd1a506	Mark sync completed on desktop client, even when no files to send Previously Sync spinner on desktop config screen would hang when no files to send to server & the Sync button had been manually triggered	2023-10-18 01:30:56 -07:00
Debanjum Singh Solanky	71b0012e8c	Set offline chat config to default value if unset on server load	2023-10-18 00:59:43 -07:00
Debanjum Singh Solanky	cf1cdc3fe1	Disambiguate input_filter variable names in fs_syncer functions	2023-10-17 23:32:10 -07:00
Debanjum Singh Solanky	e3cd8b4150	Only index files returned by input-filter globs in fs_syncer Ignore .org, .pdf etc. suffixed directories under `input-filter' from being evaluated as files. Explicitly filter results by input-filter globs to only index files, not directory for each text type Add test to prevent regression Closes #448	2023-10-17 23:32:10 -07:00
Debanjum Singh Solanky	51363d280d	Do not configure khoj server for pull based indexing from khoj.el Do not make khoj server pull update index on Obsidian plugin load. Index is updated on push from plugin instead now/	2023-10-17 21:47:19 -07:00
Debanjum Singh Solanky	d9d133dfb9	Read text files as utf-8, instead of default os locale On Windows, the default locale isn't utf8. Khoj had regressed to reading files in OS specified locale encoding, e.g cp1252, cp949 etc. It now explicitly uses utf8 encoding to read text files for indexing Resolves #495, resolves #472	2023-10-17 21:47:19 -07:00
Debanjum	3d4576ae38	Fix encoding binary files for sync from the Desktop, Obsidian client (#506 ) - Fix encoding binary files like PDFs for sync from Desktop client - Fix encoding binary files like PDFs for sync from Obsidian client	2023-10-17 15:37:22 -07:00
Debanjum Singh Solanky	c8293998d9	Fix encoding binary files like PDFs for sync from Obsidian client Use readBinary to read binary files like PDFs instead of read	2023-10-17 15:08:30 -07:00
sabaimran	ba60c869c9	Fix encoding binary files like PDFs for sync from Desktop client Use readFileSync, Buffer to pass appropriately formatted binary data	2023-10-17 15:08:23 -07:00
Andrew Spott	3d7381446d	Changed globbing. Now doesn't clobber a users glob if they want to a… (#496 ) * Changed globbing. Now doesn't clobber a users glob if they want to add it, but will (if just given a directory), add a recursive glob. Note: python's glob engine doesn't support `{}` globing, a future option is to warn if that is included. * Fix typo in globformat variable * Use older glob pattern for plaintext files --------- Co-authored-by: Saba <narmiabas@gmail.com>	2023-10-17 11:26:06 -07:00
sabaimran	2646c8554d	Provide a default value to offline_chat configuration of the conversation processor	2023-10-17 10:35:22 -07:00
Debanjum Singh Solanky	b8976426eb	Update offline chat model config schema used by Emacs, Obsidian clients The server uses a new schema for the conversation config. The Emacs, Obsidian clients need to use this schema to update the conversation config	2023-10-17 07:01:35 -07:00
Debanjum	ecc6fbfeb2	Push Files to Index from Emacs, Obsidian & Desktop Clients using Multi-Part Forms (#499 ) ### Overview - Add ability to push data to index from the Emacs, Obsidian client - Switch to standard mechanism of syncing files via HTTP multi-part/form. Previously we were streaming the data as JSON - Benefits of new mechanism - No manual parsing of files to send or receive on clients or server is required as most have in-built mechanisms to send multi-part/form requests - The whole response is not required to be kept in memory to parse content as JSON. As individual files arrive they're automatically pushed to disk to conserve memory if required - Binary files don't need to be encoded on client and decoded on server ### Code Details ### Major - Use multi-part form to receive files to index on server - Use multi-part form to send files to index on desktop client - Send files to index on server from the khoj.el emacs client - Send content for indexing on server at a regular interval from khoj.el - Send files to index on server from the khoj obsidian client - Update tests to test multi-part/form method of pushing files to index #### Minor - Put indexer API endpoint under /api path segment - Explicitly make GET request to /config/data from khoj.el:khoj-server-configure method - Improve emoji, message on content index updated via logger - Don't call khoj server on khoj.el load, only once khoj invoked explicitly by user - Improve indexing of binary files - Let fs_syncer pass PDF files directly as binary before indexing - Use encoding of each file set in indexer request to read file - Add CORS policy to khoj server. Allow requests from khoj apps, obsidian & localhost - Update indexer API endpoint URL to` index/update` from `indexer/batch` Resolves #471 #243	2023-10-17 06:05:15 -07:00
Debanjum Singh Solanky	7b1c62ba53	Mark test_get_configured_types_via_api unit test as flaky It passes locally on running individually but fails when run in parallel on local or CI	2023-10-17 05:56:00 -07:00
Debanjum Singh Solanky	6a4f1b2188	Add more client, request details in logs by index/update API endpoint	2023-10-17 05:43:29 -07:00
Debanjum Singh Solanky	5efae1ad55	Update indexer API endpoint query params for force, content type New URL query params, `force' and `t' match name of query parameter in existing Khoj API endpoints Update Desktop, Obsidian and Emacs client to call using these new API query params. Set `client' query param from each client for telemetry visibility	2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky	84654ffc5d	Update indexer API endpoint URL to index/update from indexer/batch New URL follows action oriented endpoint naming convention used for other Khoj API endpoints Update desktop, obsidian and emacs client to call this new API endpoint	2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky	e347823ff4	Log telemetry for index updates via push to API endpoint	2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky	05be6bd877	Clicking Update Index in Obsidian settings should push files to index Use the indexer/batch API endpoint to regenerate content index rather than the previous pull based content indexing API endpoint	2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky	13a3122bf3	Stop configuring server to pull files to index from Obsidian client Obsidian client now pushes vault files to index instead	2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky	99a2c934a3	Add CORS policy to allow requests from khoj apps, obsidian & localhost Using fetch from Khoj Obsidian plugin was failing due to cross-origin request and method: no-cors didn't allow passing x-api-key custom header. And using Obsidian's request with multi-part/form-data wasn't possible either.	2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky	541cd59a49	Let fs_syncer pass PDF files directly as binary before indexing No need to do unneeded base64 encoding/decoding to pass pdf contents for indexing from fs_syncer to pdf_to_jsonl	2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky	d27dc71dfe	Use encoding of each file set in indexer request to read file Get encoding type from multi-part/form-request body for each file Read text files as utf-8 and pdfs, images as binary	2023-10-17 04:58:12 -07:00
Debanjum Singh Solanky	8e627a5809	Pass any files to be deleted to indexer API via Khoj Obsidian plugin - Keep state of previously synced files to identify files to be deleted - Last synced files stored in settings for persistence of this data across Obsidian reboots	2023-10-17 03:34:49 -07:00
Debanjum Singh Solanky	f2e293a149	Push Vault files to index to Khoj server using Khoj Obsidian plugin Use the multi-part/form-data request to sync Markdown, PDF files in vault to index on khoj server Run scheduled job to push updates to value for indexing every 1 hour	2023-10-17 03:05:30 -07:00
Debanjum Singh Solanky	6baaaaf91a	Test request body of multi-part form to update content index from khoj.el	2023-10-16 23:54:32 -07:00
Debanjum Singh Solanky	79b3f8273a	Make khoj.el send files to be deleted from index to server	2023-10-16 23:53:02 -07:00
Debanjum Singh Solanky	5dc399b32e	Document system requirements to run offline chat Closes #375	2023-10-16 19:39:06 -07:00
Debanjum Singh Solanky	f64fa06e22	Initialize the Khoj Transient menu on first run instead of load This prevents Khoj from polling the Khoj server until explicitly invoked via `khoj' entrypoint function. Previously it'd make a request to the khoj server every time Emacs or khoj.el was loaded Closes #243	2023-10-16 19:11:46 -07:00
Debanjum	b4949f7f0b	Improve Offline Chat Model Experience (#494 ) - Make offline chat model user configurable. Use `filename` of any [GPT4All supported model](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json) like below: - Run GPT4All Chat Model on GPU, when available via [GPT4All Vulcan support](https://blog.nomic.ai/posts/gpt4all-gpu-inference-with-vulkan) - Use default Llama 2 supported by GPT4All - Make `tokenizer` and `max-prompt-size` of chat model user configurable. E.g When using chat models not in [this pre-defined list](https://github.com/khoj-ai/khoj/blob/master/src/khoj/processor/conversation/utils.py) that support larger context window or a different tokenizer. Closes #406, #418	2023-10-16 17:44:49 -07:00
Debanjum Singh Solanky	644c3b787f	Scale no. of chat history messages to use as context with max_prompt_size Previously lookback turns was set to a static 2. But now that we support more chat models, their prompt size vary considerably. Make lookback_turns proportional to max_prompt_size. The truncate_messages can remove messages if they exceed max_prompt_size later This lets Khoj pass more of the chat history as context for models with larger context window	2023-10-16 17:22:28 -07:00
Debanjum Singh Solanky	90e1d9e3d6	Pin gpt4all to 1.0.12 as next version will introduce breaking changes	2023-10-16 10:57:16 -07:00
Debanjum Singh Solanky	1a9023d396	Update Chat Actor test to not incept with prior world knowledge	2023-10-15 17:22:44 -07:00
Debanjum Singh Solanky	df1d74a879	Use max_prompt_size, tokenizer from config for chat model context stuffing	2023-10-15 16:52:53 -07:00
Debanjum Singh Solanky	116595b351	Use chat_model specified in new offline_chat section of config - Dedupe offline_chat_model variable. Only reference offline chat model stored under offline_chat. Delete the previous chat_model field under GPT4AllProcessorConfig - Set offline chat model to use via config/offline_chat API endpoint	2023-10-15 16:37:49 -07:00
Debanjum Singh Solanky	feb4f17e3d	Update chat config schema. Make max_prompt, chat tokenizer configurable This provides flexibility to use non 1st party supported chat models - Create migration script to update khoj.yml config - Put `enable_offline_chat' under new `offline-chat' section Referring code needs to be updated to accomodate this change - Move `offline_chat_model' to `chat-model' under new `offline-chat' section - Put chat `tokenizer` under new `offline-chat' section - Put `max_prompt' under existing `conversation' section As `max_prompt' size effects both openai and offline chat models	2023-10-15 16:35:11 -07:00
sabaimran	c125995d94	[Multi-User]: Part 0 - Add support for logging in with Google (#487 ) * Add concept of user authentication to the request session via GoogleUser	2023-10-14 19:39:13 -07:00
Debanjum Singh Solanky	247e75595c	Use AutoTokenizer to support more tokenizers	2023-10-14 16:54:52 -07:00
Saba	ff2dbadc9d	Use computed plaintext_content to set file content rather than calling f.read again	2023-10-14 13:28:34 -07:00
Debanjum Singh Solanky	1ad8b150e8	Add default tokenizer, max_prompt as fallback for non-default offline chat models Pass user configured chat model as argument to use by converse_offline The proper fix for this would allow users to configure the max_prompt and tokenizer to use (while supplying default ones, if none provided) For now, this is a reasonable start.	2023-10-13 22:48:56 -07:00
Debanjum Singh Solanky	56bd69d5af	Improve Llama v2 extract questions actor and associated prompt - Format extract questions prompt format with newlines and whitespaces - Make llama v2 extract questions prompt consistent - Remove empty questions extracted by offline extract_questions actor - Update implicit qs extraction unit test for offline search actor	2023-10-13 22:48:56 -07:00
sabaimran	09bb3686cc	Strip the incoming query from the slash conversation command (#500 ) * Strip the incoming query from the slash conversation command before passing it to the model or for search * Return q when content index not loaded * Remove -n 4 from pytest ini configuration to isolate test failures	2023-10-13 21:11:23 -07:00
Debanjum Singh Solanky	96c0b21285	Sync desktop app package.json with other Khoj clients metadata - Make `bump_version.sh' script set version for the Khoj desktop app too - Sync Khoj desktop app authors, license, description and version with the other interfaces and server - Update description in packages metadata to match project subtitle on Github	2023-10-13 20:43:55 -07:00
sabaimran	80fb56b8a5	Sync deksktop app package version with the other releases	2023-10-13 19:23:00 -07:00
Debanjum Singh Solanky	b669aa2395	Clean and fix the content indexing code in the Emacs client - Pass payloads as unibyte. This was causing the request to fail for files with unicode characters - Suppress messages with file content in on index updates - Fix rendering response from server on index update API call - Extract code to populate body of index update HTTP request with files	2023-10-13 18:00:37 -07:00
Debanjum Singh Solanky	bea196aa30	Explicitly make GET request to /config/data from khoj.el:khoj-server-configure method Previously global state of `url-request-method' would affect the kind of request made to api/config/data API endpoint as it wasn't being explicitly being set before calling the API endpoint This was done with the assumption that the default value of GET for url-request-method wouldn't change globally But in some cases, experientially, it can get changed. This was resulting in khoj.el load failing as POST request was being made instead which would throw error	2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky	292f0420ad	Send content for indexing on server at a regular interval from khoj.el - Allow indexing frequency to be configurable by user - Ensure there is only one khoj indexing timer running	2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky	bed3aff059	Update tests to test multi-part/form method of pushing files to index Instead of using the previous method to push data as json payload of POST request pass it as files to upload via the multi-part/form to the batch indexer API endpoint	2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky	fc99431754	Send files to index on server from the khoj.el emacs client - Add elisp variable to set API key to engage with the Khoj server - Use multi-part form to POST the files to index to the indexer API endpoint on the khoj server	2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky	68018ef397	Use multi-part form to send files to index on desktop client - Add typing for variables in for loop and other minor formatting clean-up - Assume utf8 encoding for text files and binary for image, pdf files	2023-10-12 20:58:49 -07:00
Debanjum Singh Solanky	7190b3811d	Remove all filter terms in user query from defiltered_query Previously only the the last filter's terms were getting effectively applied as the `filter.defilter' operation was being done on `user_query' but was updating the `defiltered_query'	2023-10-12 20:56:17 -07:00
Debanjum Singh Solanky	72f8fde7ef	Run pytests in parallel on multiple CPU cores using pytest-xdist for speed	2023-10-12 20:56:17 -07:00
Debanjum Singh Solanky	60e9a61647	Use multi-part form to receive files to index on server - This uses existing HTTP affordance to process files - Better handling of binary file formats as removes need to url encode/decode - Less memory utilization than streaming json as files get automatically written to disk once memory utilization exceeds preset limits - No manual parsing of raw files streams required	2023-10-11 23:58:23 -07:00
Debanjum Singh Solanky	9ba173bc2d	Improve emoji, message on content index updated via logger Use mailbox closed with flag down once content index completed. Use standard, existing logger messages in new indexer messages, when files to index sent by clients	2023-10-11 17:12:03 -07:00
Debanjum Singh Solanky	6aa69da3ef	Put indexer API endpoint under /api path segment Update FastAPI app router, desktop app and to use new url path to batch indexer API endpoint All api endpoints should exist under /api path segment	2023-10-09 21:35:58 -07:00
Debanjum Singh Solanky	148e8f468f	Restrict openai package version below 1.0.0 to avoid breaking changes	2023-10-09 19:30:58 -07:00
Debanjum Singh Solanky	f6f7a62d80	Wait for user to stop typing to trigger search from khoj.el in Emacs - Improves user experience by aligning idle time with search latency to avoid display jitter (to render results) while user is typing - Makes the idle time configurable Closes #480	2023-10-06 12:44:45 -07:00
sabaimran	5c4f0d42b7	Return new default config in API endpoint	2023-10-06 12:30:09 -07:00
sabaimran	052b25af0a	Update default configuration passed to Khoj clients to circumvent valiation issues	2023-10-06 12:29:15 -07:00
Debanjum Singh Solanky	a85ff941ca	Make offline chat model user configurable Only GPT4All supported Llama v2 models will work given the prompt structure is not currently configurable	2023-10-04 20:41:14 -07:00
Debanjum Singh Solanky	d1ff812021	Run GPT4All Chat Model on GPU, when available GPT4All now supports running models on GPU via Vulkan	2023-10-04 18:42:12 -07:00
Debanjum Singh Solanky	13b16a4364	Use default Llama 2 supported by GPT4All Remove custom logic to download custom Llama 2 model. This was added as GPT4All didn't support Llama 2 when it was added to Khoj	2023-10-03 19:01:54 -07:00
sabaimran	4a5ed7f06c	Update Khoj package version for Electron, Desktop app (#492 ) * Address package upgrade for Electron application * Update package version for Electron desktop application	2023-10-03 12:21:32 -07:00
sabaimran	3f962a55c3	Fix Linux Desktop Application (#491 ) * Use separate functions for adding files and folders to configuration for indexing * Add a loading bar while data is syncing * Bump the minor version for the application	2023-10-03 11:43:19 -07:00
sabaimran	63b3696af0	Release Khoj version 0.12.3	2023-09-26 22:41:11 -07:00
sabaimran	d2f9bca1cf	Fix null ref issue in query method and update logic for determining whether khoj is already configured in obsidian	2023-09-26 22:33:44 -07:00
sabaimran	2f18383349	Release Khoj version 0.12.2	2023-09-26 11:59:47 -07:00
sabaimran	588f35b6e9	Add max prompt size for gpt-3.5-turbo-16k	2023-09-26 10:57:35 -07:00
sabaimran	99f9c3f8e2	Update setup instructions	2023-09-26 09:40:36 -07:00
sabaimran	4e370d7a18	Release Khoj version 0.12.1	2023-09-26 09:24:53 -07:00
sabaimran	3675aa348a	Update naming of Khoj in manifest.json for Obsidian	2023-09-26 09:24:36 -07:00
sabaimran	4b6d8af218	Update metadata in manifest.json	2023-09-26 09:19:56 -07:00
sabaimran	a82d1becc3	Release Khoj version 0.12.0	2023-09-26 09:17:56 -07:00
sabaimran	38f0df3d53	Remove unused icons from electron app folder	2023-09-26 07:56:29 -07:00
sabaimran	29a64be939	Deprecate desktop build instructions from old setup	2023-09-25 22:02:02 -07:00
sabaimran	99995b2497	Add basic instructions for setting up the Khoj desktop interface	2023-09-25 21:08:14 -07:00
sabaimran	5e16074b92	Fix comparison for search type in plugins mode	2023-09-25 10:57:17 -07:00
sabaimran	efe5e09c3a	Use jammy for docker base image due to dependency issue with arm64 image	2023-09-18 15:38:18 -07:00
sabaimran	6df728c445	Move bash command in Dockerfile into single line	2023-09-18 15:13:11 -07:00
sabaimran	96a9fa07f0	Fix conf test setup for offline chat	2023-09-18 15:05:15 -07:00
sabaimran	2dd15e9f63	Resolve issues with GPT4All and fix prompt for yesterday extract questions date filter (#483 ) - GPT4All integration had ceased working with 0.1.7 specification. Update to use 1.0.12. At a later date, we should also use first party support for llama v2 via gpt4all - Update the system prompt for the extract_questions flow to add start and end date to the yesterday date filter example. - Update all setup data in conftest.py to use new client-server indexing pattern	2023-09-18 14:41:26 -07:00
sabaimran	8141be97f6	Update date filter test to use compiled rather than raw key	2023-09-18 11:24:56 -07:00
sabaimran	b225d1188c	Fix formatting of gpt.py	2023-09-18 11:09:02 -07:00
Jonny-GM	34b202b868	More lenient date searching (#481 ) * Modify DateFilter to use compiled entry key * Instruct search to include date in query * Minor prompt change * Prompt fix	2023-09-18 10:46:00 -07:00
sabaimran	16874e1953	Provide force fallback for regeneration	2023-09-12 16:35:07 -07:00
sabaimran	9f42a1a036	Propagate flags to configure index command	2023-09-11 10:33:44 -07:00
sabaimran	343854752c	Improve docker builds for local hosting (#476 ) * Remove GPT4All dependency in pyproject.toml and use multiplatform builds in the dockerization setup in GH actions * Move configure_search method into indexer * Add conditional installation for gpt4all * Add hint to go to localhost:42110 in the docs. Addresses #477	2023-09-08 17:07:26 -07:00
sabaimran	dccfae3853	Remove PySide dependency and deprecate desktop builds (#475 ) * Remove PySide, gui option from code * Remove pyside 6 dependency from code * Remove workflows which build desktop applications * Update unit tests and update line in documentation * Remove additional references to pyinstaller, gui * Add uninstall steps to normal uninstall instructions	2023-09-07 11:36:27 -07:00
sabaimran	76562f4250	Add front-end Electron application for Khoj local file syncing (#473 ) * Initial version - setup a file-push architecture for generating embeddings with Khoj * Use state.host and state.port for configuring the URL for the indexer * Fix parsing of PDF files * Read markdown files from streamed data and update unit tests * On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system * Init: refactor indexer/batch endpoint to support a generic file ingestion format * Add features to better support indexing from files sent by the desktop client * Initial commit with Electron application - Adds electron app * Add import for pymupdf, remove import for pypdf * Allow user to configure khoj host URL * Remove search type configuration from index.html * Use v1 path for current indexer routes	2023-09-06 12:04:18 -07:00
bholagabbar	205dc90746	Fix notion title bug (#474 ) * Update notion_to_jsonl.py * Fix try-catch block	2023-09-05 10:47:42 -07:00
sabaimran	922222a813	Fix anyio package version to avoid backwards compatibility issue with start_blocking_portal method	2023-08-31 14:14:13 -07:00
sabaimran	4854258047	Move to a push-first model for retrieving embeddings from local files (#457 ) * Initial version - setup a file-push architecture for generating embeddings with Khoj * Update unit tests to fix with new application design * Allow configure server to be called without regenerating the index; this no longer works because the API for indexing files is not up in time for the server to send a request * Use state.host and state.port for configuring the URL for the indexer * On application startup, load in embeddings from configurations files, rather than regenerating the corpus based on file system	2023-08-31 12:55:17 -07:00
sabaimran	92cbfef7ab	Skip plaintext file indexing if there's a parsing issue and log the file	2023-08-29 14:34:08 -07:00
sabaimran	74409c2c64	Release Khoj version 0.11.4	2023-08-29 11:44:35 -07:00
sabaimran	1b85958bcc	trim chat input start	2023-08-28 19:18:10 -07:00
sabaimran	e592f6eac8	Release Khoj version 0.11.3	2023-08-28 14:46:03 -07:00
sabaimran	7c35da9fc4	Fix bug in /chat endpoint for general and update depdendencies	2023-08-28 14:12:11 -07:00
Debanjum Singh Solanky	c93dcc948a	Exclude tests data file from programming stats on Github Git tag tests/data files with the linguist-vendored attribute to prevent github from including them in stats. Otherwise Khoj is getting marked as an HTML project due to the tardigrades html page in tests data, when it's primarily a python project currently	2023-08-28 11:00:52 -07:00
Debanjum Singh Solanky	59ffd1dc94	Document slash command and query filter in docs for chat and search	2023-08-28 11:00:52 -07:00
sabaimran	bc09143856	Release Khoj version 0.11.2	2023-08-28 10:16:13 -07:00
Debanjum	bc5e60defb	Filter knowledge base used by chat to respond (#469 ) - Overview - Allow applying word, file or date filters on your knowledge base from the chat interface - This will limit the portion of the knowledge base Khoj chat can use to respond to your query	2023-08-28 09:32:33 -07:00
Debanjum Singh Solanky	01b310635e	Enable passing search query filters via chat and test it	2023-08-28 09:24:32 -07:00
Debanjum Singh Solanky	794bad8bcb	Make date_filter.extract_date_range method always return a list type	2023-08-28 00:55:28 -07:00
Debanjum Singh Solanky	d5a2de6222	Add method to extract filter terms from query to all filters - Test the get_filter_term method in all 3 word, file, date filters - Make the existing can_filter method by default in base filter abstract class	2023-08-28 00:55:28 -07:00
Debanjum	150105505b	Add Default chat command. Make Khoj ask clarifying questions (#468 ) - Make Khoj ask clarifying questions when answer not in provided context - Add default conversation command to auto switch b/w general, notes modes - Show filtered list of commands available with the currently input text - Use general prompt when no references found and not in Notes mode - Test general and notes slash commands in offline chat director tests	2023-08-28 00:52:57 -07:00
Debanjum Singh Solanky	319f066aec	Test general and notes slash commands in offline chat director tests	2023-08-28 00:47:02 -07:00
Debanjum Singh Solanky	eb6cd4f8d0	Use general prompt when no references found and not in Notes mode	2023-08-28 00:47:02 -07:00
Debanjum Singh Solanky	edffbad837	Make Khoj ask clarifying questions when answer not in provided context Previously it would just refuse ask for clarification. This improves the chat quality score for the existing director tests	2023-08-28 00:47:02 -07:00
Debanjum Singh Solanky	75c1016ec0	Show filtered list of commands available with the currently input text	2023-08-28 00:46:10 -07:00
Debanjum Singh Solanky	74605f6159	Add default conversation command to auto switch b/w general, notes modes This was the default behavior but behavior regressed when adding slash commands in PR #463	2023-08-28 00:46:10 -07:00
sabaimran	cbc978ea08	Update help links for notion, github to point to the main docs	2023-08-27 15:02:55 -07:00
sabaimran	b45e1d8c0d	Fix plaintext HTML parsing and rendering (#464 ) * Store conversation command options in an Enum * Move to slash commands instead of using @ to specify general commands * Calculate conversation command once & pass it as arg to child funcs * Add /notes command to respond using only knowledge base as context This prevents the chat model to try respond using it's general world knowledge only without any references pulled from the indexed knowledge base * Test general and notes slash commands in openai chat director tests --------- Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>	2023-08-27 11:24:30 -07:00
Debanjum	7919787fb7	Use Slash Commands and Add Notes Slash Command (#463 ) * Store conversation command options in an Enum * Move to slash commands instead of using @ to specify general commands * Calculate conversation command once & pass it as arg to child funcs * Add /notes command to respond using only knowledge base as context This prevents the chat model to try respond using it's general world knowledge only without any references pulled from the indexed knowledge base * Test general and notes slash commands in openai chat director tests * Update gpt4all tests to use md configuration * Add a /help tooltip * Add dynamic support for describing slash commands. Remove default and treat notes as the default type --------- Co-authored-by: sabaimran <narmiabas@gmail.com>	2023-08-26 18:11:18 -07:00
sabaimran	e64357698d	Skip indexing single bad markdown, plaintext file (#460 )	2023-08-23 15:34:56 -07:00
sabaimran	84bd579077	Format the chat outputted message with code, bolding, or italics. Add a copy button for code. Closes #445 .	2023-08-19 20:02:57 -07:00
sabaimran	f9e09ba490	Do not try downloading model from GPT4All if the user is not connected to the internet	2023-08-19 19:09:21 -07:00
Debanjum Singh Solanky	3ff4e19dd2	Release Khoj version 0.11.1	2023-08-16 22:53:29 -07:00
sabaimran	4fb8c2c5e1	Pass a SIGTERM to tell the uvicorn server to exit and gracefully kill the thread	2023-08-16 21:27:05 -07:00
Debanjum Singh Solanky	34d5cd2bd8	Increase pytests workflow timeout duration to reduce intermittent failures The test workflow fails regularly with an OperationCancelled error. This is an intermittent failure that gets resolved on running the failed workflows a few times.	2023-08-16 20:00:36 -07:00
sabaimran	4e03dfea43	Attach the parent to the server thread, allowing the kill signal to trigger a graceful exit (#446 )	2023-08-16 19:36:10 -07:00
Debanjum Singh Solanky	3c58ab5fcb	Unmark Python 3.8 as supported in khoj-assistant pypi package	2023-08-16 00:58:59 -07:00
Debanjum Singh Solanky	26c3977fb9	Remove info hint to reindex khoj on unexpected search results The index corruption was issue resolved a while ago in #325 and hasn't cropped up again	2023-08-16 00:58:59 -07:00
sabaimran	def909a913	Revert "Open Web interface within Desktop app in GUI mode" (#444 )	2023-08-15 23:26:28 -07:00
sabaimran	6562ec6531	Release Khoj version 0.11.0	2023-08-14 19:25:03 -07:00
sabaimran	064b2fbc4a	Add a link to the FAQ in our docs (#438 ) * Add a link to faq.khoj.dev in the docs	2023-08-14 15:05:08 -07:00
sabaimran	0ea901c7c1	Allow indexing to continue even if there's an issue parsing a particular org file (#430 ) * Allow indexing to continue even if there's an issue parsing a particular org file * Use approximation in pytorch comparison in text_search UT, skip additional file parser errors for org files * Change error of expected failure	2023-08-14 07:56:33 -07:00
sabaimran	7b907add77	Add support for indexing plaintext files (#420 ) * Add support for indexing plaintext files - Adds backend support for parsing plaintext files generically (.html, .txt, .xml, .csv, .md) - Add equivalent frontend views for setting up plaintext file indexing - Update config, rawconfig, default config, search API, setup endpoints * Add a nifty plaintext file icon to configure plaintext files in the Web UI * Use generic glob path for plaintext files. Skip indexing files that aren't in whitelist	2023-08-09 15:44:40 -07:00
Debanjum Singh Solanky	84d774ea34	Retain desktop builds for 3 days to allow user tests Upgrade minimum tiktoken version to work for encoding gpt4	2023-08-08 23:02:13 -07:00
Ellen7ions	26bddcb65c	Add support for starting a new line with shift-enter (#412 ) * Add support for starting a new line with shift-enter * Remove useless comments. Set font-size: medium. * Update src/khoj/interface/web/chat.html Update the styling to have the padding, margin and line-height like before. Co-authored-by: Debanjum <debanjum@gmail.com> * Update src/khoj/interface/web/chat.html Make the chat-body scroll to the bottom after resizing Co-authored-by: Debanjum <debanjum@gmail.com> --------- Co-authored-by: Debanjum <debanjum@gmail.com>	2023-08-07 19:49:07 -07:00
Debanjum Singh Solanky	97609e4995	Use 500px png of khoj logo instead svg for much smaller asset size The khoj logo svg was 1.3Mb. The 500px png of it is 38Kb. Given all usage of khoj-logo are below 230px this should work fine	2023-08-07 18:27:11 -07:00
Debanjum	14a816d173	Open Web interface within Desktop app in GUI mode (#429 ) Previously the GUI mode (with khoj --gui or using the desktop app) would open the web interface in the users default web browser. Now the web interface is just rendered within the app itself using PyQT's Webview. This gives it a more proper app like feel	2023-08-07 17:48:30 -07:00
Debanjum Singh Solanky	378b96ec1b	Open the khoj app window maximized on startup	2023-08-07 15:39:05 -07:00
Debanjum Singh Solanky	ea734ba1c8	Open app in native view on starting it in GUI mode instead of on web browser - Opens settings page on first run and landing page after in GUI mode Previously was only opening the GUI on linux after first run as it doesn't have a system tray - Both the views are from the web interface but are rendered within the app instead of the browser	2023-08-07 13:41:42 -07:00
Debanjum Singh Solanky	9c494705a8	Open the search, chat or config view in app from the system tray menu	2023-08-07 13:41:42 -07:00
Debanjum Singh Solanky	cc36b87345	Render the web interface directly within the desktop app as a webview	2023-08-07 13:41:12 -07:00
Debanjum	c832e456e0	Merge pull request #427 from Comprehensive-Jason/master-2 Update obsidian/manifest.json	2023-08-07 10:46:35 -07:00
Jason Qin	3ef1b7073d	Update obsidian/manifest.json Closes #426	2023-08-07 10:41:39 -07:00
sabaimran	738cf650b3	Explicitly set Khoj to use the default locale of the user (#425 ) - Explicitly set locale using `locale.setLocale(locale.LC_ALL, '')` for localization. Relevant for datetime libraries. See [Python 3 documentation](https://docs.python.org/3/library/locale.html#locale.setlocale).	2023-08-07 09:23:24 -07:00
Debanjum	cc951450fb	Build Khoj Debian package on Ubuntu 20.04 to work with Glibc 2.31 (#424 ) Build the Debian package using Ubuntu 20.04 instead of 22.04 as Ubuntu 20.04 comes pre-installed with glibc_2.31 unlike Ubuntu 22.04 which uses glibc_2.35	2023-08-06 23:21:51 -07:00
Debanjum Singh Solanky	75c16432a4	Loosen dateparser dependency to get python3.10 wheel for regex package This should reduce chances of installation errors due to regex package being built from source for python3.11 Previously, the regex dependency of dateparser = 1.1.1 didn't have a wheel for python 3.11. This would trigger building the regex package from scratch which would fail for a lot of folks	2023-08-06 22:48:40 -07:00
Debanjum Singh Solanky	8b41eb9f14	Create Pypi package on Ubuntu 20.04 LTS as well	2023-08-06 21:34:38 -07:00
Debanjum Singh Solanky	1cbacf20dc	Build Khoj Debian package on Ubuntu 20.04 to work with glibc 2.31	2023-08-06 20:02:42 -07:00
Jason Qin	0bb5c808e5	Update manifest.json (#422 ) Mark plugin as desktop-only in Obsidian to stop 'fails to load' messages in Obsidian Mobile	2023-08-06 14:21:32 -07:00
Muftawo	c8ef619090	fixed reference link to landing page (#417 ) * Fixed zsh error no matches found * Fixed home page 404 error	2023-08-04 10:38:14 -07:00
Debanjum	952bd39536	Merge pull request #413 from Muftawo/update-docs updated the setup file path to fix the 404 error	2023-08-03 10:44:17 -07:00
Muftawo	18e94d9e60	updated the setup file path to fix the 404 error	2023-08-03 13:35:16 +00:00
sabaimran	78012b8111	Avoid null ref issue when setting model state for web UI. Closes #410	2023-08-03 00:39:06 -07:00
sabaimran	0baed742e4	Add checksums to verify the correct model is downloaded as expected (#405 ) * Add checksums to verify the correct model is downloaded as expected - This should help debug issues related to corrupted model download - If download fails, let the application continue * If the model is not download as expected, add some indicators in the settings UI * Add exc_info to error log if/when download fails for llamav2 model * Simplify checksum checking logic, update key name in model state for web client	2023-08-02 23:26:52 -07:00
sabaimran	6aa998e047	Add note about system requirements for Linux - debian installation. Closes #378 .	2023-08-02 10:57:36 -07:00
sabaimran	d00f51b531	Fix the minimum version of the transformers library required to address #404	2023-08-02 09:10:14 -07:00
Debanjum Singh Solanky	e6e3acdbe4	Release Khoj version 0.10.1	2023-08-01 23:55:13 -07:00
Debanjum Singh Solanky	7c1d70aa17	Bump GPT4All response generation batch size to 512 from 256 A batch size of 512 performs ~20% better on a XPS with no GPU and 16Gb RAM. Seems worth the tradeoff for now	2023-08-01 23:34:02 -07:00
Debanjum Singh Solanky	e42fd8ae91	Make desktop app workflow apt update before install of linux packages - See if this fixes the issue with the workflows failing to install system packages - Make the build desktop app run on changes to the workflow file as well	2023-08-01 23:15:13 -07:00
Debanjum	16c6bfce8e	Improve Quality and Reliability of Offline Chat (#393 ) # Incoming ## Major ### Fix Prompt Size Exceeded Issue - Fix issues related to prompt size, Closes #386. Use the correct tokenizer to calculate whether the input needs to be truncated or not. ### Improve Llama 2 Model Download - Use the correct download link for LlamaV2 -- should have been using the small model, but was using the medium - Add better downloading logic to retry download if it failed, Closes #379 ### Fix Segmentation Fault due to Race - Add a lock around generating chat responses from the offline model to avoid segmentation faults. Closes #367. - Add a loading symbol to the web chat UI when the model is thinking. Closes #392 ### Improve Chat Response Latency - Improve performance of offline chat by increasing batch size (via `n_batch`) to automatically engage more cores/GPU, using smaller model and fixing prompt vs response token generation numbers. Closes #363 ### Fix Fake Dialogue Continuation - Fix formatting of user query with offline chat, this was contributing to #398 - Stop Llama 2 from Creating Fake Dialogue Continuations. Closes #398 ## Minor - Improve default message for Chat window on web when it's not configured. Include hint to use offline chat. - Add null check in `perform_chat_checks` method - Add offline chat director unit tests ## Performance Analysis (Time to First Token) \| \| v0.10.0 \| this branch \| \|-\|-\|-\| \| Query 1 \| 52s \| 28s \| \| Query 2 \| 33s\| 42s \| \| Query 3 \| 67s\| 38s\|	2023-08-01 22:07:27 -07:00
Debanjum Singh Solanky	44292afff2	Put offline model response generation behind the chat lock as well Not just the chat response streaming	2023-08-01 21:53:52 -07:00
Debanjum Singh Solanky	1812473d27	Extract new schema version for each migration script into a variable This should ease readability, indicates which version this migration script will update the schema to once applied	2023-08-01 21:41:08 -07:00
Debanjum Singh Solanky	b9937549aa	Simplify migration scripts management. Make them use static version - Only make them update config when it's run conditions are satisfies - Use static schema version to simplify reasoning about run conditions	2023-08-01 21:28:20 -07:00
Debanjum Singh Solanky	185a1fbed7	Remove old chat setup timer. It is mislabelled, irrelevant since streaming	2023-08-01 20:52:00 -07:00
Debanjum Singh Solanky	95acb1583d	Update local Chat Actor and Director tests expected to fail	2023-08-01 20:52:00 -07:00
Debanjum Singh Solanky	c2b7a14ed5	Fix context, response size for Llama 2 to stay within max token limits Create regression text to ensure it does not throw the prompt size exceeded context window error	2023-08-01 20:52:00 -07:00
Debanjum Singh Solanky	6e4050fa81	Make Llama 2 stop generating response on hitting specified stop words It would previously some times start generating fake dialogue with it's internal prompt patterns of <s>[INST] in responses. This is a jarring experience. Stop generation response when hit <s> Resolves #398	2023-08-01 20:52:00 -07:00
Debanjum Singh Solanky	aa6846395d	Fix offline model migration script to run for version < 0.10.1 - Use same batch_size in extract question actor as the chat actor - Log final location the chat model is to be stored in, instead of it's temp filename while it is being downloaded	2023-08-01 20:51:53 -07:00
Ikko Eltociear Ashimine	49abb9df9c	Fix typo in orgnode.py (#397 ) Fix spelling of Ouput in org parser property drawer comment to Output.	2023-08-01 19:54:57 -07:00
sabaimran	d8fa967b43	Update chat actor unit tests for greater accuracy and benchmarking	2023-08-01 12:24:43 -07:00
sabaimran	f409e16137	Update some of the extract question prompts for llamav2	2023-08-01 12:23:36 -07:00
sabaimran	b11b00a9ff	Add log line for time to first response	2023-08-01 10:57:38 -07:00
sabaimran	778df6be71	Add a logline when the offline model migration script runs	2023-08-01 09:27:42 -07:00
sabaimran	48363ec861	Add additional check for chat_messages length in UT	2023-08-01 09:25:52 -07:00
sabaimran	3a5d93d673	Add migration script for getting the new offline model	2023-08-01 09:25:05 -07:00
sabaimran	90efc2ea7a	Update comments and add explanations	2023-08-01 09:24:03 -07:00
sabaimran	f7e03f6d63	Switch spinner snake case -> camel case	2023-08-01 08:52:25 -07:00
sabaimran	1c52a6993f	add a lock around chat operations to prevent the offline model from getting bombarded and stealing a bunch of compute resources - This also solves #367	2023-08-01 00:23:17 -07:00
sabaimran	6c3074061b	Disable the input bar when chat response is in flight	2023-08-01 00:21:39 -07:00
sabaimran	c14cbe926a	Add a loading symbol to web chat. Closes #392	2023-07-31 23:35:48 -07:00
sabaimran	8054bdc896	Use n_batch parameter to increase resource consumption on host machine (and implicitly engage GPU)	2023-07-31 23:25:08 -07:00
sabaimran	e55e9a7b67	Fix unit tests and truncation logic	2023-07-31 21:37:59 -07:00
sabaimran	2335f11b00	Add better error handling for download processes incase of failure	2023-07-31 21:07:38 -07:00
sabaimran	95c7b07c20	Make the fake message longer	2023-07-31 20:55:19 -07:00
sabaimran	8dd5756ce9	Add new director tests for the offline chat model with llama v2	2023-07-31 20:24:52 -07:00
sabaimran	209975e065	Resolve merge conflicts: let Khoj fail if the model tokenizer is not found	2023-07-31 19:12:26 -07:00
sabaimran	2d6c3cd4fa	Misc. quality improvements for Llama V2 - Fix download url -- was mapping to q3_K_M, but fixed to use q4_K_S - Use a proper Llama Tokenizer for counting tokens for truncation with Llama - Add additional null checks when running	2023-07-31 19:11:20 -07:00
sabaimran	ca195097d7	Update chat hint message at first run	2023-07-31 17:46:09 -07:00
Debanjum Singh Solanky	ded606c7cb	Fix format of user query during general conversation with Llama 2	2023-07-31 17:21:14 -07:00
Debanjum Singh Solanky	48e5ac0169	Do not drop system message when truncating context to max prompt size Previously the system message was getting dropped when the context size with chat history would be more than the max prompt size supported by the cat model Now only the previous chat messages are dropped or the current message is truncated but the system message is kept to provide guidance to the chat model	2023-07-31 17:21:14 -07:00
Saba	02e216c135	Clarify usage in telmetry.md	2023-07-30 22:37:20 -07:00
Saba	7eabf8ab0f	Add instructions for installing the desktop app and opting out of telemetry	2023-07-30 22:26:52 -07:00
sabaimran	88ef86ad5c	Fix typing issues for mypy (#372 )	2023-07-30 19:27:48 -07:00
sabaimran	ca2c942b65	Add typing to compiled_references and inferred_queries	2023-07-30 19:10:30 -07:00
sabaimran	dbb54cfcfa	Merge branch 'master' of github.com:khoj-ai/khoj	2023-07-30 18:52:17 -07:00
sabaimran	3646fd1449	Add a warning to indicate that Khoj is not configured to work with personal data sources	2023-07-30 18:52:10 -07:00
sabaimran	996832dc72	Allow user to chat even if content types aren't configured - use empty references	2023-07-30 18:47:45 -07:00
Debanjum	41d36a5ecc	Merge pull request #371 from felixonmars/patch-1 Correct typos in setup.md in the Khoj documentation	2023-07-30 18:37:22 -07:00
Felix Yan	f4fdfe8d8c	Correct typos in setup.md	2023-07-31 03:32:56 +03:00
Debanjum Singh Solanky	28df08b907	Fix configure openai processor for khoj docker Store khoj search models and embeddings in default location in docker container under /root/.khoj	2023-07-30 02:07:33 -07:00
Debanjum Singh Solanky	dffbfee62b	Fix sample khoj docker config to index test data using new schema	2023-07-30 01:48:18 -07:00
Debanjum Singh Solanky	53810a0ff7	Create khoj config dir if non-existant, before writing to khoj env file	2023-07-30 01:35:36 -07:00
Debanjum Singh Solanky	56394d2879	Update demo video to configure offline chat via the web interface	2023-07-29 19:17:40 -07:00
Debanjum Singh Solanky	b32673db8e	Fix link to Docs website in Khoj readme on Github	2023-07-29 12:50:39 -07:00
Debanjum Singh Solanky	a3d1212e79	Align docs landing page with updated github readme - Screenshots of khoj search, chat - Put quickstart on landing page - Put miscellaneous pages under separate section - Move credits to separate page under miscellaneous	2023-07-29 12:42:36 -07:00
Debanjum Singh Solanky	d7205aed36	Update docs with setup instructions for Offline and Online Chat	2023-07-29 11:18:12 -07:00
Debanjum	0404e33437	Add screenshots, style content in README	2023-07-29 01:22:48 -07:00
sabaimran	f65d157244	Release Khoj version 0.10.0	2023-07-28 19:27:47 -07:00
Debanjum Singh Solanky	f76af869f1	Do not log the gpt4all chat response stream in khoj backend Stream floods stdout and does not provide useful info to user	2023-07-28 19:14:04 -07:00
sabaimran	5ccb01343e	Add Offline chat to Obsidian (#359 ) * Add support for configuring/using offline chat from within Obsidian * Fix type checking for search type * If Github is not configured, /update call should fail * Fix regenerate tests same as the update ones * Update help text for offline chat in obsidian * Update relevant description for Khoj settings in Obsidian * Simplify configuration logic and use smarter defaults	2023-07-28 18:47:56 -07:00
Debanjum	b3c1507708	Merge pull request #361 from khoj-ai/configure-offline-chat-from-emacs - Configure using Offline Chat from Emacs: - Enable, Disable Offline Chat from Emacs - Use: Enable offline chat with `(setq khoj-chat-offline t)' during khoj setup - Benefits: Offline chat models are better for privacy but not great at answering questions	2023-07-28 18:06:58 -07:00
sabaimran	9f78db0579	Let Offline chat override OpenAI API settings (#362 ) * Let Offline chat override OpenAI API settings * Download the offline model whenever offline chat is enabled * Add progressbar for download for llamav2 model to track progress * Change ordering of n due to switch of default processor * Flip ordering of offline/openai checks when extracting questions from query	2023-07-28 17:26:20 -07:00
Debanjum Singh Solanky	ebfbef1f68	Configure using offline chat from Emacs Closes #358	2023-07-28 16:07:33 -07:00
Debanjum Singh Solanky	9b1048caf7	Remove asymmetric from name of remaining text search tests Asymmetric search is the only search type used now in khoj.el. So making distinction between between symmetric and asymmetric search isn't necessary anymore	2023-07-28 15:33:22 -07:00
sabaimran	12cfb48f16	Fix gpt4all import error in Desktop builds (#356 ) * Add gpt4all to imports via sysconfig path	2023-07-28 11:54:18 -07:00
Debanjum	4b0639cfbd	Merge pull request #354 from ducksblock/master Fix #353: Remove references to localhost:8000 in docs	2023-07-28 11:00:12 -07:00
ducksblock	cbecd7b66f	Fix #353 : Remove references to localhost:8000	2023-07-28 13:57:00 +05:30
sabaimran	702486dab7	Add gpt4all for copying metadata	2023-07-27 22:22:24 -07:00
sabaimran	29081f4429	Adjust parameters for offline chat	2023-07-27 22:22:09 -07:00
sabaimran	124d97c26d	Replace Falcon 🦅 model with Llama V2 🦙 for offline chat (#352 ) * Working example with LlamaV2 running locally on my machine - Download from huggingface - Plug in to GPT4All - Update prompts to fit the llama format * Add appropriate prompts for extracting questions based on a query based on llama format * Rename Falcon to Llama and make some improvements to the extract_questions flow * Do further tuning to extract question prompts and unit tests * Disable extracting questions dynamically from Llama, as results are still unreliable	2023-07-27 20:51:20 -07:00
sabaimran	55965eea7d	Delete FUNDING.yml Instead of this file, use an organization-level file: https://github.com/khoj-ai/.github	2023-07-27 15:28:47 -07:00
sabaimran	925177b150	Update FUNDING.yml Change to use a single organization (remove list brackets)	2023-07-27 15:19:20 -07:00
sabaimran	78197bb5c3	Create FUNDING.yml - Add github sponsor information directly to khoj project. Closes #302	2023-07-27 15:16:45 -07:00
Debanjum Singh Solanky	da3f4dc7e4	Fix test config to run OpenAI Chat Actor, Director tests OpenAI conversation processor schema had updated but conftest hadn't been updated to reflect the same. Update conftest setup of conversation processor to fix this	2023-07-27 11:30:04 -07:00
Debanjum Singh Solanky	715d56d4f0	Use new schema to update khoj.yml config from khoj.el	2023-07-26 17:34:16 -07:00
sabaimran	8b2af0b5ef	Add support for our first Local LLM 🤖🏠 (#330 ) * Add support for gpt4all's falcon model as an additional conversation processor - Update the UI pages to allow the user to point to the new endpoints for GPT - Update the internal schemas to support both GPT4 models and OpenAI - Add unit tests benchmarking some of the Falcon performance * Add exc_info to include stack trace in error logs for text processors * Pull shared functions into utils.py to be used across gpt4 and gpt * Add migration for new processor conversation schema * Skip GPT4All actor tests due to typing issues * Fix Obsidian processor configuration in auto-configure flow * Rename enable_local_llm to enable_offline_chat	2023-07-26 16:27:08 -07:00
sabaimran	23d77ee338	Fix import issues in desktop image builds (#343 )	2023-07-26 15:45:52 -07:00
Justin Bassett-Green	8dcc21052f	Add chat-model param in sample config yml and document (#341 ) * add chat-model config param to docs * add chat-model param to sample config yml	2023-07-22 16:53:08 -07:00
Debanjum Singh Solanky	5bb42e56a8	Fix formatting of khoj test config and unused references in conftests	2023-07-22 00:29:26 -07:00
Debanjum Singh Solanky	7722a9c347	Default to using the gpt-3.5-turbo model for chat from khoj.el	2023-07-22 00:29:26 -07:00
Saba	36d25c4f1d	Center the title, add table headers	2023-07-21 23:36:38 -07:00
Saba	01b6a10cd1	Simplify readme	2023-07-21 23:30:44 -07:00
sabaimran	4ce072c4b3	Make the README on our Github minimal (#334 ) * Make the README on our Github minimal * Add a bit of formatting and more background	2023-07-21 23:29:04 -07:00
Debanjum Singh Solanky	4089e38283	Fix links to demos and screenshots in docs	2023-07-21 20:01:19 -07:00
Debanjum Singh Solanky	89ad362758	Update Screenshots and Demos in Docs	2023-07-21 15:22:35 -07:00
Debanjum Singh Solanky	f0d4a4cf9a	Revert "Make configure_content functional. Do not pass content index state to it." This reverts commit `2ddee7e745` as it broke partial updates of the content index for just the specified content types	2023-07-21 13:59:09 -07:00
sabaimran	82c725817e	Merge branch 'master' of github.com:khoj-ai/khoj	2023-07-21 13:24:05 -07:00
sabaimran	596e11ec6d	Use the same function for computing entries for IDs regardless of whether it has prev entries	2023-07-21 13:23:56 -07:00
Saba	634f0b4cc4	Fix docs indexing issue	2023-07-21 08:30:00 -07:00
Debanjum Singh Solanky	c28755ccd2	Fix diff blocks, links, remove footnotes & rearrange sections in docs Extract performance into separate sectin into shoving it under search Create page for web interface	2023-07-21 00:58:30 -07:00
Debanjum Singh Solanky	2ddee7e745	Make configure_content functional. Do not pass content index state to it.	2023-07-20 23:24:08 -07:00
Debanjum	e92bc0e2e6	Create CNAME to make Docs accessible at docs.khoj.dev	2023-07-20 23:24:08 -07:00
sabaimran	1610d2ebd9	📝 Add a documentation base for Khoj! (#333 ) * Add docs for more organized, accessible information detailing Khoj setup * Delete duplicated files * Add a coverpage without enabling it. Add logo and theme * Remove obsidian README.md * Add plausible script to index.html via docsify	2023-07-20 22:34:25 -07:00
Debanjum Singh Solanky	3e59be7f1d	Release Khoj version 0.9.0	2023-07-18 19:59:27 -07:00
Debanjum Singh Solanky	d078e7b1f6	Clean up search type usage in khoj server, tests and Readme	2023-07-18 19:57:55 -07:00
Debanjum Singh Solanky	4d910936b7	Fix triggering index update on khoj server from khoj.el	2023-07-18 19:57:54 -07:00
Debanjum Singh Solanky	5c7d7f558d	Make AI model used for Khoj chat configurable from khoj.el - Fix bug. Set the unused model-name to a standad default value	2023-07-18 19:57:54 -07:00
Debanjum	5f2be2a9bb	Merge pull request #298 from HyunggyuJang/patch-1 Encode config as utf-8 during setup in khoj.el. This will allow utf-8 encoded files etc to be passed in config	2023-07-18 17:54:11 -07:00
Debanjum	3a1c5a6dab	Merge pull request #329 from khoj-ai/create-schema-migration-func-and-reindex-to-fix-corruption Create Schema Migrator and Reindex to Apply Index Corruption Fixes - `83e1088` Manage `khoj.yml' config migrations on app start. Version the `khoj.yml' schema - `429e1b4` Regenerate index to apply corruption fixes on first run of this khoj version Otherwise users would need to manually re-index their contents with khoj	2023-07-18 16:43:17 -07:00
Debanjum Singh Solanky	429e1b4b48	Regenerate index to apply corruption fixes on first run of new khoj	2023-07-18 16:10:47 -07:00
Debanjum Singh Solanky	83e1088d42	Manage khoj.yml config migrations on app start. Version the schema - Add version to khoj.yml schema Versioning the khoj.yml config schema will simplify future migrations	2023-07-18 16:10:10 -07:00
Debanjum Singh Solanky	71e8ddd9a2	Check if PDF is configured before showing it as an option in khoj.el	2023-07-17 15:49:20 -07:00
Debanjum	d00c5da8b7	Merge pull request #325 from khoj-ai/stablize-simplify-content-indexing ## Stabilize and Simplify Content Indexing ### Major Updates - `9bcca43` Unify logic to update entries when indexing from scratch or incrementally - `89c7819` Unify logic to update embeddings when indexing from scratch or incrementally - `6a0297c` Stable sort new entries when marking entries for update - `58d86d7` Unify logic to configure server from API or on server start - Create tests to ensure old entries, embeddings in index are unaffected on adding new entries - Refer: `1482fd4`, `7669b85`, `88d1a29` - `ad41ef3` Make normalization of embeddings configurable to test this in `c73feeb` ### Minor Updates - `1673bb5` Add todo state to compiled form of each entry - `6e70b91` Remove unused `dump_jsonl` helper method - `7ad9603` Improve naming of lock - `b02323a` Improve naming text search test methods Resolves #190	2023-07-17 14:51:10 -07:00
Debanjum Singh Solanky	3e3a1ecbc8	Start app even if server init fails to let user fix it Show stacktrace on error to help debugging	2023-07-17 14:33:02 -07:00
Debanjum Singh Solanky	ef6a0044f4	Drop embeddings of deleted text entries from index Previously the deleted embeddings would continue to be in the index, even after the entry was deleted	2023-07-16 03:47:05 -07:00
Debanjum Singh Solanky	c73feebf25	Test index embeddings are stable on incremental update & no norm Ensure order of new embedding insertion on incremental update does not affect the order and value of existing embeddings when normalization is turned off	2023-07-16 02:22:28 -07:00
Debanjum Singh Solanky	ad41ef3991	Make normalizing embeddings configurable	2023-07-16 02:16:33 -07:00
Debanjum Singh Solanky	1482fd4d4d	Test index is stable sorted on incremental update with new entry Ensure order of new embedding, entry insertion on incremental update is stable	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	b02323ade6	Improve name of text search test functions Asymmetric was older name used to differentiate between symmetric, asymmetric search. Now that text search just uses asymmetric search stick to simpler name	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	89c7819cb7	Unify logic to generate embeddings from scratch and incrementally This simplifies the `compute_embeddings' method and avoids potential later divergence in handling the index regenerate vs update scenarios	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	6a0297cc86	Stable sort new entries when marking entries for update	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	7669b85da6	Test index is stable sorted on regenerate with new entry	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	6e70b914c2	Remove unused dump_jsonl method The entries index is stored ingzipped jsonl files for each content type	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	9bcca43299	Use single func to handle indexing from scratch and incrementally Previous regenerate mechanism did not deduplicate entries with same key So entries looked different between regenerate and update Having single func, mark_entries_for_update, to handle both scenarios will avoid this divergence Update all text_to_jsonl methods to use the above method for generating index from scratch	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	1673bb5558	Add todo state to compiled form of each org-mode entry	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	88d1a29a84	Test index is stable for duplicate entries across regenerate, update - Current incorrect behavior: All entries with duplicate compiled form are kept on regenerate but on update only the last of the duplicated entries is kept This divergent behavior is not ideal to prevent index corruption across reconfigure and update	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	da98b92dd4	Create helper function to test value, order of entries & embeddings This helper should be used to observe if the current embeddings are stable sorted on regenerate and incremental update of index in text search tests	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	7ad96036b0	Improve lock name to config_lock instead of search_index_lock It is used to lock updates to all app config state, including processor	2023-07-16 01:45:53 -07:00
Debanjum Singh Solanky	58d86d7876	Use single func to configure server via API and on server start Improve error messages on failure to configure server components	2023-07-16 01:45:53 -07:00
sabaimran	a15711e635	Fix null type checks in get /config	2023-07-15 15:53:56 -07:00
sabaimran	e590d75b20	Start Khoj even when config is not valid (#320 ) * Add icon to indicate bad config, start Khoj even if there was an issue setting up the index	2023-07-15 14:11:54 -07:00
sabaimran	49ab201c30	Fix issues importing PySide in Docker container (#322 ) * Rather than installing PyQT dependencies, remove codepaths that require pyqt files in no-gui mode	2023-07-15 13:33:13 -07:00
sabaimran	ba47f2ab39	Merge branch 'master' of github.com:debanjum/khoj	2023-07-14 22:28:05 -07:00
sabaimran	874cffd256	Add additional support for parsing notion workspaces	2023-07-14 22:27:56 -07:00
Debanjum	52f68167ce	Merge pull request #317 from khoj-ai/reduce-memory-consumption-by-search-model-duplication Reuse Search Models across Content Types to reduce Memory Consumption - Memory consumption now only scales with search models used, not with content types. Previously each content type had it's own copy of the search ML models. That'd result in 300+ Mb per enabled text content type - Split model state into 2 separate state objects, `search_models` and `content_index`. This allows loading text_search and image_search models first and then reusing them across all content_types in content_index - The change should cut down memory utilization quite a bit for most users. I see a >50% drop in memory utilization on my Khoj instance. But this will vary for each user based on the amount of content indexed vs number of plugins enabled. - This change does not solve the RAM utilization scaling with size of the index, as the whole content index is still kept in RAM while Khoj is running Should help with #195, #301 and #303	2023-07-14 19:54:12 -07:00
Debanjum Singh Solanky	f08e9539f1	Release lock after updating index even if update fails to prevent deadlock Wrap acquire/release locks in try/catch/finally when updating content index and search models to prevent lock not being released on error and causing a deadlock	2023-07-14 16:57:27 -07:00
sabaimran	37f7f9fd1d	Add additional telemetry for system understanding (#316 ) * Add additional telemetry in order to understand which data sources are the most useful * Make actions side by side in the configuration page * Restore main run command * Update links to point to wiki pages for Github, Notion integrations * Stanardize nomenclature of the api_type to use _config suffix Remove header fields that aren't actually helpful for understanding config usage	2023-07-14 10:14:07 -07:00
Debanjum Singh Solanky	b9fb656657	Update Tests to setup both content_index, search_models before testing This is required by the updated structure of Khoj setup - Add content_config pytest fixture, pass bi_encoder from search_models.[text\|image]_search	2023-07-14 01:29:48 -07:00
Debanjum Singh Solanky	86e2bec9a0	Reuse Search Models across Content Types to Reduce Memory Consumption - Memory consumption now only scales with search models used, not with content types as well. Previously each content type had it's own copy of the search ML models. That'd result in 300+ Mb per enabled content type - Split model state into 2 separate state objects, `search_models' and `content_index'. This allows loading text_search and image_search models first and then reusing them across all content_types in content_index - This should cut down memory utilization quite a bit for most users. I see a ~50% drop in memory utilization. This will, of course, vary for each user based on the amount of content indexed vs number of plugins enabled - This does not solve the RAM utilization scaling with size of the index. As the whole content index is still kept in RAM while Khoj is running Should help with #195, #301 and #303	2023-07-14 01:27:22 -07:00
sabaimran	c2249eadb2	Add a Github workflow that allows you to build dev versions of Desktop applications (#309 ) * Add a Github workflow that allows you to build dev versions of Desktop applications * Add pull_request trigger for testing * Fix errant open quote in Package Khoj App step * Nix the release step, since this isn't associated with any tags - Set retention period for uploaded artifacts to 1 day * Remove pull_request trigger - limit to manual triggers and pushes to master	2023-07-13 22:11:39 -07:00
Debanjum	b2718d330c	Merge pull request #304 from migrate-from-pyqt-to-pyside Migrate from PyQT6 to PySide6	2023-07-13 11:54:47 -07:00
sabaimran	31e933207f	Set default values for sys.stdout if they're unavailable	2023-07-12 22:22:49 -07:00
Debanjum Singh Solanky	9c76150895	Migrate from PyQT6 to PySide6	2023-07-11 18:43:44 -07:00
Debanjum	83ed8561ee	Reduce size of Docker image and build it from local code - Improvements - Install Khoj on Docker from local code instead of pulling from Github - Reduce Khoj Docker image size by 2Gb by not caching installed pip packages. Refer [issue comment](https://github.com/khoj-ai/khoj/issues/148#issuecomment-1627443570)	2023-07-11 01:30:06 -07:00
HyunggyuJang	88c42b3043	Encode data as utf-8 otherwise it will complain, see `1c85531090`	2023-07-11 17:06:05 +09:00
Debanjum Singh Solanky	6308388dfc	Install Khoj on Docker from local app instead of pulling from github Just use a random static version for Khoj on the Docker as otherwise the hatch vcs dynamic versioning requires the .git directory in the docker image too	2023-07-11 00:41:05 -07:00
Debanjum Singh Solanky	802472cd99	Reduce Khoj Docker image size by 2Gb by not caching pip packages Resolve #148	2023-07-10 23:27:02 -07:00
Debanjum Singh Solanky	f664a74e77	Update Khoj server to run on non standard port, 42110 instead of 8000 Resolves #295	2023-07-10 21:27:58 -07:00
Debanjum Singh Solanky	bfd516c1a4	Deprecate (unmaintained) support to setup Khoj via Conda	2023-07-10 21:27:58 -07:00
Debanjum Singh Solanky	58c2c3b71a	Add Documentation to Release Khoj	2023-07-10 21:27:58 -07:00
sabaimran	effb52f859	Fix demo rendering with the new header	2023-07-10 21:16:19 -07:00
sabaimran	55f5be7b03	Release Khoj version 0.8.2	2023-07-10 14:39:32 -07:00
sabaimran	9a63f89f33	Merge branch 'master' of github.com:debanjum/khoj	2023-07-10 14:31:19 -07:00
sabaimran	53809298c0	Release Khoj version 0.8.1	2023-07-10 14:30:04 -07:00
tjsousa	5b37e988e6	Allow using configured GPT chat model (#292 ) My account doesn't have gpt-4 enabled and it wouldn't work as the default value was always used from extract_questions, where the caller could use the configured model.	2023-07-10 14:24:40 -07:00
Debanjum Singh Solanky	75ff871217	Release Khoj version 0.8.0	2023-07-10 13:37:51 -07:00
Debanjum Singh Solanky	979088b3dc	Add tooltip helper text on web settings page buttons - Provide more details on what clicking configure, initialize buttons or changing the results count slider does - This shows up on user hovering over those buttons	2023-07-10 13:32:41 -07:00
Debanjum Singh Solanky	255781e135	Use relative link on logo to jump to correct page on local and cloud	2023-07-10 13:22:20 -07:00
Debanjum Singh Solanky	b2d229c116	Move header pane style to base khoj.css for reuse. Fix logo size	2023-07-10 13:10:17 -07:00
Debanjum Singh Solanky	f4cef377ca	Add details to run, configure Khoj via Web in Readme	2023-07-10 12:10:20 -07:00
Debanjum Singh Solanky	20cb314171	Open the Khoj config page in the browser on first run	2023-07-10 12:10:20 -07:00
sabaimran	07cf5a214a	Check if PDF files are present in the Obsidian vault before initializing the Khoj configuration (#293 )	2023-07-10 10:33:04 -07:00
sabaimran	7364bac8ae	Make the header take up less space - Use a single row for the header - Needed custom styling for each page because each of them are different in subtle ways, unfortunately	2023-07-09 22:31:37 -07:00
sabaimran	62704cac09	Add a plugin which allows users to index their Notion pages (#284 ) * For the demo instance, re-instate the scheduler, but infrequently for api updates - In constants, determine the cadence based on whether it's a demo instance or not - This allow us to collect telemetry again. This will also allow us to save the chat session * Conditionally skip updating the index altogether if it's a demo isntance * Add backend support for Notion data parsing - Add a NotionToJsonl class which parses the text of Notion documents made accessible to the API token - Make corresponding updates to the default config, raw config to support the new notion addition * Add corresponding views to support configuring Notion from the web-based settings page - Support backend APIs for deleting/configuring notion setup as well - Streamline some of the index updating code * Use defaults for search and chat queries results count * Update pagination of retrieving pages from Notion * Update state conversation processor when update is hit * frequency_penalty should be passed to gpt through kwargs * Add check for notion in render_multiple method * Add headings to Notion render * Revert results count slider and split Notion files by blocks * Clean/fix misc things in the function to update index - Use the successText and errorText variables appropriately - Name parameters in function calls - Add emojis, woohoo * Clean up and further modularize code for processing data in Notion	2023-07-09 15:29:26 -07:00
Debanjum	77755c0284	Fix Packaging the Khoj Desktop Apps (#289 ) * Add langchain static files and pytorch metadata to Khoj native app * Add pillow static files, metadata & hidden imports to Khoj native app * Fix path to web interface static files on Khoj native app * Add tiktoken hidden imports to make chat work from Khoj native app * Fix Khoj native app to run with GUI mode enabled This got broken when we moved from using the --no-gui flag to using --gui in https://github.com/khoj-ai/khoj/pull/263	2023-07-09 10:21:16 -07:00
sabaimran	4c135ea316	Make streaming optional for the /chat endpoint (#287 ) * Update the /chat endpoint to conditionally support streaming - If streams are enabled, return the threadgenerator as it does currently - If stream is disabled, return a JSON response with the response/compiled references separated out - Correspondingly, update the chat.html UI to use the streamed API, as well as Obsidian - Rename chat/init/ to chat/history * Update khoj.el to use the /history endpoint - Update corresponding unit tests to use stream=true * Remove & from call to /chat for obsidian * Abstract functions out into a helpers.py file and clean up some of the error-catching	2023-07-09 10:12:09 -07:00
Debanjum Singh Solanky	0a86220d42	Use default values, delete content config on disable and update state	2023-07-07 20:36:16 -07:00
Debanjum Singh Solanky	362063f5fe	By default, connect to Khoj server over IPv4 from Obsidian plugin	2023-07-07 20:36:16 -07:00
Debanjum Singh Solanky	571e8c2548	Add rerank, index corruption hint on search page of web interface Similar to the hint alrady in the Obsidian search modal Closes #272	2023-07-07 20:36:16 -07:00
Debanjum	4b79d8216f	Move remaining chat actors to use OpenAI chat models - Deprecate the unused beta /answer and /search type identification endpoints and associated GPT functions - Update extract_questions to use GPT4 - Update summarize method to default to GPT-3.5 - Update date filter to support quoting values in single quotes too. So now both dt>'2023-04-01' and dt>"2023-04-01" should work - Remove "model" field from chat settings on the web interface	2023-07-07 18:53:05 -07:00
Debanjum Singh Solanky	61e131f95c	Hide unused model field from chat settings on web interface	2023-07-07 18:43:53 -07:00
Debanjum Singh Solanky	af30d01e85	Move to newer chat models to extract questions & summarize chats Deprecate usage of the older gpt3 models in-place of the newer chat based models - text-davinci-003 is only 50% cheaper than gpt4 and less reliable for question extraction - Using gpt-3.50turbo for summarization should reduce cost of chat - Keep conversation.chat_session as a list instead of a string - Update completion_with_backoff func to use ChatML format	2023-07-07 17:32:27 -07:00
Debanjum Singh Solanky	171ce19e1f	Update date filter to allow quoting values in single quotes	2023-07-07 17:13:47 -07:00
Debanjum Singh Solanky	e588f7c528	Deprecate unused beta search and answer API endpoints	2023-07-07 16:38:07 -07:00
Debanjum Singh Solanky	c9fc4d1296	Revert to using cross-encoder to improve search results used by chat	2023-07-07 15:31:34 -07:00
Debanjum Singh Solanky	11f0a9f196	Fix chat tests since streaming. Pass args correctly to chat methods - Fix testing gpt converse method after it started streaming responses - Pass stop in model_kwargs dictionary and api key in openai_api_key parameter to chat completion methods. This should resolve the arg warning thrown by OpenAI module	2023-07-07 15:23:44 -07:00
Debanjum Singh Solanky	48870d9170	Fix parsing questions generated by extract_questions actor into list The previous json parsing was failing to handle questions with date filters Fix the chat actor tests to run without throwing error with freezegun complaining about importing transformers.local_llama model Remove quote escapes from date filter examples provided to extract_questions actor	2023-07-07 15:18:55 -07:00
Debanjum Singh Solanky	279662620b	Move results count to settings page on web. Use it for search & chat - Before Only the search interface had the results count configuration option - After - The results count is set on the settings page instead of the search page - Both search and chat can use the configured results count instead of just search	2023-07-07 14:08:08 -07:00
Debanjum Singh Solanky	2ec8da89e8	Remove Update button from Khoj Search page on the Web interface The settings page on the Khoj web interface already has a configure button. Don't need the Update button on the search page as well	2023-07-07 12:49:58 -07:00
Debanjum Singh Solanky	bf427cd8dd	Set no. of results used to generate chat response from Khoj Emacs	2023-07-07 12:34:50 -07:00
Debanjum Singh Solanky	1d77fe712c	Set no. of results used to generate chat response from Khoj Obsidian	2023-07-07 12:32:32 -07:00
Debanjum Singh Solanky	2f31de5ed5	Set no. of references to use for chat configurable in Chat API	2023-07-07 12:29:36 -07:00
Debanjum Singh Solanky	d97682fdac	Use tooltip, placeholders to guide Khoj setup via web settings page	2023-07-06 21:37:48 -07:00
Debanjum Singh Solanky	f5cf09424b	Use more descriptive field names for content type settings on Khoj web Resolves #281	2023-07-06 20:47:39 -07:00
Debanjum Singh Solanky	a2c668268f	Use node-fetch >=3.1.0 in khoj obsidian plugin to avoid security vulnerability	2023-07-06 13:05:52 -07:00
sabaimran	d688ddf92c	Re-instate the scheduler for the demo instances (#279 ) * For the demo instance, re-instate the scheduler, but infrequently for api updates - In constants, determine the cadence based on whether it's a demo instance or not - This allow us to collect telemetry again. This will also allow us to save the chat session * Conditionally skip updating the index altogether if it's a demo isntance	2023-07-06 11:01:32 -07:00
Debanjum Singh Solanky	8f36572a9b	Improve typing, null checks in controllers and gpt functions	2023-07-05 20:49:25 -07:00
Debanjum Singh Solanky	41ac1e24c9	Add docs for a pre-emptive setup of Khoj for later offline usage Closes #151	2023-07-05 20:48:51 -07:00
Debanjum	6c2a8a5bce	⚡️ Stream Responses by Khoj Chat on Web, Obsidian - What - Stream chat responses from OpenAI API to Web, Obsidian clients - Implement using a callback function which manages a queue where new tokens can be placed as they come on. As the thread is read from, tokens are removed. - When the final token has been processed, add the `compiled_references` to the queue to be rendered by the `chat` client - When the thread has been closed, save the accumulated conversation log in the user's history using a `partial func` - Incrementally decode tokens on the front end and add them as they appear from the streamed response - Why This significantly reduces perceived latency and OpenAI API request timeouts for Chat Closes https://github.com/khoj-ai/khoj/issues/257	2023-07-05 20:02:11 -07:00
Debanjum Singh Solanky	e111eda6ae	Make client, app_config optional in telemetry logger for correct typing	2023-07-05 18:57:38 -07:00
Debanjum Singh Solanky	e562114f6b	Improve comments, var names in js for chat streaming on web interface	2023-07-05 18:57:27 -07:00
Debanjum Singh Solanky	46269ddfd3	Fix chat logging messages to get context without flooding logs	2023-07-05 18:27:06 -07:00
Debanjum Singh Solanky	0ba838b53a	Show temp status message in Khoj Obsidian chat while Khoj is thinking - Scroll to bottom after adding temporary status message and references too	2023-07-05 18:02:43 -07:00
Debanjum Singh Solanky	8271abe729	Use optional chaining operator to extract khojBannerSubmit from conditional	2023-07-05 18:02:43 -07:00
Debanjum Singh Solanky	c12ec1fd03	Show temp status message in Khoj web chat while Khoj is thinking - Scroll to bottom after adding temporary status message and references too	2023-07-05 18:02:30 -07:00
sabaimran	257a421e45	Bonus: add try-catch logic around telemetry upload in case of JSON serializability issues	2023-07-05 15:12:18 -07:00
sabaimran	4e6b66b139	Add support for streaming chat response from OpenAI to Obsidian - I needed to installed node-fetch to accomplish this, as the built-in request object from Obsidian doesn't seem to support streaming and the built-in fetch object is very sensitive to any and all cross origin requests	2023-07-05 15:01:22 -07:00
sabaimran	3ff5074cf5	Log the end-to-end time of generating a streamed response from OpenAI	2023-07-05 14:59:44 -07:00
sabaimran	68e635cc32	Remove additional comments and debug statements	2023-07-05 11:33:56 -07:00
sabaimran	67a8795b1f	Clean-up commented out code	2023-07-05 11:24:40 -07:00
sabaimran	79b1b1d350	Save streamed chat conversations via partial function passed to the ThreadGenerator	2023-07-04 17:33:52 -07:00
sabaimran	afd162de01	Add reference notes to result response from GPT when streaming is completed - NOTE: results are still not being saved to conversation history	2023-07-04 12:47:50 -07:00
sabaimran	8f491d72de	Initial code with chat streaming working (warning: messy code)	2023-07-04 10:14:39 -07:00
Debanjum Singh Solanky	5889eceba4	Make text selectable in Khoj chat modal on Obsidian Previously the text in the Khoj chat modal couldn't be copied as it was not selectable Resolves #206	2023-07-03 23:24:04 -07:00
sabaimran	89354def9b	Update request timeout window to 20 seconds	2023-07-03 22:28:18 -07:00
sabaimran	b1940519c3	Log error if unable to decode chunk from Github	2023-07-03 16:29:32 -07:00
Debanjum Singh Solanky	ecf9730cd7	Disable Chat, Search on Web if Khoj not configured & show next steps	2023-07-03 16:04:32 -07:00
sabaimran	017e8c1aef	Skip indexing a PDF that has an indexing error (#274 )	2023-07-03 15:55:11 -07:00
sabaimran	a6f313589e	Release Khoj version 0.7.1	2023-07-03 12:26:41 -07:00
Debanjum Singh Solanky	70f6b8266c	Upgrade minimum supported pydantic version	2023-07-03 12:22:56 -07:00
sabaimran	8bfd5828e6	Remove deprecation notice since we're opening the web UI by default	2023-07-03 12:01:09 -07:00
sabaimran	92d81d3b16	Initialize the search.model field to SearchModels() and fix Reinitialize API call (#273 )	2023-07-03 11:32:44 -07:00
sabaimran	61403138d5	Merge pull request #269 from khoj-ai/features/simplify-configuration-steps Simplify some common configuration steps	2023-07-03 00:16:51 -07:00
sabaimran	ea3dc2cfa3	Simplify rendering of content type pages and logic of selecting config	2023-07-03 00:15:29 -07:00
sabaimran	260272dca2	Check if state.config is populated before configuring via the update method	2023-07-03 00:10:56 -07:00
sabaimran	bf8914d0c8	Fix default config initialization for for chat.html	2023-07-03 00:00:47 -07:00
Debanjum	faad1297f4	Drop Support for Org Music, Ledger Content Types Removing unused content types will reduce khoj code to manage - `0f993b3` Drop support for Ledger as a separate content type Khoj will soon get a generic text indexing content type in Index plain text files #237. This along with a file filter should suffice for searching through Ledger transactions - `c9db532` Remove unused org-music as an indexable content type from Khoj Org-music was just a custom content type that worked with org-music. It was mostly only useful for me.	2023-07-02 17:48:29 -07:00
Debanjum Singh Solanky	0f993b332e	Drop support for Ledger as a separate content type Khoj will soon get a generic text indexing content type. This along with a file filter should suffice for searching through Ledger transactions, if required. Having a specific content type for niche use-case like ledger isn't useful. Removing unused content types will reduce khoj code to manage.	2023-07-02 16:57:49 -07:00
sabaimran	fa218ff5aa	Fix call to update for Reinitialize button	2023-07-02 16:31:30 -07:00
sabaimran	a8b83da872	Merge branch 'master' of github.com:debanjum/khoj into features/simplify-configuration-steps	2023-07-02 16:21:54 -07:00
Debanjum Singh Solanky	c9db5321e7	Remove unused org-music as an indexable content type from Khoj Org-music was just a custom content type that worked with org-music. It was mostly only useful for me. Cleaning up that code will reduce number of content types for khoj to manage.	2023-07-02 16:21:21 -07:00
sabaimran	77a45f4215	Merge pull request #265 from khoj-ai/fix/obsidian-setup-issues Fix configuration setup logic in Obsidian	2023-07-02 16:21:18 -07:00
sabaimran	b86a3bb0c5	Merge branch 'master' of github.com:debanjum/khoj into fix/obsidian-setup-issues	2023-07-02 16:21:05 -07:00
sabaimran	a52c1c8380	Use built-in app.vault to determine whether there are any PDF files within	2023-07-02 16:20:43 -07:00
sabaimran	eff1436857	Overwrite existing PDFs in Obsidian as well, make if-block more legible	2023-07-02 16:17:25 -07:00
Debanjum Singh Solanky	30459ee4ba	Fix Khoj subtitle in desktop entry, pyproject, cli and Obsidian Readme	2023-07-02 16:09:07 -07:00
sabaimran	feac71ce1e	Merge pull request #268 from khoj-ai/fix/threading-issue-in-update-api Add try-except-finally blocks around configure calls in /update	2023-07-02 16:08:29 -07:00
sabaimran	1a1b044d12	Simplify settings pages for configuration - Add one-click disablement - Remove fields that probably don't need to be edited (our implementation details) - Add a green tick if a given field is configured	2023-07-02 16:04:05 -07:00
sabaimran	e4c445f805	Add try-except-finally blocks around configure calls in /update	2023-07-02 13:35:02 -07:00
sabaimran	4b02a8c788	Fix PDF setup in Obsidian plugin and force Obsidian configuration for markdown	2023-07-02 12:37:24 -07:00
sabaimran	b6772d8fc3	Merge pull request #264 from khoj-ai/fix/remove-guidance-for-desktop-gui Escape special characters in the URL when adding a link to the remote file	2023-07-02 09:14:08 -07:00
sabaimran	2a7e4f2b71	Escape special characters in the URL when adding a link to the remote file	2023-07-02 09:13:28 -07:00
sabaimran	4915b7214d	Merge pull request #263 from khoj-ai/fix/remove-guidance-for-desktop-gui [Fix] Remove the default behavior of using GUI for Khoj	2023-07-01 21:37:11 -07:00
sabaimran	c747562897	Update the GUI to just be a simple box with a button for the web UI	2023-07-01 20:37:21 -07:00
sabaimran	bab7f39d47	Move logic to open the web browser into the GUI section	2023-07-01 20:11:27 -07:00
sabaimran	36537606da	Update unit test and preserve prior operational ordering in main.py	2023-07-01 20:02:35 -07:00
sabaimran	ea9ae4ae28	Configure Khoj to automatically open the browser to their web home page when Khoj is up	2023-07-01 19:46:31 -07:00
sabaimran	d2083dd395	Remove bespoke processing for GithubToJsonl file demo	2023-07-01 19:09:22 -07:00
sabaimran	a71440f62a	Update the guidance in the error message if config is not set	2023-07-01 19:09:00 -07:00
sabaimran	7db97d8aa9	Fix: don't try to render the search_type.ALL	2023-07-01 19:08:19 -07:00
sabaimran	f0f6390366	Make --no-gui the default behavior of Khoj and update corresponding documentation	2023-07-01 19:07:59 -07:00
Debanjum Singh Solanky	2fbc609233	Add content write permission to jobs in github release workflow	2023-07-01 06:23:45 -07:00
Debanjum Singh Solanky	d77e05c279	Release Khoj version 0.7.0	2023-07-01 05:44:22 -07:00
Debanjum Singh Solanky	32d73500ba	Update Khoj Github Plugin details in main Readme	2023-07-01 02:18:47 -07:00
Debanjum Singh Solanky	30d87a9a01	Update color of Khoj chat in Obsidinan plugin to Lantern theme	2023-07-01 02:18:47 -07:00
Debanjum Singh Solanky	51826d28d6	Ensure clicking Update in Khoj Obsidian indexes PDF files too	2023-07-01 02:18:47 -07:00
sabaimran	dac2d14380	Handle file names appropriately for md files and render commits in github results	2023-07-01 01:20:58 -07:00
sabaimran	dbe713604d	Fix error in tests for markdown_to_jsonl	2023-07-01 00:49:40 -07:00
sabaimran	931aab4464	Handle case for when headers value is None	2023-07-01 00:37:30 -07:00
sabaimran	d01afb3ee4	Fix path issues for URL-based markdown files	2023-07-01 00:25:11 -07:00
sabaimran	01aa285d7b	Merge pull request #260 from khoj-ai/features/add-demo-views-for-khoj Add demo view for Khoj	2023-06-30 21:57:43 -07:00
sabaimran	31655447e7	Add the sign-up list to the chat page as well and update copy	2023-06-30 21:43:01 -07:00
sabaimran	cebaa51c2f	Merge branch 'master' of github.com:debanjum/khoj into features/add-demo-views-for-khoj	2023-06-30 20:39:02 -07:00
sabaimran	796102c74e	Add separate configuration if the given Khoj instance is meant for demo - In theory, this will be suitable for any Khoj instance that's meant for external-facing purposes (as in, outside of the user's network) - Prevent re-indexing for Github data if this is a demo instance - Fix up some issues with the CSS which made settings page small in mobile - In the frontend views for Khoj, add a button to get on the waitlist and links to the landing page	2023-06-30 20:38:55 -07:00
sabaimran	a443af3a71	Merge pull request #256 from khoj-ai/features/improve-telemetry Add additional request headers to improve telemetry	2023-06-30 20:35:41 -07:00
sabaimran	db3026739d	Resolve diffs in api.py to make /chat endpoint async with new request parameter	2023-06-30 00:25:37 -07:00
sabaimran	ef72508914	Try/catch around github file decoding, await call to search in chat API, fix img width	2023-06-30 00:23:21 -07:00
Debanjum Singh Solanky	b950889f47	Fix org-mode web renderer to handle results containing list in block - Break out of rendering list if at end of org block in org.js - This would previous hang rendering results in web interface Should try fix this upstream in org.js as well	2023-06-29 19:01:25 -07:00
sabaimran	780c769567	Add additional request headers to improve telemetry	2023-06-29 18:51:24 -07:00
sabaimran	6c10d68262	Merge pull request #253 from khoj-ai/features/github-issues-indexing Support indexing Github issues as well as corresponding comments	2023-06-29 16:02:47 -07:00
sabaimran	b2dd946c6d	Rename issue to entry method for accuracy	2023-06-29 15:23:50 -07:00
Debanjum Singh Solanky	51dfa48e2b	Have Khoj support Python 3.11 as Pytorch supports it now - Previously Khoj could only support Python upto 3.10 due to pytorch. But lots of folks had python 3.11 installed by default on their machines. This required installing python 3.10 and dealing with virtual envs. With Torch >= 2.0.1 now able to support python 3.11, at least one class of installation troubles for Khoj should drop. See https://github.com/pytorch/pytorch/issues/86566 for reference - Preliminary testing indicates using the new torch 2.x may reduce search time by 25% (from 80ms to 60ms on Mac M1) - Update Docs to not require mentioning python <=3.10 required - Update Github test workflow to run khoj tests with python 3.11 too	2023-06-29 15:13:26 -07:00
sabaimran	65bf894302	Interpret org files as a list and put them in separate divs. Update styling of search results to separate into cards	2023-06-29 15:12:48 -07:00
Debanjum Singh Solanky	d212298573	Make Configure button on web interface incrementally update by default We should add a way to force index everything. But force indexing should not be the default when user is just trying update content to index	2023-06-29 14:52:51 -07:00
Debanjum Singh Solanky	da2de21339	Only return requested result count even if search in multiple content types - Set results_count to default value at start so it is an int, never None	2023-06-29 14:49:05 -07:00
sabaimran	77672ac0ae	Demarcate different results with a border box - Add back support for searching by type Github - Remove custom class name in markdown js file	2023-06-29 14:14:25 -07:00
sabaimran	6edc32f2f4	Accept current changes to include issues in rendering flow	2023-06-29 12:25:29 -07:00
Debanjum	f272d4503e	Search across all Asymmetric Text Content Types in Parallel - Allow searching across asymmetric text content types using threads - Query time on my Mac averages 95ms latency (140ms at 90 percentile) across (Org, Markdown, Github, PDF and Music content types) - This is not too much more than search for a single content type (maybe max ~50% latency increase?). Encoding query is what takes most of the time anyway and that's just done once like before, threading adds some overhead - An average of `95 ms` latency or `140ms` at 90th percentile is inline with keeping an incremental search (search-as-you-type) experience - Put logic to remove filter terms from query in a `defilter` method for each filter - Encode query once during search to encode query once across all (asymmetric) content types - Search across all content types via the web and emacs interfaces in [`d5fb419`](`d5fb4196de`) and [`5c4eb95`](`5c4eb950d5`) respectively - Allow Khoj Chat to pull relevant data from across content types (without the perf hit). Khoj chat is only pulling data from a single content type currently	2023-06-29 12:21:27 -07:00
sabaimran	b41c14b258	Use *.markdown in the khoj_docker.yml	2023-06-29 11:55:18 -07:00
sabaimran	e6053951f0	In chat conftest fixtures, use .markdown rather than .md	2023-06-29 11:53:47 -07:00
sabaimran	ab7dabe74f	Explicitly use Union type for function parameters for lint checks	2023-06-29 11:44:30 -07:00
sabaimran	601b738135	Bonus: Rename all md files to markdown for cleanliness	2023-06-29 11:27:47 -07:00
sabaimran	fecf6700d2	Limit small image rendering to just the avatar images	2023-06-29 11:27:18 -07:00
sabaimran	70e550250a	Add an additional data source for issues from Github repositories + quality of life updates - Use a request session to reduce the overhead of setting up a new connection with the Github URL each request - Use the streaming feature for the REST api to reduce some of the memory footprint	2023-06-29 10:59:54 -07:00
Debanjum Singh Solanky	5f2717cc4b	Use logger.warning since logger.warn is deprecated	2023-06-28 22:15:27 -07:00
Debanjum Singh Solanky	5f7eaa7ded	Add trio, move freezegun, factory-boy to project test dependencies	2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky	56ce97ef9e	Use async/await in tests for query method of text and image search The text, image search query method has become async. So async/await is required to get results correctly in tests etc	2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky	f516d127c8	Update client tests to expect "all" as a valid new content type	2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky	b1767f93d6	Get any configured asymmetric search model to encode query for search - Set image_search.query to async to use it with multi-threading This is same as text_search.query being set to an async method - Exit search early if no search_model is defined in state.model	2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky	8eae7c898c	Put each result under org heading when query for "all" content type in khoj.el - Add "all" as default content type when no content type retrieved from server	2023-06-28 22:07:02 -07:00
Debanjum Singh Solanky	630bf995f1	Style each result based on its content type in same view on Khoj web - So when searching across content types (with content-type = "all") org-mode results get rendered differently than markdown, PDF etc. results - Set div class for each result separately instead of a single uber div for styling. This allows styling div of each result based on the content-type of that result - No need to create placeholder "all" content type on web interface as server is passing an all content type by itself	2023-06-28 22:07:01 -07:00
Debanjum Singh Solanky	1773a78339	Fix createRequestUrl method signature to fetch results from khoj web	2023-06-28 12:10:45 -07:00
Debanjum Singh Solanky	212b1a96c8	Create "all" search type for search across all content types on khoj server Allows moving logic to handle search across all content types to server from clients	2023-06-28 11:34:26 -07:00
Debanjum Singh Solanky	0636ceaf14	Merge branch 'master' of github.com:khoj-ai/khoj into parallelize-search-across-all-asymmetric-text-content-types Conflicts: - src/khoj/routers/api.py: Use theirs	2023-06-27 16:10:32 -07:00
Debanjum Singh Solanky	510bb7e684	Use typing union in text_search for python 3.8 compatible type hinting	2023-06-27 15:59:50 -07:00
Debanjum Singh Solanky	1b11d5723d	Extract search request URL builder into js function in web interface	2023-06-27 15:50:41 -07:00
Debanjum Singh Solanky	09f739b8cc	Null check config, log warning instead of error when configuring search	2023-06-27 15:48:48 -07:00
sabaimran	c0d35bafdd	Merge pull request #250 from khoj-ai/features/github-multi-repo-and-more Support multiple Github repositories and support indexing of multiple file types	2023-06-27 15:14:49 -07:00
sabaimran	9d62d66a77	Simplify construction of repo shorthand in GithubToJsonl	2023-06-27 15:05:03 -07:00
sabaimran	2697c7a186	Update org tests to use new method, update Github configuration in tests	2023-06-27 15:04:48 -07:00
sabaimran	227169ebde	Support configuration of multiple Github repositories in the settings interface - Add cards to configure each of the Github repositories - Fix a bug in the API which caused all other settings to be wiped when updating one of the content types - Provide an error message to the user if they have a misconfiguration in their chat settings	2023-06-27 14:10:09 -07:00
sabaimran	37a1f15c38	Add backend support for indexing multiple repositories - Add support for indexing org files as well as markdown files from the Github repository and update corresponding search view - Support indexing a list of repositories	2023-06-27 12:06:15 -07:00
Debanjum Singh Solanky	5da6a5e669	Build docker image using latest khoj from git master - Previous state Ideally docker image should use latest app code available locally. But this is better than the previous state where the latest Docker image was being built using older khoj package published to pypi This would happen because the workflow to publish the khoj-assistant pypi package runs in parallel to the dockerize workflow so the latest khoj pypi package isn't published before the latest docker image is built on master - Updated state Now at least the docker image published via the dockerize github workflow will be built using the latest khoj code on github	2023-06-26 20:16:07 -07:00
sabaimran	ddd550e6f4	Add call to use X-CSRFToken in relevant POST methods	2023-06-26 12:38:00 -07:00
sabaimran	35e24d7851	Fix null checking in state for content config API and telemetry API	2023-06-26 11:37:34 -07:00
sabaimran	5e39421f56	Merge branch 'master' of github.com:debanjum/khoj	2023-06-25 11:41:47 -07:00
sabaimran	4410a3bb4b	Limit max width of the pre tag to 100% of the screen width	2023-06-25 11:41:15 -07:00
sabaimran	ffe66b848a	Use a single column tempalte for config plugins when in mobile	2023-06-25 11:27:41 -07:00
Debanjum Singh Solanky	b1890aa050	Null check intermediary objects when config not fully initialized	2023-06-24 15:34:18 -07:00
Debanjum Singh Solanky	946af0889d	Improve showing status message on saving config via web interface - Show success/failure status message much closer to the save button Previously status message was shown on top of the page, which wasn't always in view and wasn't easily seen - Improve the status message to more clearly show next steps on success	2023-06-24 00:49:57 -07:00
Debanjum Singh Solanky	40d1abfe50	Update the new /config APIs to configure Khoj for first time users - Setup state.config and sub-components from unset state - Setup search types with default settings	2023-06-24 00:45:30 -07:00
Debanjum Singh Solanky	05a3c81adb	Add beautiful as dependency to pass pytests	2023-06-23 15:10:09 -07:00
Debanjum Singh Solanky	edabede93a	Fix post configuration state update on error or success on config html	2023-06-23 14:52:25 -07:00
Debanjum	98642e01b5	Update Web Interface with Lantern Theme - Style all pages with consistent lantern theme styling - Add navigation pane to all web interface pages - a200af68b38d0625c42e2098d171c6ddab121bd2 Keep pico.css locally for offline usage - cd8d069e6673b4db4c14f736c3d8af80bf94614d Highlight currently active tab in web interface - Update config pages to use Lantern theme	2023-06-23 14:39:25 -07:00
Debanjum Singh Solanky	4744d69221	Resolve button name, anchor tag feedback. Add status message to settings page - Use "Configure" name for settings config action - Use more standard anchor tag instead of button - Add configure status message	2023-06-23 09:48:38 -07:00
Debanjum Singh Solanky	26abafa658	Highlight currently active tab in web interface for orientation	2023-06-22 00:33:28 -07:00
Debanjum Singh Solanky	2728c714d7	Put pico.css in local assets. Move common css styling into khoj.css	2023-06-22 00:33:11 -07:00
Debanjum Singh Solanky	20a37697de	Add Khoj header with navigation pane to Search and Chat Interfaces	2023-06-22 00:33:11 -07:00
Debanjum Singh Solanky	c467a0cbb0	Update UI of config sub pages to use khoj lantern theme styling	2023-06-22 00:33:11 -07:00
Debanjum Singh Solanky	0ce2ec590a	Update main config page on khoj server to match khoj lantern theme	2023-06-21 20:25:25 -07:00
Debanjum Singh Solanky	d30a9ddd33	Use Khoj Logo on Search, Chat pages of Web Interface	2023-06-21 12:34:53 -07:00
Debanjum Singh Solanky	6d4aad57e1	Use new Khoj Lantern Logo in Web, Emacs, Obsidian UIs and Docs	2023-06-21 01:57:22 -07:00
Debanjum Singh Solanky	69d4fa6525	Rename project links across repo from debanjum/khoj to khoj-ai/khoj	2023-06-21 00:13:21 -07:00
Debanjum Singh Solanky	5c4eb950d5	Search across all content types via khoj.el on Emacs If no content-type selected in transient menu option, khoj.el queries khoj server without content-type parameter (t) set. This results in search across all enabled asymmetric search text content types	2023-06-20 23:39:56 -07:00
Debanjum Singh Solanky	2cd3e799d3	Improve null and type checks	2023-06-20 23:30:59 -07:00
Debanjum Singh Solanky	d5fb4196de	Update web interface to allow querying all content types at once	2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky	5c7c8d1f46	Use async/await to fix parallelization of search across content types	2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky	1192e49307	Pass default value matching argument types expected by text_search methods	2023-06-20 22:21:50 -07:00
Debanjum Singh Solanky	0144e610d6	Only search across content types that work with asymmetric search	2023-06-20 22:21:46 -07:00
Debanjum Singh Solanky	f6a7aa6c96	Style Khoj chat on web interface with new lantern theme - Color khoj chat message with new yellow theme color - Update Khoj chat emoji to lantern - Add page type to title of pages on web interface	2023-06-20 01:39:33 -07:00
Debanjum Singh Solanky	6d94d6e75a	Encode the asymmetric, symmetric search queries in parallel for speed Use timer to measure time to encode queries and total search time	2023-06-20 01:18:17 -07:00
Debanjum Singh Solanky	d292dc03b3	Use new Khoj Logotype in Web interface	2023-06-20 01:13:06 -07:00
Debanjum Singh Solanky	db07362ca3	Encode user query as same across search types to speed up query time - Add new filter abstract method to remove filter terms from query - Use the filter method to remove filter terms, encode this defiltered query and pass it to the query methods of each search types TODO: Encoding query is still taking 100-200 ms unlike before. Need to investigate why	2023-06-19 23:29:54 -07:00
Debanjum Singh Solanky	285d17af2a	Search in parallel across all enabled content types requested via API - Update API to return content from all enabled content types when type is not set to specific type in HTTP request param - To do this efficiently run the search queries in parallel threads	2023-06-19 23:29:06 -07:00
Debanjum Singh Solanky	79d325fbb6	Fix triggering @general queries in Khoj Chat	2023-06-19 23:05:33 -07:00
Debanjum Singh Solanky	e97a20d70c	Set conversation type if query param set, else return chat history Only initialize variables if query is not empty, to avoid unnecessary compute, variable null checks etc. Fixes #230	2023-06-19 19:59:16 -07:00
sabaimran	6224dce49d	Merge pull request #228 from debanjum/features/pretty-config-page Update the config page to be more usable	2023-06-19 18:11:35 -07:00
sabaimran	4722a2c16d	Add Github configuration page and success notifications	2023-06-18 10:06:45 -07:00
sabaimran	668135c763	Merge branch 'master' of github.com:debanjum/khoj into features/pretty-config-page	2023-06-18 08:35:09 -07:00
sabaimran	81183a1fe1	Address misc PR comments and update logo in all clients - Rename the new logo to reflect accuracy on size (e.g., 128x128) - Update the icns file for Mac - Update nomenclature in settings pages	2023-06-18 08:34:58 -07:00
Debanjum Singh Solanky	a44cde2865	Show hint to re-index vault if wonky results in Obsidian search modal Remove spurious indentation in Obsidian styles.css Resolves #207	2023-06-18 04:53:51 -07:00
Debanjum Singh Solanky	595cc5b0f5	Use printer icon for PDF logs. Only split lines if file at web link in web interface	2023-06-18 02:26:03 -07:00
Debanjum	e06be395f9	Use Github REST API and Index Commit Messages off Github Repository - Migrate to Github REST API instead of Llama Hub to index Markdown Docs in Github Repository - Index Commit Messages from Github Repository as well	2023-06-18 14:51:32 +05:30
Debanjum Singh Solanky	e31a540a5e	Get all md files recursively in repository by passing recursive param Previously the `get_markdown_files' method was only getting files at root of the repository Fix, improve logger messages in github to jsonl processor	2023-06-18 01:47:15 -07:00
Debanjum Singh Solanky	6fdac24416	Set page size to 100 to reduce requests required to Github API to 1/3 - Default is 30. So number of paginated requests required to get all items (commits, files) will reduce by 67% - No need to increase page size for the get tree Github API request from `get_markdown_files' Get tree Github API doesn't support pagination and return 100K items in response. This should be way more than enough for our current use-cases	2023-06-18 01:44:36 -07:00
Debanjum Singh Solanky	87975e589a	Fix passing auth token to Github API to increase rate limits by x85 - Previously wasn't prefixing "token" to PAT token in Auth header This resulted in the request being considered unauthenticated - Unauthenticated requests to Github API are limited to 60 requests/hour Authenticated requests to Github API are allowed 5000 requests/hour	2023-06-18 01:19:26 -07:00
Debanjum Singh Solanky	9c70af960c	Extract logic to get file content from Github into a separate method	2023-06-18 01:19:13 -07:00
Debanjum Singh Solanky	10d4c38ce9	Extract Wait for rate limit reset logic into a function for reuse	2023-06-18 01:06:46 -07:00
sabaimran	aad7f825e0	Remove music configuration	2023-06-17 21:23:56 -07:00
sabaimran	5f97afbfac	Ignore type checks from mypy in subindexed fields	2023-06-17 16:53:36 -07:00
sabaimran	c2d46de8bc	Add endpoint for regenerating directly from the config page and add music content-type	2023-06-17 15:47:33 -07:00
sabaimran	ded3100caf	Update the configuration page to make config management easier - Add a central configuration management page to make management of config details easier - Add relevant api endpoints both for client and server to update/request data as necessary - Attempt to update the favicon	2023-06-17 15:21:28 -07:00
Debanjum Singh Solanky	3f24e53b6e	Render URL as link in web interface if file param of result is a web link	2023-06-17 04:26:40 -07:00
Debanjum Singh Solanky	63ec84ad78	Store Github URL of Markdown files on Github in file jsonl param	2023-06-17 04:23:01 -07:00
Debanjum Singh Solanky	0c1c7583b5	Handle pagination, API rate limits. Get all commits from Github repo	2023-06-17 04:21:39 -07:00
Debanjum Singh Solanky	31d17d0b22	Index commits message from repository with the github plugin	2023-06-17 02:59:54 -07:00
Debanjum Singh Solanky	c29c141a7e	Use Github Rest API to index Markdown files in Github Repository The Llama_Hub Github plugin is fairly limited. The Github Rest API is well supported and can easily be extended to index commit messages, issues, discussions, PRs etc.	2023-06-17 02:16:13 -07:00
Debanjum	9f00a366ab	Add a Github plugin to index content from a Github repository - Use the Github plugin on LlamaHub to read in markdown files from specified Github repository for indexing - Update the desktop GUI application to take in the required parameters to read from Github - Requires a classic PAT token for Github access	2023-06-17 12:28:47 +05:30
Saba	ac96f43b1b	Remove try-catch specific to Github plugin; consolidate GUI logic	2023-06-16 23:46:25 -07:00
Saba	07ade2262a	Set default value of pat_token in conftest.py to be empty string	2023-06-13 17:03:03 -07:00
Saba	751edfefe5	Add separate unit test for github. Will only run of a PAT token is set	2023-06-13 16:55:58 -07:00
Saba	3a61919344	Fix failing unit tests by hard-coding model presence of expected search types	2023-06-13 16:32:47 -07:00
Saba	019d3732de	Rename orgmode_search to org_search	2023-06-13 16:06:54 -07:00
Saba	08d79f5ba4	Unify types used in Github and other text-based configs. Fix typing issues	2023-06-13 15:52:36 -07:00
Saba	a6cd96a6a9	Add a Github plugin which can be used to read from a Github repository	2023-06-13 14:40:06 -07:00
Debanjum	c68cde4803	Log clients calling API endpoints on Khoj server - Make API endpoints on Khoj server accept `client` as request parameter - Khoj API endpoints: /chat, /search, /update - Make Khoj clients set `client` request param when calling the API endpoints on the Khoj server - Khoj clients: Emacs, Obsidian and Web - Also log khoj server_version running to telemetry server	2023-06-09 18:36:49 +05:30
sabaimran	59fa48036f	Merge pull request #224 from debanjum/fix/message-exceeds-prompt-size Pass truncated message as string in ChatMessage when exceeding max prompt size	2023-06-08 17:32:53 -07:00
Debanjum Singh Solanky	139a3ba060	Update server to log new server version field to telemetry db	2023-06-08 14:14:21 +05:30
Saba	c5666e0404	Move factory dependencies to optional settings	2023-06-06 23:26:24 -07:00
Saba	5d5ebcbf7c	Rename truncate messages method and update unit tests to simplify assertion logic	2023-06-06 23:25:43 -07:00
Saba	7119ed0849	Run pre-commit script	2023-06-05 19:29:23 -07:00
Saba	948ba6ddca	Remove unused logger	2023-06-05 19:01:03 -07:00
Saba	6212d7c2e8	Remove debug line	2023-06-05 19:00:25 -07:00
Saba	f65ff9815d	Move message truncation logic into a separate function. Add unit tests with factory boy.	2023-06-05 18:58:29 -07:00
Debanjum Singh Solanky	eb6175e9b0	Update description field in webmanifest of Khoj, Khoj Chat PWA	2023-06-06 01:53:42 +05:30
Debanjum Singh Solanky	bb2363f324	Set client request param when calling khoj server APIs from Web	2023-06-06 00:05:00 +05:30
Debanjum Singh Solanky	caab55fbdd	Set client request param when calling khoj server APIs from Obsidian	2023-06-06 00:04:46 +05:30
Debanjum Singh Solanky	de2494154f	Set client request param when calling khoj server APIs from Emacs	2023-06-06 00:02:10 +05:30
Debanjum Singh Solanky	168c11cea7	Make server API endpoints accept client as query param - The chat, search and update API will accept client as request param. - This will allow logging the client from which these APIs was called.	2023-06-05 23:57:08 +05:30
Debanjum Singh Solanky	8617cf1389	Push telemetry to Posthog to grok Khoj usage	2023-06-05 22:47:49 +05:30
Debanjum Singh Solanky	d13db2e666	Make old telemetry server forward requests to new server	2023-06-05 13:06:45 +05:30
Saba	5f4223efb4	Increase timeout to OpenAI call	2023-06-04 20:49:47 -07:00
Saba	0e63a90377	Fix the mechanism to retrieve the message content	2023-06-04 20:25:37 -07:00
Saba	f0efe0177e	Pass truncated message as string in ChatMessage when exceeding max prompt size	2023-06-04 19:33:46 -07:00
Debanjum	f6ceb22373	Use api_key keyword argument to set the openai_api_key parameter for GPT	2023-06-04 15:05:34 +05:30
Saba	068ee0ac5e	Swap elif with else, as usage of this method does not use openai_api_key	2023-06-04 02:25:08 -07:00
Saba	6508379d7b	Use api_key keyword argument to set the openai_api_key parameter for GPT	2023-06-04 00:57:00 -07:00
Debanjum Singh Solanky	7af8a56434	Remove filename from reference before rendering references in khoj.el Fixes bug where actual reference heading in next line jumping out of references footnote section	2023-06-02 10:42:44 +05:30
Debanjum Singh Solanky	ec280067ef	Do not retrieve relevant notes when having a general chat with Khoj - This improves latency of @general chat by avoiding unnecessary compute - It also avoids passing references in API response when they haven't been used to generate the chat response. So interfaces don't have to add logic to not render them unnecessarily	2023-06-02 10:42:44 +05:30
Debanjum Singh Solanky	90439a8db1	Update Khoj subtitle to AI personal assistant for your digital brain	2023-06-02 10:42:44 +05:30
Debanjum	e022910f31	Search PDF files with Khoj. Integrate with LangChain - Introduce Khoj to LangChain: Call GPT with LangChain for Khoj Chat - Search (and Chat about) PDF files with Khoj - Create PDF to JSONL Processor: Convert PDF content into standardized JSONL format - Expose PDF search type via Khoj server API - Enable querying PDF files via Obsidian, Emacs and Web interfaces	2023-06-02 10:20:26 +05:30
Debanjum Singh Solanky	e9ed7a19fd	Update search prompt to extract PDF search type. Fix extract_question prompt	2023-06-02 10:06:03 +05:30
Debanjum Singh Solanky	89fbfce20a	Mention PDF are also supported in Khoj Readme	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	bbe3bf9733	Render PDF search results in Khoj Obsidian interface - Make plugin update khoj server config to index PDF files in vault too - Make Obsidian plugin update index for PDF files in vault too - Show PDF results in Khoj Search modal as well - Ensure combined results are sorted by score across both types - Jump to PDF file when select it PDF search result from modal	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	e3892945d4	Render PDF search results in Khoj.el Emacs interface	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	85144006a1	Render PDF search results in khoj web interface	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	acd14a5e41	Wire up PDF to jsonl processor to Khoj server layer (API, config) - Specify PDF content to index via khoj.yml - Index PDF content on app start, reconfigure - Expose PDF as a search type via API	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	d63194c3a9	Create tests for PDF to JSONL processor	2023-06-01 21:42:48 +05:30
Debanjum Singh Solanky	286b500f66	Create PDF to JSONL processor using PyPDF and LangChain Switch `pydantic' to >= 1.9.1 else `langchain.document_loaders' starts throwing typing error for python 3.8, 3.9	2023-06-01 21:41:49 +05:30
Debanjum Singh Solanky	1b3effd8e6	Fork Markdown to JSONL processor as start template for PDF to Jsonl Processor	2023-06-01 09:13:31 +05:30
Debanjum Singh Solanky	1cd9ecd449	Truncate last message if still over max supported prompt size by model	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	ed4d0f9076	Simplify argument names used in khoj openai completion functions - Match argument names passed to khoj openai completion funcs with arguments passed to langchain calls to OpenAI - This simplifies the logic in the khoj openai completion funcs	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	703a7c89c0	Reduce retry count and request timeout for faster response or failure - Fix bug where both LangChain and Khoj retry requests 6 times each. So a total of 12 requests at >1minute intervals for each chat response in case of OpenAI API being down - Retrying too many times when the API is failing doesn't help - The earlier 60 second request timeout was spacing out the interval between retries way too much. This slowed down chat response times quite a bit when API was being flaky - With these updates you'll know if call to chat API failed in under a minute	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	18081b3bc6	Use LangChain to call GPT over API	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	277d2f5c96	Do not add "Notes:" suffix to chat messages when no notes retrieved This was causing spurious "Notes:" suffix being added to Khoj Chat in response	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	334be4e600	Use LangChain to call OpenAI for Khoj Chat - Use ChatModel and ChatOpenAI to call OpenAI chat model instead of using OpenAI package directly - This is being done as part of migration to rely on LangChain for creating agents and managing their state	2023-06-01 08:50:59 +05:30
Debanjum Singh Solanky	efcf7d1508	Extract prompts as LangChain Prompt Templates into a separate module Improves code modularity, cleanliness. Reduces bloat in GPT.py module	2023-06-01 08:50:58 +05:30
Debanjum Singh Solanky	b484953bb3	Import app state correctly to generate embeddings with OpenAI model Resolves #216	2023-05-28 10:21:54 +05:30
Debanjum Singh Solanky	9cfaaf0941	Update docs to configure khoj.yml for using OpenAI model for embeddings	2023-05-28 10:21:54 +05:30
Debanjum Singh Solanky	a0d0dbaca7	Fix link to Khoj Obsidian Demo video in Readmes	2023-05-23 04:23:08 +05:30
Debanjum Singh Solanky	ebb5d7b8e5	Release Khoj version 0.6.2	2023-05-17 20:04:20 +05:30
Debanjum Singh Solanky	d02415edcc	Write generated server id to env file when env file does not contain it	2023-05-17 19:38:44 +05:30
Debanjum Singh Solanky	dc0626856e	Put the telemetry db in a separate directory by default	2023-05-17 18:58:47 +05:30
Debanjum	dc495babb3	Add Telemetry to Understand Khoj Usage ### Objective: Use telemetry to better understand Khoj usage. This will motivate and prioritize work for Khoj. Specific questions: - Number of active deployments of khoj server - How regularly is khoj used (hourly, daily, weekly etc)? - How much is which feature used (chat, search)? - Which UI interface is used most (obsidian, emacs, web ui)? ### Details - Expose setting to disable telemetry logging in khoj.yml - Create basic telemetry server to log data to a DB - Log calls to Khoj API /search, /chat, /update endpoints - Batch upload telemetry data to server at ~hourly interval	2023-05-17 19:09:50 +08:00
Debanjum Singh Solanky	55d72231b3	Generate docker image for telemetry server using Github workflow	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	e9f04dc644	Add dockerfile to containerize telemetry server	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	07b19964d4	Schedule jobs at (co-)prime intervals to reduce overlap in job runs	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	d42f0f5055	Add basic telemetry server for khoj	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	134cce9d32	Batch upload telemetry data at regular interval instead of while querying	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	3ede919c66	Log usage of /search, /chat, /update API endpoints to telemetry server	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	f2e89f6f46	Add khoj app helper methods to log app usage to a telemetry server	2023-05-17 16:08:21 +05:30
Debanjum Singh Solanky	9ca61d62ff	Enable/disable logging telemetry by setting bool in khoj.yml config We log usage telemetry by default, unless setting explicitly set in khoj.yml	2023-05-15 23:26:38 +08:00
Debanjum Singh Solanky	131b8407b5	Allow Khoj Chat to respond to general queries not in reference notes - Khoj chat will now respond to general queries if: 1. no relevant reference notes available or 2. when explicitly induced by prefixing the chat message with "@general" - Previously Khoj Chat would a lot of times refuse to respond to general queries not answerable from reference notes or chat history - Make chat quality tests more robust - Add more equivalent chat response options refusing to answer - Force haiku writing to not give any preable, just the haiku	2023-05-12 18:42:40 +08:00
Debanjum Singh Solanky	cc75f986b2	Test text search index only updates on changes to text content	2023-05-12 17:37:34 +08:00
Debanjum Singh Solanky	f9ccce430e	Allow configuring OpenAI chat model for Khoj chat - Simplifies switching between different OpenAI chat models. E.g GPT4 - It was previously hard-coded to use gpt-3.5-turbo. Now it just defaults to using gpt-3.5-turbo, unless chat-model field under conversation processor updated in khoj.yml	2023-05-03 23:01:13 +08:00
Debanjum	f0253e2cbb	Include Filename, Entry Heading in All Compiled Entries to Improve Search Context Merge pull request #214 from debanjum/add-filename-heading-to-compiled-entry-for-context - Set filename as top heading in compiled org, markdown entries - Note: Khoj was already indexing filenames in compiled markdown entries but they weren't set as top level headings but rather appended as bare text. The updated structure should provide more schematic context of relevance - Set entry heading as heading for compiled org, md entries, even if split by max tokens - Snip prepended heading to avoid crossing model max_token limits - Entries with no md headings should not get heading prefix prepended	2023-05-03 22:59:30 +08:00
Debanjum Singh Solanky	6b535cc345	Snip prepended heading to avoid crossing model max_token limits Otherwise if heading > max_tokens than the search models will just see a heading (with repeated filename) for each compiled entry and not actual content. 100 characters should be sufficient to include filename (not path) and entry heading. If longer rather truncate to pass entry unique text to model for search context	2023-05-03 22:53:13 +08:00
Debanjum Singh Solanky	02aeee60aa	Set filename as top heading of org entries for better search context Previously filename was only being appended to markdown entries. Test filename getting prepended to compiled entry as heading	2023-05-03 22:53:13 +08:00
Debanjum Singh Solanky	94825a70b9	Set heading of md entries to improve search context for long entries Otherwise if a markdown entry is longer than max_tokens, the split entries (apart from first one) do not get their heading context set	2023-05-03 22:53:13 +08:00
Debanjum Singh Solanky	5de04621b5	Set filename as top heading of md entries for better search context Previously filename was appended to the end of the compiled entry. This didn't provide appropriate structured context Test filename getting prepended as heading to compiled entry	2023-05-03 22:50:31 +08:00
Debanjum Singh Solanky	0e3fb59e09	Entries with no md headings should not get heading prefix prepended Files with no headings would previously get their entry be prefixed with a markdown heading prefix (#)	2023-05-03 22:50:31 +08:00
Debanjum Singh Solanky	45a991d75c	Prepend entry heading to all compiled org snippets to improve search context All compiled snippets split by max tokens (apart from first) do not get the heading as context. This limits search context required to retrieve these continuation entries	2023-05-03 22:50:31 +08:00
Debanjum Singh Solanky	3386cc92b5	Fix khoj server config update in khoj.el by unquoting list to cl-push to - cl-push expects a generatlized variable. Else throws (setf quote) undefined warning - This results in the config call failing on calling khoj entrypoint	2023-05-03 15:10:56 +08:00
Debanjum Singh Solanky	948a4274e4	Fix documentation strings and simplify not null checks	2023-05-02 21:47:50 +08:00
Debanjum Singh Solanky	731ef5688f	Use cl-pushnew to fix byte-compile errors with using add-to-list	2023-05-02 21:47:38 +08:00
Debanjum Singh Solanky	f046523b33	Improve khoj.el messages to convey state of khoj server - Remove waiting for server message as it hides the messages from the server - Fix the nil message that were being rendered, by checking before showing messages from server - Consistently prefix messages from khoj with khoj.el	2023-04-28 11:15:13 +08:00
Debanjum Singh Solanky	76df393eb5	Only call khoj server configure API from khoj.el when config updated Previously khoj.el was calling the server configure API even when config was same as before. This had broken the khoj search as you type experience from emacs Also show more details to user about what in khoj is being configured	2023-04-27 20:45:16 +08:00
Debanjum Singh Solanky	ceae06ae9d	Fix khoj.el compilation warnings around unused variables	2023-04-27 20:45:16 +08:00
Debanjum Singh Solanky	8269adf849	Refactor khoj-setup in khoj.el for readability. No functional change	2023-04-27 20:45:00 +08:00
Debanjum Singh Solanky	865d12b6f2	Fix escaping quote in chat references to prevent it breaking out of html	2023-04-27 20:45:00 +08:00
Debanjum Singh Solanky	26cb878327	Add Yarn lockfile for Khoj Obsidian	2023-04-18 00:57:11 +07:00
Debanjum Singh Solanky	e3180d63e6	Sync Khoj Obsidian Tagline with Khoj tagline	2023-04-18 00:56:50 +07:00
Debanjum Singh Solanky	62e6e09521	Release Khoj version 0.6.1	2023-04-17 23:31:35 +07:00
Debanjum Singh Solanky	b079fb31bc	Replace Windows path separators in indexName configured via Khoj Obsidian Resolves #185, #199 - Issue IndexName created from Obsidian Absolute Vault path wasn't replacing windows path, drive separators with underscore. It was only replacing unix path separators - Fix Also replace windows drive and path separators with _ while creating IndexName in Khoj Obsidian plugin	2023-04-17 16:55:33 +07:00
Debanjum Singh Solanky	d90df966a9	Make khoj logger use utf-8 encoding when writing to khoj log file Resolve logger error issue mentioned in #199	2023-04-17 16:55:07 +07:00
Debanjum Singh Solanky	dc3f399f91	Fix to get score associated with SearchResponse in result as string	2023-04-16 20:22:51 +07:00
Debanjum Singh Solanky	d5000c63e1	Update Readmes to use python -m pip install khoj-assistant Makes it easier to tell pip associated with which python is being used. Easier to debug when users have different versions of python installed (e.g 3.10 and 3.11)	2023-04-16 20:17:20 +07:00
Debanjum Singh Solanky	453c84ab79	Add Screenshots of Khoj Chat Interface on Emacs, Obsidian to Readmes	2023-04-07 23:19:47 +07:00
Debanjum Singh Solanky	35aa06067f	Release Khoj version 0.6.0 Upload styles.css via release workflow	2023-03-31 18:13:16 +07:00
Debanjum	8f4e5d3d83	Improve Styling of Khoj Search Modal on Obsidian and Indexing of Markdown Merge pull request #198 from debanjum/improve-khoj-search-for-markdown-obsidian ### Overview - Copied Khoj Search Modal styling from Jim Prince's PR #135 with minor improvements - Implements improvements to the Khoj Search in Markdown/Obsidian suggested by folks. Specifically: - #133 - #134 - #142 ### Changes - `5673bd5` Keep original formatting in compiled text entry strings - `a2ab68a` Include filename of markdown entries for search indexing - `6712996` Create Note with Query as title from within Khoj Search Modal - `d3257cb` Style the search result. Use Obsidian theme colors and font-size - `4009148` For each result: snip it by lines, show filename, remove frontmatter	2023-03-30 14:15:23 +07:00
Debanjum Singh Solanky	5673bd5b96	Keep original formatting in compiled text entry strings - Explicity split entry string by space during split by max_tokens - Prevent formatting of compiled entry from being lost - The formatting itself contains useful information No point in dropping the formatting unnecessarily, even if (say) the currrent search models don't account for it (yet)	2023-03-30 14:02:46 +07:00
Debanjum Singh Solanky	a2ab68a7a2	Include filename of markdown entries for search indexing Append originating filename to compiled string of each entry for better search quality by providing more context to model Update markdown_to_jsonl tests to ensure filename being added Resolves #142	2023-03-30 13:51:36 +07:00
Debanjum Singh Solanky	67129964a7	Create Note with Query as title from within Khoj Search Modal This follows expected behavior for obsidain search modals E.g Ominsearch and default Obsidian search. The note creation code is borrowed from Omnisearch. Resolves #133	2023-03-30 13:51:36 +07:00
Debanjum Singh Solanky	d3257cb24e	Style the search result. Use Obsidian theme colors and font-size Based on PR #135	2023-03-30 12:35:29 +07:00
Debanjum Singh Solanky	40091489c0	For each result: snip it by lines, show filename, remove frontmatter Based on PR #135 Resolves #134	2023-03-30 12:34:55 +07:00
Debanjum Singh Solanky	240db7b4f0	Add screenshot of Khoj chat on Obsidian to Readme. Fix links	2023-03-30 02:49:05 +07:00
Debanjum Singh Solanky	234be96e53	Fix processor key used to configure chat model in khoj obsidian	2023-03-30 01:47:09 +07:00
Debanjum	53d421f9c6	Create Chat Modal for Obsidian Plugin Merge pull request #196 from debanjum/create-chat-modal-for-obsidian - Set your OpenAI API key in the Khoj Obsidian Settings - Use Modal in Obsidian for Chat - Style Chat Modal combining the Khoj Web interface and Obsidian theme style	2023-03-30 01:37:07 +07:00
Debanjum Singh Solanky	c8c0cfd10e	Add Chat features, setup and usage to Khoj Obsidian plugin Readme	2023-03-30 00:32:24 +07:00
Debanjum Singh Solanky	7ecae224e7	Configure OpenAI API Key from the Khoj plugin setting in Obsidian	2023-03-29 23:54:08 +07:00
Debanjum Singh Solanky	3d616c8d65	Use Obsidian font sizes. Improve input field, reference indexing - Give space in the input field. Too narrow previously - References should be indexed from 1 instead of 0 - Use Obsidian font size variables to scale fonts in chat appropriately	2023-03-29 22:13:55 +07:00
Debanjum Singh Solanky	23bd737f6b	Use chat input element to send message on Enter. No send button required	2023-03-29 22:13:30 +07:00
Debanjum Singh Solanky	81e98c3079	Scroll to bottom of modal on open and message send	2023-03-29 18:12:12 +07:00
Debanjum Singh Solanky	59ff1ae27f	Use obsidian theme colors for bg, text. Restrict css namespace via prefix	2023-03-29 18:12:12 +07:00
Debanjum Singh Solanky	001ac7b5eb	Style Obsidian Chat Modal like Khoj Chat Web Interface - Add message sender, date metadata as message footer - Use css directly from Khoj Chat Web Interface. - Modify it to work under a Obsidian modal - So replace html, body styling from web interface to instead styling new "khoj-chat" class attached to contentEl of modal	2023-03-29 18:12:12 +07:00
Debanjum Singh Solanky	112f388ada	Render references next to chat responses by khoj in chat modal	2023-03-28 18:11:03 +07:00
Debanjum Singh Solanky	1d3d949962	Render conversation logs on page load	2023-03-28 14:56:29 +07:00
Debanjum Singh Solanky	cd46a17e5f	Add Khoj Chat Modal, Command in Khoj Obsidian to Chat using API	2023-03-28 14:56:29 +07:00
Debanjum Singh Solanky	c0972e09e6	Rename KhojModal to KhojSearchModal, a more specific name for it In preparation to introduce Khoj chat in Obsidian	2023-03-28 14:56:29 +07:00
Debanjum Singh Solanky	64fff1d372	Release Khoj version 0.5.0	2023-03-28 03:35:59 +07:00
Debanjum Singh Solanky	7478d08803	Update main readme to mention chat features	2023-03-27 22:02:53 +07:00
Debanjum Singh Solanky	fc218508f9	Update khoj.el docs and Emacs Readme for chat, simplified setup	2023-03-27 22:02:47 +07:00
Debanjum	87090531da	Install, Start and Configure Khoj Server from Emacs Merge pull request #193 from debanjum/simplify-khoj-server-setup-on-emacs ## Major Changes - `ae535a0` Configure Khoj chat using khoj.el by setting OpenAI API key in Emacs - `82eb4bf` Setup Khoj server on opening khoj.el - `99d19dc` Start Khoj server from Emacs using khoj.el - `c92d791` Install Khoj server from Emacs using khoj.el This assumes you have python (<3.11) and pip installed in a system path ### Sample Config - Enable Khoj Chat by configuring you OpenAI API Key - Specify Org Files, Directories to Index for Search (and Chat) By default, your org-agenda-files (include archive files)) are indexed - Invoke khoj by calling `C-c s` ``` emacs-lisp (use-package khoj :after org :straight (khoj :type git :host github :repo "debanjum/khoj" :files ("src/interface/emacs/khoj.el")) :bind ("C-c s" . 'khoj) :config (setq khoj-openai-api-key "<YOUR_OPENAI_API_KEY_FOR_KHOJ_CHAT>" khoj-org-directories '("~/docs/notes" "~/docs/journals") khoj-org-files '("~/docs/tasks.org" "~/docs/journal.org" "~/docs/archive.org"))) ```	2023-03-27 18:49:43 +07:00
Debanjum Singh Solanky	83a7ccd729	Fix docstrings and method ordering in khoj.el	2023-03-27 18:33:09 +07:00
Debanjum Singh Solanky	5c2327ee4f	Configure org directories to index from khoj.el Converts paths to glob style regexes that will index all org files recursively under the specified list of path Should help setup for org-roam users from khoj.el	2023-03-27 18:30:53 +07:00
Debanjum Singh Solanky	6e8a40906d	Allow disabling automatic server setup. Fix server start vs ready logic - khoj-auto-setup controls whether to automatically check for and setup khoj server from within Emacs - extract install, start, configure sequence into public, interactive method. Allows calling khoj-setup during package load via init.el - Fix: Do not attempt to configure or wait for server ready if user has said no to auto-setup request - Fix logic to mark server started vs ready - Previously the started/running vs ready variables defs were getting intertwined - Server started indicates server bootup has been triggered - Server ready indicates server API ready to accept requests	2023-03-27 17:53:08 +07:00
Debanjum Singh Solanky	526a927bce	Fix org entry extraction test, variable prefixed with khoj in khoj.el Discovered via failing build and test workflows on Github	2023-03-27 16:44:50 +07:00
Debanjum Singh Solanky	7243059507	Track index update asynchronously via moon phase progressbar in khoj.el	2023-03-27 06:01:04 +07:00
Debanjum Singh Solanky	8a9055f918	Restrict server messages show in echo area to main server files	2023-03-27 04:59:55 +07:00
Debanjum Singh Solanky	ae535a06eb	Configure Khoj chat using khoj.el by setting OpenAI API key in Emacs	2023-03-27 04:59:54 +07:00
Debanjum Singh Solanky	36b17d4ae0	Generalize the directory from config extraction elisp method	2023-03-27 03:44:03 +07:00
Debanjum Singh Solanky	924424c754	Throw actionable exceptions when content types or chat not configured	2023-03-27 02:47:44 +07:00
Debanjum Singh Solanky	359a2cacef	Fix khoj--server-running to work with unconfigured or external server - If khoj server started outside emacs, khoj--server-ready should be set to true by khoj--server-running method (instead of waiting for proc msg) - If khoj server is unconfigured the /config/types endpoint wouldn't return anything. Using config/data/default allows checking khoj server running status without requiring it to be configured as well	2023-03-27 02:45:59 +07:00
Debanjum Singh Solanky	d7fb9a596e	Auto configure server before loading khoj-menu If the config hasn't changed there'll be no update. If config has changed indexing will get triggered asynchronously. But user cannot make query till indexing done As easier to know when server ready to configure	2023-03-27 02:44:02 +07:00
Debanjum Singh Solanky	8a21aff438	Make khoj.el server start, stop, restart, setup methods interactive No need to erase temporary buffers before working on them	2023-03-27 01:53:15 +07:00
Debanjum Singh Solanky	cb40a96c85	Index configured org files from khoj.el - Set `khoj-org-files-index' to list of files to index - Defaults to indexing org-agenda-files - Uses khoj server api to configure org files to index	2023-03-27 01:05:26 +07:00
Debanjum Singh Solanky	50760acc37	Wait for Khoj server to get ready before opening khoj.el transient menu - Use process filter, sentinel to mark when khoj server is ready or not - Display server messages for visibility into server boot-up process - Wait until server ready to open khoj transient menu in Emacs Until then khoj features wouldn't work anyway, so avoids confusion	2023-03-26 13:00:01 +07:00
Debanjum Singh Solanky	82eb4bfd0d	Setup Khoj server on opening khoj from with Emacs - Create helper methods to check, stop, restart, setup khoj server - (Ask to) setup khoj server on calling khoj main entrypoint function	2023-03-26 10:12:06 +07:00
Debanjum Singh Solanky	99d19dcf43	Start Khoj server from Emacs using khoj.el	2023-03-26 09:38:46 +07:00
Debanjum Singh Solanky	c92d79118a	Install Khoj server from Emacs using khoj.el	2023-03-26 08:50:03 +07:00
Debanjum Singh Solanky	e281a498b4	Style Khoj search org buffer via elisp instead of in-buffer settings	2023-03-26 06:34:18 +07:00
Debanjum Singh Solanky	4f655d20ae	Style Khoj chat directly via elisp instead of via in-buffer settings	2023-03-26 06:03:30 +07:00
Debanjum Singh Solanky	f6ff7b1beb	Render foonote reference links as superscript for Khoj Chat on Emacs	2023-03-26 05:33:08 +07:00
Debanjum Singh Solanky	285a2b86d2	Use aiohttp version 3.8.4 as 4.x breaks docker image build	2023-03-26 05:33:02 +07:00
Debanjum Singh Solanky	67c850a4ac	Add retry logic to OpenAI API queries to increase Chat tenacity - Move completion and chat_completion into helper methods under utils.py - Add retry with exponential backoff on OpenAI exceptions using tenacity package. This is officially suggested and used by other popular GPT based libraries	2023-03-26 05:12:35 +07:00
Debanjum	0aebf624fc	Improve Khoj Chat in Emacs, Server Merge pull request #192 from debanjum/improvements-to-khoj-chat-in-emacs ### Khoj Chat on Emacs Improvements - `d78454d` Load Khoj Chat buffer before asking for query to provide context - `93e2aff` Use org footnotes to add references, allows jump to def on click - `5e9558d` Stylize reference links as superscripts and show definition on hover - bc71c19 Use `m` or `C-x m` in-buffer keybindings to send messages to Khoj ### Khoj Chat Server Improvements - `27217a3` Time chat API sub-components for performance analysis - `508b217` Update Chat API, Logs, Interfaces to store, use references as list - d4b3866 Truncate message logs to below max supported prompt size by chat model - `cf28f10` Register separate timestamps for user query and response by Khoj Chat	2023-03-25 05:49:27 +07:00
Debanjum Singh Solanky	ff846f05c5	Clean-up khoj.el based on linting helpers and manual review	2023-03-25 05:47:49 +07:00
Debanjum Singh Solanky	7e36f421f9	Truncate message logs to below max supported prompt size by model - Use tiktoken to count tokens for chat models - Make conversation turns to add to prompt configurable via method argument to generate_chatml_messages_with_context method	2023-03-25 05:13:56 +07:00
Debanjum Singh Solanky	4725416fbd	Use shortcut keybindings in buffer to ease sending messages to Khoj	2023-03-25 05:06:01 +07:00
Debanjum Singh Solanky	508b2176b7	Update Chat API, Logs, Interfaces to store, use references as list - Remove the need to split by magic string in emacs and chat interfaces - Move compiling references into string as context for GPT to GPT layer - Update setup in tests to use new style of setting references - Name first argument to converse as more appropriate "references"	2023-03-24 22:10:11 +07:00
Debanjum Singh Solanky	b08745b541	Keep chat messages at 1 empty line visible distance in khoj.el - Clean redundant concat, format string - Improve variable name to emojified sender	2023-03-24 22:10:11 +07:00
Debanjum Singh Solanky	27217a330d	Time chat API sub-components for performance analysis Time and the search query extraction, search and response generation components	2023-03-24 20:39:41 +07:00
Debanjum Singh Solanky	5e9558d39d	Stylize references shown as footnote links in chat messages - Render references as superscript - Show reference definitions on hover over reference links to ease access - Truncate reference def shown on hover to 70 char - Add continuation suffix, ..., when reference definition truncated	2023-03-24 20:38:05 +07:00
Debanjum Singh Solanky	cf28f104c7	Register separate timestamps for user query and response by Khoj Chat	2023-03-24 18:31:58 +07:00
Debanjum Singh Solanky	93e2aff786	Add references as org footnotes instead of links	2023-03-24 18:31:42 +07:00
Debanjum Singh Solanky	d78454d4ad	Load Khoj Chat buffer before asking for query to provide context	2023-03-24 13:43:46 +07:00
Debanjum	4070d13a96	Create Khoj Chat Interface in Emacs Merge pull request #191 from debanjum/create-chat-interface-on-emacs - Render conversation history in a read-only org-mode buffer for Khoj Chat - Add `chat` as a transient action in the Khoj transient menu - Style chat messages as org-mode entries - Put received date in property drawer and keep it hidden/folded by default - Add Khoj chat response as child entry of the users associated question org entry This allows folding back-n-forth between user and Khoj for easier viewing - Render source notes snippets used as references for response as org-mode links Hovering mouse on link or opening links shows reference note snippets used	2023-03-22 16:32:40 -06:00
Debanjum Singh Solanky	863933daaa	Resolve build issues found by melpazoid	2023-03-23 02:25:34 +04:00
Debanjum Singh Solanky	e9ca04af0d	Require dash, org to run ERT tests for khoj.el	2023-03-23 01:46:26 +04:00
Debanjum Singh Solanky	06df394d6c	Style chat messages as org-mode entries in Emacs - Style Message as Org Entries instead of List - Put khoj response as child of user query entry - Improves color coding for readability - Allows folding each back-n-forth - Put timestamp of message received into property drawer - Use standardized time format for new and old chat messages	2023-03-22 12:00:43 -06:00
Debanjum Singh Solanky	364e6c11af	Render chat history from API in chat buffer on first run - Generalize the render-chat-response method to handle rendering history or chat response from chat API reponse - Trigger rendering of khoj chat history if Khoj chat buffer not created for this session yet	2023-03-22 12:00:35 -06:00
Debanjum Singh Solanky	36b52fdd0a	Properly escape reference links before rendering - Use org-insert-link method to improve link rendering robustness Previous simple mechanism to crete org-links would result in links escaping out of formating. Use a user-facing org-mode method to remove/reduce probability of this - Replace newlines with space to render reference notes as links	2023-03-22 11:05:38 -06:00
Debanjum Singh Solanky	72f63a6ef7	Add basic chat interface for Khoj on Emacs - Query khoj chat API to get Khoj Chat response to user message - Render chat messages as a org-mode list in format: - [sender-name]: [message] - /[receive-date]/ - Add references as org links with context visible on hover, but no jump to note - Require dash library for khoj.el to simplify list manipulation. Use `-map-indexed' method from dash	2023-03-22 10:47:55 -06:00
Debanjum Singh Solanky	e4d67694e1	Add search to method, variable names meant for khoj search in khoj.el In preparation to introduce Khoj chat in Emacs	2023-03-21 21:44:11 -06:00
Debanjum Singh Solanky	98e5ea4940	Fix name of default encoder to replace in multi-lingual model setup docs	2023-03-21 20:38:17 -06:00
Debanjum Singh Solanky	2f6284872d	Mention Khoj needs Python version 3.10 or lower in docs	2023-03-20 15:18:19 -06:00
Debanjum Singh Solanky	a9b81975f2	Fix encoder model name to configure multilingual search in Readme See comment in issue #98 for stale model name comment	2023-03-19 17:27:53 -06:00
Debanjum	b351cfb8a0	Add Search Actor to Improve Querying Notes for Khoj Chat Merge pull request #189 from debanjum/add-search-actor-to-improve-notes-lookup-for-chat ### Introduce Search Actor Search actor infers Search Queries from user's message - Capabilities - Use previous messages to add context to current search queries[^1] This improves quality of responses in multi-turn conversations. - Deconstruct users message into multiple search queries to lookup notes[^2] - Use relative date awareness to add date filters to search queries[^3] - Chat Director now does the following: 1. [NEW] Use Search Actor to generate search queries from user's message 2. Retrieve relevant notes from Knowledge Base using the Search queries 3. Pass retrieved relevant notes to Chat Actor to respond to user ### Add Chat Quality Tests - Test Search Actor capabilities - Mark Chat Director Tests for Relative Date, Multiple Search Queries as Expected Pass ### Give More Search Results as Context to Chat Actor - Loosen search results score threshold to work better for searches with date filters - Pass more search results (up to 5 from 2) as context to Chat Actor to improve inference [^1]: Multi-Turn Example Q: "When did I go to Mars?" Search: "When did I go to Mars?" A: "You went to Mars in the future" Q: "How was that experience?" Search: "How my Mars experience?" This gives better context for the Chat actor to respond [^2]: Deconstruct Example: Is Alpha older than Beta? => What is Alpha's age? & When was Beta born? [^3]: Date Example: Convert user messages containing relative dates like last month, yesterday to date filters on specific dates like dt>="2023-03-01"	2023-03-18 18:02:12 -06:00
Debanjum Singh Solanky	601ff2541b	Revert to using GPT to extract search queries from users message - Reasons: - GPT can extract date aware search queries with date filters better than ChatGPT given the same prompt. - Need quality more than cost savings for now. - Need to figure ways to improve prompt for ChatGPT before using it	2023-03-18 17:56:13 -06:00
Debanjum Singh Solanky	e28526bbc9	Extract search queries from users message using ChatGPT as Search Actor - Reasons - ChatGPT should be better at following instructions than GPT - At 1/10th the cost, it's much cheaper than using older GPT models	2023-03-18 16:33:24 -06:00
Debanjum Singh Solanky	939d7731da	Fix-up Search Actor GPT's response for decoding it as valid JSON	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	f63fd0995e	Pass more search results as context to Chat Actor to improve inference	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	10836dedee	Search should return user message if GPT response is not valid JSON Previously would throw if GPT response is not valid JSON. Better to return original message to use for search instead	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	08f5fb315f	Add answers to context for Search Actor to generate relevant queries Update Search Actor prompt with answers, more precise primer and two more examples for context Mark the 3 chat quality tests using answer as context to generate queries as expected to pass. Verify that the 3 tests pass now, unlike before when the Search Actor did not have the answers for context	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	f09bdd515b	Expect Chat Director can extract relative dates using new Search Actor	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	36c7389b46	Test Search Actor generating search query from Chat History	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	2600cc9d4d	Test Search Actor extracting relative dates & multiple questions	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	45cb510421	Loosen search results score thresold used by chat for more context	2023-03-18 16:30:55 -06:00
Debanjum Singh Solanky	d871e04a81	Use past user messages, inferred questions as context to extract questions - Keep inferred questions in logs - Improve prompt to GPT to try use past questions as context - Pass past user message and inferred questions as context to help GPT extract complete questions - This should improve search results quality - Example Expected Inferred Questions from User Message using History: 1. "What is the name of Arun's daughter?" => "What is the name of Arun's daughter" 2. "Where does she study?" => => "Where does Arun's daughter study?" OR => "Where does Arun's daughter, Reena study?"	2023-03-18 16:30:50 -06:00
Debanjum Singh Solanky	1a5d1130f4	Generate search queries from message to answer users chat questions The Search Actor allows for 1. Looking up multiple pieces of information from the notes E.g "Is Bob older than Tom?" searches for age of Bob and Tom in 2 searches 2. Allow date aware user queries in Khoj chat Answer time range based questions Limit search to specified timeframe in question using date filter E.g "What national parks did I visit last year?" adds dt>="2022-01-01" dt<"2023-01-01" to Khoj search Note: Temperature set to 0. Message to search queries should be deterministic	2023-03-18 16:28:51 -06:00
Debanjum Singh Solanky	d0f14d3f85	Test usage of = in date filter queries	2023-03-16 14:52:59 -06:00
Debanjum Singh Solanky	dfb277ee37	Set skipif at module level if OpenAI API key not set for chat tests - Remove stale message_to_prompt test It is too broad, reduces maintainability. Remove as it doesn't really need its own test right now - Setting skipif at module level for chat actor, director tests reduces code duplication as earlier was using decorator on each chat test	2023-03-16 12:23:52 -06:00
Debanjum	e75e13d788	Create Tests to Measure Chat Quality, Capabilities Create Rubric to Test Chat Quality and Capabilities ### Issues - Previously the improvements in quality of Khoj Chat on changes was uncertain - Manual testing on my evolving set of notes was slow and didn't assess all expected, desired capabilities ### Fix 1. Create an Evaluation Dataset to assess Chat Capabilities - Create custom notes for a fictitious person (I'll publish a book with these soon 😅😋) - Add a few of Paul Graham's more personal essays. [Easy to get as markdown](https://github.com/ofou/graham-essays) 2. Write Unit Tests to Measure Chat Capabilities - Measure quality at 2 separate layers - Chat Actor: These are the narrow agents made of LLM + Prompt. E.g `summarize`, `converse` in `gpt.py` - Chat Director: This is the chat orchestration agent. It calls on required chat actors, search through user provided knowledge base (i.e notes, ledger, image) etc to respond appropriately to the users message. This is what the `/api/chat` API exposes. - Mark desired but not currently available capabilities as expected to fail <br /> This still allows measuring the chat capability score/percentage while only failing capability tests which were passing before on any changes to chat	2023-03-16 11:30:52 -06:00
Debanjum Singh Solanky	4e15b4e411	Create test notes dataset for chat testing Combine hand-written custom notes and PG essays with personal content to bulk up notes count Delete old documentation markdown as not a representative dataset for application (which is more tuned for personal notes)	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	1b4d562700	Test Chat Director Capabilities: Answer from notes, chat history etc - Chat directors are broad agents. - Chat directors orchestrate narrow actor agents to synthesize final response for the user - Agents are Prompts + ML Model - Test Chat Director Capabilities 1. [X] Answer from retrieved notes 2. [X] Answer from chat history 3. [X] Answer general questions 4. [X] Carry out multi-turn conversation 5. [X] Say don't know when answer not in provided context 6. [X] Answers that require current date awareness This test is expected to fail as the chat is not capable of doing this without the Search actor. But the test allows assessing chat quality 7. [X] Date-aware aggregation across multiple different notes This test is expected to fail as the chat is not capable of doing this without the Search actor. But the test allows assessing chat quality 8. [X] Ask clarification questions if no unambiguous answer in provided context 9. [X] Retrieve answer from chat history beyond lookback window This test is expected to fail as the chat director is not capable of searching chat history yet. But the test allows assessing chat quality 10. [X] Retrieve context for answer using multiple independent searches on knowledge base This test is expected to fail as the chat is not capable of doing this without the Search actor. But the test allows assessing chat quality	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	b6d63137f1	Setup Pytest fixture for conversation processor to test chat API - Index markdown test data as knowledge base. As easier to get good markdown content (vs org) - Setup markdown_content_config, processor_config and chat_client to test chat API	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	3f719c9e17	Rename Chat Model+Prompt tests to chat actor tests	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	7526a50dd4	Extract conversation processor utility funcs from gpt.py into utils.py	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	7c4d546039	Configure tests to mark chat quality tests & filter unhelpful warnings - Mark chat quality tests, register custom mark for chat quality - Filter unhelpful deprecation warnings from within dateparser library - Error if tests use unregistered marks	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	c1128a1ad8	Test Chat Actor Capabilities; ability to answer from notes, chat logs etc - Chat actors are narrow agents (prompt + ML model) Chat actors are different from the Chat director. who orchestrates the narrow actor agents to synthesize final response to the user - Test Chat Actor Capabilities 1. Answer from retrieved notes 2. Answer from chat history 3. Answer general questions 4. Carry out multi-turn conversation 5. Say don't know when answer not in provided context 6. Answers that require current date awareness 7. Date-aware aggregation across multiple different notes 8. Ask clarification questions if no unambiguous answer in provided context This test is expected to fail as the chat is not capable of doing this consistently yet. But having the test allows assessing chat quality - Use Openai API Key from OPENAI_API_KEY environment variable - Gitignore .env file, python virtualenv directory Put OpenAI API Key in .env file to run chatbot tests via vscode The .env file is default location for importing env vars	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	9306cd901a	Clean up chat tests to work with updated chat methods in gpt.py	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	24ddebf3ce	Make converse prompt more precise. Fix default arg vals in gpt methods - Set conversation_log arg default to dict - Increase default temperature to 0.2 for a little creativity in answering - Make GPT be more reliable in looking at past conversations for forming response	2023-03-16 09:30:37 -06:00
Debanjum Singh Solanky	8609e3129e	Fix, improve displaying chat messages, sources by Khoj in web interface Pretty pretty json in conversation logs	2023-03-14 11:24:47 -06:00
Debanjum	6c0e82b2d6	Merge Improve Khoj Chat PR #183 from debanjum/improve-chat-interface # Improve Khoj Chat ## Main Changes - Use the new [API](https://openai.com/blog/introducing-chatgpt-and-whisper-apis) for [ChatGPT](https://openai.com/blog/chatgpt) to improve conversation quality and cost - Improve Prompt to answer query using indexed notes - Previously was asking GPT to summarize the notes - Both the chat and answer API use this new prompt - Support Multi-Turn conversations - Pass previous messages and associated reference notes to ChatGPT for context - Show note snippets referenced to generate response - Allows fact-checking, getting details - Simplify chat interface by using only single unified chat type for now ## Miscellaneous - Replace summarize with answer API. Summarize via API not useful for now - Only pass Khoj search results above a threshold confidence to GPT for context - Allows Khoj to say don't know if it can't find answer to query from notes - Allows relying on (only) conversation history to generate response in multi-turn conversation - Move Chat API out of beta. Update Readme	2023-03-10 19:03:44 -06:00
Debanjum Singh Solanky	cccd225247	Deduplicate and simplify logic to render chat message with reference	2023-03-10 18:58:11 -06:00
Debanjum Singh Solanky	b9caad458e	Type score_threshold with union, not \|, to support python <3.10	2023-03-10 18:58:11 -06:00
Debanjum Singh Solanky	198d9af8cf	Update Readme to reflect Khoj Chat out of Beta	2023-03-10 18:58:11 -06:00
Debanjum Singh Solanky	a71f168273	Move the chat API out of beta. Save chat sessions at 15min intervals	2023-03-10 17:20:52 -06:00
Debanjum Singh Solanky	bcc0bed9db	Upgrade bump_version script to handle release and post-release commit - Updates version in khoj.el and Obsidian manifest, package, versions json files under interface and project root - Create and tag release commit with updated files - Creates commit with post-release version upgrade in files - Use flags to specify whether to create a release or post-release commit	2023-03-10 15:23:17 -06:00
Debanjum Singh Solanky	8bb8824d0c	Bump khoj versions in obsidian, emacs files	2023-03-10 15:23:17 -06:00
Debanjum Singh Solanky	e16d0b6d7e	Open references notes used for chat on mobile too (by clicking) Requires clicking the reference as hover doesn't work on mobile	2023-03-09 17:13:07 -06:00
Debanjum Singh Solanky	c3c7b8a951	Make Khoj chat a separate Progressive Web App (PWA) for easier access	2023-03-09 13:45:06 -06:00
Debanjum Singh Solanky	3838f9d8e3	Remove explicitly asking GPT to say I don't know in prompt for now GPT still mostly says I don't know when answer not in notes or chats But with this its more inclined to answer general questions not in chats or notes while informing user that the information is not from existing chats or notes	2023-03-09 12:11:44 -06:00
Debanjum Singh Solanky	f7b8cdd02e	Log prompts being passed to GPT for debugging	2023-03-08 19:17:52 -06:00
Debanjum Singh Solanky	2739a492b4	Log message metadata along with Khoj message instead of user message References should be attached to khoj chat messsage rather than the users message in the chat interface	2023-03-08 19:16:24 -06:00
Debanjum Singh Solanky	87d1e1341d	Show reference notes used as response context in chat interface	2023-03-08 19:16:24 -06:00
Debanjum Singh Solanky	280061e1fa	Do not deduplicate search results used for chat context - Chat uses compiled form of search results, not the raw entries to provide context for chat. The compiled snipped search results themselves are unique and using multiple of them for context from the same raw note is fine if they cross the score and rank thresholds This should improve the context provided for chat - Also apply score_threshold, no deduplication to the answers API	2023-03-06 23:51:31 -06:00
Debanjum Singh Solanky	672f61529e	Make getting deduped search results configurable via Search API	2023-03-06 23:48:46 -06:00
Debanjum Singh Solanky	4fb628975c	Fix jumping to note from Khoj Obsidian search modal result on Windows - Issue The file path separator by khoj server and the Obsidian vault were different on Windows - Fix Normalize file path to use forward slash(/) to find the matching note file in the Obsidian vault for jump to it Resolves #177	2023-03-05 21:07:54 -06:00
Debanjum Singh Solanky	b6cdc5c7cb	Do not expose answer API as a chat type in chat web interface or API Answer does not rely on past conversations, just the knowledge base. It is meant for one off interactions, like search rather than a continuing conversation like chat For now it is only exposed via API. Later it will be expose in the interfaces as well Remove ability to select different chat types from the chat web interface as there is only a single chat type Stop appending answers to the conversation logs	2023-03-05 18:21:59 -06:00
Debanjum Singh Solanky	7f994274bb	Support multi-turn conversations in chat mode - Only use decent quality search results, if any, as context - Pass source results used by previous chat messages as context - Loosen prompt to allow looking at previous chats and notes to answer - Pass current date for context - Make GPT provide reason when it can't answer the question. Gives user context to tune their questions	2023-03-05 18:21:39 -06:00
Debanjum Singh Solanky	d73042426d	Support filtering for results above threshold score in search API	2023-03-05 18:21:39 -06:00
Debanjum Singh Solanky	45f461d175	Keep search results passed to GPT as context in conversation logs This will be useful to 1. Show source references used to arrive at answer 2. Carry out multi-turn conversations	2023-03-05 16:00:19 -06:00
Debanjum Singh Solanky	7cad1c9428	Only use past chat message, not session summaries as chat context Passing only chat messages for current active, and summaries for past session isn't currently as useful	2023-03-05 16:00:18 -06:00
Debanjum Singh Solanky	ad1f1cf620	Improve and simplify Khoj Chat using ChatGPT - Set context by either including last 2 chat messages from active session or past 2 conversation summaries from conversation logs - Set personality in system message - Place personality system message before last completed back & forth This may stop ChatGPT forgetting its personality as conversation progresses given: - The conditioning based on system role messages is light - If system message is too far back in conversation history, the model may forget its personality conditioning - If system message at end of conversation, the model can think its the start of a new conversation - Inserting the system message before last completed back & forth should prevent ChatGPT from assuming its the start of a new conversation while not losing personality conditioning from the system message - Simplfy the Khoj Chat API to for now just answer from users notes instead of trying to infer other potential interaction types. - This is the default expected behavior from the feature anyway - Use the compiled text of the top 2 search results for context - Benefits of using ChatGPT - Better model - 1/10th the price - No hand rolled prompt required to make GPT provide more chatty, assistant type responses	2023-03-05 01:24:13 -06:00
Debanjum Singh Solanky	9d42b5d60d	Use multiple compiled search results for more relevant context to GPT Increase temperature to allow GPT to collect answer across multiple notes	2023-03-05 01:24:13 -06:00
Debanjum Singh Solanky	c3b624e351	Introduce improved answer API and prompt. Use by default in chat web interface - Improve GPT prompt - Make GPT answer users query based on provided notes instead of summarizing the provided notes - Make GPT be truthful using prompt and reduced temperature - Use Official OpenAI Q&A prompt from cookbook as starting reference - Replace summarize API with the improved answer API endpoint - Default to answer type in chat web interface. The chat type is not fit for default consumption yet	2023-03-05 01:24:13 -06:00
Debanjum Singh Solanky	7184508784	Mention Python and Pip need to be installed in Main and Emacs Readme	2023-03-02 21:28:54 -06:00
Debanjum Singh Solanky	211e460398	Output date filter from cache log at debug level. Remove unused imports Other logs not directly useful to user have already been converted to debug log levels in `1ae4016`. Just forgot to convert this log line too	2023-03-02 15:41:32 -06:00
Debanjum Singh Solanky	c823f46d89	Test error on missing fields in ContentConfig pulled from Khoj.yml Resolves #9	2023-03-02 15:35:39 -06:00
Debanjum Singh Solanky	b6dbe4dd1d	Do not try retrieve an unconfigured core content type in Config GUI Previous behavior was resulting in a null reference error. As key for the core content/search type was not present in current config Fallback to using default config for unconfigured core content type instead See #165 for details	2023-03-02 11:09:31 -06:00
Debanjum Singh Solanky	1ae40163a9	Show user friendly information logs by default for context - Use emojis to make info logs easier to read - Inform when khoj is ready to use - Provide information on what khoj is doing while starting up - Inform when content/search types and processors are setup - Inform when models are being loaded from the web as this step can take time - Convert all other info logs to be only shown in verbose mode	2023-03-01 16:39:07 -06:00
Debanjum Singh Solanky	fe03ba3dce	Index intro text before headings in org files - Text before headings was not being indexed due to buggy orgnode parsing logic - Resolved indexing intro text from files with and without headings in them - Ensure intro text node has heading set to all title lines collected from the file Resolves #165	2023-03-01 12:11:33 -06:00
Debanjum Singh Solanky	ed177db2be	Emojify step names in workflows. Stop publishing to TestPyPi from PR	2023-03-01 10:56:39 -06:00
Debanjum Singh Solanky	7ad251b8ef	Log and Continue on OSError while collating dates for date filters Log to understand if error, date can be handled better Mitigates #172	2023-03-01 01:23:37 -06:00
Debanjum Singh Solanky	2bed4c3b50	Fix configuring search types & /config/types API when no plugin configured - Test /config/types API when no plugin configured, only plugin configured and no content configured scenarios - Do not throw null reference exception while configuring search types when no plugin configured - Do not throw null reference exception on calling /config/types API when no plugin configured Resolves bug introduced by #173	2023-03-01 01:23:37 -06:00
Debanjum Singh Solanky	8914dbd073	Fix creating GUI panels for unconfigured search, processor types Repro: 1. Open khoj server with `khoj` on first run 2. Install/enable Khoj Obsidian plugin (to configure khoj server) 3. Restart khoj server with `khoj` Bug: - Unconfigured processor and search_types are instantiated as None in self.current_config - While creating the desktop GUI, these null configs are attempted to be accessed as valid dictionaries for creating their GUI panels - This results in the null ref errors Fix: Use default config to create their GUI elements for unconfigured search and processor types Resolves #167	2023-03-01 01:20:58 -06:00
Debanjum	e77a5ffc83	Merge pull request #173 from debanjum/enable-creating-content-plugins ## Enable Creating Content Plugins ### Goal Index, Search text content not supported by default in Khoj using plugins ### Code Changes - `fcbbe8c` Configure content plugins to index using `khoj.yml` - Index content plugins from standardized JSONL format for ingestion - `55a032e` Add jsonl processor to index plugin content - `ab0d3a0` Index configured plugins on app start and via update API endpoint - Expose plugin content types for usage by interfaces - `47b58a2` Dynamically update available types on loading the Khoj server - Expose indexed types via API (`9d38ead`). Simplify getting enabled types in Web (`f3f2438`), Emacs (`1e43f1a`) interfaces - Search plugin content from the Web and Emacs Interfaces - `d91c7e2` Search plugin content via the search API - Render plugin content on Web (`88344f9`) and Emacs (`c2814fc`) interfaces - The Web, Emacs interfaces are general interfaces, they allow searching across all content types - The Obsidian interface is currently tuned for only markdown content It will be extended to render more content plugins later ### Testing - `fcbbe8c` Add unit tests to test reading plugin config from khoj.yml - `55a032e` Add unit tests for the `JsonlToJsonl` processor - `88a9ead` Add unit tests to validate search, incremental update, force-update API works with plugin content types - `b09350c` Add unit test to validate only configure search types returned by the new /api/config/types API endpoint - Manually test the config read, indexing, search and update with local khoj	2023-02-28 22:23:25 -06:00
Debanjum Singh Solanky	b09350c052	Fix to return only enabled content types via the new config/types API - Previously was return all core content types even if they had not been setup - Add test to validate only configured content types are returned by the api/config/types API endpoint	2023-02-28 22:08:26 -06:00
Debanjum Singh Solanky	b177adf3a7	Return value of search_type in /config/type API endpoint - Remove need for interfaces to downcase content types returned by API before using the type in search and other API endpoint - Fix to check for search_type.name in plugin keys instead of value	2023-02-28 21:49:26 -06:00
Debanjum Singh Solanky	ede6eb6879	Re-enable testing search and update API with image content type It may have been disabled due to issues with image search earlier	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	88a9eadfba	Use client pytest fixture to test API with plugin type configured	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	ab501a56c9	Create pytest fixture to configure app with plugin, search types	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	f944408e69	Update content_config pytest fixture to index plugin content	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	88344f9ed2	Improve rendering search results of plugin content types on web interface Render only the entry from plugin search response instead of raw json Use the results-ledger styling for results-plugin styling	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	c2814fce58	Improve rendering search results of plugin content types in khoj.el Render only the entry from plugin search response instead of raw json	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	f3f24387ec	Use new config/types API to set enabled content types on web interface	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	1e43f1a12e	Use new config/types API to set enabled content types in khoj.el menu	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	9d38eadd42	Return enabled content types via api/config/types API endpoint Simplifies dynamically populating enabled content types for interfaces	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	68bd5d9ebc	Configure API routes after set up search types while configuring server Configure app routes after configuring server. Import API routers after search type is dynamically populated. Allow API to recognize the dynamically populated plugin search types as valid type query param. Enable searching for plugin type content.	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	d91c7e2761	Search for plugin content via the search API	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	47b58a2a4d	Configure, use dynamically instantiated SearchType enum on app start The SearchType is now dynamically populated with core and configured plugin types Use the new dynamic SearchType enum from state.py across codebase	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	ab0d3a08e2	Index configured plugins on app start and via update API endpoint	2023-02-28 20:25:51 -06:00
Debanjum Singh Solanky	55a032e8c4	Add processor to index entries from jsonl files for plugins - Read, merge entries from input jsonl files and filters - Mark new, modified entries for update	2023-02-24 02:54:12 -06:00
Debanjum Singh Solanky	fcbbe8c759	Read content plugin configs from Khoj config YAML Configure external text content plugins via the Khoj YAML Reuse existing TextContentConfig definition for external text content plugins	2023-02-23 23:57:32 -06:00
Debanjum Singh Solanky	f57d7bf5ad	Use pypi khoj to fix docker builds and dockerize github workflow - Instead of building the package locally like before The issue started since moving to dynamic git based versioning with hatch-vcs This should reduce image size of docker builds too - Also move to ubuntu image since pyqt6 builds available on it, so do not need to build it locally for image - This s	2023-02-19 01:57:01 -06:00
Debanjum Singh Solanky	fada617faa	Fix TOC links, Add how to auto start Khoj server to Readme Rename tools directory to more standard scripts directory	2023-02-18 23:51:02 -06:00
Debanjum Singh Solanky	61b6ee2857	Use helper script to bump khoj pre-release versions	2023-02-17 20:31:51 -06:00
Debanjum Singh Solanky	47c2cc63e1	Automate uploading Obsidian artifacts to new releases	2023-02-17 19:57:44 -06:00
Debanjum Singh Solanky	a8940462c4	Automate khoj python package versioning using hatch-vcs and Git tags	2023-02-17 18:19:01 -06:00
Debanjum Singh Solanky	053d6141f3	Ignore ts typing error, Fix SPDX license identifier in Obsidian plugin	2023-02-17 18:19:01 -06:00
Debanjum Singh Solanky	47569da38e	Fix usage of "\" in orgnode test string to resolve DeprecationWarning	2023-02-17 17:15:44 -06:00
Debanjum Singh Solanky	36be3c4b8f	Fix or ignore MyPy issues in PyQt desktop GUI code - Remove unneeded type ignore for mps with the latest mypy - Stop excluding PyQT desktop GUI code from MyPy checks - Do not warn about unused ignores. Some issue with mypy giving different errors in different environments (venv, system and pre-commit)	2023-02-17 16:13:05 -06:00
Debanjum Singh Solanky	fd0a2f55f8	Run mypy checks in test workflow and on push (via pre-commit) - Run mypy on git push (not every commit) but for all files - Running it on pre-commit, doesn't make sense as mypy wants to look at all files, not just diff files - But this is too time consuming to run every commit, so run on push - Update development section documentation on installing, manually running pre-commit for validation that includes running mypy checks	2023-02-17 16:08:56 -06:00
Debanjum Singh Solanky	5c0d340970	Update Development section in Readme. Add steps for code validation	2023-02-17 13:31:37 -06:00
Debanjum Singh Solanky	051f0e3fb5	Add, configure and run pre-commit locally and in test workflow	2023-02-17 13:31:36 -06:00
Debanjum Singh Solanky	5e83baab21	Use Black to format Khoj server code and tests	2023-02-17 11:55:17 -06:00
Debanjum Singh Solanky	6130fddf45	Install pytest as optional dev dependency of app in test workflow	2023-02-17 10:11:57 -06:00
Debanjum Singh Solanky	8b293edd7c	Move mypy config into pyproject.toml. Ignore 2 remaining mypy issues	2023-02-16 03:33:08 -06:00
Debanjum Singh Solanky	7a9a811874	Fix authors, homepage URL in pyproject.toml and workflow triggers	2023-02-16 03:19:56 -06:00
Debanjum Singh Solanky	dcb86c2d3e	Build khoj python package using hatchling, pyproject.toml - Why - pyprojects.toml is the python standards compliant config format - allows collating python tooling configs into single standard file - hatch(-ling) is a new lightweight build system for python packages - Detailed Changes - Replace setup.py, setuptools with pyproject.toml, hatchling for khoj python config and build - move pytest into optional development dependencies - add more links to khoj in the project urls section - add topic classifiers and keywords to find khoj package - Delete setup.py, MANIFEST.in as moved to pyproject.toml based setup - Update pypi workflow to set python package version in pyproject.toml	2023-02-16 02:37:32 -06:00
Debanjum Singh Solanky	c641eb4ad6	Improve rendering log and error stacktraces using the Rich package - Use Rich to render uvicorn, fastAPI logs as well The previous CustomFormatter only worked on khoj logs - Improve rendering stacktrace on errors using Rich	2023-02-15 16:19:32 -06:00
Debanjum Singh Solanky	a403def19e	Fix workflow to publish Khoj python package to PyPi	2023-02-14 22:19:21 -06:00
Debanjum	eee57599ad	Improve Dockerize, Publish to PyPi Workflows - fb86dea Create tagged Docker image on new tag/release - 01fd98b Improve workflow to publish khoj to pypi	2023-02-14 21:11:56 -06:00
Debanjum Singh Solanky	af6d65a909	Create tagged Docker image on new tag/release	2023-02-14 20:04:06 -06:00
Debanjum Singh Solanky	25e06f26c0	Improve workflow to publish khoj to pypi - Use emoji's to improve visual indicator of action step - Rename to pypi instead of the more ambiguous publish name Publish could mean publish docker image, publish to pypi, MELPA or Obsidian plugin - Update workflow badge, link pypi badge to khoj pypi package page - Use pypa official github action to upload package to (test) pypi instead of doing it manually using twine - Upload python package artifact for easier access for testing. As uploading to testpypi doesn't work for PRs by others from forked repos	2023-02-14 20:03:35 -06:00
Debanjum	11873795a6	Use src layout to fix packaging khoj for pypi ### Issue The khoj python package was using a common top level name[1], `src' instead of `khoj' due to incorrect usage of the src layout[2] ### Fix Put content meant for python packaging from `src/' to `src/khoj/' Update code, tests, configs and docs to reference new layout The `khoj' python package should now get unpacked under `khoj' instead of `src' directory ### Details - `25a749c` Use the src/ layout to fix packaging Khoj for PyPi - `bc7477e` Move Emacs, Obsidian plugin code out from under src/khoj directory - `f83cf4e` Check wheel contents in workflow before publishing Khoj to PyPI [1]: https://github.com/jwodder/check-wheel-contents#w005--wheel-contains-common-toplevel-name-in-library [2]: https://packaging.python.org/en/latest/discussions/src-layout-vs-flat-layout/	2023-02-14 16:26:07 -06:00
Debanjum Singh Solanky	e76c285bdc	No need to prune plugins as not included in pypi package. Mention Obsidian as supported Interfaces in Readme	2023-02-14 16:15:40 -06:00
Debanjum Singh Solanky	bc7477ea3e	Move Emacs, Obsidian plugin code out from under src/khoj directory - What - The Emacs and Obsidian interfaces stay in their original directories under src/ - src/khoj now only contains code meant for pypi packaging - Benefits - This avoids having to update khoj MELPA, Obsidian plugin config as the Emacs, Obsidian code is under their original directories - It separates the code in src/khoj meant for python packaging from code for external interfaces like Emacs and Obsidian	2023-02-14 15:44:22 -06:00
Debanjum Singh Solanky	f83cf4ebc6	Check wheel contents in workflow before publishing it to PyPI	2023-02-14 15:20:44 -06:00
Debanjum Singh Solanky	25a749ca1d	Use the src/ layout to fix packaging Khoj for PyPi - Why The khoj pypi packages should be installed in `khoj' directory. Previously it was being installed into `src' directory, which is a generic top level directory name that is discouraged from being used - Changes - move src/* to src/khoj/* - update `setup.py' to `find_packages' in `src' instead of project root - rename imports to form `from khoj.*' in complete project - update `constants.web_directory' path to use `khoj' directory - rename root logger to `khoj' in `main.py' - fix image_search tests to use the newly rename `khoj' logger - update config, docs, workflows to reference new path `src/khoj'	2023-02-14 15:19:06 -06:00
Debanjum Singh Solanky	cc31cd070d	Enable the publish workflow for PRs created in the main repo The publish workflow was previously disabled for PRs in commit `d1945c5ba8`	2023-02-14 13:51:31 -06:00
Debanjum	84322b2a45	Demo using Search in Khoj Obsidian Plugin	2023-02-14 08:43:50 -08:00
Debanjum Singh Solanky	a4dcb20622	Add setting to toggle auto configuring of khoj backend from Obsidian - By default the obsidian plugin automatically configures the khoj backend to index the current vault - For more complex scenarios, users can manage their ~/.khoj/khoj.yml manually by toggling the auto-configure setting off in the khoj plugin settings Resolves #156	2023-02-13 20:15:28 -06:00
Debanjum Singh Solanky	24aa696ef5	Indicate indexing active on Update button in Obsidian plugin settings Use moon rotating through phases to indicate notes indexing in progress Resolves #129	2023-02-13 19:28:19 -06:00
Debanjum Singh Solanky	11517ba8eb	Encode jsonl data as utf8 for gzip write for consistent read/write encoding Should help with issue #89	2023-02-12 17:33:23 -06:00
Debanjum Singh Solanky	c156b3e087	Remove sub-dependencies from setup.py. Upgrade sentence-transformer - setup.py best practise recommends only specifying core dependencies, not dependencies of core dependencies in it - Latest sentence-transformer (version 2.2.2) correctly installs its huggingface_hub dependency. Else application fails to start	2023-02-12 10:42:05 -06:00
Debanjum Singh Solanky	3ec41c4d64	Wrap lines for org, markdown results in khoj search results buffer	2023-02-12 07:33:50 -06:00
Debanjum Singh Solanky	d1945c5ba8	Do not run publish workflow for PRs as forks do not have auth token	2023-02-12 07:31:24 -06:00
Debanjum Singh Solanky	9a013ec48f	Add more details to setup Khoj backend in Obsidian plugin readme	2023-02-12 07:31:13 -06:00
Debanjum	24c553877c	Merge pull request #152 from axelson/fix-obsidian-doc-link Fix link to Obsidian plugins doc in Khoj Obsidian Readme	2023-02-10 22:20:06 -06:00
Jason Axelson	6d5930363a	Fix obsidian plugins doc link Also make it more obvious where the link is going, initially I thought the link was to another official khoj documentation site.	2023-02-10 07:11:21 -10:00
Debanjum Singh Solanky	215235efd2	Bump khoj pre-release version	2023-02-08 20:24:36 -03:00
Debanjum Singh Solanky	55e4fa9719	Fix indentation in workflow yaml for testing khoj backend	2023-02-07 02:59:46 -03:00
Debanjum Singh Solanky	2445664d40	Deprioritize searching for Music content over other text content	2023-02-07 02:41:31 -03:00
Debanjum Singh Solanky	2e052913b6	Search in first configured content type when no search type set Instead of searching through all configured content types but only returning results of the last configured content type	2023-02-07 02:41:31 -03:00
Debanjum Singh Solanky	a26ab31d20	Allow chat with markdown notes if no org-mode content configured	2023-02-07 02:41:31 -03:00
Debanjum	99a03da3f7	Read Markdown file as utf8 instead of the default encoding used by OS ### Background 1. Obsidian stores markdown notes as `utf8`[1] 2. By default, the python `open` command uses the OS locale encoding[2] ### Issue Based on above background, if the OS locale encoding isn't `utf8` it causes the `UnicodeDecodeError: <locale_encoding> codec can't decode byte` error ### Fix - Read markdown files as `utf8` The Obsidian plugin is the main use-case for markdown files in khoj currently and that stores md files as `utf8`. Do not assume utf8 for other content types like org-mode, beancount for now. - Fail if error in reading file as utf8, instead of ignoring errors. Would rather have user realize that their files are not going to get indexed correctly. [1]: https://forum.obsidian.md/t/better-handle-md-files-not-stored-in-utf8-format/13524/3 [2]: https://docs.python.org/3/library/functions.html#open	2023-02-07 01:46:42 -03:00
Debanjum Singh Solanky	d3e82b918f	Make Khoj require python version below 3.11 until PyTorch works with it Closes #128	2023-02-06 23:11:51 -03:00
Debanjum Singh Solanky	c11f7b47e4	Update workflow to run backend tests for all supported python versions	2023-02-06 21:05:34 -03:00
Debanjum Singh Solanky	11a18cc452	Update khoj docker config to index sub directories for text content - Khoj supports indexing subdirectories but the khoj docker config wasn't updated to support the same - This should also allow khoj docker users to index multiple separate directory trees by mounting them into separate sub folders within /data/<content-type>/. For e.g /data/org/dir1, /data/org/dir2 etc in khoj_docker.yml	2023-02-06 21:04:50 -03:00
Debanjum Singh Solanky	fbb7747dcc	Read Markdown file as utf8 instead of the default encoding used by OS - Background 1. Obsidian stores markdown notes as utf8[1] 2. By default, the python `open' command uses the OS locale encoding[2] This was causing the `UnicodeDecodeError: <locale_encoding> codec can't decode byte' error - Fix - Read markdown files as utf8 The Obsidian plugin is the main use-case for markdown files in khoj currently and that stores md files as utf8. Do not assume utf8 for other content types like org-mode, beancount for now. - Fail if error in reading file as utf8, instead of ignoring errors. Would rather have user realize that their files are not going to get indexed correctly. [1]: https://forum.obsidian.md/t/better-handle-md-files-not-stored-in-utf8-format/13524/3 [2]: https://docs.python.org/3/library/functions.html#open	2023-02-06 21:04:50 -03:00
Debanjum Singh Solanky	66dca6cf33	Add Docs to Search across Languages, Uninstall Khoj to Readme Add details and fixes to Obsidian, Main readme based on feedback, confusion from the Obsidian plugin announcement	2023-02-06 21:04:50 -03:00
Debanjum Singh Solanky	cba9a6a703	Use List, Tuple, Set from typing to support Python 3.8 for khoj Before Python 3.9, you can't directly use list, tuple, set etc for type hinting Resolves #130	2023-02-06 01:23:52 -03:00
Debanjum	14f28e3a03	Mention Emacs, Obsidian plugins at top of main Readme Add badges for supported plugins at top of main readme. Link badges to plugin docs for easy navigation for plugin users from main readme/project root	2023-01-28 18:01:20 -08:00
Debanjum Singh Solanky	f26cee604d	Update Khoj Plugin Install Instructions. Rename main Readme to README Khoj plugin page from within Obsidian isn't recognized. Seems like it needs an uppercase readme file only. So it doesn't show the Khoj readme from within Obsidian itself.	2023-01-27 20:01:31 -03:00
Debanjum Singh Solanky	2e13e15625	Ensure markdown entries in khoj.el results separated by empty line - Update khoj.el test to reflect updated rendering logic - Move ledger render function before image rendered to group functions with similar logic closer	2023-01-26 19:13:02 -03:00
Debanjum Singh Solanky	85ae46f429	Use thread_last to make results rendering funcs more readable in khoj.el	2023-01-26 18:59:44 -03:00
Debanjum	a8ab9448da	Resolve Khoj Obsidian Plugin feedback ### Details - `b415f87` Split find and jump to notes code in `onChooseSuggestion' method - `37063f6` Truncate query to 8k chars for find similar notes from Obsidian plugin - `4456cf5` No need to use `then' or `finally' in `async' functions after an `await' - `4070be6` Pass app object from plugin instance to child objects and functions - `c203c6a` Use Sentence case for Find similar mote Obsidian command name	2023-01-26 18:54:33 -03:00
Debanjum Singh Solanky	b415f87093	Split code in onChooseSuggestion method to make it more readable Split find file, jump to file code to make onChooseSuggestion more readable - Use find, instead of using return in forEach to get first match - Move the jump to file+heading code out from forEach	2023-01-26 18:26:24 -03:00
Debanjum Singh Solanky	37063f6a38	Truncate query to 8k chars for find similar notes from obsidian plugin Truncate current file data passed to khoj backend API via query string below default query size supported by popular servers	2023-01-26 18:26:24 -03:00
Debanjum Singh Solanky	4456cf5c8f	No need to use then or finally in async functions after an await	2023-01-26 18:26:24 -03:00
Debanjum Singh Solanky	4070be637c	Pass app object from plugin instance to child objects and functions Do not reference global app object from child objects and funcs directly. It is only available for debugging purposes and access to it maybe dropped in the future.	2023-01-26 18:26:24 -03:00
Debanjum Singh Solanky	c203c6a3fd	Use Sentence case for Find Similar Note command name in Khoj Obsidian	2023-01-26 18:26:24 -03:00
Debanjum Singh Solanky	e18124ef6f	Add badge for tests and update project subtitle in khoj.el Readme	2023-01-23 20:52:03 -03:00
Debanjum	477ef28e08	Create and Automate Tests for Khoj.el on Emacs - Use ERT to test `khoj.el' - Test extracting and rendering of Org, Markdown and Ledger entries from Khoj API response - Automate `khoj.el' testing using Github workflow - Fix, Simplify and Test the get text around point code for the "Find Similar" feature	2023-01-23 20:40:18 -03:00
Debanjum Singh Solanky	f9fb58aec3	Automate khoj.el testing using Github workflow Install transient.el dependency as it is not available by default before Emacs 28.1	2023-01-23 20:33:47 -03:00
Debanjum Singh Solanky	86e808abfb	Test get-current-text helpers for Find Similar feature in khoj.el	2023-01-23 20:33:47 -03:00
Debanjum Singh Solanky	be6acda212	Create khoj.el tests. Test rendering results of each content types	2023-01-23 20:33:47 -03:00
Debanjum Singh Solanky	0d0bf3b5aa	Simplify get-current-text functions for Find Similar in khoj.el Use existing functions like `string-trim', `thing-at-point' and remove unneeded code from the two functions	2023-01-23 19:15:52 -03:00
Debanjum Singh Solanky	07e9e4ecc3	Get current paragraph text when point at start of paragraph in khoj.el Previously if cursor was at start of current paragraph, it would get text for the current and next paragraph, instead of just the current one	2023-01-23 18:05:54 -03:00
Debanjum Singh Solanky	a0b03c8bb1	Get current entry text when point at heading for Find Similar in khoj.el Previously if cursor was at heading of current entry, it would find entries similar to the previous outline heading, instead of the current one	2023-01-23 10:01:25 -03:00
Debanjum Singh Solanky	013c7c10a4	Bump khoj pre-release version	2023-01-22 18:45:56 -03:00
Debanjum Singh Solanky	ad3c9b5f44	Bump khoj version to 0.2.5 in preparation for release	2023-01-22 18:18:21 -03:00
Debanjum Singh Solanky	9ed056c7e7	Use consistent indentation in Khoj Emacs Readme	2023-01-22 18:04:12 -03:00
Debanjum Singh Solanky	0980c6e87f	Update Emacs Usage section in Readme. Add find-similar, menu usage	2023-01-22 18:04:12 -03:00
Debanjum Singh Solanky	6908b6eed3	Truncate image queries below max tokens length supported by ML model This would previously return the infamous tensor size mismatch error Verify this error is not raised since adding the query truncation logic	2023-01-21 14:11:00 -03:00
Debanjum Singh Solanky	3d9ed91e42	Search by image at path only if query of form "file:/path/to/image" Previously no query syntax helpers, like the "file:" prefix, were used before checking if query contains file path. This made query to image search brittle to misinterpretation and pointless checking Add test to verify search by image at file works as expected	2023-01-21 14:06:56 -03:00
Debanjum	655ef11653	Find Similar Notes, Transactions, Images from Khoj in Emacs ### Overview Find items of specified type similar to current text item at point ### Capabilities - Support querying with text surrounding point in any text buffer - Find similar items of specified content type indexed on Khoj ### Details - Query using text in current section if in a `outline-mode` buffer (i.e markdown heading, org-mode entry text) - Query using text in current paragraph if in non `outline-mode` buffer - Search for items of `content-type` set in khoj transient menu - Update last used khoj content-type and results from the find-similar and update functions for later reuse ### Related - Recently added [Find Similar Notes in Khoj Obsidian](https://github.com/debanjum/khoj/pull/122) as well	2023-01-20 22:44:28 -03:00
Debanjum Singh Solanky	b7aa22a059	Change order of arg passed to query-api-and-render-results by importance	2023-01-20 22:13:24 -03:00
Debanjum Singh Solanky	936a88fa7e	Find items of specified type similar to current text item at point - Support querying with text surrounding point in any text buffer Previously could only find items similar to org entry at point - Find similar items of specified content type indexed on khoj Previously only looked for similar org entries indexed on khoj Now uses the content-type configured in khoj transient menu to find items of the specified content type - Details - Generalize the get-current-org-entry-text func to get text for any outline section - Replace leading whitespaces from query text as well - Create method to get current paragraph text from non-outline mode buffers - Update transient, find-similar funcs to pass, use content-type configured in khoj transient menu - Generalize query title creation logic to remove markdown headings prefix (#) apart from org heading prefix (*) as well - Update last used khoj content-type and results from the find-similar and update funcs for later reuse - Jump to top of results buffer after results rendered	2023-01-20 22:12:54 -03:00
Debanjum Singh Solanky	17aaadea1f	Find notes similar to current org entry at point	2023-01-20 05:14:54 -03:00
Debanjum Singh Solanky	44bbc0a417	Add section separators to khoj.el for easier code traversal	2023-01-19 23:36:54 -03:00
Debanjum	7516435a0b	Automate khoj.el build and quality checks - `9f0bd0a` Build `khoj.el' and Run `package-lint', `checkdoc' and other melpa package quality checks - `48ad3c5` Use default content types if fail to call backend on `khoj.el` load	2023-01-19 20:21:55 -03:00
Debanjum Singh Solanky	48ad3c535e	Use default content types if fail to call backend on khoj.el load Do not want khoj.el to fail on init/load if khoj backend not running	2023-01-19 20:13:49 -03:00
Debanjum Singh Solanky	9f0bd0a361	Add Github workflow for khoj.el build and quality checks Add khoj.el build badge to khoj.el Readme	2023-01-19 20:13:19 -03:00
Debanjum	b58dd82141	Use Transient Menu to Improve Khoj.el Interface - `5f446b1` Convert `khoj' entry point method to transient.el menu for richer configuration - `9d64a00` Allow updating khoj content index from within `khoj.el'	2023-01-19 03:11:23 -03:00
Debanjum Singh Solanky	0dd1cba272	Rename configuration sections in khoj.el transient menu	2023-01-19 03:03:08 -03:00
Debanjum Singh Solanky	5d0f369186	Add ability to quit khoj transient with standard q keybinding	2023-01-19 02:47:07 -03:00
Debanjum Singh Solanky	87c7cf4272	Use single khoj func as entrypoint. Group khoj.el code into sections - Give more relevant, specific name to khoj suffix commands - Remove `khoj-simple'. Have single `khoj' function for entrypoint	2023-01-19 02:38:19 -03:00
Debanjum Singh Solanky	9d64a009fd	Allow updating khoj content index from within khoj.el - Split transient config menu by type	2023-01-18 23:07:59 -03:00
Debanjum Singh Solanky	a8d0c7d905	Rename search type to more apt content type in khoj.el	2023-01-18 22:13:49 -03:00
Debanjum Singh Solanky	00daea16df	Allow setting default-search-type to image. Make docstrings compact	2023-01-18 22:01:17 -03:00
Debanjum Singh Solanky	216b17cfd0	Dynamically populate content type choices when khoj transient invoked	2023-01-18 22:00:56 -03:00
Debanjum Singh Solanky	5f446b1440	Convert main khoj.el entrypoint into transient menu for richer configuration	2023-01-18 21:50:07 -03:00
Debanjum Singh Solanky	5c07dcd219	Fix, update Obsidian Readme. Add Find Similar Notes to Implementation section	2023-01-18 00:22:26 -03:00
Debanjum	b7fc344be1	Search for Similar Notes from Obsidian Plugin Enable searching for notes similar to the current note being viewed ## Main Changes - `39a18e2` Extend search modal to search for similar notes - Hide input field on init, Trigger search on opening modal when in similar notes mode - Set input to contents of current markdown file and get notes similar to it - Re-rank, by default, when searching for similar notes - Filter out current note from similar note search results - `0bed410` Only show `Find Similar Note' command in Editor	2023-01-18 00:10:10 -03:00
Debanjum Singh Solanky	6119d0a69e	Add usage of "Find Similar Notes" command to the Khoj Obsidian Readme	2023-01-18 00:03:13 -03:00
Debanjum Singh Solanky	657e455785	Remove unused `onunload' method in main.ts of khoj obsidian plugin	2023-01-17 23:46:38 -03:00
Debanjum Singh Solanky	0bed410712	Limit Find Similar Note command to be triggered from Editor Fixup indentation and comments	2023-01-17 19:34:48 -03:00
Debanjum Singh Solanky	39a18e2080	Add ability to search for similar notes in Khoj Obsidian - Hide input field on init, Trigger search on opening modal in similar notes mode - Set input to current markdown file and get similar notes to it - Enable rerank when searching for similar notes - Filter out current note from similar note search results	2023-01-17 19:07:18 -03:00
Debanjum Singh Solanky	ffaef92476	Encode query string before passing as query param to search API	2023-01-17 18:04:11 -03:00
Debanjum Singh Solanky	d5a7cc5b0f	Compact code to map results from search API into SearchResult objects Make code compact for readability Remove unneeded temporary variables and return statements	2023-01-17 18:04:11 -03:00
Debanjum Singh Solanky	8ab7a26bde	Update Khoj on Obsidian screenshots in Main and Plugin Readme - Screenshot querying "Setup Editor" on test vault with Khoj Readmes - New features showcase: - information keybindings, rerank keybinding at bottom of modal - fixed top level headings in search results - search results snipped if greater than N words	2023-01-17 13:58:50 -03:00
Debanjum Singh Solanky	7b4f78776c	Fix extracting Markdown Entries with Top Level Headings - Previously top level headings would have get stripped of the space between heading text and the prefix # symbols. That is, `# Top Level Heading' would get converted to `#Top Level Heading' - This would mess up their rendering as a heading in search results - Add unit tests to text_to_jsonl processors to prevent regression	2023-01-17 13:06:28 -03:00
Debanjum Singh Solanky	1a296518c5	Limit total words for each Search Result rendered in search modal Provides a more consistent rendering of results in modal. Makes it easier to see more results in modal. To see complete entry, user can always just jump to entry from modal	2023-01-17 13:06:14 -03:00
Debanjum Singh Solanky	e7b89f7fd0	Return compiled entry in additional details of /api/search response This can be used to highlight portion of raw entry to highlight and for passing to summarizer to stay with max_tokens limit supported by GPT models	2023-01-16 22:56:06 -03:00
Debanjum Singh Solanky	7071d081e9	Increase max_tokens returned by GPT summarizer. Remove default params	2023-01-16 22:55:36 -03:00
Debanjum Singh Solanky	3d9cdadbbb	Add codebase visualization of Khoj Obsidian to Khoj Obsidian Readme	2023-01-15 14:09:21 -03:00
Debanjum Singh Solanky	d02ba325aa	Handle empty chat history returned by API to chat.html on web interface	2023-01-15 13:51:16 -03:00
Debanjum Singh Solanky	721bbbe15c	Update Readme. Add Chat with Notes Section to Advanced Usage - Add Setup OpenAI API key in Khoj Section to Miscellaneous Refer all mentions of setting up your OpenAI API key to that section - Add Demo Screenshot of Chat with Notes - Put existing Miscellaneous Section under Beta API sub heading - Fix to make Access Khoj on Mobile a Subsection of Advanceed Usage - Trigger refresh of github image cache by adding ? at end of image paths	2023-01-14 00:39:15 -03:00
Debanjum Singh Solanky	42f8230b37	Update Troubleshooting Section in Main Readme - Convert Troubleshooting Issues into Headings instead of Bullets Allows them to be linked to more easily. E.g when pointing folks to it in github issues etc - Add index corruption issue and fix to the Troubleshooting section	2023-01-13 23:03:15 -03:00
Debanjum	3f2ea039a7	Add Chat page to the Khoj Web Interface ### Overview - Provide a chat interface to engage with and inquire your notes - Simplify interacting with the beta `chat` and `summarize` APIs ### Use - Open `<khoj-url>/chat`, by default at http://localhost:8000/chat?type=summarize - Type your queries, see summarized response by Khoj from your notes Note: - You will need to add an API key from OpenAI to your khoj.yml - Your query and top note from search result will be sent to OpenAI for processing ## Details - `177756b` Show chat history on loading chat page on web interface - `d8ee0f0` Save chat history to disk for persistence, seeing chat logs - `5294693` Style chat messages as speech bubbles - `d170747` Add khoj web interface and chat styling to new chat page on khoj web - `de6c146` Implement functional, unstyled chat page for khoj web interface	2023-01-13 23:02:19 -03:00
Debanjum Singh Solanky	16d4560ff8	Comment css styling of chat page for later reference	2023-01-13 22:40:01 -03:00
Debanjum Singh Solanky	cfef346d03	Do not update query field to ever chat message It doesn't work as well with chat, unlike for search page Use more appropriate thinking face emoji for you instead of surprise face	2023-01-13 22:24:26 -03:00
Debanjum Singh Solanky	177756be7e	Fetch chat history from backend and render it on chat page load	2023-01-13 22:01:57 -03:00
Debanjum Singh Solanky	330febaa1a	Update conversation logs from /beta/summary API endpoint too	2023-01-13 22:01:57 -03:00
Debanjum Singh Solanky	cb6f0b53c9	Make user_message_metadata arg to message_to_log in gpt.py optional - Use a default user_message_metadata if arg not set - Update conversation to use `by' as `you' and `khoj'	2023-01-13 22:01:57 -03:00
Debanjum Singh Solanky	cc2456e411	Update /beta/chat API to return chat history if no query param passed	2023-01-13 22:01:57 -03:00
Debanjum Singh Solanky	d8ee0f0e9a	Use scheduler to save chat history to disk every 5 minutes - The previous mechanism to trigger saving on shutdown event did not work - Use scheduler to persist chat sessions to disk at a 5 minute interval - This improve time granularity, fixed interval of saving chat logs - It may lose ~5 minutes of chat history until mechanism to also write on shutdown found/resolved - Create conversation directory if it doesn't exist before attempting write - Reset chat_session after writing it to disk	2023-01-13 22:01:57 -03:00
Debanjum Singh Solanky	5294693e97	Style message as speech bubbles on chat page of web interface - Wrap messages into speech bubbles - Color messages by khoj blue, sender grey - Add those standard protrusions to the speech bubbles for fun - Align bubbles left or right based on sender - messages by khoj are left aligned, message by self are right aligned - Put message metadata like sender and time under speech bubble - use data-* attribute and ::after css pseudo-selector for this - Update renderMessage func to accept time param, remove unused type_ param	2023-01-13 22:01:57 -03:00
Debanjum Singh Solanky	7723d656dc	Do not force GPT to summarize note using past tense Not all notes are in the past. Notes can be about stuff in the future. Casting them to past tense gives the impression that they've already happened / been done.	2023-01-13 13:10:35 -03:00
Debanjum Singh Solanky	2842e3a035	Automatically scroll to bottom of chat body on new messages	2023-01-13 13:09:51 -03:00
Debanjum Singh Solanky	34014635d0	Improve colors, fix contrast for accessability on web interface - Changes - Use blue color for khoj heading font - This fixes the title color issue - Update background to lighter shade - This fixes the body text color issue - Update colors for todo, done, miscellaneous todo state, tag color - This does not fix the color contrast issue but seems like an acceptable solution - Using white text rather than black text on blue background better even though the black text on blue background passes the WCAG acceptable contrast score - For details see blog post: https://uxmovement.com/buttons/the-myths-of-color-contrast-accessibility/ - Add border to tags to give them tag pills look and differntiate from todo states - Buttons and inputs - Change background color of input fields like type dropdown, update button and results count counter, to match background color of page - Add shadow on hover over button, dropdowns Resolves #111	2023-01-12 21:59:50 -03:00
Debanjum Singh Solanky	d170747ec2	Add khoj web interface & chat styling to new chat page on khoj web - Ensure message input box sticks to bottom of screen - Ensure chat logs div is scrollable when logs become longer than screen Do not make the whole page scroll, just the chat logs body div	2023-01-12 21:58:46 -03:00
Debanjum Singh Solanky	de6c146290	Implement functional, unstyled chat page for khoj web interface Expose it at /chat URL	2023-01-12 21:53:25 -03:00
Debanjum Singh Solanky	f0213d0a82	Fix links to install khoj.el readme from main readme	2023-01-12 02:25:00 -03:00
Debanjum Singh Solanky	e6793816f9	Upgrade Khoj.el Readme. Add TOC, Screenshot, Features Sections - Update Query filter details	2023-01-12 02:14:02 -03:00
Debanjum Singh Solanky	2fe21f3a78	Update Advanced Usage section in main Readme - Update Khoj PWA image to show Khoj open as PWA on Android - Add section to show configuring Khoj to use OpenAI models for search	2023-01-12 01:49:12 -03:00
Debanjum Singh Solanky	26f791e9ad	Update Obsidian Plugin Readme. Add Khoj icon to Khoj Modal Placeholder text - Fold Query Filter, Demo Description - Add Limitations to Readme - Add Update index bullet to Troubleshooting Options	2023-01-12 01:48:52 -03:00
Debanjum Singh Solanky	3e63af5c94	Constrain grid rows to fix layout of Khoj web interface on Chrome	2023-01-12 01:48:52 -03:00
Debanjum Singh Solanky	a31002bf38	Revert obsidian plugin manifest, versions at project root to 0.2.1	2023-01-11 20:54:12 -03:00
Debanjum Singh Solanky	50c797962c	Jump to Search Result from Khoj Modal even on Obsidian Android Uses longest file path match to find markdown file in vault corresponding to file of search result returned by Khoj Allow jumping to search result from khoj plugin modal on Android too	2023-01-11 19:44:11 -03:00
Debanjum Singh Solanky	51ea6d9c9b	Do not force index update when configure backend on plugin load - Backend can handle incremental updates - Avoid khoj usability delay by avoiding recomputed everytime vault opened	2023-01-11 17:17:08 -03:00
Debanjum Singh Solanky	3fe5ce2721	Merge branch 'master' of github.com:debanjum/khoj	2023-01-11 17:02:30 -03:00
Debanjum	e28af68cbd	Fix, Improve Configuring Khoj from Obsidian Plugin ### Details - `1c813a6` Convert Results Count setting to `Slider` from `Text` in plugin settings pane - `4e1abd1` Disable `Update` button in plugin settings while indexing vault - `513c86c` Set index file paths relative to current or default path on Khoj backend - `4407e23` Only index current vault on Khoj. Remove `ObsidianVaultPath` setting from plugin - `86a1e43` Return HTTP Exception on /api/update API call failure - `5af2b68` Update plugin notifications for errors. Remove notification for success	2023-01-11 17:01:33 -03:00
Debanjum Singh Solanky	123b077c68	Use apt update before apt install in test workflow on Github	2023-01-11 16:51:16 -03:00
Debanjum Singh Solanky	5996d47d7c	Trigger input event to Get, Render Reranked results from Khoj backend Previous mechanism of manually triggering getSuggestions, renderSuggestions flow was corrupting traversing and opening reranked search results in KhojModal Emulate event that would anyway trigger the get & render of results in modal. This lets obsidian core handle the flow without digging too deep into obsidian cores handling of the flow. Lowers the chance of breakage	2023-01-11 16:39:23 -03:00
Debanjum Singh Solanky	1c813a6884	Convert results count setting to slider in plugin settings pane	2023-01-11 16:39:23 -03:00
Debanjum Singh Solanky	4e1abd1b72	Disable update button while indexing vault in plugin settings	2023-01-11 16:39:23 -03:00
Debanjum Singh Solanky	513c86c6a1	Set index file paths relative to current or default path on khoj backend We need the index file paths to make sense on the khoj backend server Having path of index on backend relative to current vault directory on frontend ignores the fact that the frontend maybe on a different machine than the khoj backend server Using unique index name per vault allows switching vaults without overwriting indices of other vaults created on khoj backend when khoj obsidian plugin is loaded on opening a different vault	2023-01-11 16:39:23 -03:00
Debanjum Singh Solanky	4407e23c19	Only index current vault on Khoj. Remove plugin setting to configure it - Overview Limits using Khoj with a single vault at a time. This is automatically configured to the most recently opened vault. Once directory filters are supported on backend, the plugin will be updated to index multiple vault but search only current vault from current vaults khoj obsidian plugin - Code Details - Remove setting to configure Vault directory from Khoj Obsidian plugin - Automatically configure Khoj to index only current Vault. - Overwrites any previous vaults that were intended to be indexed by Khoj backend - Force update of index after configuring vault - Why It's not helpful for now and can lead to more problems, confusion. Once directory filters	2023-01-11 16:39:23 -03:00
Debanjum Singh Solanky	86a1e43605	Return HTTP Exception on /api/update API call failure - Previously the backend was just throwing backend error. The frontend calling the /update API wasn't getting notified - Now the frontend can react appropriately and make the issue visible to the user	2023-01-11 16:39:23 -03:00
Debanjum Singh Solanky	5af2b68e2b	Update plugin notifications for errors and success - Only show notification on plugin load and failure. - In settings page, set current backend status at top of pane instead of showing notification Notices bubbles cluttered the UI while typing updates to settings - Show notification once index updated via settings pane button click There was no notification on index updated, which usually takes time on the backend	2023-01-11 16:39:23 -03:00
Debanjum Singh Solanky	853192932a	setCTA on Khoj Obsidian plugin button. Minor cleanup of space, tabs	2023-01-10 23:36:02 -03:00
Debanjum	531d423715	Enhance Search Modal, Error State Handling in Khoj Obsidian Plugin ### Search Modal Enhancements - `b52cd85` Allow Reranking results using Keybinding from Khoj Search Modal - `580f4ac` Add hints to Modal for available Keybindings - `da49ea2` Add placeholder text to modal in Khoj Obsidian plugin ### Handle Failure to Connect to Khoj Backend Load plugin but warn on failure to connect to Khoj backend - `f046a95` Track connectedToBackend as a setting. Use it across obsidian plugin to: - Disable command if not connected to backend - Trigger warning notice on clicking Khoj ribbon if not connected to backend - Show warning at top of Khoj Obsidian plugin settings pane - `768e874` Load obsidian plugin even if fail to connect to backend but show warning - Allows user to see reason for failure to try resolve it - Allows user to update Khoj URL settings to point to URL of Khoj server ### Miscellaneous - `7991ab7` Add button in Obsidian plugin settings to force re-indexing your vault - Useful if index gets corrupted	2023-01-10 23:20:32 -03:00
Debanjum Singh Solanky	da49ea272c	Add placeholder text to modal in Khoj Obsidian plugin	2023-01-10 22:50:11 -03:00
Debanjum Singh Solanky	580f4aca23	Add hints to Modal for available Keybindings	2023-01-10 22:03:47 -03:00
Debanjum Singh Solanky	b52cd85c76	Allow Reranking Results using Keybinding from Khoj Search Modal	2023-01-10 21:59:38 -03:00
Debanjum Singh Solanky	7991ab7a86	Add button in Obsidian plugin settings to force re-indexing your vault	2023-01-10 19:49:12 -03:00
Debanjum Singh Solanky	f046a95f3d	Track connectedToBackend as a setting. Use it across obsidian plugin - Display warning at top of khoj obsidian plugin settings - Make search command available only if connected to backend - Show warning notice on clicking khoj search ribbon button - Call saveData after configureKhojBackend to ensure connnectedToBackend setting saved after being (potentially) updated in configureKhojBackend function	2023-01-10 17:28:47 -03:00
Debanjum Singh Solanky	768e874185	Load obsidian plugin even if fail to connect to backend but show warning - Previously the plugin would not load if cannot connect to Khoj backend - Silently failing to load with no reason provided is not helpful - Load plugin to allow user to fix the Khoj URL in their plugin setting - Show reason for khoj plugin not working. More helpful than failing silently	2023-01-10 17:20:02 -03:00
Debanjum Singh Solanky	aa22d83172	Create and use a context manager to time code Use the timer context manager in all places where code was being timed - Benefits - Deduplicate timing code scattered across codebase. - Provides single place to manage perf timing code - Use consistent timing log patterns	2023-01-09 19:48:16 -03:00
Debanjum Singh Solanky	93f39dbd43	Add typing to text_search. Reformat code to set existing_embedding	2023-01-09 19:47:27 -03:00
Debanjum Singh Solanky	db7483329c	Only import type hint packages for type checking. Avoids circular imports Use annotations from the __future__ package to avoid having to quote type hints. This import will not be required after Python 3.11	2023-01-09 19:47:27 -03:00
Debanjum Singh Solanky	e5254a8e56	Create BaseEncoder class. Make OpenAI encoder its child. Use for typing - Set type of all bi_encoders to BaseEncoder - Make load_model return type Union of CrossEncoder and BaseEncoder	2023-01-09 19:47:27 -03:00
Debanjum Singh Solanky	cf7400759b	Remove unused render_results method from text and image search It's a relic from when khoj was being used as a python module	2023-01-09 19:47:27 -03:00
Debanjum Singh Solanky	afcfc3cd62	Split text_search.query logic into separate methods for modularity The query method had become too big. Extract out filter, score, sort and deduplicate logic used by text_search.query into separate methods. This should improve readabilty of code.	2023-01-09 19:47:27 -03:00
Debanjum Singh Solanky	8dc6ee8b6c	Pass `model' arg to extract_search_type method from beta search API Issue caught by mypy	2023-01-09 19:47:27 -03:00
Debanjum Singh Solanky	8498903641	Fix, add typing to Filter and TextSearchModel classes - Changes - Fix method signatures of BaseFilter subclasses. Else typing information isn't translating to them - Explicitly pass `entries: list[Entry]' as arg to `load' method - Fix type of `raw_entries' arg to `apply' method to list[Entry] from list[str] - Rename `raw_entries' arg to `apply' method to `entries' - Fix `raw_query' arg used in `apply' method of subclasses to `query' - Set type of entries, corpus_embeddings in TextSearchModel - Verification Ran `mypy --config-file .mypy.ini src' to verify typing	2023-01-09 19:47:27 -03:00
Debanjum Singh Solanky	d40076fcd6	Deduplicate test code, make teardown more robust using pytest fixtures	2023-01-09 19:47:27 -03:00
Debanjum Singh Solanky	eace7c6215	Use torch.tensor as torch.Tensor cannot create tensor on MPS device - `torch.Tensor' is apparently a legacy tensor constructor - Using that to create tensor on MPS devices throws error: RuntimeError: legacy constructor expects device type: cpu but device type: mps was passed - `torch.tensor' can handle creating tensors on Mac GPU (MPS) fine	2023-01-09 19:47:19 -03:00
Debanjum Singh Solanky	9def3f8c6f	Add exception handling to beta APIs, in case OpenAI API call fails	2023-01-09 01:27:06 -03:00
Debanjum Singh Solanky	7b164de021	Add beta API to summarize top search result using an OpenAI model This is unlike the more general chat API that combines summarization of top search result and conversing with the OpenAI model This should give faster summary results. As no intent categorization API call required	2023-01-09 01:25:59 -03:00
Debanjum Singh Solanky	d36da46f7b	Truncate prompt to not exceed OpenAI prompt limit Truncate prompt containing the top retrieved entry to 500 words to avoid triggering the max_token limit error	2023-01-09 00:51:46 -03:00
Debanjum Singh Solanky	237123d18c	Fix tests for the conversation processor - Use latest davinci model for tests - Wrap prompt in triple quotes to improve legibilty - `understand' method returns dictionary instead of string. Fix its test - Fix prompt for new model to pass `chat_with_history' test	2023-01-09 00:22:26 -03:00
Debanjum Singh Solanky	918af5e6f8	Make OpenAI conversation model configurable via khoj.yml - Default to using `text-davinci-003' if conversation model not explicitly configured by user. Stop using the older `davinci' and `davinci-instruct' models - Use `model' instead of `engine' as parameter. Usage of `engine' parameter in OpenAI API is deprecated	2023-01-09 00:17:51 -03:00
Debanjum Singh Solanky	7e05389776	Quote all values passed to input-filter fields in sample yaml files	2023-01-08 22:40:18 -03:00
Debanjum Singh Solanky	0440f3fd57	Add encoder-type field to the search-type sections in khoj_sample.yml	2023-01-08 22:07:13 -03:00
Debanjum Singh Solanky	8b8e202ab3	Set input-filter to list in khoj_docker.yml and khoj_sample.yml `input-filter' was converted to a list a while back but the sample khoj configs were not updated to reflect this. This change fixes that	2023-01-08 21:08:00 -03:00
Debanjum Singh Solanky	74e779f8d0	Fix /beta/chat API to use Entry class instead of old dictionary pattern Search returns response of type SearchResponse instead of a dict now	2023-01-08 15:28:26 -03:00
Debanjum Singh Solanky	f2436039a0	Improve readability of GPT prompt strings in conversation processor	2023-01-08 15:27:41 -03:00
Debanjum	1c091e509b	Make Encoder Type Configurable. Allow using OpenAI Model for Search - `2fe37a0` Make type of encoder to use for embeddings configurable via `khoj.yml' - Previously `encoder_type' was set in the setup code of search_type - All encoders were of type `SentenceTransformer' - All cross_encoders were of type `CrossEncoder' - Now the `encoder_type' can be configured via the new `encoder_type' field in `TextSearchConfig' under `search_type` in `khoj.yml' - All the specified `encoder-type' class needs is an `encode' method that takes entries and returns embedding vectors - `826f9dc` Drop long words from compiled entries to be within max token limit of models Long words (>500 characters) provide less useful context to models. Dropping very long words allow models to create better embeddings by passing more of the useful context from the entry to the model - `c0ae8ee` Allow using OpenAI models for search in Khoj To use OpenAI models for search in Khoj, in `~/.khoj/khoj.yml' 1. Set `encoder' to name of an OpenAI model. E.g text-embedding-ada-002 2. Set `encoder-type' to src.utils.models.OpenAI 3. Set `model-directory` to null, as this is an online model and cannot be stored on the file system	2023-01-08 11:10:25 -03:00
Debanjum Singh Solanky	6119005838	Improve comments, exceptions, typing and init of OpenAI model code	2023-01-08 00:36:18 -03:00
Debanjum Singh Solanky	c0ae8eee99	Allow using OpenAI models for search in Khoj - Init processor before search to instantiate `openai_api_key' from `khoj.yml'. The key is used to configure search with openai models - To use OpenAI models for search in Khoj - Set `encoder' to name of an OpenAI model. E.g text-embedding-ada-002 - Set `encoder-type' in `khoj.yml' to `src.utils.models.OpenAI' - Set `model-directory' to `null', as online model cannot be stored on disk	2023-01-07 23:13:56 -03:00
Debanjum Singh Solanky	826f9dc054	Drop long words from compiled entries to be within max token limit of models Long words (>500 characters) provide less useful context to models. Dropping very long words allow models to create better embeddings by passing more of the useful context from the entry to the model	2023-01-07 23:13:56 -03:00
Debanjum Singh Solanky	6a30a13326	Only create model directory if the optional field is set in SearchConfig	2023-01-07 23:13:56 -03:00
Debanjum Singh Solanky	2fe37a090f	Make type of encoder to use for embeddings configurable via khoj.yml - Previously `model_type' was set in the setup of each `search_type' - All encoders were of type `SentenceTransformer' - All cross_encoders were of type `CrossEncoder' - Now `encoder-type' can be configured via the new `encoder_type' field in `TextSearchConfig' under `search-type` in `khoj.yml`. - All the specified `encoder-type' class needs is an `encode' method that takes entries and returns embedding vectors	2023-01-07 23:09:12 -03:00
Debanjum Singh Solanky	fa92adcf0d	Add Visualization of Codebase to Readme under Development Section Source from Github vNext Repo Visualizer at https://githubnext.com/projects/repo-visualization/	2023-01-05 20:11:56 -03:00
Debanjum Singh Solanky	8c7ffd7aee	Add Readme doc to fix failure to build tokenizer dependency	2023-01-05 20:11:56 -03:00
Debanjum Singh Solanky	d55d7d53dc	Fix GPU usage by Khoj on Macs to speed up search and indexing - Ensure all tensors are on MPS device before doing operations across them - Background - GPU is used by default for Khoj on MacOS now - Needed PyTorch > 1.13.0 on Macs to use GPU, which we do now - MPS should speed up search and indexing on MacOS	2023-01-05 15:39:09 -03:00
Debanjum Singh Solanky	7380518f24	Upgrade PyTorch, Pillow version to resolve Dependabot Security Advisories This also enables GPU usage by Khoj on MacOS as MPS support is now in PyTorch mainline	2023-01-05 15:39:09 -03:00
Debanjum	abd035e2fa	Merge PR #112 to fix quote usage in khoj.el docstring from suliveevil/master Fix usage warning for unescaped single quote in `khoj.el' docstring. Converts usage of '<text>' into `<text>' to use the correct quote forms in generated docs	2023-01-05 13:24:11 -03:00
Debanjum Singh Solanky	1dc1472c55	In publish workflow, make twine upload verbose to troubleshoot	2023-01-05 12:56:46 -03:00
Debanjum Singh Solanky	e792523849	Bump version in metadata packages for khoj, khoj.el and obsidian plugin	2023-01-05 12:50:27 -03:00
suliveevil	b2812b409f	fix docstring usage warning ⛔ Warning (comp): khoj.el:119:2: Warning: docstring has wrong usage of unescaped single quotes (use \= or different quoting) ⛔ Warning (comp): khoj.el:120:2: Warning: docstring has wrong usage of unescaped single quotes (use \= or different quoting) ⛔ Warning (comp): khoj.el:121:2: Warning: docstring has wrong usage of unescaped single quotes (use \= or different quoting) ⛔ Warning (comp): khoj.el:168:2: Warning: docstring has wrong usage of unescaped single quotes (use \= or different quoting)	2023-01-05 16:47:38 +08:00

742 changed files with 113097 additions and 7317 deletions

									
										41

.devcontainer/Dockerfile
									
										Normal file
									
												View File
												
				@@ -0,0 +1,41 @@

				ARG PYTHON_VERSION=3.10

				FROM mcr.microsoft.com/devcontainers/python:${PYTHON_VERSION}

				# Install Node.js and Yarn

				RUN curl -fsSL https://deb.nodesource.com/setup_lts.x | bash - && \

				    apt-get install -y nodejs

				# Setup working directory

				WORKDIR /workspace

				# --- Python Server App Dependencies ---

				# Create Python virtual environment

				RUN python3 -m venv /opt/venv

				# Add venv to PATH for subsequent RUN commands and for the container environment

				ENV PATH="/opt/venv/bin:${PATH}"

				# Copy files required for Python dependency installation.

				COPY pyproject.toml README.md ./

				# Setup python environment

				    # Use the pre-built llama-cpp-python, torch cpu wheel

				ENV PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cpu https://abetlen.github.io/llama-cpp-python/whl/cpu" \

				    # Avoid downloading unused cuda specific python packages

				    CUDA_VISIBLE_DEVICES="" \

				    # Use static version to build app without git dependency

				    VERSION=0.0.0

				# Install Python dependencies from pyproject.toml in editable mode

				RUN sed -i "s/dynamic = \\[\"version\"\\]/version = \"$VERSION\"/" pyproject.toml && \

				    pip install --no-cache-dir ".[dev]"

				# --- Web App Dependencies ---

				# Copy web app manifest files

				COPY src/interface/web/package.json src/interface/web/yarn.lock /tmp/web/

				# Install web app dependencies

				# note: yarn will be available from the "features" in devcontainer.json

				RUN yarn install --cwd /tmp/web --cache-folder /opt/yarn-cache

				# The .venv and node_modules are now populated in the image.

				# The rest of the source code will be mounted by VS Code from your local checkout,

				# overlaying any files copied here if they are part of the workspace mount.

									
										63

.devcontainer/devcontainer.json
									
										Normal file
									
												View File
												
				@@ -0,0 +1,63 @@

				{

				    "name": "Khoj Development Environment",

				    "build": {

				        "dockerfile": "Dockerfile",

				        "context": "..", // Build context is the project root

				        "args": {

				            "PYTHON_VERSION": "3.10"

				        }

				    },

				    "forwardPorts": [

				        42110

				    ],

				    "containerEnv": {

				        "USE_EMBEDDED_DB": "True"

				    },

				    "customizations": {

				        "vscode": {

				            "extensions": [

				                "ms-python.python",

				                "ms-python.vscode-pylance",

				                "ms-python.black-formatter",

				                "ms-python.mypy-type-checker",

				                "ms-python.isort",

				                "esbenp.prettier-vscode",

				                "GitHub.copilot",

				                "GitHub.copilot-chat",

				                "GitHub.vscode-pull-request-github",

				                "github.vscode-github-actions",

				                "unifiedjs.vscode-mdx"

				            ],

				            "settings": {

				                "python.defaultInterpreterPath": "/opt/venv/bin/python",

				                "python.formatting.provider": "black",

				                "python.linting.enabled": true,

				                "python.linting.mypyEnabled": true,

				                "python.linting.mypyArgs": [

				                    "--config-file=pyproject.toml"

				                ],

				                "mypy.configFile": "pyproject.toml",

				                "isort.args": [

				                    "--profile",

				                    "black",

				                    "--filter-files"

				                ],

				                "python.testing.pytestArgs": [

				                    "tests"

				                ],

				                "python.testing.unittestEnabled": false,

				                "python.testing.pytestEnabled": true,

				            }

				        }

				    },

				    "postCreateCommand": "scripts/dev_setup.sh --devcontainer",

				    "features": {

				        "ghcr.io/devcontainers/features/github-cli:1": {},

				        "ghcr.io/devcontainers/features/node:1": {

				            "version": "lts",

				            "installYarnUsingApt": false,

				            "nodeGypDependencies": true

				        }

				    },

				    "remoteUser": "vscode"

				}

									
										29

.devcontainer/launch.json
									
										Normal file
									
												View File
												
				@@ -0,0 +1,29 @@

				{

				    "version": "0.2.0",

				    "configurations": [

				        {

				            "name": "Launch Khoj",

				            "type": "debugpy",

				            "request": "launch",

				            "module": "src.khoj.main",

				            "console": "integratedTerminal",

				            "justMyCode": false,

				            "sudo": true,

				            "args": [

				                "-v",

				                "--anonymous-mode",

				                "--non-interactive",

				                "--port=42110"

				            ],

				            "envFile": "${workspaceFolder}/.env",

				            "env": {

				                "KHOJ_ADMIN_EMAIL": "admin",

				                "KHOJ_ADMIN_PASSWORD": "admin",

				                "USE_EMBEDDED_DB": "true",

				                "KHOJ_DEBUG": "true",

				                "KHOJ_TELEMETRY_DISABLED": "true",

				                "PROMPTRACE_DIR": "${workspaceFolder}/promptrace"

				            }

				        }

				    ]

				}

14

.dockerignore

View File

@@ -1,9 +1,11 @@
 .git/
 .pytest_cache/
 .vscode/
 .venv/
 docs/
 .*
 **/__pycache__/
 *.egg-info/
 documentation/
 tests/
 build/
 dist/
 *.egg-info/
 scripts/
 src/interface/
 src/telemetry/
 !src/interface/web

2

.gitattributes vendored Normal file

View File

@@ -0,0 +1,2 @@
 # Exclude tests data file from programming stats on Github
 tests/data/** linguist-vendored

									
										144

.github/ISSUE_TEMPLATE/bug-report.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,144 @@

				name: "🪲 Bug Report"

				description: "Use this template to report a bug."

				labels: ["fix"]

				body:

				    - type: "markdown"

				      attributes:

				          value: |

				              > [!IMPORTANT]

				              > To save time for both you and us, try follow these guidelines before submitting a new issue:

				              > 1. Check if there is an existing issue tracking your bug on our Github.

				              > 2. When unsure if your issue is an actual bug, first discuss it on a [Github discussion](https://github.com/khoj-ai/khoj/discussions/new?category=q-a) or the **#bugs** channel of our [Discord server](https://discord.gg/b6gUdpKr).

				              > These steps avoid opening issues which are duplicate or not actual bugs.

				    - type: "checkboxes"

				      id: "server"

				      attributes:

				          label: "Server"

				          description: "With which Khoj server are you experiencing the issue? (select at least one)"

				          options:

				              - label: "Cloud (https://app.khoj.dev)"

				                required: false

				              - label: "Self-Hosted Docker"

				                required: false

				              - label: "Self-Hosted Python package"

				                required: false

				              - label: "Self-Hosted source code"

				                required: false

				      validations:

				          required: true

				    - type: "checkboxes"

				      id: "clients"

				      attributes:

				          label: "Clients"

				          description: "With which Khoj client(s) are you experiencing the issue? (select at least one)"

				          options:

				              - label: "Web browser"

				                required: false

				              - label: "Desktop/mobile app"

				                required: false

				              - label: "Obsidian"

				                required: false

				              - label: "Emacs"

				                required: false

				              - label: "WhatsApp"

				                required: false

				      validations:

				          required: true

				    - type: "checkboxes"

				      id: "os"

				      attributes:

				          label: "OS"

				          description: "On which operating system do you experience the issue? (select at least one)"

				          options:

				              - label: "Windows"

				                required: false

				              - label: "macOS"

				                required: false

				              - label: "Linux"

				                required: false

				              - label: "Android"

				                required: false

				              - label: "iOS"

				                required: false

				      validations:

				          required: true

				    - type: "input"

				      id: "version"

				      attributes:

				          label: "Khoj version"

				          description: "Use `/help` command on the chat page to find the server version.\n

				                        If using the cloud service, you can input, latest.\n

				                        If self-hosting - please make sure to run the latest version of Khoj before reporting any issues as it may have already been fixed."

				      validations:

				          required: true

				    - type: "textarea"

				      id: "description"

				      attributes:

				          label: "Describe the bug"

				          description: "What is the problem? A clear and concise description of the bug."

				      validations:

				          required: true

				    - type: "textarea"

				      id: "current"

				      attributes:

				          label: "Current Behavior"

				          description: "What actually happened?\n\n

				                        Please include full errors, uncaught exceptions, stack traces, screenshots and other relevant logs."

				      validations:

				          required: true

				    - type: "textarea"

				      id: "expected"

				      attributes:

				          label: "Expected Behavior"

				          description: "What did you expect to happen?"

				      validations:

				          required: true

				    - type: "textarea"

				      id: "reproduction"

				      attributes:

				          label: "Reproduction Steps"

				          description: "Detail the steps needed to reproduce the issue. This can include a self-contained, concise snippet of

				                        code, if applicable.\n\n

				                        For more complex issues, provide a link to a repository with the smallest sample that reproduces

				                        the bug.\n

				                        If the issue can be replicated without code, please provide a clear, step-by-step description of

				                        the actions or conditions necessary to reproduce it. Any screenshots are also appreciated.\n

				                        Avoid including business logic or unrelated details, as this makes diagnosis more difficult.\n\n

				                        Whether it's a sequence of actions, code samples, or specific conditions, ensure that the steps

				                        are clear enough to be easily followed and replicated."

				      validations:

				          required: true

				    - type: "textarea"

				      id: "workaround"

				      attributes:

				          label: "Possible Workaround"

				          description: "If you find any workaround for this problem - please, provide it here."

				      validations:

				          required: false

				    - type: "textarea"

				      id: "context"

				      attributes:

				          label: "Additional Information"

				          description: "Anything else that might be relevant for troubleshooting this bug.\n

				                        Providing context helps us come up with a solution that is most useful in the real-world use case.\n

				                        For example if self-hosting, provide environment details like OS versin, Docker version etc."

				      validations:

				          required: false

				    - type: "input"

				      id: "discussion_link"

				      attributes:

				          label: "Link to Discord or Github discussion"

				          description: "Provide a link to the first message of bug's discussion on Discord or Github.\n

				                        This will help to keep history of why this bug exists."

				      validations:

				          required: false

									
										54

.github/ISSUE_TEMPLATE/feature-request.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,54 @@

				name: "🙋 Feature Request"

				description: "Use this template to request new feature or suggest an idea for Khoj"

				labels: ["upgrade"]

				body:

				    - type: "markdown"

				      attributes:

				          value: |

				              > [!IMPORTANT]

				              > To save time for both you and us, try follow these guidelines before submitting a feature request:

				              > 1. Check if there is an existing feature request that is similar to your on our Github.

				              > 2. We encourage you to first discuss your idea on a [Github discussion](https://github.com/khoj-ai/khoj/discussions/categories/ideas) or the **#ideas** channel of our [Discord server](https://discord.gg/b6gUdpKr).

				              > This step helps in understanding the new feature and determining if it's can be implemented at all.

				              Only proceed with this report if your idea was approved after the GitHub/Discord discussion.

				    - type: "textarea"

				      id: "description"

				      attributes:

				          label: "Describe the feature"

				          description: "A clear and concise description of the feature you are proposing."

				      validations:

				          required: true

				    - type: "textarea"

				      id: "use-case"

				      attributes:

				          label: "Use Case"

				          description: "Why do you need this feature? Provide real world use cases, the more the better."

				      validations:

				          required: true

				    - type: "textarea"

				      id: "solution"

				      attributes:

				          label: "Proposed Solution"

				          description: "Suggest how to implement the new feature. Please include prototype/sketch/reference implementation."

				      validations:

				          required: false

				    - type: "textarea"

				      id: "additional_info"

				      attributes:

				          label: "Additional Information"

				          description: "Any additional information you would like to provide - links, screenshots, etc."

				      validations:

				          required: false

				    - type: "input"

				      id: "discussion_link"

				      attributes:

				          label: "Link to Discord or Github discussion"

				          description: "Provide a link to the first message of feature request's discussion on Discord or Github.\n

				                        This will help to keep history of why this feature request exists."

				      validations:

				          required: false

									
										45

.github/workflows/build.yml
									
										vendored
									
												View File
											
				@@ -1,45 +0,0 @@

				name: build

				on:

				  push:

				    branches:

				      - master

				    paths:

				      - src/**

				      - config/**

				      - setup.py

				      - Dockerfile

				      - docker-compose.yml

				      - .github/workflows/build.yml

				  workflow_dispatch:

				env:

				  DOCKER_IMAGE_TAG: ${{ github.ref == 'refs/heads/master' && 'latest' || github.ref_name }}

				jobs:

				  build:

				    name: Build Docker Image, Push to Container Registry

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout Code

				        uses: actions/checkout@v2

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@v1

				      - name: Login to GitHub Container Registry

				        uses: docker/login-action@v1

				        with:

				          registry: ghcr.io

				          username: ${{ github.repository_owner }}

				          password: ${{ secrets.PAT }}

				      - name: Build and Push Docker Image

				        uses: docker/build-push-action@v2

				        with:

				          context: .

				          file: Dockerfile

				          push: true

				          tags: ghcr.io/${{ github.repository }}:${{ env.DOCKER_IMAGE_TAG }}

				          build-args: |

				            PORT=8000

									
										39

.github/workflows/build_khoj_el.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,39 @@

				# melpa quality checks like checkdoc, byte-compile, package-lint for khoj.el

				# using melpazoid: https://github.com/riscy/melpazoid

				name: build khoj.el

				on:

				  push:

				    branches:

				      - 'master'

				    paths:

				      - src/interface/emacs/*.el

				      - .github/workflows/build_khoj_el.yml

				  pull_request:

				    branches:

				      - 'master'

				    paths:

				      - src/interface/emacs/*.el

				      - .github/workflows/build_khoj_el.yml

				jobs:

				  build:

				    runs-on: ubuntu-latest

				    steps:

				    - uses: actions/checkout@v2

				    - name: Set up Python 3.11

				      uses: actions/setup-python@v1

				      with: { python-version: 3.11 }

				    - name: ⏬️ Install Dependencies

				      run: |

				        python -m pip install --upgrade pip

				        sudo apt-get install emacs && emacs --version

				        git clone https://github.com/riscy/melpazoid.git ~/melpazoid

				        pip install ~/melpazoid

				    - name: 🌡️ Validate Khoj.el

				      env:

				        # Khoj recipe from https://github.com/melpa/melpa/pull/8321/files

				        RECIPE: (khoj :fetcher github :repo "khoj-ai/khoj" :files ("src/interface/emacs/*.el"))

				        EXIST_OK: true

				        LOCAL_REPO: ${{ github.workspace }}

				      run: echo $GITHUB_REF && make -C ~/melpazoid

									
										99

.github/workflows/desktop.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,99 @@

				name: desktop

				on:

				  push:

				    tags:

				      - "*"

				    branches:

				      - 'master'

				    paths:

				      - src/interface/desktop/**

				      - .github/workflows/desktop.yml

				jobs:

				  build:

				    name: 🖥️ Build, Release Desktop App

				    runs-on: ubuntu-latest

				    env:

				      TODESKTOP_ACCESS_TOKEN: ${{ secrets.TODESKTOP_ACCESS_TOKEN }}

				      TODESKTOP_EMAIL: ${{ secrets.TODESKTOP_EMAIL }}

				    defaults:

				      run:

				        shell: bash

				        working-directory: src/interface/desktop

				    steps:

				      - name: ⬇️ Checkout Code

				        uses: actions/checkout@v3

				        with:

				          fetch-depth: 0

				      - name: ⤵️ Install Node

				        uses: actions/setup-node@v3

				        with:

				          node-version: "lts/*"

				      - name: ⚙️ Setup Desktop Build

				        run: |

				          yarn

				          npm install -g @todesktop/cli

				          sed -i "s/\"id\": \"\"/\"id\": \"${{ secrets.TODESKTOP_ID }}\"/g" todesktop.json

				      - name: ⚙️ Build Desktop App

				        run: |

				          npx todesktop build

				      # - name: 📦 Release Desktop App

				      #   if: startsWith(github.ref, 'refs/tags/')

				      #   run: |

				      #     npx todesktop release --latest --force

				      - name: ⤵️ Get Desktop Apps

				        if: startsWith(github.ref, 'refs/tags/')

				        run: |

				          build_no=`npx todesktop builds --latest | tail -n 1 | awk -F'/' '{print $NF}'`

				          sleep 900  # wait for 15 minutes for the build to be available

				          wget https://download.khoj.dev/builds/$build_no/mac/dmg/arm64 -O khoj-${{ github.ref_name }}-arm64.dmg

				          wget https://download.khoj.dev/builds/$build_no/mac/dmg/x64 -O khoj-${{ github.ref_name }}-x64.dmg

				          wget https://download.khoj.dev/builds/$build_no/windows/nsis/x64 -O khoj-${{ github.ref_name }}-x64.exe

				          wget https://download.khoj.dev/builds/$build_no/linux/deb/x64 -O khoj-${{ github.ref_name }}-x64.deb

				          wget https://download.khoj.dev/builds/$build_no/linux/appImage/x64 -O khoj-${{ github.ref_name }}-x64.AppImage

				      - name: ⏫ Upload Mac ARM App

				        if: startsWith(github.ref, 'refs/tags/')

				        uses: actions/upload-artifact@v4

				        with:

				          if-no-files-found: warn

				          name: khoj-${{ github.ref_name }}-arm64.dmg

				          path: src/interface/desktop/khoj-${{ github.ref_name }}-arm64.dmg

				      - name: ⏫ Upload Mac x64 App

				        if: startsWith(github.ref, 'refs/tags/')

				        uses: actions/upload-artifact@v4

				        with:

				          if-no-files-found: warn

				          name: khoj-${{ github.ref_name }}-x64.dmg

				          path: src/interface/desktop/khoj-${{ github.ref_name }}-x64.dmg

				      - name: ⏫ Upload Windows App

				        if: startsWith(github.ref, 'refs/tags/')

				        uses: actions/upload-artifact@v4

				        with:

				          if-no-files-found: warn

				          name: khoj-${{ github.ref_name }}-x64.exe

				          path: src/interface/desktop/khoj-${{ github.ref_name }}-x64.exe

				      - name: ⏫ Upload Debian App

				        if: startsWith(github.ref, 'refs/tags/')

				        uses: actions/upload-artifact@v4

				        with:

				          if-no-files-found: warn

				          name: khoj-${{ github.ref_name }}-x64.deb

				          path: src/interface/desktop/khoj-${{ github.ref_name }}-x64.deb

				      - name: ⏫ Upload Linux App Image

				        if: startsWith(github.ref, 'refs/tags/')

				        uses: actions/upload-artifact@v4

				        with:

				          if-no-files-found: warn

				          name: khoj-${{ github.ref_name }}-x64.AppImage

				          path: src/interface/desktop/khoj-${{ github.ref_name }}-x64.AppImage

									
										178

.github/workflows/dockerize.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,178 @@

				name: dockerize

				on:

				  push:

				    tags:

				      - "*"

				    branches:

				      - master

				    paths:

				      - src/khoj/**

				      - src/interface/web/**

				      - pyproject.toml

				      - Dockerfile

				      - prod.Dockerfile

				      - computer.Dockerfile

				      - docker-compose.yml

				      - .github/workflows/dockerize.yml

				  workflow_dispatch:

				    inputs:

				      tag:

				        description: 'Docker image tag'

				        default: 'dev'

				      khoj:

				        description: 'Build Khoj docker image'

				        type: boolean

				        default: true

				      khoj-cloud:

				        description: 'Build Khoj cloud docker image'

				        type: boolean

				        default: true

				      khoj-computer:

				        description: 'Build computer for Khoj'

				        type: boolean

				        default: true

				env:

				  # Tag Image with tag name on release

				  # else with user specified tag (default 'dev') if triggered via workflow

				  # else with run_id if triggered via a pull request

				  # else with 'pre' (if push to master)

				  DOCKER_IMAGE_TAG: ${{ github.ref_type == 'tag' && github.ref_name || github.event_name == 'workflow_dispatch' && github.event.inputs.tag || 'pre' }}

				jobs:

				  build:

				    name: Publish Khoj Docker Images

				    strategy:

				      fail-fast: false

				      matrix:

				        include:

				          - image: 'local'

				            platform: linux/amd64

				            runner: ubuntu-latest

				          - image: 'local'

				            platform: linux/arm64

				            runner: ubuntu-linux-arm64

				          - image: 'cloud'

				            platform: linux/amd64

				            runner: ubuntu-latest

				          - image: 'cloud'

				            platform: linux/arm64

				            runner: ubuntu-linux-arm64

				    runs-on: ${{ matrix.runner }}

				    steps:

				      - name: Checkout Code

				        uses: actions/checkout@v3

				        with:

				          # Get all history to correctly infer Khoj version using hatch

				          fetch-depth: 0

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@v2

				      - name: Login to GitHub Container Registry

				        uses: docker/login-action@v2

				        with:

				          registry: ghcr.io

				          username: ${{ github.repository_owner }}

				          password: ${{ secrets.PAT }}

				      - name: Get App Version

				        id: hatch

				        run: echo "version=$(pipx run hatch version)" >> $GITHUB_OUTPUT

				      - name: 🧹 Delete huge unnecessary tools folder

				        run: rm -rf /opt/hostedtoolcache

				      - name: 📦 Build and Push Docker Image

				        uses: docker/build-push-action@v4

				        if: (matrix.image == 'local' && github.event_name == 'workflow_dispatch') && github.event.inputs.khoj == 'true' || (matrix.image == 'local' && github.event_name == 'push')

				        with:

				          context: .

				          file: Dockerfile

				          push: true

				          tags: |

				            ghcr.io/${{ github.repository }}:${{ env.DOCKER_IMAGE_TAG }}-${{ matrix.platform == 'linux/amd64' && 'amd64' || 'arm64' }}

				          build-args: |

				            VERSION=${{ steps.hatch.outputs.version }}

				            PORT=42110

				          cache-from: type=gha,scope=${{ matrix.image }}-${{ matrix.platform }}

				          cache-to: type=gha,mode=max,scope=${{ matrix.image }}-${{ matrix.platform}}

				          labels: |

				            org.opencontainers.image.description=Khoj AI - Your second brain powered by LLMs and Neural Search

				            org.opencontainers.image.source=${{ github.server_url }}/${{ github.repository }}

				      - name: 📦️⛅️ Build and Push Cloud Docker Image

				        uses: docker/build-push-action@v4

				        if: (matrix.image == 'cloud' && github.event_name == 'workflow_dispatch') && github.event.inputs.khoj-cloud == 'true' || (matrix.image == 'cloud' && github.event_name == 'push')

				        with:

				          context: .

				          file: prod.Dockerfile

				          push: true

				          tags: |

				            ghcr.io/${{ github.repository }}-cloud:${{ env.DOCKER_IMAGE_TAG }}-${{ matrix.platform == 'linux/amd64' && 'amd64' || 'arm64' }}

				          build-args: |

				            VERSION=${{ steps.hatch.outputs.version }}

				            PORT=42110

				          cache-from: type=gha,scope=${{ matrix.image }}-${{ matrix.platform }}

				          cache-to: type=gha,mode=max,scope=${{ matrix.image }}-${{ matrix.platform}}

				          labels: |

				            org.opencontainers.image.description=Khoj AI Cloud - Your second brain powered by LLMs and Neural Search

				            org.opencontainers.image.source=${{ github.server_url }}/${{ github.repository }}

				      - name: 📦️️💻 Build and Push Computer for Khoj

				        uses: docker/build-push-action@v4

				        if: github.event_name == 'workflow_dispatch' && github.event.inputs.khoj-computer == 'true' && matrix.image == 'local'

				        with:

				          context: .

				          file: computer.Dockerfile

				          push: true

				          tags: |

				            ghcr.io/${{ github.repository }}-computer:${{ env.DOCKER_IMAGE_TAG }}-${{ matrix.platform == 'linux/amd64' && 'amd64' || 'arm64' }}

				          cache-from: type=gha,scope=computer-${{ matrix.platform }}

				          cache-to: type=gha,mode=max,scope=computer-${{ matrix.platform }}

				          labels: |

				            org.opencontainers.image.description=Khoj AI Computer - A computer for your second brain to operate

				            org.opencontainers.image.source=${{ github.server_url }}/${{ github.repository }}

				  manifest:

				    needs: build

				    runs-on: ubuntu-latest

				    if: github.event_name != 'pull_request'

				    steps:

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@v3

				      - name: Login to GitHub Container Registry

				        uses: docker/login-action@v2

				        with:

				          registry: ghcr.io

				          username: ${{ github.repository_owner }}

				          password: ${{ secrets.PAT }}

				      - name: Create and Push Local Manifest

				        if: github.event.inputs.khoj == 'true' || github.event_name == 'push'

				        run: |

				          docker buildx imagetools create \

				            -t ghcr.io/${{ github.repository }}:${{ env.DOCKER_IMAGE_TAG }} \

				            -t ghcr.io/${{ github.repository }}:${{ github.ref_type == 'tag' && 'latest' || env.DOCKER_IMAGE_TAG }} \

				            ghcr.io/${{ github.repository }}:${{ env.DOCKER_IMAGE_TAG }}-amd64 \

				            ghcr.io/${{ github.repository }}:${{ env.DOCKER_IMAGE_TAG }}-arm64

				      - name: Create and Push Cloud Manifest

				        if: github.event.inputs.khoj-cloud == 'true' || github.event_name == 'push'

				        run: |

				          docker buildx imagetools create \

				            -t ghcr.io/${{ github.repository }}-cloud:${{ env.DOCKER_IMAGE_TAG }} \

				            -t ghcr.io/${{ github.repository }}-cloud:${{ github.ref_type == 'tag' && 'latest' || env.DOCKER_IMAGE_TAG }} \

				            ghcr.io/${{ github.repository }}-cloud:${{ env.DOCKER_IMAGE_TAG }}-amd64 \

				            ghcr.io/${{ github.repository }}-cloud:${{ env.DOCKER_IMAGE_TAG }}-arm64

				      - name: Create and Push Computer Manifest

				        if: github.event.inputs.khoj-computer == 'true'

				        run: |

				          docker buildx imagetools create \

				            -t ghcr.io/${{ github.repository }}-computer:${{ env.DOCKER_IMAGE_TAG }} \

				            -t ghcr.io/${{ github.repository }}-computer:${{ github.ref_type == 'tag' && 'latest' || env.DOCKER_IMAGE_TAG }} \

				            ghcr.io/${{ github.repository }}-computer:${{ env.DOCKER_IMAGE_TAG }}-amd64 \

				            ghcr.io/${{ github.repository }}-computer:${{ env.DOCKER_IMAGE_TAG }}-arm64

									
										47

.github/workflows/dockerize_telemetry_server.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,47 @@

				name: dockerize telemetry server

				on:

				  push:

				    branches:

				      - master

				    paths:

				      - src/telemetry/**

				      - .github/workflows/dockerize_telemetry_server.yml

				  pull_request:

				    branches:

				      - master

				    paths:

				      - src/telemetry/**

				      - .github/workflows/dockerize_telemetry_server.yml

				  workflow_dispatch:

				env:

				  DOCKER_IMAGE_TAG: ${{ github.ref == 'refs/heads/master' && 'latest' || github.event.pull_request.number }}

				jobs:

				  build:

				    name: Build Docker Image, Push to Container Registry

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout Code

				        uses: actions/checkout@v3

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@v2

				      - name: Login to GitHub Container Registry

				        uses: docker/login-action@v2

				        with:

				          registry: ghcr.io

				          username: ${{ github.repository_owner }}

				          password: ${{ secrets.PAT }}

				      - name: 📦 Build and Push Docker Image

				        uses: docker/build-push-action@v2

				        with:

				          context: src/telemetry

				          file: src/telemetry/Dockerfile

				          push: true

				          tags: ghcr.io/${{ github.repository }}-telemetry:${{ env.DOCKER_IMAGE_TAG }}

				          secrets: |

				            "POSTHOG_API_KEY=${{ secrets.POSTHOG_API_KEY }}"

									
										46

.github/workflows/github_pages_deploy.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,46 @@

				name: deploy documentation

				on:

				  push:

				    branches:

				      - 'master'

				permissions:

				  contents: read

				  pages: write

				  id-token: write

				jobs:

				    deploy:

				      environment:

				        name: github-pages

				        url: https://docs.khoj.dev

				      runs-on: ubuntu-latest

				      steps:

				        - name: Checkout

				          uses: actions/checkout@v4

				        # 👇 Build steps

				        - name: Set up Node.js

				          uses: actions/setup-node@v4

				          with:

				            node-version: 18.x

				            cache: yarn

				            cache-dependency-path: documentation/yarn.lock

				        - name: Install dependencies

				          run: |

				            cd documentation

				            yarn install --frozen-lockfile --non-interactive

				        - name: Build

				          run: |

				            cd documentation

				            yarn build

				        # 👆 Build steps

				        - name: Setup Pages

				          uses: actions/configure-pages@v5

				        - name: Upload artifact

				          uses: actions/upload-pages-artifact@v3

				          with:

				            # 👇 Specify build output path

				            path: documentation/build

				        - name: Deploy to GitHub Pages

				          id: deployment

				          uses: actions/deploy-pages@v4

									
										48

.github/workflows/pre-commit.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,48 @@

				name: pre-commit

				on:

				  pull_request:

				    paths:

				      - src/**

				      - tests/**

				      - config/**

				      - pyproject.toml

				      - .pre-commit-config.yml

				      - .github/workflows/test.yml

				  push:

				    branches:

				      - master

				    paths:

				      - src/khoj/**

				      - tests/**

				      - config/**

				      - pyproject.toml

				      - .pre-commit-config.yml

				      - .github/workflows/test.yml

				jobs:

				  test:

				    name: Setup Application and Lint

				    runs-on: ubuntu-latest

				    strategy:

				      fail-fast: false

				    steps:

				      - uses: actions/checkout@v3

				        with:

				          fetch-depth: 0

				      - name: Set up Python 3.11

				        uses: actions/setup-python@v4

				        with:

				          python-version: 3.11

				      - name: ⏬️ Install Dependencies

				        run: |

				          sudo apt update && sudo apt install -y libegl1

				          python -m pip install --upgrade pip

				      - name: ⬇️ Install Application

				        run: pip install --no-cache-dir --upgrade .[dev]

				      - name: 🌡️ Validate Application

				        run: pre-commit run --hook-stage manual --all

									
										95

.github/workflows/publish.yml
									
										vendored
									
												View File
											
				@@ -1,95 +0,0 @@

				name: publish

				on:

				  push:

				    tags:

				      - "*"

				    branches:

				      - 'master'

				    paths:

				      - src/**

				      - setup.py

				      - .github/workflows/publish.yml

				  pull_request:

				    branches:

				      - 'master'

				    paths:

				      - src/**

				      - setup.py

				      - .github/workflows/publish.yml

				jobs:

				  publish:

				    name: Publish App to PyPI

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/checkout@v3

				      - name: Set up Python 3.10

				        uses: actions/setup-python@v4

				        with:

				          python-version: '3.10'

				      - name: Install Dependencies

				        run: |

				          python -m pip install --upgrade pip

				          pip install build twine

				      - name: Install Application

				        run: |

				          pip install --upgrade .

				      - name: Publish Release to PyPI

				        if: startsWith(github.ref, 'refs/tags')

				        env:

				          TWINE_USERNAME: __token__

				          TWINE_PASSWORD: ${{ secrets.PYPI_API_KEY }}

				        run: |

				          # Setup Environment for Reproducible Builds

				          export PYTHONHASHSEED=42

				          export SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)

				          # Build and Upload PyPi Package

				          rm -rf dist

				          python -m build

				          twine check dist/*

				          twine upload dist/*

				      - name: Publish Master to PyPI

				        if: github.ref == 'refs/heads/master'

				        env:

				          TWINE_USERNAME: __token__

				          TWINE_PASSWORD: ${{ secrets.PYPI_API_KEY }}

				        run: |

				          # Set Pre-Release Version

				          sed -E -i "s/version=(.*)',/version=\1a$(date +%s)',/g" setup.py

				          # Setup Environment for Reproducible Builds

				          export PYTHONHASHSEED=42

				          export SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)

				          # Build and Upload PyPi Package

				          rm -rf dist

				          python -m build

				          twine check dist/*

				          twine upload dist/*

				      - name: Publish PR to Test PyPI

				        if: github.event_name == 'pull_request'

				        env:

				          TWINE_USERNAME: __token__

				          TWINE_PASSWORD: ${{ secrets.TEST_PYPI_API_KEY }}

				          PULL_REQUEST_NUMBER: ${{ github.event.number }}

				        run: |

				          # Set Development Release Version

				          sed -E -i "s/version=(.*)',/version=\1.dev$PULL_REQUEST_NUMBER$(date +%s)',/g" setup.py

				          # Setup Environment for Reproducible Builds

				          export PYTHONHASHSEED=42

				          export SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)

				          # Build and Upload PyPi Package

				          rm -rf dist

				          python -m build

				          twine check dist/*

				          twine upload -r testpypi dist/*

									
										81

.github/workflows/pypi.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,81 @@

				name: pypi

				on:

				  push:

				    tags:

				      - "*"

				    branches:

				      - 'master'

				    paths:

				      - src/khoj/**

				      - src/interface/web/**

				      - pyproject.toml

				      - .github/workflows/pypi.yml

				  pull_request:

				    branches:

				      - 'master'

				    paths:

				      - src/khoj/**

				      - src/interface/web/**

				      - pyproject.toml

				      - .github/workflows/pypi.yml

				  workflow_dispatch:

				jobs:

				  publish:

				    name: Publish Python Package to PyPI

				    runs-on: ubuntu-latest

				    permissions:

				      id-token: write

				    steps:

				      - uses: actions/checkout@v4

				        with:

				          fetch-depth: 0

				      - name: Set up Python 3.11

				        uses: actions/setup-python@v4

				        with:

				          python-version: '3.11.12'

				      - name: ⬇️ Install Server

				        run: python -m pip install --upgrade pip && pip install --upgrade .

				      - name: ⬇️ Install Web Client

				        run: |

				          yarn install

				          yarn pypiciexport

				        working-directory: src/interface/web

				      - name: 📂 Copy Generated Files

				        run: |

				          mkdir -p src/khoj/interface/compiled

				          cp -r /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/khoj/interface/compiled/* src/khoj/interface/compiled/

				      - name: ⚙️ Build Python Package

				        run: |

				          # Setup Environment for Reproducible Builds

				          export PYTHONHASHSEED=42

				          export SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)

				          rm -rf dist

				          # Build PyPI Package

				          pipx run build

				      - name: 🌡️ Validate Python Package

				        run: |

				          # Validate PyPi Package

				          pipx run check-wheel-contents dist/*.whl --ignore W004

				          pipx run twine check dist/*

				      - name: ⏫ Upload Python Package Artifacts

				        uses: actions/upload-artifact@v4

				        with:

				          name: khoj

				          path: dist/khoj-*.whl

				      - name: 📦 Publish Python Package to PyPI

				        if: startsWith(github.ref, 'refs/tags') || github.ref == 'refs/heads/master'

				        uses: pypa/gh-action-pypi-publish@release/v1.12

				        with:

				          skip-existing: true

				          print-hash: true

									
										127

.github/workflows/release.yml
									
										vendored
									
												View File
												
				@@ -12,106 +12,55 @@ on:

				        type: string

				jobs:

				  publish:

				    strategy:

				      matrix:

				        include:

				        - os: ubuntu-latest

				          extension: deb

				        - os: macos-latest

				          extension: dmg

				        - os: windows-latest

				          extension: exe

				    runs-on: ${{ matrix.os }}

				  publish_obsidian_plugin:

				    name: 💎 Publish Obsidian Plugin

				    runs-on: ubuntu-latest

				    permissions:

				      contents: write

				    defaults:

				      run:

				        shell: bash

				        working-directory: src/interface/obsidian

				    steps:

				      - uses: actions/checkout@v3

				      - name: Set up Python 3.9

				        uses: actions/setup-python@v4

				      - name: Install Node

				        uses: actions/setup-node@v3

				        with:

				          python-version: '3.9'

				          node-version: "lts/*"

				      - name: Install Dependencies

				        shell: bash

				      - name: ⚙️ Build Obsidian Plugin

				        run: |

				          if [ "$RUNNER_OS" == "Linux" ]; then

				            sudo apt install libegl1 libxcb-xinerama0 python3-tk -y

				          fi

				          python -m pip install --upgrade pip

				          pip install pyinstaller

				          yarn

				          yarn run build --if-present

				      - name: Install Khoj App

				        run: |

				          pip install --upgrade .

				      - name: Package Khoj App

				        shell: bash

				        run: |

				          # Setup Environment for Reproducible Builds

				          export PYTHONHASHSEED=42

				          export SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)

				          pyinstaller --noconfirm Khoj.spec

				          if [ "$RUNNER_OS" == "Windows" ]; then

				            mv dist/Khoj.exe dist/khoj_"$GITHUB_REF_NAME"_amd64.exe

				          fi

				      - name: Create Mac App DMG

				        if: matrix.os == 'macos-latest'

				        run: |

				         # Install Mac DMG Creator

				          brew install create-dmg

				          # Copy app to separate dmg folder

				          mkdir -p dist/dmg && cp -r dist/Khoj.app dist/dmg

				          # Create disk image with the app

				          create-dmg \

				            --volname "Khoj" \

				            --volicon "src/interface/web/assets/icons/favicon.icns" \

				            --window-pos 200 120 \

				            --window-size 600 300 \

				            --icon-size 100 \

				            --icon "Khoj.app" 175 120 \

				            --hide-extension "Khoj.app" \

				            --app-drop-link 425 120 \

				            "dist/khoj_"$GITHUB_REF_NAME"_amd64.dmg" \

				            "dist/dmg/"

				      - uses: ruby/setup-ruby@v1

				        if: matrix.os == 'ubuntu-latest'

				      - name: ⏫ Upload Obsidian Plugin main.js

				        uses: actions/upload-artifact@v4

				        with:

				          ruby-version: '3.0'

				      - name: Create Debian Package

				        if: matrix.os == 'ubuntu-latest'

				        shell: bash

				        env:

				          DEBIAN_PACKAGE_VERSION: ${{ inputs.version }}

				        run: |

				          # Install Debian Packager

				          gem install fpm

				          if-no-files-found: error

				          name: main.js

				          path: src/interface/obsidian/main.js

				          # Copy app files into expected output directory structure

				          mkdir -p package/opt package/usr/share/applications package/usr/share/icons/hicolor/128x128/apps

				          cp -r dist/Khoj package/opt/Khoj

				          cp src/interface/web/assets/icons/favicon-128x128.png package/usr/share/icons/hicolor/128x128/apps/Khoj.png

				          cp Khoj.desktop package/usr/share/applications

				          # Fix permissions to be usable by non-root users

				          find package/usr/share -type f -exec chmod 644 -- {} +

				          chmod 755 package/opt/Khoj

				          # Package the app

				          if [ -z "$DEBIAN_PACKAGE_VERSION" ]; then

				            DEBIAN_PACKAGE_VERSION=$(echo $GITHUB_REF_NAME | sed -E 's/v(.*)/\1/g')

				          fi

				          fpm -C package -s dir -t deb -n Khoj --version $DEBIAN_PACKAGE_VERSION -p dist/khoj_"$GITHUB_REF_NAME"_amd64.deb

				      - uses: actions/upload-artifact@v3

				      - name: ⏫ Upload Obsidian Plugin manifest.json

				        uses: actions/upload-artifact@v4

				        with:

				          name: khoj_${{github.ref_name}}_amd64.${{matrix.extension}}

				          path: dist/khoj_${{github.ref_name}}_amd64.${{matrix.extension}}

				          if-no-files-found: error

				          name: manifest.json

				          path: src/interface/obsidian/manifest.json

				      - name: Release

				      - name: ⏫ Upload Obsidian Plugin styles.css

				        uses: actions/upload-artifact@v4

				        with:

				          if-no-files-found: error

				          name: styles.css

				          path: src/interface/obsidian/styles.css

				      - name: 🌈 Create Release

				        uses: softprops/action-gh-release@v1

				        if: startsWith(github.ref, 'refs/tags/')

				        with:

				          files: dist/khoj_${{github.ref_name}}_amd64.${{matrix.extension}}

				          generate_release_notes: true

				          files: |

				            src/interface/obsidian/main.js

				            src/interface/obsidian/manifest.json

				            src/interface/obsidian/styles.css

									
										217

.github/workflows/run_evals.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,217 @@

				name: eval

				on:

				  # Run on every release

				  push:

				    tags:

				      - "*"

				  # Allow manual triggers from GitHub UI

				  workflow_dispatch:

				    inputs:

				      khoj_mode:

				        description: 'Khoj Mode (general/default/research)'

				        required: true

				        default: 'default'

				        type: choice

				        options:

				          - general

				          - default

				          - research

				      dataset:

				        description: 'Dataset to evaluate (frames/simpleqa)'

				        required: true

				        default: 'frames'

				        type: choice

				        options:

				          - frames

				          - simpleqa

				          - gpqa

				          - math500

				      sample_size:

				        description: 'Number of samples to evaluate'

				        required: false

				        default: 200

				        type: number

				      sandbox:

				        description: 'Code sandbox to use'

				        required: false

				        default: 'terrarium'

				        type: choice

				        options:

				          - terrarium

				          - e2b

				      chat_model:

				        description: 'Chat model to use'

				        required: false

				        default: 'gemini-2.0-flash'

				        type: string

				      max_research_iterations:

				        description: 'Maximum number of iterations in research mode'

				        required: false

				        default: 5

				        type: number

				      openai_api_key:

				        description: 'OpenAI API key'

				        required: false

				        default: ''

				        type: string

				      openai_base_url:

				        description: 'Base URL of OpenAI compatible API'

				        required: false

				        default: 'https://api.openai.com/v1'

				        type: string

				      auto_read_webpage:

				        description: 'Auto read webpage on online search'

				        required: false

				        default: 'false'

				        type: choice

				        options:

				          - 'false'

				          - 'true'

				      randomize:

				        description: 'Randomize the sample of questions'

				        required: false

				        default: 'true'

				        type: choice

				        options:

				          - 'false'

				          - 'true'

				jobs:

				  eval:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        # Use input from manual trigger if available, else run all combinations

				        khoj_mode: ${{ github.event_name == 'workflow_dispatch' && fromJSON(format('["{0}"]', inputs.khoj_mode)) || fromJSON('["general", "default", "research"]') }}

				        dataset: ${{ github.event_name == 'workflow_dispatch' && fromJSON(format('["{0}"]', inputs.dataset)) || fromJSON('["frames", "gpqa"]') }}

				    services:

				      postgres:

				        image: ankane/pgvector

				        env:

				          POSTGRES_PASSWORD: postgres

				          POSTGRES_USER: postgres

				          POSTGRES_DB: postgres

				        ports:

				          - 5432:5432

				        options: >-

				          --health-cmd pg_isready

				          --health-interval 10s

				          --health-timeout 5s

				          --health-retries 5

				    steps:

				      - uses: actions/checkout@v3

				        with:

				          fetch-depth: 0

				      - name: Set up Python

				        uses: actions/setup-python@v4

				        with:

				          python-version: '3.10'

				      - name: Get App Version

				        id: hatch

				        run: |

				          # Mask relevant workflow inputs as secret early

				          OPENAI_API_KEY=$(jq -r '.inputs.openai_api_key' $GITHUB_EVENT_PATH)

				          echo ::add-mask::$OPENAI_API_KEY

				          echo OPENAI_API_KEY="$OPENAI_API_KEY" >> $GITHUB_ENV

				          # Get app version from hatch

				          echo "version=$(pipx run hatch version)" >> $GITHUB_OUTPUT

				      - name: ⏬️ Install Dependencies

				        env:

				          DEBIAN_FRONTEND: noninteractive

				        run: |

				          # install dependencies

				          sudo apt update && sudo apt install -y git python3-pip libegl1 sqlite3 libsqlite3-dev libsqlite3-0 ffmpeg libsm6 libxext6

				          # upgrade pip

				          python -m ensurepip --upgrade && python -m pip install --upgrade pip

				          # install terrarium for code sandbox

				          git clone https://github.com/khoj-ai/terrarium.git && cd terrarium && npm install --legacy-peer-deps && mkdir pyodide_cache

				      - name: ⬇️ Install Application

				        run: |

				          sed -i 's/dynamic = \["version"\]/version = "${{ steps.hatch.outputs.version }}"/' pyproject.toml

				          pip install --upgrade .[dev]

				      - name: 📝 Run Eval

				        env:

				          KHOJ_MODE: ${{ matrix.khoj_mode }}

				          SAMPLE_SIZE: ${{ github.event_name == 'workflow_dispatch' && inputs.sample_size || 200 }}

				          BATCH_SIZE: "20"

				          RANDOMIZE: ${{ github.event_name == 'workflow_dispatch' && inputs.randomize || 'true' }}

				          KHOJ_URL: "http://localhost:42110"

				          KHOJ_LLM_SEED: "42"

				          KHOJ_DEFAULT_CHAT_MODEL: ${{ github.event_name == 'workflow_dispatch' && inputs.chat_model || 'gemini-2.0-flash' }}

				          KHOJ_RESEARCH_ITERATIONS: ${{ github.event_name == 'workflow_dispatch' && inputs.max_research_iterations || 10 }}

				          KHOJ_AUTO_READ_WEBPAGE: ${{ github.event_name == 'workflow_dispatch' && inputs.auto_read_webpage || 'false' }}

				          GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}

				          OPENAI_BASE_URL: ${{ github.event_name == 'workflow_dispatch' && inputs.openai_base_url || 'https://api.openai.com/v1' }}

				          SERPER_DEV_API_KEY: ${{ matrix.dataset != 'math500' && secrets.SERPER_DEV_API_KEY || '' }}

				          OLOSTEP_API_KEY: ${{ matrix.dataset != 'math500' && secrets.OLOSTEP_API_KEY || ''}}

				          FIRECRAWL_API_KEY: ${{ matrix.dataset != 'math500' && secrets.FIRECRAWL_API_KEY || '' }}

				          HF_TOKEN: ${{ secrets.HF_TOKEN }}

				          E2B_API_KEY: ${{ inputs.sandbox == 'e2b' && secrets.E2B_API_KEY || '' }}

				          E2B_TEMPLATE: ${{ vars.E2B_TEMPLATE }}

				          KHOJ_ADMIN_EMAIL: khoj

				          KHOJ_ADMIN_PASSWORD: khoj

				          POSTGRES_HOST: localhost

				          POSTGRES_PORT: 5432

				          POSTGRES_USER: postgres

				          POSTGRES_PASSWORD: postgres

				          POSTGRES_DB: postgres

				          USE_EMBEDDED_DB: "true"

				          KHOJ_TELEMETRY_DISABLE: "True"  # To disable telemetry for tests

				        run: |

				          # Start Khoj server in background

				          khoj --anonymous-mode --non-interactive &

				          # Start code sandbox

				          npm install -g pm2

				          NODE_ENV=production npm run ci --prefix terrarium

				          # Wait for server to be ready

				          timeout=120

				          while ! curl -s http://localhost:42110/api/health > /dev/null; do

				            if [ $timeout -le 0 ]; then

				              echo "Timed out waiting for Khoj server"

				              exit 1

				            fi

				            echo "Waiting for Khoj server..."

				            sleep 2

				            timeout=$((timeout-2))

				          done

				          # Run evals

				          python tests/evals/eval.py -d ${{ matrix.dataset }}

				      - name: Upload Results

				        if: always()  # Upload results even if tests fail

				        uses: actions/upload-artifact@v4

				        with:

				          name: eval-results-${{ steps.hatch.outputs.version }}-${{ matrix.khoj_mode }}-${{ matrix.dataset }}

				          path: |

				            *_evaluation_results_*.csv

				            *_evaluation_summary_*.txt

				      - name: Display Results

				        if: always()

				        run: |

				          # Read and display summary

				          echo "## Evaluation Summary of Khoj on ${{ matrix.dataset }} in ${{ matrix.khoj_mode }} mode" >> $GITHUB_STEP_SUMMARY

				          echo "**$(head -n 1 *_evaluation_summary_*.txt)**" >> $GITHUB_STEP_SUMMARY

				          echo "- Khoj Version: ${{ steps.hatch.outputs.version }}" >> $GITHUB_STEP_SUMMARY

				          echo "- Chat Model: ${{ inputs.chat_model || 'gemini-2.0-flash' }}" >> $GITHUB_STEP_SUMMARY

				          echo "- Code Sandbox: ${{ inputs.sandbox || 'terrarium' }}" >> $GITHUB_STEP_SUMMARY

				          echo "\`\`\`" >> $GITHUB_STEP_SUMMARY

				          tail -n +2 *_evaluation_summary_*.txt >> $GITHUB_STEP_SUMMARY

				          echo "" >> $GITHUB_STEP_SUMMARY

				          echo "\`\`\`" >> $GITHUB_STEP_SUMMARY

				          # Display in logs too

				          echo "===== EVALUATION RESULTS ====="

				          cat *_evaluation_summary_*.txt

									
										81

.github/workflows/test.yml
									
										vendored
									
												View File
												
				@@ -2,46 +2,91 @@ name: test

				on:

				  pull_request:

				    branches:

				      - 'master'

				    paths:

				      - src/**

				      - src/khoj/**

				      - tests/**

				      - '!tests/evals/**'

				      - config/**

				      - setup.py

				      - pyproject.toml

				      - .pre-commit-config.yml

				      - .github/workflows/test.yml

				  push:

				    branches:

				      - 'master'

				      - master

				    paths:

				      - src/**

				      - src/khoj/**

				      - tests/**

				      - '!tests/evals/**'

				      - config/**

				      - setup.py

				      - pyproject.toml

				      - .pre-commit-config.yml

				      - .github/workflows/test.yml

				jobs:

				  test:

				    name: Run Tests

				    runs-on: ubuntu-latest

				    container: ubuntu:latest

				    strategy:

				      fail-fast: false

				      matrix:

				        python_version:

				          - '3.10'

				          - '3.11'

				          - '3.12'

				    services:

				      postgres:

				        image: ankane/pgvector

				        env:

				          POSTGRES_PASSWORD: postgres

				          POSTGRES_USER: postgres

				        ports:

				          - 5432:5432

				        options: --health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5

				    steps:

				      - uses: actions/checkout@v3

				        with:

				          fetch-depth: 0

				      - name: Set up Python 3.10

				      - name: Set up Python

				        uses: actions/setup-python@v4

				        with:

				          python-version: '3.10'

				          python-version: ${{ matrix.python_version }}

				      - name: Install Dependencies

				      - name: ⏬️ Install Dependencies

				        env:

				          DEBIAN_FRONTEND: noninteractive

				        run: |

				          sudo apt install libegl1 -y

				          python -m pip install --upgrade pip

				          pip install pytest

				          apt update && apt install -y git libegl1 sqlite3 libsqlite3-dev libsqlite3-0 ffmpeg libsm6 libxext6

				          # required by llama-cpp-python prebuilt wheels

				          apt install -y musl-dev && ln -s /usr/lib/x86_64-linux-musl/libc.so /lib/libc.musl-x86_64.so.1

				      - name: Install Application

				        run: |

				          pip install --upgrade .

				      - name: ⬇️ Install Postgres

				        env:

				          DEBIAN_FRONTEND: noninteractive

				        run : |

				          apt install -y postgresql postgresql-client && apt install -y postgresql-server-dev-16

				      - name: Test Application

				      - name: ⬇️ Install pip

				        run: |

				          pytest 

				          apt install -y python3-pip

				          python3 -m ensurepip --upgrade

				          python3 -m pip install --upgrade pip

				      - name: ⬇️ Install Application

				        env:

				          PIP_EXTRA_INDEX_URL: "https://download.pytorch.org/whl/cpu https://abetlen.github.io/llama-cpp-python/whl/cpu"

				          CUDA_VISIBLE_DEVICES: ""

				        run: sed -i 's/dynamic = \["version"\]/version = "0.0.0"/' pyproject.toml && pip install --break-system-packages --upgrade .[dev]

				      - name: 🧪 Test Application

				        env:

				          POSTGRES_HOST: postgres

				          POSTGRES_PORT: 5432

				          POSTGRES_USER: postgres

				          POSTGRES_PASSWORD: postgres

				          POSTGRES_DB: postgres

				        run: pytest

				        timeout-minutes: 10

									
										52

.github/workflows/test_khoj_el.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,52 @@

				name: test khoj.el

				on:

				  push:

				    branches:

				      - 'master'

				    paths:

				      - src/interface/emacs/*.el

				      - src/interface/emacs/tests/*.el

				      - .github/workflows/test_khoj_el.yml

				  pull_request:

				    branches:

				      - 'master'

				    paths:

				      - src/interface/emacs/*.el

				      - src/interface/emacs/tests/*.el

				      - .github/workflows/test_khoj_el.yml

				jobs:

				  test:

				    runs-on: ubuntu-latest

				    strategy:

				      fail-fast: false

				      matrix:

				        emacs_version:

				          - 27.1

				          - 27.2

				          - 28.1

				          - 28.2

				          - snapshot

				    steps:

				      - uses: purcell/setup-emacs@master

				        with:

				          version: ${{ matrix.emacs_version }}

				      - uses: actions/checkout@v3

				      - name: 🧪 Test Khoj.el

				        run: |

				           # Run ERT tests on khoj.el

				           emacs -batch \

				           --eval "(progn \

				                    (require 'package) \

				                    (push '(\"melpa\" . \"https://melpa.org/packages/\") package-archives) \

				                    (package-initialize) \

				                    (unless package-archive-contents (package-refresh-contents)) \

				                    (unless (package-installed-p 'transient) (package-install 'transient)) \

				                    (unless (package-installed-p 'dash) (package-install 'dash)) \

				                    (unless (package-installed-p 'org) (package-install 'org)) \

				                   )" \

				           -l ert \

				           -l ./src/interface/emacs/khoj.el \

				           -l ./src/interface/emacs/tests/khoj-tests.el \

				           -f ert-run-tests-batch-and-exit

30

.gitignore vendored

View File

@@ -1,9 +1,11 @@
 # Khoj artifacts
 *.gz
 *.pt
 src/.data
 tests/data/models
 tests/data/embeddings
 tests/evals/**.txt
 tests/evals/**.csv
 pgserver_data/
 # External app artifacts
 __pycache__
@@ -11,15 +13,21 @@ __pycache__
 .emacs.desktop*
 *.py[cod]
 .vscode
 .env
 .venv/*
 todesktop.json
 # Build artifacts
 /src/interface/web/images
 /src/khoj/interface/web/images
 /src/khoj/interface/built/
 /src/khoj/interface/compiled/404.html
 /build/
 /dist/
 /khoj_assistant.egg-info/
 khoj_assistant.egg-info
 /config/khoj*.yml
 .pytest_cache
 khoj.log
 *.log
 /src/khoj/static
 # Obsidian plugin artifacts
 # ---
@@ -28,10 +36,22 @@ node_modules
 # Don't include the compiled obsidian main.js file in the repo.
 # They should be uploaded to GitHub releases instead.
 main.js
 src/interface/obsidian/main.js
 # Exclude sourcemaps
 *.map
 # IntelliJ
 .idea
 # obsidian
 data.json
 # Android
 src/interface/android/.gradle
 src/interface/android/app/build
 src/interface/android/build
 src/interface/android/*.aab
 src/interface/android/*.apk
 src/interface/android/*.apk.idsig
 src/interface/android/*.keystore
 src/interface/android/local.properties

									
										13

.mypy.ini
									
												View File
											
				@@ -1,13 +0,0 @@

				[mypy]

				strict_optional = False

				ignore_missing_imports = True

				install_types = True

				non_interactive = True

				show_error_codes = True

				exclude = (?x)(

				    src/interface/desktop/main_window.py

				    | src/interface/desktop/file_browser.py

				    | src/interface/desktop/system_tray.py

				    | build/*

				    | tests/*

				  )

									
										33

.pre-commit-config.yaml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,33 @@

				repos:

				- repo: https://github.com/psf/black

				  rev: 23.1.0

				  hooks:

				  - id: black

				- repo: https://github.com/pre-commit/pre-commit-hooks

				  rev: v4.4.0

				  hooks:

				  - id: end-of-file-fixer

				  - id: trailing-whitespace

				    # Exclude elisp files to not clear page breaks

				    exclude: \.el$

				  - id: check-json

				    exclude: (devcontainer\.json|launch\.json)$

				  - id: check-toml

				  - id: check-yaml

				- repo: https://github.com/pycqa/isort

				  rev: 5.12.0

				  hooks:

				  - id: isort

				    name: isort (python)

				    args: ["--profile", "black", "--filter-files"]

				- repo: https://github.com/pre-commit/mirrors-mypy

				  rev: v1.0.0

				  hooks:

				    - id: mypy

				      stages: [pre-push, manual]

				      pass_filenames: false

				      args:

				      - --config-file=pyproject.toml

									
										37

.vscode/launch.json
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,37 @@

				{

				    "version": "0.2.0",

				    "configurations": [

				        {

				            "name": "Launch Khoj",

				            "type": "debugpy",

				            "request": "launch",

				            "module": "src.khoj.main",

				            "console": "integratedTerminal",

				            "justMyCode": false,

				            "sudo": false,

				            "args": [

				                "-v",

				                "--anonymous-mode",

				                // "--non-interactive",

				                "--port=42110",

				                // "--host=example.com",

				                // "--sslcert=ssl.crt",

				                // "--sslkey=ssl.key",

				            ],

				            "envFile": "${workspaceFolder}/.env",

				            "env": {

				                "KHOJ_ADMIN_EMAIL": "admin",

				                "KHOJ_ADMIN_PASSWORD": "admin",

				                "USE_EMBEDDED_DB": "true",

				                "KHOJ_DEBUG": "true",

				                "KHOJ_TELEMETRY_DISABLED": "true",

				                // Configure Code Sandbox

				                // "KHOJ_TERRARIUM_URL": "http://localhost:8080",

				                // Enable Promptracer to debug prompt flows

				                // "PROMPTRACE_DIR": "\${workspaceFolder}/promptrace",

				                // Enable Khoj Operator

				                // "KHOJ_OPERATOR_ENABLED": "True",

				            }

				        },

				    ]

				}

									
										7

.vscode/settings.json
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,7 @@

				{

				    "python.testing.pytestArgs": [

				        "tests"

				    ],

				    "python.testing.unittestEnabled": false,

				    "python.testing.pytestEnabled": true

				}

									
										66

Dockerfile
									
												View File
												
				@@ -1,22 +1,64 @@

				# syntax=docker/dockerfile:1

				FROM python:3.10-slim-bullseye

				LABEL org.opencontainers.image.source https://github.com/debanjum/khoj

				FROM ubuntu:jammy AS base

				LABEL homepage="https://khoj.dev"

				LABEL repository="https://github.com/khoj-ai/khoj"

				LABEL org.opencontainers.image.source="https://github.com/khoj-ai/khoj"

				LABEL org.opencontainers.image.description="Your second brain, containerized for personal, local deployment."

				# Install System Dependencies

				RUN apt-get update -y && \

				    apt-get -y install python3-pyqt5

				RUN apt update -y && apt -y install \

				    python3-pip \

				    swig \

				    curl \

				    # Required by RapidOCR

				    libgl1 \

				    libglx-mesa0 \

				    libglib2.0-0 \

				    docker.io \

				    # Required by llama-cpp-python pre-built wheels. See #1628

				    musl-dev && \

				    ln -s /usr/lib/x86_64-linux-musl/libc.so /lib/libc.musl-x86_64.so.1 && \

				    # Clean up

				    apt clean && rm -rf /var/lib/apt/lists/*

				# Copy Application to Container

				COPY . /app

				# Build Server

				FROM base AS server-deps

				WORKDIR /app

				COPY pyproject.toml .

				COPY README.md .

				ARG VERSION=0.0.0

				# use the pre-built llama-cpp-python, torch cpu wheel

				ENV PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cpu https://abetlen.github.io/llama-cpp-python/whl/cpu"

				# avoid downloading unused cuda specific python packages

				ENV CUDA_VISIBLE_DEVICES=""

				RUN sed -i "s/dynamic = \\[\"version\"\\]/version = \"$VERSION\"/" pyproject.toml && \

				    pip install --no-cache-dir .

				# Install Python Dependencies

				RUN pip install --upgrade pip && \

				    pip install --upgrade .

				# Build Web App

				FROM node:23-alpine AS web-app

				# Set build optimization env vars

				ENV NODE_ENV=production

				ENV NEXT_TELEMETRY_DISABLED=1

				WORKDIR /app/src/interface/web

				# Install dependencies first (cache layer)

				COPY src/interface/web/package.json src/interface/web/yarn.lock ./

				RUN yarn install --frozen-lockfile

				# Copy source and build

				COPY src/interface/web/. ./

				RUN yarn build

				# Merge the Server and Web App into a Single Image

				FROM base

				ENV PYTHONPATH=/app/src

				WORKDIR /app

				COPY --from=server-deps /usr/local/lib/python3.10/dist-packages /usr/local/lib/python3.10/dist-packages

				COPY --from=web-app /app/src/interface/web/out ./src/khoj/interface/built

				COPY . .

				RUN cd src && python3 khoj/manage.py collectstatic --noinput

				# Run the Application

				# There are more arguments required for the application to run,

				# but these should be passed in through the docker-compose.yml file.

				ARG PORT

				# but those should be passed in through the docker-compose.yml file.

				ARG PORT=42110

				EXPOSE ${PORT}

				ENTRYPOINT ["khoj"]

				ENTRYPOINT ["python3", "src/khoj/main.py"]

									
										7

Khoj.desktop
									
												View File
											
				@@ -1,7 +0,0 @@

				[Desktop Entry]

				Type=Application

				Name=Khoj

				Comment=A natural language search engine for your personal notes, transactions and images.

				Path=/opt

				Exec=/opt/Khoj

				Icon=Khoj

									
										115

Khoj.spec
									
												View File
											
				@@ -1,115 +0,0 @@

				# -*- mode: python ; coding: utf-8 -*-

				from os.path import join

				from platform import system

				from PyInstaller.utils.hooks import copy_metadata

				import sysconfig

				datas = [

				    ('src/interface/web', 'src/interface/web'),

				    (f'{sysconfig.get_paths()["purelib"]}/transformers', 'transformers')

				]

				datas += copy_metadata('tqdm')

				datas += copy_metadata('regex')

				datas += copy_metadata('requests')

				datas += copy_metadata('packaging')

				datas += copy_metadata('filelock')

				datas += copy_metadata('numpy')

				datas += copy_metadata('tokenizers')

				block_cipher = None

				a = Analysis(

				    ['src/main.py'],

				    pathex=[],

				    binaries=[],

				    datas=datas,

				    hiddenimports=['huggingface_hub.repository'],

				    hookspath=[],

				    hooksconfig={},

				    runtime_hooks=[],

				    excludes=[],

				    win_no_prefer_redirects=False,

				    win_private_assemblies=False,

				    cipher=block_cipher,

				    noarchive=False,

				)

				# Filter out unused and/or duplicate shared libs

				torch_lib_paths = {

				    join('torch', 'lib', 'libtorch_cuda.so'),

				    join('torch', 'lib', 'libtorch_cpu.so'),

				}

				a.datas = [entry for entry in a.datas if not entry[0] in torch_lib_paths]

				os_path_separator = '\\' if system() == 'Windows' else '/'

				a.datas = [entry for entry in a.datas if not f'torch{os_path_separator}_C.cp' in entry[0]]

				a.datas = [entry for entry in a.datas if not f'torch{os_path_separator}_dl.cp' in entry[0]]

				pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)

				if system() != 'Darwin':

				    # Add Splash screen to show on app launch

				    splash = Splash(

				        'src/interface/web/assets/icons/favicon-144x144.png',

				        binaries=a.binaries,

				        datas=a.datas,

				        text_pos=(10, 160),

				        text_size=12,

				        text_color='black',

				        minify_script=True,

				        always_on_top=True

				    )

				    exe = EXE(

				        pyz,

				        a.scripts,

				        a.binaries,

				        a.zipfiles,

				        a.datas,

				        splash,

				        splash.binaries,

				        [],

				        name='Khoj',

				        debug=False,

				        bootloader_ignore_signals=False,

				        strip=False,

				        upx=True,

				        upx_exclude=[],

				        runtime_tmpdir=None,

				        console=False,

				        disable_windowed_traceback=False,

				        argv_emulation=False,

				        target_arch='x86_64',

				        codesign_identity=None,

				        entitlements_file=None,

				        icon='src/interface/web/assets/icons/favicon-144x144.ico',

				    )

				else:

				    exe = EXE(

				        pyz,

				        a.scripts,

				        a.binaries,

				        a.zipfiles,

				        a.datas,

				        [],

				        name='Khoj',

				        debug=False,

				        bootloader_ignore_signals=False,

				        strip=False,

				        upx=True,

				        upx_exclude=[],

				        runtime_tmpdir=None,

				        console=False,

				        disable_windowed_traceback=False,

				        argv_emulation=False,

				        target_arch='x86_64',

				        codesign_identity=None,

				        entitlements_file=None,

				        icon='src/interface/web/assets/icons/favicon.icns',

				    )

				    app = BUNDLE(

				        exe,

				        name='Khoj.app',

				        icon='src/interface/web/assets/icons/favicon.icns',

				        bundle_identifier=None,

				    )

151

LICENSE

View File

@@ -1,23 +1,21 @@
                     GNU GENERAL PUBLIC LICENSE
                        Version 3, 29 June 2007
                     GNU AFFERO GENERAL PUBLIC LICENSE
                        Version 3, 19 November 2007
  Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
  Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
  Everyone is permitted to copy and distribute verbatim copies
  of this license document, but changing it is not allowed.
                             Preamble
   The GNU General Public License is a free, copyleft license for
 software and other kinds of works.
   The GNU Affero General Public License is a free, copyleft license for
 software and other kinds of works, specifically designed to ensure
 cooperation with the community in the case of network server software.
   The licenses for most software and other practical works are designed
 to take away your freedom to share and change the works.  By contrast,
 the GNU General Public License is intended to guarantee your freedom to
 our General Public Licenses are intended to guarantee your freedom to
 share and change all versions of a program--to make sure it remains free
 software for all its users.  We, the Free Software Foundation, use the
 GNU General Public License for most of our software; it applies also to
 any other work released this way by its authors.  You can apply it to
 your programs, too.
 software for all its users.
   When we speak of free software, we are referring to freedom, not
 price.  Our General Public Licenses are designed to make sure that you
@@ -26,44 +24,34 @@ them if you wish), that you receive source code or can get it if you
 want it, that you can change the software or use pieces of it in new
 free programs, and that you know you can do these things.
   To protect your rights, we need to prevent others from denying you
 these rights or asking you to surrender the rights.  Therefore, you have
 certain responsibilities if you distribute copies of the software, or if
 you modify it: responsibilities to respect the freedom of others.
   Developers that use our General Public Licenses protect your rights
 with two steps: (1) assert copyright on the software, and (2) offer
 you this License which gives you legal permission to copy, distribute
 and/or modify the software.
   For example, if you distribute copies of such a program, whether
 gratis or for a fee, you must pass on to the recipients the same
 freedoms that you received.  You must make sure that they, too, receive
 or can get the source code.  And you must show them these terms so they
 know their rights.
   A secondary benefit of defending all users' freedom is that
 improvements made in alternate versions of the program, if they
 receive widespread use, become available for other developers to
 incorporate.  Many developers of free software are heartened and
 encouraged by the resulting cooperation.  However, in the case of
 software used on network servers, this result may fail to come about.
 The GNU General Public License permits making a modified version and
 letting the public access it on a server without ever releasing its
 source code to the public.
   Developers that use the GNU GPL protect your rights with two steps:
 (1) assert copyright on the software, and (2) offer you this License
 giving you legal permission to copy, distribute and/or modify it.
   The GNU Affero General Public License is designed specifically to
 ensure that, in such cases, the modified source code becomes available
 to the community.  It requires the operator of a network server to
 provide the source code of the modified version running there to the
 users of that server.  Therefore, public use of a modified version, on
 a publicly accessible server, gives the public access to the source
 code of the modified version.
   For the developers' and authors' protection, the GPL clearly explains
 that there is no warranty for this free software.  For both users' and
 authors' sake, the GPL requires that modified versions be marked as
 changed, so that their problems will not be attributed erroneously to
 authors of previous versions.
   Some devices are designed to deny users access to install or run
 modified versions of the software inside them, although the manufacturer
 can do so.  This is fundamentally incompatible with the aim of
 protecting users' freedom to change the software.  The systematic
 pattern of such abuse occurs in the area of products for individuals to
 use, which is precisely where it is most unacceptable.  Therefore, we
 have designed this version of the GPL to prohibit the practice for those
 products.  If such problems arise substantially in other domains, we
 stand ready to extend this provision to those domains in future versions
 of the GPL, as needed to protect the freedom of users.
   Finally, every program is threatened constantly by software patents.
 States should not allow patents to restrict development and use of
 software on general-purpose computers, but in those that do, we wish to
 avoid the special danger that patents applied to a free program could
 make it effectively proprietary.  To prevent this, the GPL assures that
 patents cannot be used to render the program non-free.
   An older license, called the Affero General Public License and
 published by Affero, was designed to accomplish similar goals.  This is
 a different license, not a version of the Affero GPL, but Affero has
 released a new version of the Affero GPL which permits relicensing under
 this license.
   The precise terms and conditions for copying, distribution and
 modification follow.
@@ -72,7 +60,7 @@ modification follow.
 . Definitions.
   "This License" refers to version 3 of the GNU General Public License.
   "This License" refers to version 3 of the GNU Affero General Public License.
   "Copyright" also means copyright-like laws that apply to other kinds of
 works, such as semiconductor masks.
@@ -549,35 +537,45 @@ to collect a royalty for further conveying from those to whom you convey
 the Program, the only way you could satisfy both those terms and this
 License would be to refrain entirely from conveying the Program.
 . Use with the GNU Affero General Public License.
 . Remote Network Interaction; Use with the GNU General Public License.
   Notwithstanding any other provision of this License, if you modify the
 Program, your modified version must prominently offer all users
 interacting with it remotely through a computer network (if your version
 supports such interaction) an opportunity to receive the Corresponding
 Source of your version by providing access to the Corresponding Source
 from a network server at no charge, through some standard or customary
 means of facilitating copying of software.  This Corresponding Source
 shall include the Corresponding Source for any work covered by version 3
 of the GNU General Public License that is incorporated pursuant to the
 following paragraph.
   Notwithstanding any other provision of this License, you have
 permission to link or combine any covered work with a work licensed
 under version 3 of the GNU Affero General Public License into a single
 under version 3 of the GNU General Public License into a single
 combined work, and to convey the resulting work.  The terms of this
 License will continue to apply to the part which is the covered work,
 but the special requirements of the GNU Affero General Public License,
 section 13, concerning interaction through a network will apply to the
 combination as such.
 but the work with which it is combined will remain governed by version
 of the GNU General Public License.
 . Revised Versions of this License.
   The Free Software Foundation may publish revised and/or new versions of
 the GNU General Public License from time to time.  Such new versions will
 be similar in spirit to the present version, but may differ in detail to
 the GNU Affero General Public License from time to time.  Such new versions
 will be similar in spirit to the present version, but may differ in detail to
 address new problems or concerns.
   Each version is given a distinguishing version number.  If the
 Program specifies that a certain numbered version of the GNU General
 Program specifies that a certain numbered version of the GNU Affero General
 Public License "or any later version" applies to it, you have the
 option of following the terms and conditions either of that numbered
 version or of any later version published by the Free Software
 Foundation.  If the Program does not specify a version number of the
 GNU General Public License, you may choose any version ever published
 GNU Affero General Public License, you may choose any version ever published
 by the Free Software Foundation.
   If the Program specifies that a proxy can decide which future
 versions of the GNU General Public License can be used, that proxy's
 versions of the GNU Affero General Public License can be used, that proxy's
 public statement of acceptance of a version permanently authorizes you
 to choose that version for the Program.
@@ -620,3 +618,44 @@ copy of the Program in return for a fee.
                      END OF TERMS AND CONDITIONS
             How to Apply These Terms to Your New Programs
   If you develop a new program, and you want it to be of the greatest
 possible use to the public, the best way to achieve this is to make it
 free software which everyone can redistribute and change under these terms.
   To do so, attach the following notices to the program.  It is safest
 to attach them to the start of each source file to most effectively
 state the exclusion of warranty; and each file should have at least
 the "copyright" line and a pointer to where the full notice is found.
     <one line to give the program's name and a brief idea of what it does.>
     Copyright (C) <year>  <name of author>
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU Affero General Public License as published
     by the Free Software Foundation, either version 3 of the License, or
     (at your option) any later version.
     This program is distributed in the hope that it will be useful,
     but WITHOUT ANY WARRANTY; without even the implied warranty of
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     GNU Affero General Public License for more details.
     You should have received a copy of the GNU Affero General Public License
     along with this program.  If not, see <https://www.gnu.org/licenses/>.
 Also add information on how to contact you by electronic and paper mail.
   If your software can interact with users remotely through a computer
 network, you should also make sure that it provides a way for users to
 get its source.  For example, if your program is a web application, its
 interface could display a "Source" link that leads users to an archive
 of the code.  There are many ways you could offer source, and different
 solutions will be better for different programs; see section 13 for the
 specific requirements.
   You should also get your employer (if you work as a programmer) or school,
 if any, to sign a "copyright disclaimer" for the program, if necessary.
 For more information on this, and how to apply and follow the GNU AGPL, see
 <https://www.gnu.org/licenses/>.

5

MANIFEST.in

View File

@@ -1,5 +0,0 @@
 include Readme.md
 graft src/interface/*
 prune src/interface/web/images*
 prune docs*
 global-exclude .DS_Store *.py[cod]

									
										108

README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,108 @@

				<p align="center"><img src="https://assets.khoj.dev/khoj-logo-sideways-1200x540.png" width="230" alt="Khoj Logo"></p>

				<div align="center">

				[![test](https://github.com/khoj-ai/khoj/actions/workflows/test.yml/badge.svg)](https://github.com/khoj-ai/khoj/actions/workflows/test.yml)

				[![docker](https://github.com/khoj-ai/khoj/actions/workflows/dockerize.yml/badge.svg)](https://github.com/khoj-ai/khoj/pkgs/container/khoj)

				[![pypi](https://github.com/khoj-ai/khoj/actions/workflows/pypi.yml/badge.svg)](https://pypi.org/project/khoj/)

				[![discord](https://img.shields.io/discord/1112065956647284756?style=plastic&label=discord)](https://discord.gg/BDgyabRM6e)

				</div>

				<div align="center">

				<b>Your AI second brain</b>

				</div>

				<br />

				<div align="center">

				[📑 Docs](https://docs.khoj.dev)

				<span>&nbsp;&nbsp;•&nbsp;&nbsp;</span>

				[🌐 Web](https://khoj.dev)

				<span>&nbsp;&nbsp;•&nbsp;&nbsp;</span>

				[🔥 App](https://app.khoj.dev)

				<span>&nbsp;&nbsp;•&nbsp;&nbsp;</span>

				[💬 Discord](https://discord.gg/BDgyabRM6e)

				<span>&nbsp;&nbsp;•&nbsp;&nbsp;</span>

				[✍🏽 Blog](https://blog.khoj.dev)

				<a href="https://trendshift.io/repositories/10318" target="_blank"><img src="https://trendshift.io/api/badge/repositories/10318" alt="khoj-ai%2Fkhoj | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

				</div>

				***

				### 🎁 New

				* Start any message with `/research` to try out the experimental research mode with Khoj.

				* Anyone can now [create custom agents](https://blog.khoj.dev/posts/create-agents-on-khoj/) with tunable personality, tools and knowledge bases.

				* [Read](https://blog.khoj.dev/posts/evaluate-khoj-quality/) about Khoj's excellent performance on modern retrieval and reasoning benchmarks.

				***

				## Overview

				[Khoj](https://khoj.dev) is a personal AI app to extend your capabilities. It smoothly scales up from an on-device personal AI to a cloud-scale enterprise AI.

				- Chat with any local or online LLM (e.g llama3, qwen, gemma, mistral, gpt, claude, gemini, deepseek).

				- Get answers from the internet and your docs (including image, pdf, markdown, org-mode, word, notion files).

				- Access it from your Browser, Obsidian, Emacs, Desktop, Phone or Whatsapp.

				- Create agents with custom knowledge, persona, chat model and tools to take on any role.

				- Automate away repetitive research. Get personal newsletters and smart notifications delivered to your inbox.

				- Find relevant docs quickly and easily using our advanced semantic search.

				- Generate images, talk out loud, play your messages.

				- Khoj is open-source, self-hostable. Always.

				- Run it privately on [your computer](https://docs.khoj.dev/get-started/setup) or try it on our [cloud app](https://app.khoj.dev).

				***

				## See it in action

				![demo_chat](https://github.com/khoj-ai/khoj/blob/master/documentation/assets/img/quadratic_equation_khoj_web.gif?raw=true)

				Go to https://app.khoj.dev to see Khoj live.

				## Full feature list

				You can see the full feature list [here](https://docs.khoj.dev/category/features).

				## Self-Host

				To get started with self-hosting Khoj, [read the docs](https://docs.khoj.dev/get-started/setup).

				## Enterprise

				Khoj is available as a cloud service, on-premises, or as a hybrid solution. To learn more about Khoj Enterprise, [visit our website](https://khoj.dev/teams).

				## Frequently Asked Questions (FAQ)

				Q: Can I use Khoj without self-hosting?

				Yes! You can use Khoj right away at [https://app.khoj.dev](https://app.khoj.dev) — no setup required.

				Q: What kinds of documents can Khoj read?

				Khoj supports a wide variety: PDFs, Markdown, Notion, Word docs, org-mode files, and more.

				Q: How can I make my own agent?

				Check out [this blog post](https://blog.khoj.dev/posts/create-agents-on-khoj/) for a step-by-step guide to custom agents.

				For more questions, head over to our [Discord](https://discord.gg/BDgyabRM6e)!

				## Contributors

				Cheers to our awesome contributors! 🎉

				<a href="https://github.com/khoj-ai/khoj/graphs/contributors">

				  <img src="https://contrib.rocks/image?repo=khoj-ai/khoj" />

				</a>

				Made with [contrib.rocks](https://contrib.rocks).

				### Interested in Contributing?

				Khoj is open source. It is sustained by the community and we’d love for you to join it! Whether you’re a coder, designer, writer, or enthusiast, there’s a place for you.

				Why Contribute?

				- Make an Impact: Help build, test and improve a tool used by thousands to boost productivity.

				- Learn & Grow: Work on cutting-edge AI, LLMs, and semantic search technologies.

				You can help us build new features, improve the project documentation, report issues and fix bugs. If you're a developer, please see our [Contributing Guidelines](https://docs.khoj.dev/contributing/development) and check out [good first issues](https://github.com/khoj-ai/khoj/contribute) to work on.

									
										321

Readme.md
									
												View File
											
				@@ -1,321 +0,0 @@

				# Khoj 🦅

				[![build](https://github.com/debanjum/khoj/actions/workflows/build.yml/badge.svg)](https://github.com/debanjum/khoj/actions/workflows/build.yml)

				[![test](https://github.com/debanjum/khoj/actions/workflows/test.yml/badge.svg)](https://github.com/debanjum/khoj/actions/workflows/test.yml)

				[![publish](https://github.com/debanjum/khoj/actions/workflows/publish.yml/badge.svg)](https://github.com/debanjum/khoj/actions/workflows/publish.yml)

				*A natural language search engine for your personal notes, transactions and images*

				## Table of Contents

				- [Features](#Features)

				- [Demos](#Demos)

				  - [Khoj in Obsidian](#khoj-in-obsidian)

				  - [Khoj in Emacs, Browser](#khoj-in-emacs-browser)

				  - [Interfaces](#Interfaces)

				- [Architecture](#Architecture)

				- [Setup](#Setup)

				  - [Install](#1-Install)

				  - [Configure](#2-Configure)

				  - [Run](#3-Run)

				- [Use](#Use)

				  - [Interfaces](#Interfaces-1)

				  - [Query Filters](#Query-filters)

				- [Upgrade](#Upgrade)

				  - [Khoj Server](#upgrade-khoj-server)

				  - [Khoj.el](#upgrade-khoj-on-emacs)

				  - [Khoj Obsidian](#upgrade-khoj-on-obsidian)

				- [Troubleshoot](#Troubleshoot)

				- [Advanced Usage](#advanced-usage)

				  - [Access Khoj on Mobile](#access-khoj-on-mobile)

				- [Miscellaneous](#Miscellaneous)

				- [Performance](#Performance)

				  - [Query Performance](#Query-performance)

				  - [Indexing Performance](#Indexing-performance)

				  - [Miscellaneous](#Miscellaneous-1)

				- [Development](#Development)

				  - [Setup](#Setup)

				    - [Using Pip](#Using-Pip)

				    - [Using Docker](#Using-Docker)

				    - [Using Conda](#Test)

				  - [Test](#Test)

				- [Credits](#Credits)

				## Features

				- **Natural**: Advanced natural language understanding using Transformer based ML Models

				- **Local**: Your personal data stays local. All search, indexing is done on your machine[\*](https://github.com/debanjum/khoj#miscellaneous)

				- **Incremental**: Incremental search for a fast, search-as-you-type experience

				- **Pluggable**: Modular architecture makes it easy to plug in new data sources, frontends and ML models

				- **Multiple Sources**: Search your Org-mode and Markdown notes, Beancount transactions and Photos

				- **Multiple Interfaces**: Search using a [Web Browser](./src/interface/web/index.html), [Emacs](./src/interface/emacs/khoj.el) or the [API](http://localhost:8000/docs)

				## Demos

				### Khoj in Obsidian

				https://user-images.githubusercontent.com/6413477/210486007-36ee3407-e6aa-4185-8a26-b0bfc0a4344f.mp4

				<details><summary>Description</summary>

				- Install Khoj via `pip` and start Khoj backend in non-gui mode

				- Install Khoj plugin via Community Plugins settings pane on Obsidian app

				- Check the new Khoj plugin settings

				- Let Khoj backend index the markdown files in the current Vault

				- Open Khoj plugin on Obsidian via Search button on Left Pane

				- Search \"*Announce plugin to folks*\" in the [Obsidian Plugin docs](https://marcus.se.net/obsidian-plugin-docs/)

				- Jump to the [search result](https://marcus.se.net/obsidian-plugin-docs/publishing/submit-your-plugin)

				</details>

				### Khoj in Emacs, Browser

				https://user-images.githubusercontent.com/6413477/184735169-92c78bf1-d827-4663-9087-a1ea194b8f4b.mp4

				<details><summary>Description</summary>

				- Install Khoj via pip

				- Start Khoj app

				- Add this readme and [khoj.el readme](https://github.com/debanjum/khoj/tree/master/src/interface/emacs) as org-mode for Khoj to index

				- Search \"*Setup editor*\" on the Web and Emacs. Re-rank the results for better accuracy

				- Top result is what we are looking for, the [section to Install Khoj.el on Emacs](https://github.com/debanjum/khoj/tree/master/src/interface/emacs#installation)

				</details>

				<details><summary>Analysis</summary>

				- The results do not have any words used in the query

				  - *Based on the top result it seems the re-ranking model understands that Emacs is an editor?*

				- The results incrementally update as the query is entered

				- The results are re-ranked, for better accuracy, once user hits enter

				</details>

				### Interfaces

				![](https://github.com/debanjum/khoj/blob/master/docs/interfaces.png)

				## Architecture

				![](https://github.com/debanjum/khoj/blob/master/docs/khoj_architecture.png)

				## Setup

				These are the general setup instructions for Khoj.

				Check the [Khoj Obsidian Readme](https://github.com/debanjum/khoj/tree/master/src/interface/obsidian#Setup) to setup Khoj with the Obsidian Plugin. Its simpler as it can skip the configure step below.

				### 1. Install

				```shell

				pip install khoj-assistant

				```

				### 2. Start App

				```shell

				khoj

				```

				### 3. Configure

				1. Enable content types and point to files to search in the First Run Screen that pops up on app start

				2. Click `Configure` and wait. The app will download ML models and index the content for search

				## Use

				### Interfaces

				- **Khoj via Obsidian**

				  - [Install](https://github.com/debanjum/khoj/tree/master/src/interface/obsidian#2-Setup-Plugin) the Khoj Obsidian plugin

				  - Click the *Khoj search* icon 🔎 on the [Ribbon](https://help.obsidian.md/User+interface/Workspace/Ribbon) or Search for *Khoj: Search* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)

				- **Khoj via Emacs**

				  - [Install](https://github.com/debanjum/khoj/tree/master/src/interface/emacs#installation) [khoj.el](./src/interface/emacs/khoj.el)

				  - Run `M-x khoj <user-query>`

				- **Khoj via Web**

				  - Open <http://localhost:8000/> via desktop interface or directly

				- **Khoj via API**

				  - See the Khoj FastAPI [Swagger Docs](http://localhost:8000/docs), [ReDocs](http://localhost:8000/redocs)

				### Query Filters

				Use structured query syntax to filter the natural language search results

				- **Word Filter**: Get entries that include/exclude a specified term

				  - Entries that contain term_to_include: `+"term_to_include"`

				  - Entries that contain term_to_exclude: `-"term_to_exclude"`

				- **Date Filter**: Get entries containing dates in YYYY-MM-DD format from specified date (range)

				  - Entries from April 1st 1984: `dt:"1984-04-01"`

				  - Entries after March 31st 1984: `dt>="1984-04-01"`

				  - Entries before April 2nd 1984 : `dt<="1984-04-01"`

				- **File Filter**: Get entries from a specified file

				  - Entries from incoming.org file: `file:"incoming.org"`

				- Combined Example

				  - `what is the meaning of life? file:"1984.org" dt>="1984-01-01" dt<="1985-01-01" -"big" -"brother"`

				  - Adds all filters to the natural language query. It should return entries

				    - from the file *1984.org*

				    - containing dates from the year *1984*

				    - excluding words *"big"* and *"brother"*

				    - that best match the natural language query *"what is the meaning of life?"*

				## Upgrade

				### Upgrade Khoj Server

				```shell

				pip install --upgrade khoj-assistant

				```

				### Upgrade Khoj on Emacs

				- Use your Emacs Package Manager to Upgrade

				- See [khoj.el readme](https://github.com/debanjum/khoj/tree/master/src/interface/emacs#Upgrade) for details

				### Upgrade Khoj on Obsidian

				- Upgrade via the Community plugins tab on the settings pane in the Obsidian app

				- See the [khoj plugin readme](https://github.com/debanjum/khoj/tree/master/src/interface/obsidian#2-Setup-Plugin) for details

				## Troubleshoot

				- Symptom: Errors out complaining about Tensors mismatch, null etc

				  - Mitigation: Disable `image` search using the desktop GUI

				- Symptom: Errors out with \"Killed\" in error message in Docker

				  - Fix: Increase RAM available to Docker Containers in Docker Settings

				  - Refer: [StackOverflow Solution](https://stackoverflow.com/a/50770267), [Configure Resources on Docker for Mac](https://docs.docker.com/desktop/mac/#resources)

				# Advanced Usage

				## Access Khoj on Mobile

				1. [Setup Khoj](#Setup) on your personal server. This can be any always-on machine, i.e an old computer, RaspberryPi(?) etc

				2. [Install](https://tailscale.com/kb/installation/) [Tailscale](tailscale.com/) on your personal server and phone

				3. Open the Khoj web interface of the server from your phone browser. It should be `http://tailscale-url-of-server:8000` or `http://name-of-server:8000` if you've setup [MagicDNS](https://tailscale.com/kb/1081/magicdns/)

				4. Click the [Install/Add to Homescreen](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Add_to_home_screen) button

				5. Enjoy exploring your notes, transactions and images from your phone!

				![](https://github.com/debanjum/khoj/blob/master/docs/khoj_pwa_android.png)

				## Miscellaneous

				- The beta [chat](http://localhost:8000/api/beta/chat) and [search](http://localhost:8000/api/beta/search) API endpoints use [OpenAI API](https://openai.com/api/)

				  - It is disabled by default

				  - To use it add your `openai-api-key` via the app configure screen

				  - Warning: *If you use the above beta APIs, your query and top result(s) will be sent to OpenAI for processing*

				## Performance

				### Query performance

				- Semantic search using the bi-encoder is fairly fast at \<50 ms

				- Reranking using the cross-encoder is slower at \<2s on 15 results. Tweak `top_k` to tradeoff speed for accuracy of results

				- Filters in query (e.g by file, word or date) usually add \<20ms to query latency

				### Indexing performance

				- Indexing is more strongly impacted by the size of the source data

				- Indexing 100K+ line corpus of notes takes about 10 minutes

				- Indexing 4000+ images takes about 15 minutes and more than 8Gb of RAM

				- Note: *It should only take this long on the first run* as the index is incrementally updated

				### Miscellaneous

				- Testing done on a Mac M1 and a \>100K line corpus of notes

				- Search, indexing on a GPU has not been tested yet

				## Development

				### Setup

				#### Using Pip

				##### 1. Install

				```shell

				git clone https://github.com/debanjum/khoj && cd khoj

				python3 -m venv .venv && source .venv/bin/activate

				pip install -e .

				```

				##### 2. Configure

				- Copy the `config/khoj_sample.yml` to `~/.khoj/khoj.yml`

				- Set `input-files` or `input-filter` in each relevant `content-type` section of `~/.khoj/khoj.yml`

				  - Set `input-directories` field in `image` `content-type` section

				- Delete `content-type` and `processor` sub-section(s) irrelevant for your use-case

				##### 3. Run

				```shell

				khoj -vv

				```

				Load ML model, generate embeddings and expose API to query notes, images, transactions etc specified in config YAML

				##### 4. Upgrade

				```shell

				# To Upgrade To Latest Stable Release

				# Maps to the latest tagged version of khoj on master branch

				pip install --upgrade khoj-assistant

				# To Upgrade To Latest Pre-Release

				# Maps to the latest commit on the master branch

				pip install --upgrade --pre khoj-assistant

				# To Upgrade To Specific Development Release.

				# Useful to test, review a PR.

				# Note: khoj-assistant is published to test PyPi on creating a PR

				pip install -i https://test.pypi.org/simple/ khoj-assistant==0.1.5.dev57166025766

				```

				#### Using Docker

				##### 1. Clone

				```shell

				git clone https://github.com/debanjum/khoj && cd khoj

				```

				##### 2. Configure

				- **Required**: Update [docker-compose.yml](./docker-compose.yml) to mount your images, (org-mode or markdown) notes and beancount directories

				- **Optional**: Edit application configuration in [khoj_docker.yml](./config/khoj_docker.yml)

				##### 3. Run

				```shell

				docker-compose up -d

				```

				*Note: The first run will take time. Let it run, it\'s mostly not hung, just generating embeddings*

				##### 4. Upgrade

				```shell

				docker-compose build --pull

				```

				#### Using Conda

				##### 1. Install Dependencies

				- [Install Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html)

				##### 2. Install Khoj

				```shell

				git clone https://github.com/debanjum/khoj && cd khoj

				conda env create -f config/environment.yml

				conda activate khoj

				python3 -m pip install pyqt6  # As conda does not support pyqt6 yet

				```

				##### 3. Configure

				- Copy the `config/khoj_sample.yml` to `~/.khoj/khoj.yml`

				- Set `input-files` or `input-filter` in each relevant `content-type` section of `~/.khoj/khoj.yml`

				  - Set `input-directories` field in `image` `content-type` section

				- Delete `content-type`, `processor` sub-sections irrelevant for your use-case

				##### 4. Run

				```shell

				python3 -m src.main -vv

				```

				  Load ML model, generate embeddings and expose API to query notes, images, transactions etc specified in config YAML

				##### 5. Upgrade

				```shell

				cd khoj

				git pull origin master

				conda deactivate khoj

				conda env update -f config/environment.yml

				conda activate khoj

				```

				### Test

				```shell

				pytest

				```

				## Credits

				- [Multi-QA MiniLM Model](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), [All MiniLM Model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) for Text Search. See [SBert Documentation](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)

				- [OpenAI CLIP Model](https://github.com/openai/CLIP) for Image Search. See [SBert Documentation](https://www.sbert.net/examples/applications/image-search/README.html)

				- Charles Cave for [OrgNode Parser](http://members.optusnet.com.au/~charles57/GTD/orgnode.html)

				- [Org.js](https://mooz.github.io/org-js/) to render Org-mode results on the Web interface

				- [Markdown-it](https://github.com/markdown-it/markdown-it) to render Markdown results on the Web interface

									
										129

computer.Dockerfile
									
										Normal file
									
												View File
												
				@@ -0,0 +1,129 @@

				FROM ubuntu:24.04

				ENV DEBIAN_FRONTEND=noninteractive

				# Install System Dependencies

				RUN apt update \

				    && apt install -y \

				        ca-certificates \

				        gnupg \

				        xfce4 \

				        xfce4-goodies \

				        x11vnc \

				        xvfb \

				        xdotool \

				        imagemagick \

				        x11-apps \

				        dbus-x11 \

				        sudo \

				        python3-pip \

				        python3-tk \

				        python3-dev \

				        build-essential \

				        scrot \

				        gnome-screenshot \

				        net-tools \

				        libx11-dev \

				        libxext-dev \

				        libxtst-dev \

				        libxinerama-dev \

				        libxmu-dev \

				        libxrandr-dev \

				        libxfixes-dev \

				        software-properties-common \

				    && add-apt-repository ppa:mozillateam/ppa && apt update \

				    && apt install -y --no-install-recommends \

				        # Desktop apps

				        firefox-esr \

				        libreoffice \

				        x11-apps \

				        xpdf \

				        gedit \

				        xpaint \

				        tint2 \

				        galculator \

				        pcmanfm \

				        unzip \

				        # Terminal apps like file editors, viewers, git, wget/curl etc.

				        less \

				        nano \

				        neovim \

				        vim \

				        git \

				        curl \

				        wget \

				        procps \

				        # Python/pyenv dependencies

				        libssl-dev  \

				        zlib1g-dev \

				        libbz2-dev \

				        libreadline-dev \

				        libsqlite3-dev \

				        libncursesw5-dev \

				        xz-utils \

				        tk-dev \

				        libxml2-dev \

				        libxmlsec1-dev \

				        libffi-dev \

				        liblzma-dev \

				    # set default browser

				    && update-alternatives --set x-www-browser /usr/bin/firefox-esr \

				    && apt-get clean && rm -rf /var/lib/apt/lists/* \

				    # remove screen locks, power managers

				    && apt remove -y light-locker xfce4-screensaver xfce4-power-manager || true

				# Create Computer User

				ENV USERNAME=operator

				ENV HOME=/home/$USERNAME

				RUN useradd -m -s /bin/bash -d $HOME -g $USERNAME $USERNAME && echo "${USERNAME} ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers

				USER $USERNAME

				WORKDIR $HOME

				# Setup Python

				RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv && \

				    cd ~/.pyenv && src/configure && make -C src && cd .. && \

				    echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc && \

				    echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc && \

				    echo 'eval "$(pyenv init -)"' >> ~/.bashrc

				ENV PYENV_ROOT="$HOME/.pyenv"

				ENV PATH="$PYENV_ROOT/bin:$PATH"

				ENV PYENV_VERSION_MAJOR=3

				ENV PYENV_VERSION_MINOR=11

				ENV PYENV_VERSION_PATCH=6

				ENV PYENV_VERSION=$PYENV_VERSION_MAJOR.$PYENV_VERSION_MINOR.$PYENV_VERSION_PATCH

				RUN eval "$(pyenv init -)" && \

				    pyenv install $PYENV_VERSION && \

				    pyenv global $PYENV_VERSION && \

				    pyenv rehash

				ENV PATH="$HOME/.pyenv/shims:$HOME/.pyenv/bin:$PATH"

				# Install Python Packages

				RUN python3 -m pip install --no-cache-dir \

				    pyautogui \

				    Pillow \

				    pyperclip \

				    pygetwindow

				# Setup VNC

				RUN x11vnc -storepasswd secret /home/operator/.vncpass

				ARG WIDTH=1024

				ARG HEIGHT=768

				ARG DISPLAY_NUM=99

				ENV WIDTH=$WIDTH

				ENV HEIGHT=$HEIGHT

				ENV DISPLAY_NUM=$DISPLAY_NUM

				ENV DISPLAY=":$DISPLAY_NUM"

				# Expose VNC on port 5900

				# run Xvfb, x11vnc, Xfce (no login manager)

				EXPOSE 5900

				CMD ["/bin/sh", "-c", "    export XDG_RUNTIME_DIR=/run/user/$(id -u); \

				    mkdir -p $XDG_RUNTIME_DIR && chown $USERNAME:$USERNAME $XDG_RUNTIME_DIR && chmod 0700 $XDG_RUNTIME_DIR; \

				    Xvfb $DISPLAY -screen 0 ${WIDTH}x${HEIGHT}x24 -dpi 96 -auth /home/$USERNAME/.Xauthority >/dev/null 2>&1 & \

				    sleep 1; \

				    xauth add $DISPLAY . $(mcookie); \

				    x11vnc -display $DISPLAY -forever -rfbauth /home/$USERNAME/.vncpass -listen 0.0.0.0 -rfbport 5900 >/dev/null 2>&1 & \

				    eval $(dbus-launch --sh-syntax) && \

				    startxfce4 & \

				    sleep 2 && echo 'Container running!' && \

				    tail -f /dev/null "]

									
										21

config/environment.yml
									
												View File
											
				@@ -1,21 +0,0 @@

				name: khoj

				channels:

				  - conda-forge

				dependencies:

				  - python=3.8.*

				  - numpy=1.22.4

				  - pytorch=1.11.0

				  - transformers=4.19.4

				  - sentence-transformers=2.1.0

				  - fastapi=0.77.1

				  - uvicorn=0.17.6

				  - pyyaml=6.0

				  - pytest=7.1.2

				  - pillow=8.4.0

				  - torchvision=0.12.0

				  - openai=0.20.0

				  - pydantic=1.9.1

				  - jinja2=3.1.2

				  - aiofiles=0.8.0

				  - huggingface_hub=0.8.1

				  - dateparser=1.1.1

									
										116

config/environment_osx-arm64.yml
									
												View File
											
				@@ -1,116 +0,0 @@

				name: khoj

				channels:

				  - conda-forge

				dependencies:

				  - aiofiles=0.8.0=pyhd8ed1ab_0

				  - asgiref=3.4.1=pyhd8ed1ab_0

				  - attrs=21.2.0=pyhd8ed1ab_0

				  - brotlipy=0.7.0=py39h5161555_1001

				  - ca-certificates=2022.6.15=h4653dfc_0

				  - certifi=2022.6.15=py39h2804cbe_0

				  - cffi=1.14.6=py39hda8b47f_0

				  - chardet=4.0.0=py39h2804cbe_1

				  - charset-normalizer=2.0.0=pyhd8ed1ab_0

				  - click=8.0.1=py39h2804cbe_0

				  - colorama=0.4.4=pyh9f0ad1d_0

				  - cryptography=3.4.7=py39h73257c9_0

				  - dataclasses=0.8=pyhc8e2a94_3

				  - dateparser=1.1.1=pyhd8ed1ab_0

				  - et_xmlfile=1.0.1=py_1001

				  - fastapi=0.68.2=pyhd8ed1ab_0

				  - filelock=3.0.12=pyh9f0ad1d_0

				  - freetype=2.10.4=h17b34a0_1

				  - future=0.18.2=py39h2804cbe_3

				  - h11=0.12.0=pyhd8ed1ab_0

				  - huggingface_hub=0.2.1=pyhd8ed1ab_0

				  - idna=3.1=pyhd3deb0d_0

				  - importlib-metadata=4.6.4=py39h2804cbe_0

				  - importlib_metadata=4.6.4=hd8ed1ab_0

				  - iniconfig=1.1.1=pyh9f0ad1d_0

				  - jbig=2.1=h3422bc3_2003

				  - jinja2=3.0.3=pyhd8ed1ab_0

				  - joblib=1.0.1=pyhd8ed1ab_0

				  - jpeg=9d=h27ca646_0

				  - lcms2=2.12=had6a04f_0

				  - lerc=2.2.1=h9f76cd9_0

				  - libblas=3.9.0=11_osxarm64_openblas

				  - libcblas=3.9.0=11_osxarm64_openblas

				  - libcxx=12.0.1=h168391b_0

				  - libdeflate=1.7=h27ca646_5

				  - libffi=3.3=h9f76cd9_2

				  - libgfortran=5.0.0.dev0=11_0_1_hf114ba7_23

				  - libgfortran5=11.0.1.dev0=hf114ba7_23

				  - liblapack=3.9.0=11_osxarm64_openblas

				  - libopenblas=0.3.17=openmp_h5dd58f0_1

				  - libpng=1.6.37=hf7e6567_2

				  - libprotobuf=3.16.0=hccf11d3_0

				  - libtiff=4.3.0=hc6122e1_1

				  - libwebp-base=1.2.1=h3422bc3_0

				  - llvm-openmp=12.0.1=hf3c4609_1

				  - lz4-c=1.9.3=hbdafb3b_1

				  - markupsafe=2.0.1=py39h5161555_1

				  - more-itertools=8.8.0=pyhd8ed1ab_0

				  - ncurses=6.2=h9aa5885_4

				  - ninja=1.10.2=h4d860bb_0

				  - nltk=3.6.2=pyhd8ed1ab_0

				  - numpy=1.21.4=py39h1f3b974_0

				  - olefile=0.46=pyh9f0ad1d_1

				  - openai=0.11.4=py39h2804cbe_0

				  - openjpeg=2.4.0=h062765e_1

				  - openpyxl=3.0.9=pyhd8ed1ab_0

				  - openssl=1.1.1q=ha287fd2_0

				  - packaging=21.0=pyhd8ed1ab_0

				  - pandas=1.3.4=py39h7f752ed_1

				  - pandas-stubs=1.2.0.38=py39h2804cbe_0

				  - pillow=8.3.2=py39ha74c66e_0

				  - pip=21.2.4=pyhd8ed1ab_0

				  - pluggy=0.13.1=py39h2804cbe_4

				  - py=1.10.0=pyhd3deb0d_0

				  - pycparser=2.20=pyh9f0ad1d_2

				  - pydantic=1.8.2=py39h5161555_2

				  - pyopenssl=20.0.1=pyhd8ed1ab_0

				  - pyparsing=2.4.7=pyh9f0ad1d_0

				  - pysocks=1.7.1=py39h2804cbe_3

				  - pytest=6.2.5=py39h2804cbe_1

				  - python=3.9.7=h54d631c_3_cpython

				  - python-dateutil=2.8.2=pyhd8ed1ab_0

				  - python-tzdata=2022.1=pyhd8ed1ab_0

				  - python_abi=3.9=2_cp39

				  - pytorch=1.9.0=cpu_py39he8fdc14_2

				  - pytorch-cpu=1.9.0=cpu_py39hd610c6a_2

				  - pytz=2021.3=pyhd8ed1ab_0

				  - pytz-deprecation-shim=0.1.0.post0=py39h2804cbe_2

				  - pyyaml=5.4.1=py39h5161555_1

				  - readline=8.1=hedafd6a_0

				  - regex=2021.8.21=py39h5161555_0

				  - requests=2.26.0=pyhd8ed1ab_0

				  - sacremoses=0.0.43=pyh9f0ad1d_0

				  - scikit-learn=0.24.2=py39hef7049f_1

				  - scipy=1.7.0=py39h5060c3b_0

				  - sentence-transformers=2.1.0=pyhd8ed1ab_0

				  - sentencepiece=0.1.95=py39h4d2d688_1

				  - setuptools=57.4.0=py39h2804cbe_0

				  - six=1.16.0=pyh6c4a22f_0

				  - sleef=3.5.1=h27ca646_1

				  - sqlite=3.36.0=h72a2b83_0

				  - starlette=0.14.2=pyhd8ed1ab_0

				  - threadpoolctl=2.2.0=pyh8a188c0_0

				  - tk=8.6.11=he1e0b03_0

				  - tokenizers=0.10.3=py39hab32027_1

				  - toml=0.10.2=pyhd8ed1ab_0

				  - torchvision=0.10.1=py39h0a40b5a_0_cpu

				  - tqdm=4.62.1=pyhd8ed1ab_0

				  - transformers=4.14.1=pyhd8ed1ab_0

				  - typing-extensions=3.10.0.0=hd8ed1ab_0

				  - typing_extensions=3.10.0.0=pyha770c72_0

				  - tzdata=2021a=he74cb21_1

				  - tzlocal=4.2=py39h2804cbe_1

				  - urllib3=1.26.6=pyhd8ed1ab_0

				  - uvicorn=0.16.0=py39h2804cbe_0

				  - wheel=0.37.0=pyhd8ed1ab_1

				  - xz=5.2.5=h642e427_1

				  - yaml=0.2.5=h642e427_0

				  - zipp=3.5.0=pyhd8ed1ab_0

				  - zlib=1.2.11=h31e879b_1009

				  - zstd=1.5.0=h861e0a7_0

				prefix: /opt/homebrew/Caskroom/miniforge/base/envs/khoj

									
										54

config/khoj_docker.yml
									
												View File
											
				@@ -1,54 +0,0 @@

				content-type:

				  # The /data/folder/ prefix to the folders is here because this is

				  # the directory to which the local files are copied in the docker-compose.

				  # If changing, the docker-compose volumes should also be changed to match.

				  org:

				    input-files: null

				    input-filter: "/data/org/*.org"

				    compressed-jsonl: "/data/embeddings/notes.jsonl.gz"

				    embeddings-file: "/data/embeddings/note_embeddings.pt"

				    index_heading_entries: false

				  markdown:

				    input-files: null

				    input-filter: "/data/markdown/*.md"

				    compressed-jsonl: "/data/embeddings/markdown.jsonl.gz"

				    embeddings-file: "/data/embeddings/markdown_embeddings.pt"

				  ledger:

				    input-files: null

				    input-filter: /data/ledger/*.beancount

				    compressed-jsonl: /data/embeddings/transactions.jsonl.gz

				    embeddings-file: /data/embeddings/transaction_embeddings.pt

				  image:

				    input-directories: ["/data/images/"]

				    embeddings-file: "/data/embeddings/image_embeddings.pt"

				    batch-size: 50

				    use-xmp-metadata: false

				  music:

				    input-files: ["/data/music/music.org"]

				    input-filter: null

				    compressed-jsonl: "/data/embeddings/songs.jsonl.gz"

				    embeddings-file: "/data/embeddings/song_embeddings.pt"

				search-type:

				  symmetric:

				    encoder: "sentence-transformers/all-MiniLM-L6-v2"

				    cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"

				    model_directory: "/data/models/symmetric"

				  asymmetric:

				    encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"

				    cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"

				    model_directory: "/data/models/asymmetric"

				  image:

				    encoder: "sentence-transformers/clip-ViT-B-32"

				    model_directory: "/data/models/image_encoder"

				processor:

				  #conversation:

				  #  openai-api-key: null

				  #  conversation-logfile: "/data/embeddings/conversation_logs.json"

									
										52

config/khoj_sample.yml
									
												View File
											
				@@ -1,52 +0,0 @@

				content-type:

				  org:

				    input-files:  # ["/path/to/org-file.org"]  REQUIRED IF input-filter IS NOT SET OR

				    input-filter: # /path/to/org/*.org         REQUIRED IF input-files IS NOT SET

				    compressed-jsonl: "~/.khoj/content/org/org.jsonl.gz"

				    embeddings-file: "~/.khoj/content/org/org_embeddings.pt"

				    index_heading_entries: false  # Set to true to index entries with empty body

				  markdown:

				    input-files:  # ["/path/to/markdown-file.md"]  REQUIRED IF input-filter IS NOT SET OR

				    input-filter: # "/path/to/markdown/*.md"       REQUIRED IF input-files IS NOT SET

				    compressed-jsonl: "~/.khoj/content/markdown/markdown.jsonl.gz"

				    embeddings-file: "~/.khoj/content/markdown/markdown_embeddings.pt"

				  ledger:

				    input-files:  # ["/path/to/ledger-file.beancount"]  REQUIRED IF input-filter is not set OR

				    input-filter: # /path/to/ledger/*.beancount         REQUIRED IF input-files is not set

				    compressed-jsonl: "~/.khoj/content/ledger/ledger.jsonl.gz"

				    embeddings-file: "~/.khoj/content/ledger/ledger_embeddings.pt"

				  image:

				    input-directories: # ["/path/to/images/"]   REQUIRED IF input-filter IS NOT SET OR

				    input-filter:      # /path/to/images/*.jpg  REQUIRED IF input-directories IS NOT SET

				    embeddings-file: "~/.khoj/content/image/image_embeddings.pt"

				    batch-size: 50

				    use-xmp-metadata: false

				  music:

				    input-files:  # ["/path/to/music-file.org"] REQUIRED IF input-filter IS NOT SET OR

				    input-filter: # /path/to/music/*.org        REQUIRED IF input-files IS NOT SET

				    compressed-jsonl: "~/.khoj/content/music/music.jsonl.gz"

				    embeddings-file: "~/.khoj/content/music/music_embeddings.pt"

				search-type:

				  symmetric:

				    encoder: "sentence-transformers/all-MiniLM-L6-v2"

				    cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"

				    model_directory: "~/.khoj/search/symmetric/"

				  asymmetric:

				    encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"

				    cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"

				    model_directory: "~/.khoj/search/asymmetric/"

				  image:

				    encoder: "sentence-transformers/clip-ViT-B-32"

				    model_directory: "~/.khoj/search/image/"

				processor:

				  conversation:

				    openai-api-key: # "YOUR_OPENAI_API_KEY"

				    conversation-logfile: "~/.khoj/processor/conversation/conversation_logs.json"

									
										144

docker-compose.yml
									
												View File
												
				@@ -1,29 +1,131 @@

				version: "3.9"

				services:

				  database:

				    image: docker.io/pgvector/pgvector:pg15

				    environment:

				      POSTGRES_USER: postgres

				      POSTGRES_PASSWORD: postgres

				      POSTGRES_DB: postgres

				    volumes:

				      - khoj_db:/var/lib/postgresql/data/

				    healthcheck:

				      test: ["CMD-SHELL", "pg_isready -U postgres"]

				      interval: 30s

				      timeout: 10s

				      retries: 5

				  sandbox:

				    image: ghcr.io/khoj-ai/terrarium:latest

				    healthcheck:

				      test: ["CMD-SHELL", "curl -f http://localhost:8080/health"]

				      interval: 30s

				      timeout: 10s

				      retries: 2

				  search:

				    image: docker.io/searxng/searxng:latest

				    volumes:

				      - khoj_search:/etc/searxng

				    environment:

				      - SEARXNG_BASE_URL=http://localhost:8080/

				  # Creates Computer for Khoj to use.

				  # Set KHOJ_OPERATOR_ENABLED=True in the server service environment variable to enable.

				  computer:

				    container_name: khoj-computer

				    image: ghcr.io/khoj-ai/khoj-computer:latest

				    # build:

				    #   context: .

				    #   dockerfile: computer.Dockerfile

				    ports:

				      - "5900:5900"

				    volumes:

				      - khoj_computer:/home/operator

				  server:

				    image: ghcr.io/debanjum/khoj:latest

				    depends_on:

				      database:

				        condition: service_healthy

				    # Use the following line to use the latest version of khoj. Otherwise, it will build from source. Set this to ghcr.io/khoj-ai/khoj-cloud:latest if you want to use the prod image.

				    image: ghcr.io/khoj-ai/khoj:latest

				    # Uncomment the following line to build from source. This will take a few minutes. Comment the next two lines out if you want to use the official image.

				    # build:

				      # context: .

				    ports:

				      # If changing the local port (left hand side), no other changes required.

				      # If changing the remote port (right hand side), 

				      #   change the port in the args in the build section, 

				      # If changing the remote port (right hand side),

				      #   change the port in the args in the build section,

				      #   as well as the port in the command section to match

				      - "8000:8000"

				      - "42110:42110"

				    extra_hosts:

				      - "host.docker.internal:host-gateway"

				    working_dir: /app

				    volumes:

				      - .:/app

				      # These mounted volumes hold the raw data that should be indexed for search. 

				      # The path in your local directory (left hand side)

				      #   points to the files you want to index.

				      # The path of the mounted directory (right hand side),

				      #   must match the path prefix in your config file.

				      - ./tests/data/org/:/data/org/

				      - ./tests/data/images/:/data/images/

				      - ./tests/data/ledger/:/data/ledger/

				      - ./tests/data/music/:/data/music/

				      - ./tests/data/markdown/:/data/markdown/

				      # Embeddings and models are populated after the first run

				      # You can set these volumes to point to empty directories on host

				      - ./tests/data/embeddings/:/data/embeddings/

				      - ./tests/data/models/:/data/models/

				      - khoj_config:/root/.khoj/

				      - khoj_models:/root/.cache/torch/sentence_transformers

				      - khoj_models:/root/.cache/huggingface

				      # uncomment line below to mount docker socket to allow khoj to use its computer.

				      # - /var/run/docker.sock:/var/run/docker.sock

				    # Use 0.0.0.0 to explicitly set the host ip for the service on the container. https://pythonspeed.com/articles/docker-connection-refused/

				    command: --no-gui --host="0.0.0.0" --port=8000 -c=config/khoj_docker.yml -vv

				    environment:

				      - POSTGRES_DB=postgres

				      - POSTGRES_USER=postgres

				      - POSTGRES_PASSWORD=postgres

				      - POSTGRES_HOST=database

				      - POSTGRES_PORT=5432

				      - KHOJ_DJANGO_SECRET_KEY=secret

				      - KHOJ_DEBUG=False

				      - KHOJ_ADMIN_EMAIL=username@example.com

				      - KHOJ_ADMIN_PASSWORD=password

				      # Default URL of Terrarium, the default Python sandbox used by Khoj to run code. Its container is specified above

				      - KHOJ_TERRARIUM_URL=http://sandbox:8080

				      # Uncomment line below to have Khoj run code in remote E2B code sandbox instead of the self-hosted Terrarium sandbox above. Get your E2B API key from https://e2b.dev/.

				      # - E2B_API_KEY=your_e2b_api_key

				      # Default URL of SearxNG, the default web search engine used by Khoj. Its container is specified above

				      - KHOJ_SEARXNG_URL=http://search:8080

				      # Uncomment line below to use with Ollama running on your local machine at localhost:11434.

				      # Change URL to use with other OpenAI API compatible providers like VLLM, LMStudio etc.

				      # - OPENAI_BASE_URL=http://host.docker.internal:11434/v1/

				      #

				      # Uncomment appropriate lines below to use chat models by OpenAI, Anthropic, Google.

				      # Ensure you set your provider specific API keys.

				      # ---

				      # - OPENAI_API_KEY=your_openai_api_key

				      # - GEMINI_API_KEY=your_gemini_api_key

				      # - ANTHROPIC_API_KEY=your_anthropic_api_key

				      #

				      # Uncomment line below to enable Khoj to use its computer.

				      # - KHOJ_OPERATOR_ENABLED=True

				      # Uncomment appropriate lines below to enable web results with Khoj

				      # Ensure you set your provider specific API keys.

				      # ---

				      # Free, Slower API. Does both web search and webpage read. Get API key from https://jina.ai/

				      # - JINA_API_KEY=your_jina_api_key

				      # Paid, Fast API. Only does web search. Get API key from https://serper.dev/

				      # - SERPER_DEV_API_KEY=your_serper_dev_api_key

				      # Paid, Fast, Open API. Only does webpage read. Get API key from https://firecrawl.dev/

				      # - FIRECRAWL_API_KEY=your_firecrawl_api_key

				      # Paid, Fast, Higher Read Success API. Only does webpage read. Get API key from https://olostep.com/

				      # - OLOSTEP_API_KEY=your_olostep_api_key

				      #

				      # Uncomment the necessary lines below to make your instance publicly accessible.

				      # Proceed with caution, especially if you are using anonymous mode.

				      # ---

				      # - KHOJ_NO_HTTPS=True

				      # Replace the KHOJ_DOMAIN with the server's externally accessible domain or I.P address from a remote machie (no http/https prefix).

				      # Ensure this is set correctly to avoid CSRF trusted origin or unset cookie issue when trying to access the admin panel.

				      # - KHOJ_DOMAIN=192.168.0.104

				      # - KHOJ_DOMAIN=khoj.example.com

				      # Replace the KHOJ_ALLOWED_DOMAIN with the server's internally accessible domain or I.P address on the host machine (no http/https prefix).

				      # Only set if using a load balancer/reverse_proxy in front of your Khoj server. If unset, it defaults to KHOJ_DOMAIN.

				      # For example, if the load balancer service is added to the khoj docker network, set KHOJ_ALLOWED_DOMAIN to khoj's docker service name: `server'.

				      # - KHOJ_ALLOWED_DOMAIN=server

				      # - KHOJ_ALLOWED_DOMAIN=127.0.0.1

				      # Uncomment the line below to disable telemetry.

				      # Telemetry helps us prioritize feature development and understand how people are using Khoj

				      # Read more at https://docs.khoj.dev/miscellaneous/telemetry

				      # - KHOJ_TELEMETRY_DISABLE=True

				    # Comment out this line when you're using the official ghcr.io/khoj-ai/khoj-cloud:latest prod image.

				    command: --host="0.0.0.0" --port=42110 -vv --anonymous-mode --non-interactive

				volumes:

				  khoj_config:

				  khoj_db:

				  khoj_models:

				  khoj_search:

				  khoj_computer:

BIN
docs/interfaces.png

View File

Binary file not shown.

Before

Width: | Height: | Size: 994 KiB

BIN
docs/khoj_pwa_android.png

View File

Binary file not shown.

Before

Width: | Height: | Size: 389 KiB

20

documentation/.gitignore vendored Normal file

View File

@@ -0,0 +1,20 @@
 # Dependencies
 /node_modules
 # Production
 /build
 # Generated files
 .docusaurus
 .cache-loader
 # Misc
 .DS_Store
 .env.local
 .env.development.local
 .env.test.local
 .env.production.local
 npm-debug.log*
 yarn-debug.log*
 yarn-error.log*

									
										41

documentation/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,41 @@

				# Website

				This website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.

				### Installation

				```

				$ yarn

				```

				### Local Development

				```

				$ yarn start

				```

				This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

				### Build

				```

				$ yarn build

				```

				This command generates static content into the `build` directory and can be served using any static contents hosting service.

				### Deployment

				Using SSH:

				```

				$ USE_SSH=true yarn deploy

				```

				Not using SSH:

				```

				$ GIT_USER=<Your GitHub username> yarn deploy

				```

				If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.

0

src/init.py → documentation/assets/.nojekyll

View File

BIN
documentation/assets/img/admin_get_emali_login.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 89 KiB

BIN
documentation/assets/img/admin_successful_login_url.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 66 KiB

BIN
documentation/assets/img/agents_page_full.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 511 KiB

BIN
documentation/assets/img/chrome_pwa_alt.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 500 KiB

BIN
documentation/assets/img/dream_house.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 3.0 MiB

BIN
documentation/assets/img/example_chatmodel_option.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 34 KiB

BIN
documentation/assets/img/example_openai_processor_config.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 21 KiB

BIN
documentation/assets/img/example_search_model_admin_settings.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 1.2 MiB

BIN
documentation/assets/img/favicon-128x128.ico Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 170 KiB

BIN
documentation/assets/img/file_filters_conversation.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 18 KiB

BIN
documentation/assets/img/khoj-logo-sideways-200.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 13 KiB

BIN
documentation/assets/img/khoj-logo-sideways-500.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 36 KiB

5385

documentation/assets/img/khoj-logo-sideways.svg Normal file

View File

File diff suppressed because one or more lines are too long

After

Width: | Height: | Size: 1.2 MiB

0

docs/khoj_architecture.png → documentation/assets/img/khoj_architecture.png

View File

Before

Width: | Height: | Size: 350 KiB

After

Width: | Height: | Size: 350 KiB

BIN
documentation/assets/img/khoj_automation_email.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 232 KiB

BIN
documentation/assets/img/khoj_chat_on_emacs.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 302 KiB

BIN
documentation/assets/img/khoj_chat_on_obsidian.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 394 KiB

BIN
documentation/assets/img/khoj_chat_on_web.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 187 KiB

74

documentation/assets/img/khoj_clients.svg Normal file

View File

File diff suppressed because one or more lines are too long

After

Width: | Height: | Size: 27 KiB

BIN
documentation/assets/img/khoj_codebase_visualization_0.2.1.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 544 KiB

BIN
documentation/assets/img/khoj_create_automation.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 102 KiB

62

documentation/assets/img/khoj_datasources.svg Normal file

View File

File diff suppressed because one or more lines are too long

After

Width: | Height: | Size: 43 KiB

BIN
documentation/assets/img/khoj_documentation.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 868 KiB

BIN
documentation/assets/img/khoj_emacs_menu.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 49 KiB

BIN
documentation/assets/img/khoj_obsidian_codebase_visualization_0.2.1.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 333 KiB

BIN
documentation/assets/img/khoj_pwa_android.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 445 KiB

BIN
documentation/assets/img/khoj_search_on_emacs.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 420 KiB

BIN
documentation/assets/img/khoj_search_on_obsidian.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 478 KiB

BIN
documentation/assets/img/khoj_search_on_web.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 268 KiB

BIN
documentation/assets/img/khoj_web_app_home.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 214 KiB

BIN
documentation/assets/img/magic_link.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 23 KiB

BIN
documentation/assets/img/plants_i_got.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 3.0 MiB

BIN
documentation/assets/img/pwa_install_1.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 103 KiB

BIN
documentation/assets/img/pwa_install_2.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 119 KiB

BIN
documentation/assets/img/pwa_install_3.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 265 KiB

BIN
documentation/assets/img/pwa_install_desktop.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 15 KiB

BIN
documentation/assets/img/quadratic_equation_khoj_web.gif Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 19 MiB

BIN
documentation/assets/img/search_agents_markdown.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 336 KiB

BIN
documentation/assets/img/select_file_filter.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 31 KiB

BIN
documentation/assets/img/speaker_icon.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 9.4 KiB

BIN
documentation/assets/img/summarize.jpg Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 94 KiB

BIN
documentation/assets/img/text_to_speech.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 4.9 KiB

BIN
documentation/assets/img/using_khoj_for_studying.gif Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 32 MiB

									
										3

documentation/babel.config.js
									
										Normal file
									
												View File
												
				@@ -0,0 +1,3 @@

				module.exports = {

				  presets: [require.resolve('@docusaurus/core/lib/babel/preset')],

				};

									
										8

documentation/docs/advanced/_category_.json
									
										Normal file
									
												View File
												
				@@ -0,0 +1,8 @@

				{

				  "label": "Advanced Self Hosting",

				  "position": 6,

				  "link": {

				    "type": "generated-index",

				    "description": "Advanced setup for Self Hosting Khoj server"

				  }

				}

									
										77

documentation/docs/advanced/admin.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,77 @@

				# Admin Panel

				> Describes the Khoj settings configurable via the admin panel

				By default, your admin panel is available at `http://localhost:42110/server/admin/`. You can access the admin panel by logging in with your admin credentials (this would be your `KHOJ_ADMIN_EMAIL` and `KHOJ_ADMIN_PASSWORD`). The admin panel allows you to configure various settings for your Khoj server.

				## App Settings

				### Agents

				Add all the agents you want to use for your different use-cases like Writer, Researcher, Therapist etc.

				- `Personality`: This is a prompt to tell the chat model how to tune the personality of the agent.

				- `Chat model`: The chat model to use for the agent.

				- `Name`: The name of the agent. This field helps give the agent a unique identity across the app.

				- `Avatar`: Url to the agents profile picture. It helps give the agent a unique visual identity across the app.

				- `Style color`, `Style icon`: These fields help give the agent a unique, visually identifiable identity across the app.

				- `Slug`: This is the agent name to use in urls.

				- `Public`: Check this if the agent is expected to be visible to all users on this Khoj server.

				- `Managed by admin`: Check this if the agent is managed by admin, not by any user.

				- `Creator`: The user who created the agent.

				- `Tools`: The list of tools available to this agent. Tools include notes, image, online. This field is not currently configurable and only supports all tools (i.e `["*"]`)

				### Chat Model Options

				Add all the chat models you want to try, use and switch between for your different use-cases. For each chat model you add:

				- `Chat model`: The name of an [OpenAI](https://platform.openai.com/docs/models), [Anthropic](https://docs.anthropic.com/en/docs/about-claude/models#model-names), [Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models) or [Offline](https://huggingface.co/models?pipeline_tag=text-generation&library=gguf) chat model.

				- `Model type`: The chat model provider like `OpenAI`, `Offline`.

				- `Vision enabled`: Set to `true` if your model supports vision. This is currently only supported for vision capable OpenAI models like `gpt-4o`

				- `Max prompt size`, `Subscribed max prompt size`: These are optional fields. They are used to truncate the context to the maximum context size that can be passed to the model. This can help with accuracy and cost-saving.<br />

				- `Tokenizer`: This is an optional field. It is used to accurately count tokens and truncate context passed to the chat model to stay within the models max prompt size.

				  ![example configuration for chat model options](/img/example_chatmodel_option.png)

				### Server Chat Settings

				The server chat settings are used as:

				1. The default chat models for subscribed (`Advanced` field) and unsubscribed (`Default` field) users.

				2. The chat model for all intermediate steps like intent detection, web search etc. during chat response generation.

				If a server chat setting is not added the first ChatModelOption in your config is used as the default chat model.

				To add a server chat setting:

				- Set your preferred default chat models in the `Default` fields of your [ServerChatSettings](http://localhost:42110/server/admin/database/serverchatsettings/)

				- The `Advanced` field doesn't need to be set when self-hosting. When unset, the `Default` chat model is used for all users and the intermediate steps.

				### AI Model API

				These settings configure APIs to interact with AI models.

				For each AI Model API you [add](http://localhost:42110/server/admin/database/aimodelapi/add):

				- `Api key`: Set to your [OpenAI](https://platform.openai.com/api-keys), [Anthropic](https://console.anthropic.com/account/keys) or [Gemini](https://aistudio.google.com/app/apikey) API keys.

				- `Name`: Give the configuration any friendly name like `OpenAI`, `Gemini`, `Anthropic`.

				- `Api base url`: Set the API base URL. This is only relevant to set if you're using another OpenAI-compatible proxy server like [Ollama](/advanced/ollama) or [LMStudio](/advanced/lmstudio).

				  ![example configuration for ai model api](/img/example_openai_processor_config.png)

				### Search Model Configs

				Search models are used to generate vector embeddings of your documents for natural language search and chat. You can choose any [embeddings models on HuggingFace](https://huggingface.co/models?pipeline_tag=sentence-similarity) to create vector embeddings of your documents for natural language search and chat.

				<img src="/img/example_search_model_admin_settings.png" alt="Example Search Model Settings" style={{width: 500}} />

				### Text to Image Model Options

				Add text to image generation models with these settings. Khoj currently supports text to image models available via OpenAI, Stability or Replicate API

				- `api-key`: Set to your OpenAI, Stability or Replicate API key

				- `model`: Set the model name available over the selected model provider

				- `model-type`: Set to the appropriate model provider

				- `openai-config`: For image generation models available via OpenAI (compatible) API you can set the appropriate OpenAI Processor Conversation Settings instead of specifying the `api-key` field above

				### Speech to Text Model Options

				Add speech to text models with these settings. Khoj currently only supports whisper speech to text model via OpenAI API or Offline

				### Voice Model Options

				Add text to speech models with these settings. Khoj currently supports models from [ElevenLabs](https://elevenlabs.io/).

				### Reflective Questions

				This is a static list of starter question suggestions for each user. It is not currently used in any client app. It used to be shown on the web app home page. We may turn it into a dynamic list of starter questions personalized to each users, say based on their recent conversations or synced knowledge base.

				## User Data

				- Users, Entrys, Conversations, Subscriptions, Github configs, Notion configs, User search configs, User conversation configs, User voice configs

				## Miscellaneous Data

				- Process Locks: Persistent Locks for Automations

				- Client Applications:

				  Client applications allow you to setup third party applications that can query your Khoj server using a client application ID + secret. The secret would go in a bearer token.

74

documentation/docs/advanced/authentication.mdx Normal file

View File

@@ -0,0 +1,74 @@
 # Authenticate (Multi-User Setup)
 ```mdx-code-block
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 ```
 By default, most of the instructions for self-hosting Khoj assume a single user, and so the default configuration is to run in anonymous mode. However, if you want to enable authentication, you can do so either with with [Magic Links](#using-magic-links) or [Google OAuth](#using-google-oauth) as shown below. This can be helpful to make Khoj securely accessible to you and your team.
 :::tip[Note]
 Remove the `--anonymous-mode` flag from your khoj start up command or docker-compose file to enable authentication.
 :::
 ## Using Magic Links
 The most secure way to do this is to integrate with [Resend](https://resend.com).
 . Setup your account at https://resend.com
 . Set an environment variable for `RESEND_API_KEY`. You can get your API key [here](https://resend.com/api-keys).
 . Set an environment variable for `RESEND_EMAIL`. This is the email address that will show up in your `from` field when sending magic links.
 This will allow you to automatically send sign-in links to users who want to log in.
 It's still possible to use the magic links feature without Resend, but you'll need to manually send the magic links to users who want to log in.
 ## Manually sending magic links
 . The user will have to enter their email address in the login page at http://localhost:42110/login.
     They'll click `Get Login Link`. Without the Resend API key, this will just create an unverified account for them in the backend
 <img src="/img/magic_link.png" alt="Magic link login form" width="400"/>
 . You can get their magic link using the admin panel
     Go to the [admin panel](http://localhost:42110/server/admin/database/khojuser/). You'll see a list of users. Search for the user you want to send a magic link to. Tick the checkbox next to their row, and use the action drop down at the top to 'Get email login URL'. This will generate a magic link that you can send to the user, which will appear at the top of the admin interface.
     | Get email login URL | Retrieved login URL |
     |---------------------|---------------------|
     | <img src="/img/admin_get_emali_login.png" alt="Get user magic sign in link" width="400" />| <img src="/img/admin_successful_login_url.png" alt="Successfully retrieved a login URL" width="400" />|
 . Send the magic link to the user. They can click on it to log in.
     Once they click on the link, they'll automatically be logged in. They'll have to repeat this process for every new device they want to log in from, but they shouldn't have to repeat it on the same device.
     A given magic link can only be used once. If the user tries to use it again, they'll be redirected to the login page to get a new magic link.
 ## Using Google OAuth
 For this method, you'll need to use the prod version of the Khoj package. You can install it as below:
 <Tabs groupId="server" queryString>
   <TabItem value="docker" label="Docker">
   Update your `docker-compose.yml` to use the prod image
       ```bash
       image: ghcr.io/khoj-ai/khoj-cloud:latest
       ```
   </TabItem>
   <TabItem value="pip" label="Pip">
   ```bash
   pip install khoj[prod]
   ```
   </TabItem>
 </Tabs>
 To set up your self-hosted Khoj with Google Auth, you need to create a project in the Google Cloud Console and enable the Google Auth API.
 To implement this, you'll need to:
 . [Create authorization credentials](https://developers.google.com/identity/sign-in/web/sign-in) for your application.
 . Open your [Google cloud console](https://console.developers.google.com/apis/credentials) and create a configuration like below for the relevant `OAuth 2.0 Client IDs` project:
 ![Google auth login project settings](https://github.com/khoj-ai/khoj/assets/65192171/9bcbf6f4-197d-4d0c-973a-c10b1331c892)
 . Configure these environment variables: `GOOGLE_CLIENT_SECRET`, and `GOOGLE_CLIENT_ID`. You can find these values in the Google cloud console, in the same place where you configured the authorized origins and redirect URIs.
 That's it! That should be all you have to do. Now, when you reload Khoj without `--anonymous-mode`, you should be able to use your Google account to sign in.

									
										45

documentation/docs/advanced/gcp-vertex.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,45 @@

				# Google Vertex AI

				:::info

				This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you can directly use any of the pre-configured AI models.

				:::

				Khoj can use Google's Gemini and Anthropic's Claude family of AI models from [Vertex AI](https://cloud.google.com/vertex-ai) on Google Cloud. Explore Anthropic and Gemini AI models available on Vertex AI's [Model Garden](https://console.cloud.google.com/vertex-ai/model-garden).

				## Setup

				1. Follow [these instructions](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude#before_you_begin) to use models on GCP Vertex AI.

				2. Create [Service Account](https://console.cloud.google.com/apis/credentials/serviceaccountkey) credentials.

				   - Download the credentials keyfile in json format.

				   - Base64 encode the credentials json keyfile. For example by running the following command from your terminal:

				     `base64 -i <service_account_credentials_keyfile.json>`

				3. Create a new [API Model API](http://localhost:42110/server/admin/database/aimodelapi/add) on your Khoj admin panel.

				   - **Name**: `Google Vertex` (or whatever friendly name you prefer).

				   - **Api Key**: `base64 encoded json keyfile` from step 2.

				   - **Api Base Url**: `https://{MODEL_GCP_REGION}-aiplatform.googleapis.com/v1/projects/{YOUR_GCP_PROJECT_ID}`

				     - MODEL_GCP_REGION: A region the AI model is available in. For example `us-east5` works for [Claude](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude#regions).

				     - YOUR_GCP_PROJECT_ID: Get your project id from the [Google cloud dashboard](https://console.cloud.google.com/home/dashboard)

				4. Create a new [Chat Model](http://localhost:42110/server/admin/database/chatmodel/add) on your Khoj admin panel.

				   - **Name**: `claude-3-7-sonnet@20250219`. Any Claude or Gemini model on Vertex's Model Garden should work.

				   - **Model Type**: `Anthropic` or `Google`

				   - **Ai Model API**: *the Google Vertex Ai Model API you created in step 3*

				   - **Max prompt size**: `60000` (replace with the max prompt size of your model)

				   - **Tokenizer**: *Do not set*

				5. Select the chat model on [your settings page](http://localhost:42110/settings) and start a conversation.

				##  Troubleshooting & gcp AI Tips

				-  Permission Denied?

				  Ensure your service account has the `Vertex AI User` role and that the API is enabled in your GCP project.

				-  Region Errors?

				  Double-check that the model you're trying to use is supported in your selected region. Some Claude or Gemini models are restricted to specific zones like `us-east5` or `us-central1`.

				-  Prompt Size Limitations

				  The "Max prompt size" should align with the limits defined in the model documentation. Exceeding it can silently fail or truncate inputs.

				-  Testing the API Key

				  Before adding it to Khoj, you can verify that your key works by making a simple curl request to Vertex AI. This helps debug auth issues early.

				-  Use Environment Variables

				  For better security, consider using environment variables to manage sensitive keys and inject them at runtime during base64 encoding.

				If you encounter any issues, the [Khoj Discord](https://discord.gg/BDgyabRM6e) is a great place to ask for help!

									
										34

documentation/docs/advanced/litellm.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,34 @@

				# LiteLLM

				:::info

				This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.

				:::

				:::info

				Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.

				:::

				[LiteLLM](https://docs.litellm.ai/docs/proxy/quick_start) exposes an OpenAI compatible API that proxies requests to other LLM API services. This provides a standardized API to interact with both open-source and commercial LLMs.

				Using LiteLLM with Khoj makes it possible to turn any LLM behind an API into your personal AI agent.

				## Setup

				1. Install LiteLLM

				   ```bash

				   pip install litellm[proxy]

				   ```

				2. Start LiteLLM and use Mistral tiny via Mistral API

				   ```

				   export MISTRAL_API_KEY=<MISTRAL_API_KEY>

				   litellm --model mistral/mistral-tiny --drop_params

				   ```

				3. Create a new [API Model API](http://localhost:42110/server/admin/database/aimodelapi/add) on your Khoj admin panel

				   - **Name**: `litellm`

				   - **Api Key**: `any string`

				   - **Api Base Url**: `<URL of your Openai Proxy API>`

				4. Create a new [Chat Model](http://localhost:42110/server/admin/database/chatmodel/add) on your Khoj admin panel.

				   - **Name**: `llama3.1` (replace with the name of your local model)

				   - **Model Type**: `Openai`

				   - **Ai Model Api**: *the litellm Ai Model API you created in step 3*

				   - **Max prompt size**: `20000` (replace with the max prompt size of your model)

				   - **Tokenizer**: *Do not set for OpenAI, Mistral, Llama3 based models*

				5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.

									
										31

documentation/docs/advanced/lmstudio.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,31 @@

				# LM Studio

				:::warning[Unsupported]

				Khoj does not work with LM Studio anymore. Khoj leverages [json mode](https://platform.openai.com/docs/guides/structured-outputs#json-mode) extensively but LMStudio's API seems to have dropped support for json mode. [1](https://x.com/lmstudio/status/1770135858709975547), [2](https://lmstudio.ai/docs/api/structured-output)

				:::

				:::info

				This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.

				:::

				:::info

				Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.

				:::

				[LM Studio](https://lmstudio.ai/) is a desktop app to chat with open-source LLMs on your local machine. LM Studio provides a neat interface for folks comfortable with a GUI.

				LM Studio can expose an [OpenAI API compatible server](https://lmstudio.ai/docs/local-server). This makes it possible to turn chat models from LM Studio into your personal AI agents with Khoj.

				## Setup

				1. Install [LM Studio](https://lmstudio.ai/) and download your preferred Chat Model

				2. Go to the Server Tab on LM Studio, Select your preferred Chat Model and Click the green Start Server button

				3. Create a new [AI Model API](http://localhost:42110/server/admin/database/aimodelapi/add/) on your Khoj admin panel

				   - **Name**: `lmstudio`

				   - **Api Key**: `any string`

				   - **Api Base Url**: `http://localhost:1234/v1/` (default for LMStudio)

				4. Create a new [Chat Model](http://localhost:42110/server/admin/database/chatmodel/add) on your Khoj admin panel.

				   - **Name**: `llama3.1` (replace with the name of your local model)

				   - **Model Type**: `Openai`

				   - **Ai Model Api**: *the lmstudio Ai Model Api you created in step 3*

				   - **Max prompt size**: `20000` (replace with the max prompt size of your model)

				   - **Tokenizer**: *Do not set for OpenAI, mistral, llama3 based models*

				5. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.

78

documentation/docs/advanced/ollama.mdx Normal file

View File

@@ -0,0 +1,78 @@
 # Ollama
 ```mdx-code-block
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 ```
 :::info
 This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you can use our first-party supported models.
 :::
 :::info
 Khoj can directly run local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). The integration with Ollama is useful to run Khoj on Docker and have the chat models use your GPU or to try new models via CLI.
 :::
 Ollama allows you to run [many popular open-source LLMs](https://ollama.com/library) locally from your terminal.
 For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.
 Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/ollama/blob/main/docs/openai.md#models). This makes it possible to use chat models from Ollama with Khoj.
 ## Setup
 :::info
 Restart your Khoj server after first run or update to the settings below to ensure all settings are applied correctly.
 :::
 <Tabs groupId="type" queryString>
   <TabItem value="first-run" label="First Run">
     <Tabs groupId="server" queryString>
       <TabItem value="docker" label="Docker">
 . Setup Ollama: https://ollama.com/
 . Download your preferred chat model with Ollama. For example,
          ```bash
          ollama pull llama3.1
          ```
 . Uncomment `OPENAI_BASE_URL` environment variable in your downloaded Khoj [docker-compose.yml](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml#:~:text=OPENAI_BASE_URL)
 . Start Khoj docker for the first time to automatically integrate and load models from the Ollama running on your host machine
          ```bash
          # run below command in the directory where you downloaded the Khoj docker-compose.yml
          docker-compose up
          ```
       </TabItem>
       <TabItem value="pip" label="Pip">
 . Setup Ollama: https://ollama.com/
 . Download your preferred chat model with Ollama. For example,
          ```bash
          ollama pull llama3.1
          ```
 . Set `OPENAI_BASE_URL` environment variable to `http://localhost:11434/v1/` in your shell before starting Khoj for the first time
          ```bash
          export OPENAI_BASE_URL="http://localhost:11434/v1/"
          khoj --anonymous-mode
          ```
       </TabItem>
    </Tabs>
   </TabItem>
   <TabItem value="update" label="Update">
 . Setup Ollama: https://ollama.com/
 . Download your preferred chat model with Ollama. For example,
       ```bash
       ollama pull llama3.1
       ```
 . Create a new [AI Model API](http://localhost:42110/server/admin/database/aimodelapi/add) on your Khoj admin panel
       - **Name**: `ollama`
       - **Api Key**: `any string`
       - **Api Base Url**: `http://localhost:11434/v1/` (default for Ollama)
 . Create a new [Chat Model](http://localhost:42110/server/admin/database/chatmodel/add) on your Khoj admin panel.
       - **Name**: `llama3.1` (replace with the name of your local model)
       - **Model Type**: `Openai`
       - **AI Model API**: *the ollama AI Model API you created in step 3*
       - **Max prompt size**: `20000` (replace with the max prompt size of your model)
 . Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
    If you want to add additional models running on Ollama, repeat step 4 for each model.
   </TabItem>
 </Tabs>
 That's it! You should now be able to chat with your Ollama model from Khoj.

									
										20

documentation/docs/advanced/remote.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,20 @@

				# Remote Access

				By default self-hosted Khoj is only accessible on the machine it is running. To securely access it from a remote machine:

				- Set the `KHOJ_DOMAIN` environment variable to your remotely accessible ip or domain via shell or docker-compose.yml.

				  Examples: `KHOJ_DOMAIN=my.khoj-domain.com`, `KHOJ_DOMAIN=192.168.0.4`.

				- Ensure the Khoj Admin password and `KHOJ_DJANGO_SECRET_KEY` environment variable are securely set.

				- Setup [Authentication](/advanced/authentication).

				- Open access to the Khoj port (default: 42110) from your OS and Network firewall.

				:::warning[Use HTTPS certificate]

				To expose Khoj on a custom domain over the public internet, use of an SSL certificate is strongly recommended. You can use [Let's Encrypt](https://letsencrypt.org/) to get a free SSL certificate for your domain.

				To disable HTTPS, set the `KHOJ_NO_HTTPS` environment variable to `True`. This can be useful if Khoj is only accessible behind a secure, private network.

				:::

				:::info[Try Tailscale]

				You can use [Tailscale](https://tailscale.com/) for easy, secure access to your self-hosted Khoj over the network.

				1. Set `KHOJ_DOMAIN` to your machines [tailscale ip](https://tailscale.com/kb/1452/connect-to-devices#identify-your-devices) or [fqdn on tailnet](https://tailscale.com/kb/1081/magicdns#fully-qualified-domain-names-vs-machine-names). E.g `KHOJ_DOMAIN=100.4.2.0` or `KHOJ_DOMAIN=khoj.tailfe8c.ts.net`

				2. Access Khoj by opening `http://tailscale-ip-of-server:42110` or `http://fqdn-of-server:42110` from any device on your tailscale network

				:::

									
										17

documentation/docs/advanced/support-multilingual-docs.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,17 @@

				# Support Multilingual Docs

				Khoj uses an embedding model to understand documents. Multilingual embedding models improve the search quality for documents not in English. This affects both search and chat with docs experiences across Khoj.

				To improve search and chat quality for non-english documents you can use a [multilingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).<br />

				For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has decent search quality and speed for a consumer machine.

				To use it:

				1. Open [the search config](http://localhost:42110/server/admin/database/searchmodelconfig/) on your server's admin settings page. Either create a new search model, if none exists, or update the existing one. For example,

				   - Set the `bi_encoder` field to `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`

				   - Set the `cross_encoder` field to `mixedbread-ai/mxbai-rerank-xsmall-v1`

				2. Regenerate your content index from all the relevant clients. This step is very important, as you'll need to re-encode all your content with the new model.

				:::info[Note]

				Modern search/embedding model like [mixedbread-ai/mxbai-embed-large-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1) expect a prefix to the query (or docs) string to improve encoding. Update the `bi_encoder_query_encode_config` field of your [embedding model](http://localhost:42110/server/admin/database/searchmodelconfig/) with `{prompt: <prefix-prompt>}` to improve the search quality of these models.

				E.g. `{prompt: "Represent this query for searching documents"}`. You can pass any valid JSON object that the SentenceTransformer `encode` function accepts

				:::

									
										39

documentation/docs/advanced/tailscale.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,39 @@

				# Tailscale

				:::info

				This is only helpful for secure cross-device access to **self-hosted** Khoj. You **do not** need this if you're using [Khoj Cloud](https://app.khoj.dev).

				:::

				[Tailscale](https://tailscale.com) simplifies creating a private VPN using [Wireguard](https://www.wireguard.com/) and OAuth. So you can host and access services on your devices from anywhere.

				The instructions below are one way to simply and securely access your self-hosted Khoj from your phone, laptop etc.

				### Minimal Setup

				1. Setup khoj on your preferred machine following the [standard steps](/get-started/setup)

				2. Sign-up to [Tailscale](https://tailscale.com) and install the app on machines you want to access Khoj from. This usually includes your khoj server, your phone, laptop. Note the tailscale i.p of your khoj server.

				3. Start khoj on your server by including the flag `--host <your_server_tailscale_ip>`

				4. Open `http://<your_server_tailscale_ip>:42110` to access khoj from any device on your tailscale network!

				### HTTPS Certificate

				:::info

				Tailscale uses Wireguard to encrypt and route traffic between your machines. So HTTPS isn't required with Tailscale for secure access. HTTPS with Tailscale is only useful for browsers to not complain about security and block certain features like clipboard access unless HTTPS is enabled.

				:::

				1. Enable [MagicDNS](https://tailscale.com/kb/1081/magicdns#enabling-magicdns) and [HTTPS](https://tailscale.com/kb/1153/enabling-https) toggle on your tailscale admin console [DNS](https://login.tailscale.com/admin/dns) page. Note your unique tailscale domain name (usually ends with .ts.net)

				2. Create an https certificate for your Khoj server by running the following command:

				   ```bash

				   # Assuming the server is named, `server` and your tailnet is `black-forest.ts.net`

				   # Note path of the .crt and .key files generated

				   tailscale cert server.black-forest.ts.net

				   ```

				3. Start khoj to be served via https on standard port

				   ```bash

				   sudo KHOJ_DOMAIN=server.black-forest.ts.net \

				   khoj \

				   --sslcert /path/to/your/tailscale.crt \

				   --sslkey path/to/your/tailscale.key \

				   --host=server.black-forest.ts.net \

				   --port 443

				   ```

				4. You should now be able to access khoj on `https://server.black-forest.ts.net` from any device on your private tailscale network!

									
										34

documentation/docs/advanced/use-openai-proxy.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,34 @@

				---

				sidebar_position: 1

				---

				# Use OpenAI Proxy

				:::info

				This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.

				:::

				:::info

				Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.

				:::

				Khoj can use any OpenAI API compatible server including local providers like [Ollama](/advanced/ollama), [LMStudio](/advanced/lmstudio) and [LiteLLM](/advanced/litellm) and commercial providers like [HuggingFace](https://huggingface.co/docs/api-inference/tasks/chat-completion#using-the-api), [OpenRouter](https://openrouter.ai/docs/quick-start) etc.

				Configuring this allows you to use non-standard, open or commercial, local or hosted LLM models for Khoj.

				Combine them with Khoj can turn your favorite LLM into an AI agent. Allowing you to chat with your docs, find answers from the internet, build custom agents and run automations.

				For specific integrations, see our [Ollama](/advanced/ollama), [LMStudio](/advanced/lmstudio) and [LiteLLM](/advanced/litellm) setup docs. For general instructions to setup Khoj with an OpenAI API proxy see below.

				## General Setup

				1. Start your preferred OpenAI API compatible app locally or get API keys from commercial AI model providers.

				3. Create a new [API Model API](http://localhost:42110/server/admin/database/aimodelapi/add) on your Khoj admin panel

				   - **Name**: `any name`

				   - **Api Key**: `any string`

				   - **Api Base Url**: *The URL of your Openai Compatible API*

				3. Create a new [Chat Model](http://localhost:42110/server/admin/database/chatmodel/add) on your Khoj admin panel.

				   - **Name**: `llama3` (replace with the name of your local model)

				   - **Model Type**: `Openai`

				   - **Ai Model Api**: *The AI Model API you created in step 2*

				   - **Max prompt size**: `2000` (replace with the max prompt size of your model)

				   - **Tokenizer**: *Do not set for OpenAI, mistral, llama3 based models*

				4. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.

									
										8

documentation/docs/clients/_category_.json
									
										Normal file
									
												View File
												
				@@ -0,0 +1,8 @@

				{

				  "label": "Clients",

				  "position": 4,

				  "link": {

				    "type": "generated-index",

				    "description": "Different ways for indexing data with the Khoj backend. To see online data sources, go to https://docs.khoj.dev/category/data-sources"

				  }

				}

									
										44

documentation/docs/clients/desktop.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,44 @@

				---

				sidebar_position: 1

				---

				# Desktop

				> Upload your knowledge base to Khoj and chat with your whole corpus

				## Companion App

				Share your files, folders with Khoj using the app.

				Khoj will keep these files in sync to provide contextual responses when you search or chat.

				## Setup

				:::info[Self Hosting]

				If you are self-hosting the Khoj server, update the *Settings* page on the Khoj Desktop app to:

				- Set the `Khoj URL` field to your Khoj server URL. By default, use `http://127.0.0.1:42110`.

				- Do not set the `Khoj API Key` field if your Khoj server runs in anonymous mode. For example, `khoj --anonymous-mode`

				:::

				1. Install the [Khoj Desktop app](https://khoj.dev/downloads) for your OS

				2. Generate an API key on the [Khoj Web App](https://app.khoj.dev/settings#clients)

				3. Set your Khoj API Key on the *Settings* page of the Khoj Desktop app

				4. [Optional] Add any files, folders you'd like Khoj to be aware of on the *Settings* page and Click *Save*.

				   These files and folders will be automatically kept in sync for you

				# Main App

				You can also install the Khoj application on your desktop as a progressive web app.

				1. Open the [Khoj Web App](https://app.khoj.dev) in Chrome.

				2. Click on the install button in the address bar to install the app. You must be logged into your Chrome browser for this to work.

				![progressive web app install icon](/img/pwa_install_desktop.png)

				Alternatively, you can also install using this route:

				1. Open the three-dot menu in the top right corner of the browser.

				2. Go to 'Cast, Save, and Share' option.

				3. Click on the "Open in Khoj" option.

				![progressive web app install route](/img/chrome_pwa_alt.png)

Compare commits

4017 Commits 0.2.1 ... 1.42.5

41 .devcontainer/Dockerfile Normal file Unescape Escape View File

63 .devcontainer/devcontainer.json Normal file Unescape Escape View File

29 .devcontainer/launch.json Normal file Unescape Escape View File

14 .dockerignore Unescape Escape View File

2 .gitattributes vendored Normal file Unescape Escape View File

144 .github/ISSUE_TEMPLATE/bug-report.yml vendored Normal file Unescape Escape View File

54 .github/ISSUE_TEMPLATE/feature-request.yml vendored Normal file Unescape Escape View File

45 .github/workflows/build.yml vendored Unescape Escape View File

39 .github/workflows/build_khoj_el.yml vendored Normal file Unescape Escape View File

99 .github/workflows/desktop.yml vendored Normal file Unescape Escape View File

178 .github/workflows/dockerize.yml vendored Normal file Unescape Escape View File

47 .github/workflows/dockerize_telemetry_server.yml vendored Normal file Unescape Escape View File

46 .github/workflows/github_pages_deploy.yml vendored Normal file Unescape Escape View File

48 .github/workflows/pre-commit.yml vendored Normal file Unescape Escape View File

95 .github/workflows/publish.yml vendored Unescape Escape View File

81 .github/workflows/pypi.yml vendored Normal file Unescape Escape View File

127 .github/workflows/release.yml vendored Unescape Escape View File

217 .github/workflows/run_evals.yml vendored Normal file Unescape Escape View File

81 .github/workflows/test.yml vendored Unescape Escape View File

52 .github/workflows/test_khoj_el.yml vendored Normal file Unescape Escape View File

30 .gitignore vendored Unescape Escape View File

13 .mypy.ini Unescape Escape View File

33 .pre-commit-config.yaml Normal file Unescape Escape View File

37 .vscode/launch.json vendored Normal file Unescape Escape View File

7 .vscode/settings.json vendored Normal file Unescape Escape View File

66 Dockerfile Unescape Escape View File

7 Khoj.desktop Unescape Escape View File

115 Khoj.spec Unescape Escape View File

151 LICENSE Unescape Escape View File

5 MANIFEST.in Unescape Escape View File

108 README.md Normal file Unescape Escape View File

321 Readme.md Unescape Escape View File

129 computer.Dockerfile Normal file Unescape Escape View File

21 config/environment.yml Unescape Escape View File

116 config/environment_osx-arm64.yml Unescape Escape View File

54 config/khoj_docker.yml Unescape Escape View File

52 config/khoj_sample.yml Unescape Escape View File

144 docker-compose.yml Unescape Escape View File

BIN docs/interfaces.png View File

BIN docs/khoj_pwa_android.png View File

20 documentation/.gitignore vendored Normal file Unescape Escape View File

41 documentation/README.md Normal file Unescape Escape View File

0 src/__init__.py → documentation/assets/.nojekyll Unescape Escape View File

BIN documentation/assets/img/admin_get_emali_login.png Normal file View File

BIN documentation/assets/img/admin_successful_login_url.png Normal file View File

BIN documentation/assets/img/agents_page_full.png Normal file View File

BIN documentation/assets/img/chrome_pwa_alt.png Normal file View File

BIN documentation/assets/img/dream_house.png Normal file View File

BIN documentation/assets/img/example_chatmodel_option.png Normal file View File

BIN documentation/assets/img/example_openai_processor_config.png Normal file View File

BIN documentation/assets/img/example_search_model_admin_settings.png Normal file View File

BIN documentation/assets/img/favicon-128x128.ico Normal file View File

BIN documentation/assets/img/file_filters_conversation.png Normal file View File

BIN documentation/assets/img/khoj-logo-sideways-200.png Normal file View File

BIN documentation/assets/img/khoj-logo-sideways-500.png Normal file View File

5385 documentation/assets/img/khoj-logo-sideways.svg Normal file View File

0 docs/khoj_architecture.png → documentation/assets/img/khoj_architecture.png Unescape Escape View File

BIN documentation/assets/img/khoj_automation_email.png Normal file View File

BIN documentation/assets/img/khoj_chat_on_emacs.png Normal file View File

BIN documentation/assets/img/khoj_chat_on_obsidian.png Normal file View File

BIN documentation/assets/img/khoj_chat_on_web.png Normal file View File

74 documentation/assets/img/khoj_clients.svg Normal file View File

BIN documentation/assets/img/khoj_codebase_visualization_0.2.1.png Normal file View File

BIN documentation/assets/img/khoj_create_automation.png Normal file View File

62 documentation/assets/img/khoj_datasources.svg Normal file View File

BIN documentation/assets/img/khoj_documentation.png Normal file View File

BIN documentation/assets/img/khoj_emacs_menu.png Normal file View File

BIN documentation/assets/img/khoj_obsidian_codebase_visualization_0.2.1.png Normal file View File

BIN documentation/assets/img/khoj_pwa_android.png Normal file View File

BIN documentation/assets/img/khoj_search_on_emacs.png Normal file View File

BIN documentation/assets/img/khoj_search_on_obsidian.png Normal file View File

BIN documentation/assets/img/khoj_search_on_web.png Normal file View File

BIN documentation/assets/img/khoj_web_app_home.png Normal file View File

BIN documentation/assets/img/magic_link.png Normal file View File

BIN documentation/assets/img/plants_i_got.png Normal file View File

BIN documentation/assets/img/pwa_install_1.png Normal file View File

BIN documentation/assets/img/pwa_install_2.png Normal file View File

BIN documentation/assets/img/pwa_install_3.png Normal file View File

4017 Commits

0.2.1 ... 1.42.5

41

.devcontainer/Dockerfile Normal file

View File

63

.devcontainer/devcontainer.json Normal file

View File

29

.devcontainer/launch.json Normal file

View File

14

.dockerignore

View File

2

.gitattributes vendored Normal file

View File

144

.github/ISSUE_TEMPLATE/bug-report.yml vendored Normal file

View File

54

.github/ISSUE_TEMPLATE/feature-request.yml vendored Normal file

View File

45

.github/workflows/build.yml vendored

View File

39

.github/workflows/build_khoj_el.yml vendored Normal file

View File

99

.github/workflows/desktop.yml vendored Normal file

View File

178

.github/workflows/dockerize.yml vendored Normal file

View File

47

.github/workflows/dockerize_telemetry_server.yml vendored Normal file

View File

46

.github/workflows/github_pages_deploy.yml vendored Normal file

View File

48

.github/workflows/pre-commit.yml vendored Normal file

View File

95

.github/workflows/publish.yml vendored

View File

81

.github/workflows/pypi.yml vendored Normal file

View File

127

.github/workflows/release.yml vendored

View File

217

.github/workflows/run_evals.yml vendored Normal file

View File

81

.github/workflows/test.yml vendored

View File

52

.github/workflows/test_khoj_el.yml vendored Normal file

View File

30

.gitignore vendored

View File

13

.mypy.ini

View File

33

.pre-commit-config.yaml Normal file

View File

37

.vscode/launch.json vendored Normal file

View File

7

.vscode/settings.json vendored Normal file

View File

66

Dockerfile

View File

7

Khoj.desktop

View File

115

Khoj.spec

View File

151

LICENSE

View File

5

MANIFEST.in

View File

108

README.md Normal file

View File

321

Readme.md

View File

129

computer.Dockerfile Normal file

View File

21

config/environment.yml

View File

116

config/environment_osx-arm64.yml

View File

54

config/khoj_docker.yml

View File

52

config/khoj_sample.yml

View File

144

docker-compose.yml

View File

BIN
docs/interfaces.png

View File

BIN
docs/khoj_pwa_android.png

View File

20

documentation/.gitignore vendored Normal file

View File

41

documentation/README.md Normal file

View File

0

src/init.py → documentation/assets/.nojekyll

View File

BIN
documentation/assets/img/admin_get_emali_login.png Normal file

View File

BIN
documentation/assets/img/admin_successful_login_url.png Normal file

View File

BIN
documentation/assets/img/agents_page_full.png Normal file

View File

BIN
documentation/assets/img/chrome_pwa_alt.png Normal file

View File

BIN
documentation/assets/img/dream_house.png Normal file

View File

BIN
documentation/assets/img/example_chatmodel_option.png Normal file

View File

BIN
documentation/assets/img/example_openai_processor_config.png Normal file

View File

BIN
documentation/assets/img/example_search_model_admin_settings.png Normal file

View File

BIN
documentation/assets/img/favicon-128x128.ico Normal file

View File

BIN
documentation/assets/img/file_filters_conversation.png Normal file

View File

BIN
documentation/assets/img/khoj-logo-sideways-200.png Normal file

View File

BIN
documentation/assets/img/khoj-logo-sideways-500.png Normal file

View File

5385

documentation/assets/img/khoj-logo-sideways.svg Normal file

View File

0

docs/khoj_architecture.png → documentation/assets/img/khoj_architecture.png

View File

BIN
documentation/assets/img/khoj_automation_email.png Normal file

View File

BIN
documentation/assets/img/khoj_chat_on_emacs.png Normal file

View File

BIN
documentation/assets/img/khoj_chat_on_obsidian.png Normal file

View File

BIN
documentation/assets/img/khoj_chat_on_web.png Normal file

View File

74

documentation/assets/img/khoj_clients.svg Normal file

View File

BIN
documentation/assets/img/khoj_codebase_visualization_0.2.1.png Normal file

View File

BIN
documentation/assets/img/khoj_create_automation.png Normal file

View File

62

documentation/assets/img/khoj_datasources.svg Normal file

View File

BIN
documentation/assets/img/khoj_documentation.png Normal file

View File

BIN
documentation/assets/img/khoj_emacs_menu.png Normal file

View File

BIN
documentation/assets/img/khoj_obsidian_codebase_visualization_0.2.1.png Normal file

View File

BIN
documentation/assets/img/khoj_pwa_android.png Normal file

View File

BIN
documentation/assets/img/khoj_search_on_emacs.png Normal file

View File

BIN
documentation/assets/img/khoj_search_on_obsidian.png Normal file

View File

BIN
documentation/assets/img/khoj_search_on_web.png Normal file

View File

BIN
documentation/assets/img/khoj_web_app_home.png Normal file

View File

BIN
documentation/assets/img/magic_link.png Normal file

View File

BIN
documentation/assets/img/plants_i_got.png Normal file

View File

BIN
documentation/assets/img/pwa_install_1.png Normal file

View File

BIN
documentation/assets/img/pwa_install_2.png Normal file

View File

BIN
documentation/assets/img/pwa_install_3.png Normal file

View File

BIN
documentation/assets/img/pwa_install_desktop.png Normal file

View File